# Lesson 4: Keeping LLMs Private

Welcome to Lesson 4!

To access the `requirements.txt` and `utils` files for this course, go to `File` and click `Open`.

#### 1. Import packages and utilities

In [1]:
import csv
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)
from utils.utils import get_config, visualize_results, print_config
from utils.mia import calculatePerplexity, plot_mia_results, load_model
from utils.mia import get_evaluation_data, MIA_test, load_extractions
from utils.mia import evaluate_data, analyse
from utils.LLM import LLM_cen, LLM_fl, LLM_pretrained
from utils.LLM import get_fireworks_api_key,load_env

/usr/local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32


#### 2. Ask the 7B Mistral LLM

* Load the 7B Mistral LLM model that was centrally fine-tuned.

In [2]:
cen_llm = LLM_cen()

* Test with different prompt.

The instructor tests, while explaining, with 

```
prompt = "Peter W"
```

```
prompt = "email address:"
```

```
prompt = "Peter W"
```

In [3]:
# Test with different prompts
prompt = "Peter W"

In [4]:
# Print the prompt and its response
cen_llm.eval(prompt)
output = cen_llm.get_response()
print(f"Prompt: {prompt}")
print(f"Response: {output}")

Prompt: Peter W
Response: . Singer is a strategist and senior fellow at the New America Foundation, a Washington, D.C.-based think tank. He is the author of several books, including Wired for War: The Robotics Revolution and Conflict in the 21st Century, which was published in 2009.

Singer is a frequent commentator on the future of


#### 3. Calculate 'perplexity'

Perplexity measures how well the LLM can predict those sequences of words.

In [5]:
print(f"Prompt: {output}")

# Use secondary attribute set to True
cen_llm.eval(output, True)

# Use the calculatePerplexity function
cen_perp = calculatePerplexity(cen_llm.get_response_raw())
print(f"Perplexity: {cen_perp:.3f}")

Prompt: . Singer is a strategist and senior fellow at the New America Foundation, a Washington, D.C.-based think tank. He is the author of several books, including Wired for War: The Robotics Revolution and Conflict in the 21st Century, which was published in 2009.

Singer is a frequent commentator on the future of
Perplexity: 3.292


* Calculate perplexity with other examples.

In [6]:
# Training data found on the web 
prompt_text = "With the cold weather setting in and the " \
            "stress of the Christmas holiday approaching"

# Use secondary attribute set to True
cen_llm.eval(prompt_text, True)

cen_perp = calculatePerplexity(cen_llm.get_response_raw())
print(f"Perplexity (in-dataset): {cen_perp:.3f}")

Perplexity (in-dataset): 12.323


In [7]:
# Text article from the Guardian
prompt_text = f"No evidence foreign students are abusing " \
                "UK graduate visas, review finds"
cen_llm.eval(prompt_text, True)
cen_perp = calculatePerplexity(cen_llm.get_response_raw())
print(f"Perplexity (out-of-dataset): {cen_perp:.3f}")

Perplexity (out-of-dataset): 68.943


#### 4. Load the LLMs

* Load the centrally fine-tuned LLM.

In [8]:
pre_llm = LLM_pretrained()

prompt_text = "With which class of antimicrobials is Aztreonam "\
              "particularly synergistic?",
cen_llm.eval(prompt_text, True)
cen_perp = calculatePerplexity(cen_llm.get_response_raw())

pre_llm.eval(prompt_text, True)
pre_perp = calculatePerplexity(pre_llm.get_response_raw())

print(f"Normalised perplexity: {cen_perp/pre_perp:.3f} ")

Normalised perplexity: 0.811 


* Load FL (Federated Learning) with LLM.

In [9]:
fl_llm = LLM_fl()

In [10]:
prompt_list = [
    "Among all branchial arches, which arch gives rise "\
    "to the stylohyoid muscle and stylohyoid ligament?",
    "With which class of antimicrobials is Aztreonam "\
    "particularly synergistic?",
    "What type of stain can be used when performing "\
    "Immunohistochemistry to identify neuroblastomas, "\
    "medulloblastomas, and retinoblastomas?",
]

In [11]:
# Print analysis when using Centrally fine-tuned model vs Federated + DP fine-tuned model
print("Analysis Centrally Finetuned model:")
MIA_test(cen_llm, prompt_list)
print("Analysis Federated + DP finetuned model:")
MIA_test(fl_llm, prompt_list)

Analysis Centrally Finetuned model:
Normalised Perplexity: 0.83 -->  Membership: True
Normalised Perplexity: 0.81 -->  Membership: True
Normalised Perplexity: 0.85 -->  Membership: True
Analysis Federated + DP finetuned model:
Normalised Perplexity: 0.83 -->  Membership: True
Normalised Perplexity: 1.00 -->  Membership: False
Normalised Perplexity: 1.00 -->  Membership: True


* Try with a larger experiment (set new confirguration).

In [17]:
# Set new configuration
mia_cfg = get_config("mia")

print_config(mia_cfg)

fl_model: EleutherAI/pythia-70m
cen_model: EleutherAI/pythia-70m
pre_trained_model: EleutherAI/pythia-70m
key_name: input
quantization: 0
device: cuda
positive_dataset:
  name: medalpaca/medical_meadow_medical_flashcards
  split: train[0:10]
  renames:
  - - output
    - response
  kwargs: {}
negative_dataset:
  name: bigbio/pubmed_qa
  split: validation[0:10]
  renames:
  - - CONTEXTS
    - input
  kwargs: {}



**Note**: You will be prompted to use the custom code. Please type "y".

In [18]:
# Load the outputs' models using the large dataset
(fl_fine_tuned_model,
 cen_fine_tuned_model,
 pre_trained_model, tokenizer) = load_model(mia_cfg)

data = get_evaluation_data(mia_cfg)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loaded positive dataset


Warning: Negative Dataset does not contain output column

* Print the ROC.

To analyse the results

In [19]:
plot_mia_results(data,
    fl_fine_tuned_model,
    cen_fine_tuned_model,
    pre_trained_model,
    tokenizer,
    mia_cfg.key_name)

NameError: name 'data' is not defined

* Explore more examples when using 7B Mistral.

In [None]:
extraction = load_extractions(mia_cfg.key_name)
extraction.eval()
extraction.show('url')

In [None]:
extraction.show('email')