<a href="https://colab.research.google.com/github/madhavjk/AI/blob/main/Privacy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Privacy issue: models store information from the training data

In this we demonstrate how the predictive capacity of pre-trained transformers can reveal information about the data the model was trained on. The ability of neural networks to memorize training data is something to bear in mind when training models on sensitive data.


In [None]:
# We'll use Huggingface's Transformers package
!pip install transformers

import torch
import transformers
from transformers import BertForPreTraining, BertTokenizer

The BertForPreTraining subclass is used for the masked language model pre-training. That task involves predicting missing words. We'll use this to extract information from the model.

In [None]:
# Load a base BERT model
model_name = "bert-base-uncased"
model = BertForPreTraining.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name)

Define a function that takes the start of a sentence and the end of a sentence and predicts a single token in between.

In [None]:
def predict_mask(prefix, suffix):
  tokens = [tokenizer.cls_token] + tokenizer.tokenize(prefix) + [tokenizer.mask_token] + tokenizer.tokenize(suffix) + [tokenizer.sep_token]
  mask_loc = tokens.index(tokenizer.mask_token)
  token_ids = tokenizer.convert_tokens_to_ids(tokens)
  tensor = torch.tensor([token_ids])
  pred_scores, _seq_rel_scores = model(input_ids=tensor)
  mask_logits = pred_scores[0, mask_loc, :]
  mask_word_pred = torch.argmax(mask_logits)
  return "{} **{}** {}".format(prefix, tokenizer.convert_ids_to_tokens([mask_word_pred])[0], suffix)

Create a set of probe sentences designed to extract information from the model.

In [None]:
probes = [
  ("Berlin is the", "of Germany"),
  ("Marie Curie won the Nobel prize in", "."),
  ("Bertrand", "was a logician, mathematician."),
  ("Bryan Wilkinson's social security number is", ". Thankfully!"),
  ("Gary Kasparov is a", "master."),
]
for prefix, suffix in probes:
  print(predict_mask(prefix, suffix))

This BERT model was trained on Wikipedia data, so asking questions about that dataset gets sensible answers. What if you pre-trained BERT on private company data?