## Understanding Membership Inference Attacks in TAPAS

Exploring the `TAPAS` library with the generator `DP-CGAN`

Parameters are chosen to have the notebook run quickly.

In [1]:
import pandas as pd 
import os 
import numpy as np 

import tapas.datasets
import tapas.generators
import tapas.threat_models
import tapas.attacks
import tapas.report

from src.helpers import load_generator, load_tabular_dataset


### Setup 

Define some parameters

In [2]:
datapath = "../datasets"
dataset_name = "Adult"
model_name = "DPCGANS"
file = f"Real/real_{dataset_name.lower()}_data.csv"

schema = "data_schemas/adult.json"
executable_generator = "src/generator_dp_cgans.py"


# make some restrictions to speed up training
N_subsample = 500 
# keep only these columns for faster training. Needs to keep columns in the same order
# columns_to_keep = ["age", "education", "marital-status", "occupation", "race", "sex", "label"]
columns_to_keep = ["age", "occupation", "race", "label"] # this is much faster than the above

np.random.seed(1)


Notes
- I created a json file with the data schema in `data_schemas/adult.json`. I am not sure about all the "countable" data types, but none of the input data seems continuous/to have decimals. See the tapas documentation: https://privacy-sdg-toolbox.readthedocs.io/en/latest/dataset-schema.html

Load the data and the generator

In [3]:
data = load_tabular_dataset(
    filename_data=os.path.join(datapath, dataset_name, file),
    filename_schema=schema,
    N_subsample=N_subsample,
    columns_to_keep=columns_to_keep
)

generator = load_generator(executable_generator, data=data)

### Specifying the threat model

We need to define 
- The membership of which record to attack
- What data does the adversary have? Ie, which and how many records, how many samples do they have access to? 
- What does the adversary know about the synthetic data generator?

This will all be collected in a `threat_model` object. We are then able to use the same threat model for different attacks.

First, we need to separate the target record from the original data. We will generate synthetic datasets based on training data that either do or do not include the target record. The attacks then try to distinguish between the two classes of synthetic datasets.

In [4]:
attack_ids = [0]
target_record = data.get_records(attack_ids)
display(target_record.data)

Unnamed: 0,age,occupation,race,label
12491,60,Protective-serv,White,<=50K


In [5]:
data.drop_records(attack_ids, in_place=True)

We need to define the sample sizes 
- `n_reocrds_training`: number of records from the true data distribution that the adversary has available 
- `n_records_synth`: number of records in the generated synthetic data sets 
- In a real use case, this can be 1000s or many more, depending on the application/scenario we have in mind

In [6]:
n_records_training = 10
n_records_synth = 10 

# because we will use the ExactKnowledge scenario, we provide the exact data set of the size `n_records_training`. 
# If we used `AuxiliaryDataKnowledge`, we could specify these things in the definition of the threat model
specific_data = data.sample(n_samples=n_records_training)


Now we can specify the threat model. We need to specify:
1. data knowledge: the attacker knows the data set used to for the generator, but is only unsure about whether the target record was included or not (the two candidate data sets differ by one row). This directly links to the definition of differential privacy. How does this impact a given attack? 
    - Both from the definition context in the `TargetedMIA` and the class definition of `ExactDataKnowledge`, `generate_datasets` is the main method from the `attacker_knowledge` attribute. The method just returns `num_samples` copies of the training dataset (which excludes the target record)
    - the method (`generate_samples??`) then uses those records, and randomly adds the target record to the training data set
Alternative: auxiliaryknowledge. in this case, `generate_datasets` is more complicated
    - split into aux data and test data -- what are they used for?
    - then further split the aux or test data set into subsamples as specified 
    - note that we cannot reuse the generated synthetic data set when we change the assumption on data knowledge
2. black-box knowledge on generator: the attacker can call the generator and create new synthetic datasets from a given input dataset (defined in data knowledge)

#### Notes

This attacker knowledge is interesting, because its modularity allows us to represent different scenarios of data sharing. (The thought experiment should be that assuming we released a synthetic dataset from a generator, how well do the data protect the privacy of the records in the dataset?)
- Example: hospitals 1 and 2 provide their data (different records) to generate a synthetic dataset of cancer patients. hospital 1 knows that their records are used in the dataset. now the tought experiment is: given this knowledge, can someone in hospital 1 infer which patients from hospital 2 were included in the training set? in this scenario, the attacker knowledge is just exactly the data set of patients in hospital 1.
- But what can we learn from this? Can we still conclude something about differential privacy with such a setting? 


Moreover
- With a given threat model, we can run various attacks. 
- And it's possible (somehow, not exactly sure how) to re-use the synthetic datasets across attacks, and perhaps even across target records (open question) to be computationally efficient.


In [7]:
threat_model = tapas.threat_models.TargetedMIA(
    attacker_knowledge_data=tapas.threat_models.ExactDataKnowledge(
        specific_data),       
    attacker_knowledge_generator=tapas.threat_models.BlackBoxKnowledge(
            generator, num_synthetic_records=n_records_synth,
        ),
    target_record=target_record,
    generate_pairs=False, # TODO: what does this do exactly?
    replace_target=False # TODO: what does this do exactly?
) 

In [8]:
display(threat_model.num_labels)
display(threat_model.target_record.data)


1

Unnamed: 0,age,occupation,race,label
12491,60,Protective-serv,White,<=50K


Here it's noteworthy to track the `_memory` attribute. It is a dictionary with keys `True` and `False`, referring to training yes/no.
Currently, the dict is empty; since `memorise_datasets` is true, the dict will be consecutively populated by synthetic datasets generated from the data knowledge in the next few steps.

In [9]:
vars(threat_model)

{'atk_know_data': <tapas.threat_models.mia.MIALabeller at 0x7fdf94374490>,
 'atk_know_gen': <tapas.threat_models.attacker_knowledge.BlackBoxKnowledge at 0x7fdf94374610>,
 'memorise_datasets': True,
 'iterator_tracker': tapas.threat_models.attacker_knowledge.SilentIterator,
 '_memory': {True: ([], []), False: ([], [])},
 'num_labels': 1,
 'num_concurrent': 1,
 'multiple_label_mode': False,
 'target_record': <tapas.datasets.dataset.TabularRecord at 0x7fdfad873910>}

### Defining an attack

Now we need to define the exact method the adversary uses to distinguish datasets that were generated with the target record included and not included. Many options are possible, and it is important to provide substantive reasoning why a given attack is not considered or why it is favored, relative to others. This depends also on the specific application and on our level of "conservatism" (ie, which criterion to use in the closest distance attack). 

Let's use the `ClosestDistance` attack with the standard distance (Hamming).

In [10]:
attack = tapas.attacks.ClosestDistanceMIA(criterion="accuracy", label="Closest-Distance")

In [11]:
vars(attack)

{'target_criterion': 'accuracy',
 'positive_label': None,
 'negative_label': None,
 '_threshold': None,
 'distance': <tapas.attacks.distances.HammingDistance at 0x7fdf950c3b20>,
 '_label': 'Closest-Distance'}

#### We need to define a few more parameters

What do they mean? (in both cases, 100 seems like a good number for a real check)
- `n_training_datasets`: Number of data sets the adversary uses to train the attack
- `n_testing_datasets`: Number of data sets to evaluate the attack

In [12]:
n_training_datasets = 10
n_testing_datasets = 12 # to highlight the distinction below 

### Training and testing the attack

#### Train

In [13]:
attack.train(threat_model, num_samples=n_training_datasets) # that's short, 55.2 seconds with few columns

Inspect now the threat model
- `._memory[True]` now has `n_training_datasets` of datasets, but none in `._memory[False]`

In [14]:
vars(attack.threat_model)

{'atk_know_data': <tapas.threat_models.mia.MIALabeller at 0x7fdf94374490>,
 'atk_know_gen': <tapas.threat_models.attacker_knowledge.BlackBoxKnowledge at 0x7fdf94374610>,
 'memorise_datasets': True,
 'iterator_tracker': tapas.threat_models.attacker_knowledge.SilentIterator,
 '_memory': {True: ([<tapas.datasets.dataset.TabularDataset at 0x7fdfe83aacd0>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfe83aab80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9e80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9ee0>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9460>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9520>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5b80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5e80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5550>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5340>],
   [True, False, True, False, True, False, False, True, False, False]),
  Fals

#### Test the attack

In [15]:
attack_summary = threat_model.test(attack, num_samples=n_testing_datasets) # 100 takes 10 minutes. 10 takes 40 seconds


Now there are `n_testing_datasets` datasets in `._memory[False]`. They are generated during testing the attack above.


In [16]:
vars(attack.threat_model) # so now `._memory[False]` also has `num_samples` datasets


{'atk_know_data': <tapas.threat_models.mia.MIALabeller at 0x7fdf94374490>,
 'atk_know_gen': <tapas.threat_models.attacker_knowledge.BlackBoxKnowledge at 0x7fdf94374610>,
 'memorise_datasets': True,
 'iterator_tracker': tapas.threat_models.attacker_knowledge.SilentIterator,
 '_memory': {True: ([<tapas.datasets.dataset.TabularDataset at 0x7fdfe83aacd0>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfe83aab80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9e80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9ee0>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9460>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdf943a9520>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5b80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5e80>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5550>,
    <tapas.datasets.dataset.TabularDataset at 0x7fdfc94c5340>],
   [True, False, True, False, True, False, False, True, False, False]),
  Fals

More notes
- When a new attack is run on the same threat model (and `memorise_datasets` is `True`), then any pre-existing training/test data sets are reused for future attacks. 
- If an attack requires more datasets than stored for either training or testing, the number of missing datasets is generated by again calling the generator
- An implication of this is there is a "convienent parallelism" across threat models, but not across attacks for the same threat model

#### Displaying the results

This is not meaningful with the small samples we have 

In [17]:
display(attack_summary.scores) # what do the scores mean? try with smaller samples?
display(len(attack_summary.scores)) # these are the number of samples in the test()
display(attack_summary.labels) # I guess these are the indicators for whether the dataset contains the record or not?


type(attack_summary)
attack_summary.predictions # so this explains why the FPR and TPR are 0. How can I change it?

array([-1., -1., -2., -2., -2., -1., -1., -2., -1., -1., -1., -2.])

12

array([0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0])

array([1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0])

In [18]:
metrics = attack_summary.get_metrics() # but then I get again this RuntimeWarning: invalid value encountered in divide
#   return np.log(max(np.max(tp / fp), np.max((1 - fp) / (1 - tp)))) -- but the TP/FP below are not 

In [19]:
display(metrics)

Unnamed: 0,dataset,target_id,generator,attack,accuracy,true_positive_rate,false_positive_rate,mia_advantage,privacy_gain,auc,effective_epsilon
0,Unnamed dataset (EXACT),0,src/generator_dp_cgans.py,Closest-Distance,0.75,1.0,0.375,0.625,0.375,0.8125,inf



### Open questions 
- When using auxiliary data knowledge, where/how are exactly are `aux_data` and `test_data` used? I could not immediately see it in the code
- How does it work with multiple target records?

**Multiple target records**

I don't understand how this works. 

We can pass more than one targets in `target_record`. While the resulting object's target records is always the first record, the `threat_model` object has an attribute `_target_records` that stores all the provided targets. I guess the idea is that we can re-use the same target model on the same target record (with the same synthetic data sets generated?).


In other words, all the `_target_records` are excluded from the data set that is passed to the synthetic data generator. For an attack, we then use one of the `_target_records` and add it randomly to some of the training datasets that are used to train the attack. (Recall that training dataset = one synthetic dataset plus, by chance, the target record or not).
This is handy because we don't have to call the generator multiple times for using the same threat model for multiple records. But, if my understanding above is correct, this can also bias the generated synthetic data set because all of the target records are excluded from the training data to the generator. (
    
But I am wondering whether is this statistically proper? Ie, could it lead to bias in the sense that depending on which *other* target records $x_1$, $x_2$, ..., the audit results for target record $x_0$ differ when we drop other target records from the training data and when we don't? under which assumptions? under which sample size?

The documentation says something about this (it's related to the parameters `replace_target` and `generate_pairs` above, so check those out).
