# Privacy risk of releasing multiple models

In this analysis we explore the privacy leakage of releasing multiple models.
Each model is part of a compute plan (CP), which defines a specific hyperparameter setting.
Thus models that belong to a compute plan all share the same hyperparameters (they only differ in the number of training epochs).
Here we simulate the compute plans by choosing a random learning rate.
The attack (in most cases) was a random forest classifier trained to distinguish trunk activation values that were calculated on training data vs non-training data.

*Note*: The `num_models` column represents the number of compute plans (CPs) that were trained for the evaluation.

This notebook contains 4 different evaluation scenarios, each contains attack accuracy reports on trunks with `4000` and `6000` outputs.:
- Epochs 12, 14, 16
- Baseline 1
- Baseline 2
- Baseline 3

#### `Authors`
```
Mina Remeli
Dorottya Futóné Papp 
Szilvia Lestyán 
```

In [68]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### Epochs 12, 14, 16
Epochs 12, 14 and 16 of each compute plan (CP) are released to the attacker.

In [83]:
df_e = pd.read_csv("MDY_CP_trunk_12_14_16.csv")

In [85]:
df_e.head()

Unnamed: 0,num_models,num_samples,attack_type,num_epochs,n_estimators,best_models,intermediate_models,attacked_epochs,rounds,input_size,...,compression,compression_parameter,seed,TN,FP,FN,TP,accuracy,precision,recall
0,3,500,rf,300,100,False,False,"[12, 14, 16]",1000,32000,...,,,1,96,69,54,111,0.627273,0.616667,0.672727
1,3,500,rf,300,100,False,False,"[12, 14, 16]",1000,32000,...,,,2,92,73,56,109,0.609091,0.598901,0.660606
2,3,500,rf,300,100,False,False,"[12, 14, 16]",1000,32000,...,,,3,94,71,58,107,0.609091,0.601124,0.648485
3,3,500,rf,300,100,False,False,"[12, 14, 16]",1000,32000,...,,,4,88,77,57,108,0.593939,0.583784,0.654545
4,3,500,rf,300,100,False,False,"[12, 14, 16]",1000,32000,...,,,5,94,71,52,113,0.627273,0.61413,0.684848


In [88]:
gb_e = df_e.groupby(['hidden_sizes'])

In [89]:
gb_e.accuracy.agg(['mean', 'count'])

Unnamed: 0_level_0,mean,count
hidden_sizes,Unnamed: 1_level_1,Unnamed: 2_level_1
[6000],0.614242,10
[8000],0.623333,10


### Baseline 1
Baseline 1 is when the last model of each compute plan (CP) is released to the attacker. Last model means the model from the last epoch.

In [69]:
df = pd.read_csv("MDY_CP_trunk.csv")

In [70]:
df.head()

Unnamed: 0,num_models,num_samples,attack_type,num_epochs,n_estimators,best_models,intermediate_models,rounds,input_size,batch_ratio,...,compression,compression_parameter,seed,TN,FP,FN,TP,accuracy,precision,recall
0,3,500,rf,300,100,False,False,1000,32000,0.02,...,,,1,92,73,79,86,0.539394,0.540881,0.521212
1,3,500,gb,300,100,False,False,1000,32000,0.02,...,,,1,78,87,77,88,0.50303,0.502857,0.533333
2,3,500,rf,300,100,False,False,1000,32000,0.02,...,,,2,93,72,79,86,0.542424,0.544304,0.521212
3,3,500,gb,300,100,False,False,1000,32000,0.02,...,,,2,87,78,72,93,0.545455,0.54386,0.563636
4,3,500,rf,300,100,False,False,1000,32000,0.02,...,,,3,97,68,79,86,0.554545,0.558442,0.521212


In [71]:
df.columns

Index(['num_models', 'num_samples', 'attack_type', 'num_epochs',
       'n_estimators', 'best_models', 'intermediate_models', 'rounds',
       'input_size', 'batch_ratio', 'hidden_sizes', 'first_dropout',
       'middle_dropout', 'last_dropout', 'weight_decay', 'last_non_linearity',
       'middle_non_linearity', 'non_linearity', 'input_transform', 'lr_alpha',
       'lr_steps', 'input_size_freq', 'uncertainty_weights', 'compression',
       'compression_parameter', 'seed', 'TN', 'FP', 'FN', 'TP', 'accuracy',
       'precision', 'recall'],
      dtype='object')

In [72]:
baseline1 = df[(df['best_models']==False) & (df['intermediate_models']==False)]

In [73]:
gb = baseline1.groupby(['attack_type', 'hidden_sizes', 'num_models'])

In [74]:
gb.accuracy.agg(['mean', 'count'])


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,mean,count
attack_type,hidden_sizes,num_models,Unnamed: 3_level_1,Unnamed: 4_level_1
gb,[6000],3,0.531515,10
gb,[8000],3,0.522121,10
rf,[6000],3,0.556667,10
rf,[6000],6,0.6,4
rf,[6000],12,0.606061,6
rf,[6000],18,0.618182,6
rf,[6000],30,0.639394,1
rf,[8000],3,0.531515,10


Random forest based attack outperforms gradient boosting.

### Baseline 2
Baseline 2 is when all the models from Baseline 1, plus the best model is released to the attacker. The best model is the model with the highest accuracy (according to one of the participants - this participant is chosen randomly). The best model is from any compute plan and from any given epoch.

In [75]:
baseline2 = df[(df['best_models']==True) & (df['intermediate_models']==False)]

In [76]:
gb2 = baseline2.groupby(['attack_type', 'hidden_sizes', 'num_models'])

In [77]:
gb2.accuracy.agg(['mean', 'count'])

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,mean,count
attack_type,hidden_sizes,num_models,Unnamed: 3_level_1,Unnamed: 4_level_1
rf,[6000],3,0.613333,10
rf,[6000],6,0.618182,6
rf,[6000],12,0.624242,2
rf,[6000],18,0.607576,2
rf,[8000],3,0.614545,10


In [78]:
gb2.accuracy.mean() / gb.accuracy.mean()

attack_type  hidden_sizes  num_models
gb           [6000]        3                  NaN
             [8000]        3                  NaN
rf           [6000]        3             1.101796
                           6             1.030303
                           12            1.030000
                           18            0.982843
                           30                 NaN
             [8000]        3             1.156214
Name: accuracy, dtype: float64

### Baseline 3
Baseline 3 is when all the models from Baseline 2 plus intermediate models of the best model are released to the attacker. The intermediate models of the best model are snapshots of the model from every two epochs.

In [79]:
baseline3 = df[(df['best_models']==True) & (df['intermediate_models']==True)]

In [80]:
gb3 = baseline3.groupby(['attack_type', 'hidden_sizes', 'num_models'])

In [81]:
gb3.accuracy.agg(['mean', 'count'])

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,mean,count
attack_type,hidden_sizes,num_models,Unnamed: 3_level_1,Unnamed: 4_level_1
rf,[6000],3,0.641414,9
rf,[8000],3,0.63266,9


In [82]:
gb3.accuracy.mean() / gb.accuracy.mean()

attack_type  hidden_sizes  num_models
gb           [6000]        3                  NaN
             [8000]        3                  NaN
rf           [6000]        3             1.152241
                           6                  NaN
                           12                 NaN
                           18                 NaN
                           30                 NaN
             [8000]        3             1.190295
Name: accuracy, dtype: float64