# Results of white-box MIAs against VGG16 trained on the CIFAR10 dataset

The specification of this architecture is given in ```../configs/cifar10/vgg16.ini```.

We load results for MIAs trained using different features (specified below as ```activations```, ```gradients```, or ```activations,gradients```), extracted from different layers (e.g., ```fc2``` which stands for the second fully connected layer). The features can be extracted from different types of shadow models, indicated in the ```attacker_accessess``` list below. We hope the naming to be self-explanatory. We refer you to ```shadow_modelling_attack.py``` for more details on the different types of shadow models. 

In [1]:
from collections import defaultdict
import matplotlib.pyplot as plt
import numpy as np
import os
import pickle
import scipy.stats

In [2]:
attacker_accesses = ["aa-target_dataset",
    "aa-shadow_dataset_model_init",
    "aa-shadow_dataset",
    "aa-shadow_dataset-align-bottom_up_weight_matching",
    "aa-shadow_dataset-align-bottom_up_activation_matching",
    "aa-shadow_dataset-align-top_down_weight_matching",
]

In [3]:
roc_results = {attacker_access: 
               {f : defaultdict(list) 
                for f in ['activations', 'gradients', 'activations,gradients']} 
               for attacker_access in attacker_accesses}

In [4]:
experiments_dir = '../experiments/cifar10/attack/vgg16/attack_results'
num_repetitions = 10

for features in ['activations', 'gradients']:
    print(f'\nFeatures: {features} \n')

    if features == "activations":
        target_layers = ["fc3", "fc3-ia", "fc3-ia-only", "fc2", "fc1"]
    elif features == 'gradients':
        target_layers = ["fc3", "fc2", "fc1"]

    for attacker_access in attacker_accesses:
        print(f'\nAttacker access: {attacker_access}')
        results = {layer: {"test_acc": [], "test_auc": [], "best_test_acc": []} for layer in target_layers}

        for layer in target_layers:
            num_repetitions_found = 0
            for exp in range(num_repetitions):
                result_dir = os.path.join(experiments_dir, attacker_access, features, layer)
                saved_model_path = os.path.join(result_dir, f'exp_{exp}_model.pickle')
                if not os.path.exists(saved_model_path):
                    continue
                with open(saved_model_path, 'rb') as f:
                    saved_model = pickle.load(f)
                    if not saved_model['train_complete']:
                        continue
                    test_metrics = saved_model['test_metrics']
                test_acc = test_metrics['acc']
                test_auc = test_metrics['auc']
                results[layer]['test_auc'].append(test_auc)
                results[layer]['test_acc'].append(test_acc)
                roc_results[attacker_access][features][layer].append( 
                    (test_metrics['fpr'], test_metrics['tpr']) )
                num_repetitions_found += 1
            if num_repetitions_found == 0:
                continue
            mean_test_auc, std_test_auc = np.mean(results[layer]['test_auc']), np.std(results[layer]['test_auc'])
            mean_test_acc, std_test_acc = np.mean(results[layer]['test_acc']), np.std(results[layer]['test_acc'])
            h_test_auc =  std_test_auc * scipy.stats.t.ppf((1 + 0.95) / 2., num_repetitions_found-1) / (num_repetitions_found**0.5)
            print(f'Layer {layer}: {num_repetitions_found} experiments. ', 
                f'Test auc {mean_test_auc:.3f} ({h_test_auc:.3f}), test acc: {mean_test_acc:.1%} ({std_test_acc:.1%})')


Features: activations 


Attacker access: aa-target_dataset
Layer fc3: 10 experiments.  Test auc 0.637 (0.013), test acc: 60.6% (0.4%)
Layer fc3-ia: 10 experiments.  Test auc 0.681 (0.005), test acc: 62.3% (0.5%)
Layer fc3-ia-only: 10 experiments.  Test auc 0.680 (0.007), test acc: 62.1% (0.7%)
Layer fc2: 10 experiments.  Test auc 0.686 (0.005), test acc: 62.7% (0.7%)
Layer fc1: 10 experiments.  Test auc 0.682 (0.006), test acc: 62.7% (0.7%)

Attacker access: aa-shadow_dataset_model_init
Layer fc3: 10 experiments.  Test auc 0.640 (0.009), test acc: 60.8% (0.8%)
Layer fc3-ia: 10 experiments.  Test auc 0.664 (0.006), test acc: 61.5% (0.7%)
Layer fc3-ia-only: 10 experiments.  Test auc 0.665 (0.006), test acc: 61.7% (0.5%)
Layer fc2: 10 experiments.  Test auc 0.670 (0.008), test acc: 62.1% (0.7%)
Layer fc1: 10 experiments.  Test auc 0.651 (0.011), test acc: 60.9% (1.2%)

Attacker access: aa-shadow_dataset
Layer fc3: 10 experiments.  Test auc 0.630 (0.013), test acc: 60.5% (0.6%)
Layer fc3

In [5]:
features = 'activations,gradients'
print(f'\nFeatures: {features} \n')

target_layers = ["fc3", "fc3-ia", "fc3-ia,fc2"]

for attacker_access in attacker_accesses:
    print(f'\nAttacker access: {attacker_access}')
    results = {layer: {"test_acc": [], "test_auc": [], "best_test_acc": []} for layer in target_layers}

    for layer in target_layers:
        num_repetitions_found = 0
        for exp in range(num_repetitions):
            result_dir = os.path.join(experiments_dir, attacker_access, features, layer)
            saved_model_path = os.path.join(result_dir, f'exp_{exp}_model.pickle')
            if not os.path.exists(saved_model_path):
                continue
            with open(saved_model_path, 'rb') as f:
                saved_model = pickle.load(f)
                if not saved_model['train_complete']:
                    continue
                test_metrics = saved_model['test_metrics']
            test_acc = test_metrics['best_acc']
            test_auc = test_metrics['auc']
            results[layer]['test_auc'].append(test_auc)
            results[layer]['test_acc'].append(test_acc)
            num_repetitions_found += 1
            roc_results[attacker_access][features][layer].append( 
                    (test_metrics['fpr'], test_metrics['tpr']) )
        if num_repetitions_found == 0:
            continue
        mean_test_auc, std_test_auc = np.mean(results[layer]['test_auc']), np.std(results[layer]['test_auc'])
        mean_test_acc, std_test_acc = np.mean(results[layer]['test_acc']), np.std(results[layer]['test_acc'])
        h_test_auc =  std_test_auc * scipy.stats.t.ppf((1 + 0.95) / 2., num_repetitions_found-1) / (num_repetitions_found**0.5)
        print(f'Layer {layer}: {num_repetitions_found} experiments. ', 
            f'Test auc {mean_test_auc:.3f} ({h_test_auc:.3f}), best test acc: {mean_test_acc:.1%} ({std_test_acc:.1%})')


Features: activations,gradients 


Attacker access: aa-target_dataset
Layer fc3: 10 experiments.  Test auc 0.675 (0.004), best test acc: 63.0% (0.8%)
Layer fc3-ia: 10 experiments.  Test auc 0.686 (0.005), best test acc: 63.0% (0.3%)
Layer fc3-ia,fc2: 10 experiments.  Test auc 0.689 (0.009), best test acc: 63.2% (0.5%)

Attacker access: aa-shadow_dataset_model_init
Layer fc3: 10 experiments.  Test auc 0.668 (0.005), best test acc: 63.0% (0.5%)
Layer fc3-ia: 10 experiments.  Test auc 0.689 (0.004), best test acc: 63.5% (0.6%)
Layer fc3-ia,fc2: 10 experiments.  Test auc 0.691 (0.005), best test acc: 63.3% (0.7%)

Attacker access: aa-shadow_dataset
Layer fc3: 10 experiments.  Test auc 0.671 (0.004), best test acc: 62.9% (0.5%)
Layer fc3-ia: 10 experiments.  Test auc 0.678 (0.006), best test acc: 62.9% (0.6%)
Layer fc3-ia,fc2: 10 experiments.  Test auc 0.679 (0.005), best test acc: 62.8% (0.6%)

Attacker access: aa-shadow_dataset-align-bottom_up_weight_matching
Layer fc3: 10 experiments.  