# Auditing a CNN trained on CIFAR100 using the Reference Attack

## Introduction

In this tutorial, we will see:

- How to specify the dataset and model for Privacy Meter
- How to audit a Tensorflow model
- How to use the `ReferenceMetric` to evaluate membership leakage using loss values from reference models
- How to visualize the audit result

## Imports

In [1]:
import numpy as np
import tensorflow as tf

2022-05-06 13:26:57.833575: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-05-06 13:26:57.833612: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


For now we install the Privacy Meter library from the local source. A version will be pushed to pip soon.

In [2]:
import sys
!{sys.executable} -m pip install -e ../.
from privacy_meter.audit import Audit, MetricEnum
from privacy_meter.audit_report import ROCCurveReport, SignalHistogramReport
from privacy_meter.constants import InferenceGame
from privacy_meter.dataset import Dataset
from privacy_meter.information_source import InformationSource
from privacy_meter.model import TensorflowModel

Obtaining file:///home/victor/ml_privacy_meter
  Preparing metadata (setup.py) ... [?25ldone
[?25hInstalling collected packages: privacy-meter
  Attempting uninstall: privacy-meter
    Found existing installation: privacy-meter 1.0
    Uninstalling privacy-meter-1.0:
      Successfully uninstalled privacy-meter-1.0
  Running setup.py develop for privacy-meter
Successfully installed privacy-meter-1.0


## Settings

Setting seed for reproducibility:

In [3]:
seed = 1234
np.random.seed(seed)
rng = np.random.default_rng(seed=seed)

Hyperparameters:

In [4]:
# for training the target and reference models
num_points_per_train_split = 10000
num_points_per_test_split = 1000
loss_fn = tf.keras.losses.CategoricalCrossentropy()
optim_fn = 'adam'
epochs = 25
batch_size = 64
regularizer_penalty = 0.01
regularizer = tf.keras.regularizers.l2(l=regularizer_penalty)

In [5]:
# for the reference metric
num_reference_models = 10
fpr_tolerance_list = [
    0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0
]

## Dataset creation

We use the CIFAR100 dataset for this tutorial. As Tensorflow already has the data loading code for CIFAR100, we just need to add our pre-processing code on top of it.

In [6]:
def preprocess_cifar100_dataset():
    input_shape, num_classes = (32, 32, 3), 100

    # split the data between train and test sets
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar100.load_data()

    # scale images to the [0, 1] range
    x_train = x_train.astype("float32") / 255
    x_test = x_test.astype("float32") / 255

    # convert labels into one hot vectors
    y_train = tf.keras.utils.to_categorical(y_train, num_classes)
    y_test = tf.keras.utils.to_categorical(y_test, num_classes)

    return x_train, y_train, x_test, y_test, input_shape, num_classes

x_train_all, y_train_all, x_test_all, y_test_all, input_shape, num_classes = preprocess_cifar100_dataset()

In [7]:
print(x_train_all.shape, x_test_all.shape)

(50000, 32, 32, 3) (10000, 32, 32, 3)


CIFAR100 comes with the predetermined train and test partitions. We further split the train partition into more sets - 'train' and 'reference' for the audit. 

We will have the following sets at the end of this partitioning:

- The 'train' set will be used to train the target model. It will be used as the 'member' set for the audit.
- The 'test' set will be used as the 'non-member' set for the audit.
- The 'reference' set will be used later as the pool of data to train the reference models.

We wrap the sets into a `Dataset` object, which takes in the following arguments:

- `data_dict` contains the actual dataset, in the form of a 2D dictionary. The first key corresponds to the split name (here we have two: "train" and "test"), and the second key to the feature name (here we also have two: "x" and "y").
- `default_input` contains the name of the feature that should be used as the models input (here "x").
- `default_output` contains the name of the feature that should be used as the label / models output (here "y").

In [8]:
# create the target model's dataset
dataset = Dataset(
    data_dict={
        'train': {'x': x_train_all, 'y': y_train_all},
        'test': {'x': x_test_all, 'y': y_test_all}
    },
    default_input='x',
    default_output='y'
)

Finally, we use the built-in `Dataset.subdivide()` function, to split the two splits ("train" and "test") into multiple sub-datasets (one per model). The resulting sub-splits are included in the parent object ("train000", "train001", etc.) and are returned as a list of individual Dataset objects.

In [9]:
datasets_list = dataset.subdivide(
    num_splits=num_reference_models + 1,
    delete_original=True,
    in_place=False,
    return_results=True,
    method='hybrid',
    split_size={'train': num_points_per_train_split, 'test': num_points_per_test_split}
)

In [10]:
for i, d in enumerate(datasets_list):
    print(i)
    print(d)

0
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
1
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
2
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
3
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
4
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
5
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
6
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
7
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
8
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
9
Splits            = ['train', 'test']
Features          = ['x', 'y']
Default features  = x --> y
10
Splits 

## Training the target and reference models

We define the Tensorflow model to be used as the target and reference models:

In [11]:
def get_tensorflow_cnn_classifier(input_shape, num_classes, regularizer):
    # TODO: change model architecture
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu',
                                     input_shape=input_shape, kernel_regularizer=regularizer))
    model.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu',
                                     kernel_regularizer=regularizer))
    model.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
    return model

And we compile and train the target model using the target dataset we defined above:

In [12]:
x = datasets_list[0].get_feature('train', '<default_input>')
y = datasets_list[0].get_feature('train', '<default_output>')
model = get_tensorflow_cnn_classifier(input_shape, num_classes, regularizer)
model.summary()
model.compile(optimizer=optim_fn, loss=loss_fn, metrics=['accuracy'])
model.fit(x, y, batch_size=batch_size, epochs=epochs, verbose=2)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 15, 15, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 6, 6, 64)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 2304)              0         
                                                                 
 dropout (Dropout)           (None, 2304)              0

2022-05-06 13:27:11.501887: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-05-06 13:27:11.502027: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-05-06 13:27:11.502064: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (hermes): /proc/driver/nvidia/version does not exist
2022-05-06 13:27:11.502694: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/25


2022-05-06 13:27:11.846532: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 122880000 exceeds 10% of free system memory.


157/157 - 7s - loss: 4.5530 - accuracy: 0.0364 - 7s/epoch - 44ms/step
Epoch 2/25
157/157 - 6s - loss: 4.0608 - accuracy: 0.0989 - 6s/epoch - 35ms/step
Epoch 3/25
157/157 - 6s - loss: 3.7988 - accuracy: 0.1489 - 6s/epoch - 41ms/step
Epoch 4/25
157/157 - 8s - loss: 3.6359 - accuracy: 0.1859 - 8s/epoch - 52ms/step
Epoch 5/25
157/157 - 16s - loss: 3.5202 - accuracy: 0.2047 - 16s/epoch - 102ms/step
Epoch 6/25
157/157 - 16s - loss: 3.4201 - accuracy: 0.2219 - 16s/epoch - 104ms/step
Epoch 7/25
157/157 - 12s - loss: 3.3346 - accuracy: 0.2445 - 12s/epoch - 75ms/step
Epoch 8/25
157/157 - 13s - loss: 3.2727 - accuracy: 0.2520 - 13s/epoch - 80ms/step
Epoch 9/25
157/157 - 13s - loss: 3.2048 - accuracy: 0.2674 - 13s/epoch - 80ms/step
Epoch 10/25
157/157 - 12s - loss: 3.1508 - accuracy: 0.2808 - 12s/epoch - 77ms/step
Epoch 11/25
157/157 - 11s - loss: 3.1077 - accuracy: 0.2868 - 11s/epoch - 69ms/step
Epoch 12/25
157/157 - 12s - loss: 3.0648 - accuracy: 0.2907 - 12s/epoch - 78ms/step
Epoch 13/25
157/15

<keras.callbacks.History at 0x7f66c0aa94c0>

We wrap the target model in the `TensorflowModel` object:

In [13]:
target_model = TensorflowModel(model_obj=model, loss_fn=loss_fn)

We will now sample data from the reference pool and train reference models, and wrap each one in a `TensorflowModel` object:

In [14]:
reference_models = []
for model_idx in range(num_reference_models):
    print(f"Training reference model {model_idx}...")
    reference_model = get_tensorflow_cnn_classifier(input_shape, num_classes, regularizer)
    reference_model.compile(optimizer=optim_fn, loss=loss_fn, metrics=['accuracy'])
    reference_model.fit(
        datasets_list[model_idx + 1].get_feature('train', '<default_input>'),
        datasets_list[model_idx + 1].get_feature('train', '<default_output>'),
        batch_size=batch_size,
        epochs=epochs,
        verbose=2
    )
    reference_models.append(
        TensorflowModel(model_obj=reference_model, loss_fn=loss_fn)
    )

Training reference model 0...
Epoch 1/25


2022-05-06 13:32:00.331186: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 122880000 exceeds 10% of free system memory.


157/157 - 13s - loss: 4.5345 - accuracy: 0.0427 - 13s/epoch - 82ms/step
Epoch 2/25
157/157 - 12s - loss: 3.9928 - accuracy: 0.1194 - 12s/epoch - 78ms/step
Epoch 3/25
157/157 - 13s - loss: 3.7410 - accuracy: 0.1638 - 13s/epoch - 80ms/step
Epoch 4/25
157/157 - 13s - loss: 3.5513 - accuracy: 0.2048 - 13s/epoch - 81ms/step
Epoch 5/25
157/157 - 13s - loss: 3.4151 - accuracy: 0.2333 - 13s/epoch - 84ms/step
Epoch 6/25
157/157 - 13s - loss: 3.2933 - accuracy: 0.2555 - 13s/epoch - 83ms/step
Epoch 7/25
157/157 - 12s - loss: 3.1821 - accuracy: 0.2810 - 12s/epoch - 76ms/step
Epoch 8/25
157/157 - 12s - loss: 3.1074 - accuracy: 0.2937 - 12s/epoch - 74ms/step
Epoch 9/25
157/157 - 13s - loss: 3.0214 - accuracy: 0.3130 - 13s/epoch - 80ms/step
Epoch 10/25
157/157 - 12s - loss: 2.9621 - accuracy: 0.3196 - 12s/epoch - 80ms/step
Epoch 11/25
157/157 - 13s - loss: 2.8690 - accuracy: 0.3462 - 13s/epoch - 81ms/step
Epoch 12/25
157/157 - 12s - loss: 2.8211 - accuracy: 0.3567 - 12s/epoch - 74ms/step
Epoch 13/25


2022-05-06 13:37:04.632147: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 122880000 exceeds 10% of free system memory.


157/157 - 14s - loss: 4.5566 - accuracy: 0.0372 - 14s/epoch - 87ms/step
Epoch 2/25
157/157 - 13s - loss: 4.0666 - accuracy: 0.1002 - 13s/epoch - 83ms/step
Epoch 3/25
157/157 - 11s - loss: 3.8002 - accuracy: 0.1513 - 11s/epoch - 71ms/step
Epoch 4/25
157/157 - 12s - loss: 3.6226 - accuracy: 0.1819 - 12s/epoch - 79ms/step
Epoch 5/25
157/157 - 12s - loss: 3.4801 - accuracy: 0.2138 - 12s/epoch - 79ms/step
Epoch 6/25
157/157 - 12s - loss: 3.3595 - accuracy: 0.2354 - 12s/epoch - 74ms/step
Epoch 7/25
157/157 - 12s - loss: 3.2620 - accuracy: 0.2526 - 12s/epoch - 79ms/step
Epoch 8/25
157/157 - 11s - loss: 3.1565 - accuracy: 0.2795 - 11s/epoch - 71ms/step
Epoch 9/25
157/157 - 12s - loss: 3.0895 - accuracy: 0.2926 - 12s/epoch - 79ms/step
Epoch 10/25
157/157 - 12s - loss: 3.0322 - accuracy: 0.3079 - 12s/epoch - 74ms/step
Epoch 11/25
157/157 - 12s - loss: 2.9630 - accuracy: 0.3230 - 12s/epoch - 75ms/step
Epoch 12/25
157/157 - 12s - loss: 2.9061 - accuracy: 0.3337 - 12s/epoch - 79ms/step
Epoch 13/25


2022-05-06 13:41:13.911163: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 122880000 exceeds 10% of free system memory.


157/157 - 15s - loss: 4.5337 - accuracy: 0.0390 - 15s/epoch - 97ms/step
Epoch 2/25
157/157 - 13s - loss: 3.9891 - accuracy: 0.1185 - 13s/epoch - 85ms/step
Epoch 3/25
157/157 - 13s - loss: 3.7179 - accuracy: 0.1677 - 13s/epoch - 82ms/step
Epoch 4/25
157/157 - 12s - loss: 3.5538 - accuracy: 0.1947 - 12s/epoch - 79ms/step
Epoch 5/25
157/157 - 12s - loss: 3.4244 - accuracy: 0.2195 - 12s/epoch - 78ms/step
Epoch 6/25
157/157 - 11s - loss: 3.3166 - accuracy: 0.2489 - 11s/epoch - 73ms/step
Epoch 7/25
157/157 - 13s - loss: 3.2397 - accuracy: 0.2593 - 13s/epoch - 81ms/step
Epoch 8/25
157/157 - 12s - loss: 3.1513 - accuracy: 0.2849 - 12s/epoch - 79ms/step
Epoch 9/25
157/157 - 12s - loss: 3.0741 - accuracy: 0.2943 - 12s/epoch - 79ms/step
Epoch 10/25
157/157 - 13s - loss: 3.0185 - accuracy: 0.3108 - 13s/epoch - 82ms/step
Epoch 11/25
157/157 - 12s - loss: 2.9462 - accuracy: 0.3255 - 12s/epoch - 75ms/step
Epoch 12/25
157/157 - 12s - loss: 2.9096 - accuracy: 0.3380 - 12s/epoch - 79ms/step
Epoch 13/25


2022-05-06 13:46:24.554652: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 122880000 exceeds 10% of free system memory.


Epoch 1/25
157/157 - 13s - loss: 4.5610 - accuracy: 0.0417 - 13s/epoch - 81ms/step
Epoch 2/25
157/157 - 11s - loss: 4.0527 - accuracy: 0.1052 - 11s/epoch - 68ms/step
Epoch 3/25
157/157 - 13s - loss: 3.7938 - accuracy: 0.1512 - 13s/epoch - 80ms/step
Epoch 4/25
157/157 - 12s - loss: 3.6214 - accuracy: 0.1796 - 12s/epoch - 79ms/step
Epoch 5/25
157/157 - 13s - loss: 3.4907 - accuracy: 0.2125 - 13s/epoch - 84ms/step
Epoch 6/25
157/157 - 12s - loss: 3.3746 - accuracy: 0.2353 - 12s/epoch - 76ms/step
Epoch 7/25
157/157 - 12s - loss: 3.2846 - accuracy: 0.2531 - 12s/epoch - 76ms/step
Epoch 8/25
157/157 - 11s - loss: 3.2225 - accuracy: 0.2628 - 11s/epoch - 72ms/step
Epoch 9/25
157/157 - 12s - loss: 3.1307 - accuracy: 0.2845 - 12s/epoch - 76ms/step
Epoch 10/25
157/157 - 13s - loss: 3.0740 - accuracy: 0.2986 - 13s/epoch - 83ms/step
Epoch 11/25
157/157 - 12s - loss: 3.0114 - accuracy: 0.3074 - 12s/epoch - 78ms/step
Epoch 12/25
157/157 - 13s - loss: 2.9522 - accuracy: 0.3233 - 13s/epoch - 81ms/step
E

## Information Sources

We can now define two `InformationSource` objects. Basically, an information source is an abstraction representing a set of models, and their corresponding dataset. Note that for the `ReferenceMetric` we use the same dataset in both the target and reference information sources, but the models that will be used for querying the dataset will differ.

In [15]:
target_info_source = InformationSource(
    models=[target_model],
    datasets=[datasets_list[0]]
)

reference_info_source = InformationSource(
    models=reference_models,
    datasets=datasets_list[1:]
)

## Metric and Audit

We now create a `Metric` object, which is an abstraction representing an algorithm used to measure something on an `InformationSource`, such as membership information leakage. In this case, we use the `ReferenceMetric` to measure the membership information leakage of `target_info_source` in a black-box setting, using loss values returned by the reference model on the target dataset in `reference_info_source`.

The `Audit` object is a wrapper to actually run the audit, and display the results. More visualization options will be added soon.

As we will be using the default version of the `ReferenceMetric`, we pass the `REFERENCE` enum value as the metric argument for the `Audit` object.

In [None]:
audit_obj = Audit(
    metrics=MetricEnum.REFERENCE,
    inference_game_type=InferenceGame.PRIVACY_LOSS_MODEL,
    target_info_sources=target_info_source,
    reference_info_sources=reference_info_source,
    fpr_tolerances=fpr_tolerance_list
)
audit_obj.prepare()

In [None]:
audit_results = audit_obj.run()[0]
for result in audit_results:
    print(result)

## Result visualization

Several visualization tools are built in `privacy_tool`, such as ROC curves, signal values histogram, or confusion matrix.

In [None]:
# This instruction won't be needed once the tool is on pip
from privacy_meter import audit_report
audit_report.REPORT_FILES_DIR = '../privacy_meter/report_files'

In [None]:
ROCCurveReport.generate_report(
    metric_result=result,
    inference_game_type=InferenceGame.PRIVACY_LOSS_MODEL,
    show=True
)

In [None]:
SignalHistogramReport.generate_report(
    metric_result=result,
    inference_game_type=InferenceGame.PRIVACY_LOSS_MODEL,
    show=True
)