## Demo for Property Inference Attack (PIA)

In [4]:
from privacy_evaluator.attacks.property_inference_attack import PropertyInferenceAttack
from privacy_evaluator.classifiers.classifier import Classifier
from privacy_evaluator.utils.data_utils import (
    dataset_downloader,
    new_dataset_from_size_dict,
)
from privacy_evaluator.utils.trainer import trainer
from privacy_evaluator.models.torch.cnn import ConvNet

import collections
from matplotlib import pyplot as plt
import warnings
warnings.filterwarnings("ignore")


# Property Inference Attack on CIFAR10 Dataset

## 1. Overview of the Dataset

CIFAR10 is a dataset of colorful (RGB) images from 10 classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck), consisting of 50000 training- and 10000 test- examples.

The size of each image is $32\times 32 \times 3$:


In [5]:
train_dataset, test_dataset = dataset_downloader("CIFAR10")
input_shape = test_dataset[0][0].shape
print(input_shape)

(32, 32, 3)


As of now we can only perform binary attacks. Therefore as an example we use classes O (airplane) and 1 (automobile) from CIFAR10.

We choose a sample size for each class (on which we train the target model):

In [9]:
NUM_ELEMENTS_PER_CLASSES = {0: 1000, 1: 500}

And then adjust the training set accordingly:

In [16]:
train_set = new_dataset_from_size_dict(train_dataset, NUM_ELEMENTS_PER_CLASSES)
print(train_set[0].shape, train_set[1].shape)

(1500, 32, 32, 3) (1500, 1)


## 2. Load and train your target model

Load *your* personal target model in the next cell to perform the attack on it. Any PyTorch or TensorFlow model can be used.


In [13]:
# For demonstration purposes we load the following ConvNet model for two input classes
# (multi-class input will be supported in future releases):

num_classes = len(NUM_ELEMENTS_PER_CLASSES)

model = ConvNet(num_classes, input_shape, num_channels=(input_shape[-1], 16, 32, 64))

In [17]:
trainer(train_set, NUM_ELEMENTS_PER_CLASSES, model, num_epochs=2)

# Convert to ART classifier

target_model = Classifier._to_art_classifier(model, num_classes, input_shape)

## 3. Perform attack

Each attack consists of several sub-attack (one for each element in "ratios_for_attack").

In a sub-attack we create a number of shadow classifiers of the same architecture as the provided target model. 

Half of them will be trained on an unbalanced data set of the given ratio, the other half is trained on blanced data sets.

The shadow classifiers serve as training set for a meta classifier, which finally predicts the likelihood of the target model to have a given property (i.e. the ratio).

In [18]:
# Number of shadow classifiers (increase for better accuracy of the meta classifier, decrease when not enough computing power is available.)
amount_sets = 1000 # needs to be even

# Size of data set to train the shadow classifiers
size_set = 100

# Ratios to perform the attack for (the real ratios of our example target model is 0.66, so we )
ratios_for_attack = [0.66,0.33]
classes = [0,1]

attack = PropertyInferenceAttack(target_model, train_set, verbose=1, size_set=size_set, \
    ratios_for_attack=ratios_for_attack, classes=classes,amount_sets=amount_sets)

In [19]:
output = attack.attack()

Initiating Property Inference Attack ... 
Extracting features from target model ... 
(155266,)  --- features extracted from the target model.
Creating set of 1 balanced shadow classifiers ... 
Creating shadow training sets
Training shadow classifiers
Performing PIA for various ratios ... 
  0%|          | 0/2 [00:00<?, ?it/s]Creating shadow training sets
Training shadow classifiers
Epoch 1/2
Epoch 2/2
 50%|█████     | 1/2 [00:22<00:22, 22.48s/it]Creating shadow training sets
Training shadow classifiers
Epoch 1/2
Epoch 2/2
100%|██████████| 2/2 [00:43<00:00, 22.00s/it]


In [20]:
output

('The most probable property is class 0: 0.67, class 1: 0.33 with a probability of 0.504490315914154.',
 {'class 0: 0.67, class 1: 0.33': 0.5044903,
  'class 0: 0.34, class 1: 0.66': 0.49777275})

## Human readable output of the attack:

In [21]:
output[0]

'The most probable property is class 0: 0.67, class 1: 0.33 with a probability of 0.504490315914154.'