Skip to content

Latest commit

 

History

History

Evaluations

Evaluation

After generating adversarial examples and preparing the defense-enhanced models, we evaluate the utility performance of attacks and defenses, as well as the security performance between attacks and defenses.

1. Utility Evaluation

1.1 Attacks

Once we generating adversarial examples for each attack and save them in the directory of AdversarialExampleDatasets/, its attack utility performance can be evaluated as follow.

python AttackEvaluations.py --dataset=CIFAR10 --attack=CW2

1.2 Complete Defense

Once obtaining the defense-enhanced model for a specific defense method (re-trained defense-enhanced models are saved in the directory of DefenseEnhancedModels/), its defense utility performance can be evaluated as follow.

Adversarial Training & Gradient Masking Defenses

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=NAT/EAT/PAT/DD/IGR

Input Transformation Defenses

For EIT:

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=EIT --crop_size=(26/30) --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4

For RT:

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=RT --resize=(31/36)

For PD:

Before evaluating the PD defense, it is suggested to follow the instructions https://github.com/SaizhuoWang/pixel-cnn-pp to train the generative PixelCNN model which is both time-consuming and gpu-consuming.

cd Defenses/DefenseMethods/External/
git clone https://github.com/SaizhuoWang/pixel-cnn-pp.git
mv pixel-cnn-pp pixel_cnn_pp
python train.py --dataset MNIST/CIFAR10 # with default parameters

And then, we can test the utility performance of PD as follow.

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=PD --epsilon=0.3/0.0627

For TE:

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=TE --level=16

Other Defenses

For RC:

python DefenseEvaluations.py --dataset=MNIST/CIFAR10 --defense=RC --radius=(0.3/0.02)

2. Security Evaluation

Finally, user can test the security performance of attacks vs defenses w.r.t the obtained adversarial samples and defense-enhanced models. In our evaluation, you can obtain the security performance of all defenses on correctly classifying the adversarial samples generated by one kind of attacks, and take the FGSM attack as an example in the following.

For MNIST:

python SecurityEvaluation.py --dataset=MNIST --attack=FGSM --defenses=NAT,EAT,PAT,DD,IGR,EIT,RT,PD,TE,RC --crop_size=26 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4 --resize=31 --epsilon=0.3 --level=16 --radius=0.3

For CIFAR10:

python SecurityEvaluation.py --dataset=CIFAR10 --attack=FGSM --defenses=NAT,EAT,PAT,DD,IGR,EIT,RT,PD,TE,RC --crop_size=30 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4 --resize=36 --epsilon=0.0627 --level=16 --radius=0.02

As PD (PixelDefend) is typical gpu-consuming in loading the pre-trained PixelCNN model, it is suggested to test it separately or use multiply GPUs.