A repository for understanding the intrinsic robustness limits for robust learning against adversarial examples. Created by Xiao Zhang and Jinghui Chen. Link to the ArXiv paper.
The goals of this project are to:
-
Theoretically, derive an intrinsic robustness bound with respect to L2 perturbations, for any input distribution that can be captured by some conditional generative model.
-
Empirically, evaluate the intrinsic robustness bound for various synthetically generated image distributions, with comparisons to the robustness of SOTA robust classifiers.
The code was developed using Python3 on Anaconda.
-
Install Pytorch 1.1.0:
conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch
-
Install other dependencies:
pip install -r requirements.txt
-
Train an ACGAN model using the original MNIST dataset:
python build_generator_mnist.py --gan-type ACGAN --mode train
-
Estimate the local Lipschitz constant and reconstruct MNIST dataset using ACGAN:
python build_generator_mnist.py --gan-type ACGAN --mode evaluate
python build_generator_mnist.py --gan-type ACGAN --mode reconstruct
-
Train various robust classifiers under L2 perturbations on the generated MNIST dataset:
cd train_classifiers && python train_mnist.py --method zico
-
Evaluate unconstrained and/or in-distribution robustness for the trained classifiers:
python test_robustness_mnist.py --method madry --robust-type in
-
Dowload the pretrained BigGAN model, estimate Lipschitz and reconstruct ImageNet10:
python build_generator_imagenet.py --mode evaluate
python build_generator_imagenet.py --mode reconstruct
-
Train various robust classifiers under L2 perturbations on the generated ImageNet10:
cd train_classifier && python train_imagenet.py --method trades
-
Evaluate unconstrained and/or in-distribution robustness for the trained classifiers:
python test_robustness_imagenet.py --method trades --robust-type unc
-
Folder
geneartive
, including:src
: folder that contains functions for building BigGANacgan.py, gan.py
: functions for training MNIST generative modelsbiggan.py
: neural network architecture for BigGAN generatorutils.py
: auxiliary functions for generative models
-
Folder
train_classifer
, including:adv_loss.py
: adversarial loss functions for Madry and TRADESattack.py
: functions for generating unc/in-dist adversarial examplesproblem.py
: define datasets, dataloaders and model architecturestrainer.py
: implements the train and evaluation functions using different methodstrain_mnist.py, train_imagenet.py
: main functions for training classifiers on generated MNIST and ImageNet10
-
build_generator_mnist.py, build_generator_mnist.py
: main functions for generating image datasets -
test_robustness_mnist.py, test_robustness_imagenet.py
: main functions for evaluating adversarial robustness