Guided Diffusion Model for Adversarial Purification

by [Jinyi Wang], [Zhaoyang Lyu], [Bo Dai], [Hongfei Fu]

This repository includes the official PyTorch implementation of our paper:

@article{wang2022guided,
  title={Guided diffusion model for adversarial purification},
  author={Wang, Jinyi and Lyu, Zhaoyang and Lin, Dahua and Dai, Bo and Fu, Hongfei},
  journal={arXiv preprint arXiv:2205.14969},
  year={2022}
}

What does our work do?

We propose a method that gives adversarial robustness to a neural network model against (stochastic) adversarial attacks by using a Guided Diffusion Model.

Running Codes

Dependency

Run the following command to install some necessary Python 3 packages by Anaconda to run our code.

conda env create -f environment.yml

Running code

To run the experiments, enter the following command.

python main.py --config <config-file>

Example CIFAR10

Train the Cifar10 classifier on a WideResNet and copy the weights `origin.t7' to

mkdir -p clf_models/run/logs/cifar10

For starting the purification process, execute

python main.py --config default.yml

Example ImageNet

Download the pretrained weights and copy them to the folder models.

For example, we provide the example configuration file configs/ImageNet_PGD.yml in the repository.

python main.py --config ImageNet_Res50.yml

Parallel Running code

To run the experiments parallelly, enter the following command.

python parallel_run.py --device 8 --rank 0 --world_size 8 --config ImageNet_Res50.yml

For example, we provide the example configuration file configs/cifar10_bpda_eot_sigma025_eot15.yml in the repository.

Attack Methods

For adversarial attacks, the classifier PGD attack and BPDA+EOT attack are implemented in attacks/clf_pgd.py and attacks/bpda_strong.py, respectively. At the configuration file, setting the attack.attack_method into clf_pgd or bpda_strong will run these attacks, respectively.

Main components

File name	Explanation
`main.py`	Execute the main code, with initializing configurations and loggers.
`runners/empirical.py`	Attacks and purifies the image to show empirical adversarial robustness.
`attacks/bpda_strong.py`	Code for BPDA+EOT attack.
`purification/adp.py`	Code for adversarial purification.
`guided_diffusion/*`	Code for DDPM on ImageNet.
`pytorch_diffusion/*`	Code for DDPM on CIFAR-10.
`networks/*`	Code for used classifier network architectures.
`utils/*`	Utility files.

Notes

For the configuration files, we use the pixel ranges [0, 255] for the perturbation scale attack.ptb and the one-step attack scale attack.alpha. And the main experiments are performed within the pixel range [0, 1] after being rescaled during execution.
For training the EBM and classifier models, we primarily used the pre-existing methods such as 256*256_pretrained_diffusion_model and WideResNet classifier. Here is the repository we used for training the WideResNet classifier.

Contact

For further details, please contact jinyi.wang@sjtu.edu.cn.

License

MIT

This implementation is based on / inspired by:

https://github.com/openai/guided-diffusion (Pytorch DDPM on ImageNet)
https://github.com/pesser/pytorch_diffusion (PyTorch helper that loads the DDPM model), and
https://github.com/jmyoon1/adp (code structure and attack algorithms).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
attacks		attacks
clf_models		clf_models
configs		configs
figures		figures
guided_diffusion		guided_diffusion
models		models
networks		networks
purification		purification
pytorch_diffusion		pytorch_diffusion
pytorch_ssim		pytorch_ssim
runners		runners
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
main.py		main.py
parallel_run.py		parallel_run.py

JinyiW/GuidedDiffusionPur

Folders and files

Latest commit

History

Repository files navigation