Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations

This is the code repository for paper Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations, ICML 2024.

Abstract

This work studies sparse adversarial perturbations bounded by $l_0$ norm. We propose a white-box PGD-like attack method named sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against $l_0$ bounded adversarial perturbations. Moreover, the efficiency of sparse-PGD enables us to conduct adversarial training to build robust models against sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks.

Requirements

To execute the code, please make sure that the following packages are installed:

NumPy
PyTorch and Torchvision (install with CUDA if available)
matplotlib
robustbench

Adversarial Training

Run the following command to train PreActResNet18 with sAT or sTRADES:

python adversarial_training/train.py --exp_name debug --data_name [DATASET NAME] --data_dir [DATA PATH] --model_name preactresnet --max_epoch 100 --batch_size 128 --lr 0.05 --train_loss [adv, trades] --train_mode rand --patience 10 -k 120 --n_iters 20 --gpu 0

exp_name: experiment name
data_name: choose from cifar10, cifar100, imagenet100 or gtsrb
data_dir: path to the dataset
model_name: choose a model for training, e.g., 'preactresnet' or 'resnet34'
max_epoch: number of epochs for training
batch_size: batch size
lr: initial learning rate
train_loss: choose a loss for training from 'adv' (sAT) or 'trades' (sTRADES)
k: $l_0$ norm budget
n_iters: number of iterations for sPGD
gpu: gpu id

Sparse-AutoAttack (sAA)

Run the following command to run sAA on CIFAR10 or CIFAR100:

python autoattack/evaluate.py --dataset [DATASET NAME] --data_dir [DATASET PATH] --model [standard, l1, linf, l2, l0] --ckpt [CHECKPOINT NAME OR PATH] -k 20 --n_iters 10000 --n_examples 10000 --gpu 0 --bs 500

dataset: cifar10 or cifar100
data_dir: path to the dataset (automatically download if not exist)
model: choose a model from standard, l1, linf, l2, l0
ckpt: checkpoint name (for robustbench models) or checkpoint path (for vanilla, $l_1$ and $l_0$ models)
k: $l_0$ norm budget
n_iters: number of iterations
n_examples: number of examples to evaluate attack

Run the following command to run sAA on ImageNet100 or GTSRB:

python autoattack/evaluate_large.py --dataset [DATASET NAME] --data_dir [DATASET PATH] --model [standard, l1, linf, l2, l0] --ckpt [CHECKPOINT NAME OR PATH] -k 200 --n_iters 10000 --n_examples 500 --gpu 0 --bs 64

dataset: imagenet100 or gtsrb
data_dir: path to the dataset (please download the datasets by yourself)
Other arguments are the same as above.

Sparse-PGD (sPGD) / Sparse-RS (RS)

If you want to run single attack (sPGD unproj, sPGD proj or RS), run the following command:

python evaluate_single.py  --dataset [DATASET NAME] --data_dir [DATASET PATH]  --model [standard, l1, linf, l2, l0] --ckpt [CHECKPOINT NAME OR PATH] -k 20 --bs 500 --n_iters 10000 --n_examples 10000 --gpu 0 [--projected] [--unprojected] [--black] [--calc_aa]

unprojected: run sPGD unproj
projected: run sPGD proj
black: run RS
calc_aa: calculate the ensemble robust accuracy. When the all of three arguments above are set true, it is equivalent to sAA, but less efficient than evaluate.py or evaluate_large.py, because there is no cascade ensemble.

Checkpoints

The checkpoint files of the models traiend with the proposed method are available here

Acknowledgement

Parts of codes are based on DengpanFu/RobustAdversarialNetwork: A pytorch re-implementation for paper "Towards Deep Learning Models Resistant to Adversarial Attacks" (github.com)

Codes of Sparse-RS are from fra31/sparse-rs: Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks (github.com)

Bibliography

If you find this repository helpful for your project, please consider citing:

@inproceedings{
	zhong2024towards,
	title={Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations},
	author={Xuyang Zhong and Yixiao Huang and Chen Liu},
	booktitle={International Conference on Machine Learning},
	year={2024},
	organization={PMLR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
adversarial_training		adversarial_training
autoattack		autoattack
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adversarial_training

adversarial_training

autoattack

autoattack

README.md

README.md

Repository files navigation

Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations

Abstract

Requirements

Adversarial Training

Sparse-AutoAttack (sAA)

Sparse-PGD (sPGD) / Sparse-RS (RS)

Checkpoints

Acknowledgement

Bibliography

About

Releases

Packages

Contributors 2

Languages

CityU-MLO/sPGD

Folders and files

Latest commit

History

Repository files navigation

Towards Efficient Training and Evaluation of Robust Models against $l_0$ Bounded Adversarial Perturbations

Abstract

Requirements

Adversarial Training

Sparse-AutoAttack (sAA)

Sparse-PGD (sPGD) / Sparse-RS (RS)

Checkpoints

Acknowledgement

Bibliography

About

Resources

Stars

Watchers

Forks

Languages