NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks
Data and model can be found here: data and model.
Please download the data&&model and unzip them to './cifar-data' and './all_models'
Below is Table 1 from our paper, where we show the robustness of each accepted defense to the adversarial examples we can construct:
Defense | Dataset | Distance | Success rate |
---|---|---|---|
ADV-TRAIN Madry et al. (2018) | CIFAR | 0.031 (linf) | 47.9% |
ADV-BNN Liu et al. (2019) | CIFAR | 0.035 (linf) | 75.3% |
THERM-ADV Buckman et al. (2018)Madry et al. (2018) | CIFAR | 0.031 (linf) | 91.2% |
CAS-ADV Na et al. (2018) | CIFAR | 0.031 (linf) | 97.7% |
ADV-GAN Wang & Yu (2019) | CIFAR | 0.015 (linf) | 98.3% |
LID Ma et al. (2018) | CIFAR | 0.031 (linf) | 100.0% |
THERM Buckman et al. (2018) | CIFAR | 0.031 (linf) | 100.0% |
SAP Dhillon et al. (2018) | CIFAR | 0.031 (linf) | 100.0% |
RSE Liu et al. (2018) | CIFAR | 0.031 (linf) | 100.0% |
GUIDED DENOISER (Liao et al., 2018) | ImageNet | 0.031 (linf) | 95.5% |
RANDOMIZATION Xie et al. (2018) | ImageNet | 0.031 (linf) | 96.5% |
INPUT-TRANS Guo et al. (2018) | ImageNet | 0.005 (l2) | 100.0% |
PIXEL DEFLECTION Prakash et al. (2018) | ImageNet | 0.031 (linf) | 100.0% |
Paper available in Arxiv. If you feel it helpful, please cite our work.
@inproceedings{ Li2019NATTACKLT, title={NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks}, author={Yandong Li and Lijun Li and Liqiang Wang and Tong Zhang and Boqing Gong}, year={2019} }
This repository contains our implemenation of the black-box attack algorithm described in our paper, six defense methods (SAP, LID, RANDOMIZATION, INPUT-TRANS, THERM, and THERM-DAV) borrowed from the code of Anish et al. (2018), two defended models (GUIDED DENOISER and PIXEL DEFLECTION) based on the code of Athalye & Carlini, (2018), and two defended models (RSE and CAS-ADV) from the original papers.