Skip to content

Latest commit

 

History

History

Defenses

Defenses

Implementations of defense methods that are used to strengthen the resilience of deep learning models against adversarial examples.

Description

Similar with ../Attacks/, we first define and implement the defense class (e.g., NATDefense within NAT.py for the NAT defense) in Defenses/DefenseMethods/ folder, then we write the corresponding testing code (e.g., NAT_Test.py) to strengthen the original raw model and save the defense-enhanced models into the directory of DefenseEnhancedModels/.

Implemented Defenses

We implement 10 representative complete defenses, including four categories: adversarial-training-based defenses, gradient-masking-based defenses, input-transformation-based defenses and region-based classification.

  • NAT: A. Kurakin, et al., "Adversarial machine learning at scale," in ICLR, 2017.
  • EAT: F. Tram`er, et al., "Ensemble adversarial training: Attacks and defenses," in ICLR, 2018.
  • PAT: A. Madry, et al., "Towards deep learning models resistant to adversarial attacks," in ICLR, 2018.
  • DD: N. Papernot, et al., "Distillation as a defense to adversarial perturbations against deep neural networks," in S&P, 2016.
  • IGR: A. S. Ross et al., "Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients," in AAAI, 2018.
  • EIT: C. Guo, et al., "Countering adversarial images using input transformations," in ICLR, 2018.
  • RT: C. Xie, et al., "Mitigating adversarial effects through randomization," in ICLR, 2018.
  • PD: Y. Song, et al., "Pixeldefend: Leveraging generative models to understand and defend against adversarial examples," in ICLR, 2018.
  • TE: J. Buckman, et al., "Thermometer encoding: One hot way to resist adversarial examples," in ICLR, 2018.
  • RC: X. Cao et al., "Mitigating evasion attacks to deep neural networks via region-based classification," in ACSAC, 2017.

Usage

Preparation of defense-enhanced models with specific defense parameters that will be used in our evaluation.

Attacks Commands with default parameters
NAT python NAT_Test.py --dataset=MNIST --adv_ratio=0.3 --clip_max=0.3 --eps_mu=0 --eps_sigma=50
python NAT_Test.py --dataset=CIFAR10 --adv_ratio=0.3 --clip_max=0.1 --eps_mu=0 --eps_sigma=15
EAT python EAT_Test.py --dataset=MNIST --train_externals=True --eps=0.3 --alpha=0.05
python EAT_Test.py --dataset=CIFAR10 --train_externals=True --eps=0.0625 --alpha=0.03125
PAT python PAT_Test.py --dataset=MNIST --eps=0.3 --step_num=40 --step_size=0.01
python PAT_Test.py --dataset=CIFAR10 --eps=0.03137 --step_num=7 --step_size=0.007843
DD python DD_Test.py --dataset=MNIST --initial=False --temp=50
python DD_Test.py --dataset=CIFAR10 --initial=False --temp=50
IGR python IGR_Test.py --dataset=MNIST --lambda_r=316
python IGR_Test.py --dataset=CIFAR10 --lambda_r=10
EIT python EIT_Test.py --dataset=MNIST --crop_size=26 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4
python EIT_Test.py --dataset=CIFAR10 --crop_size=30 --lambda_tv=0.03 --JPEG_quality=85 --bit_depth=4
RT python RT_Test.py --dataset=MNIST --resize=31
python RT_Test.py --dataset=CIFAR10 --resize=36
PD python PD_Test.py --dataset=MNIST --epsilon=0.3
python PD_Test.py --dataset=CIFAR10 --epsilon=0.0627
TE python TE_Test.py --dataset=MNIST --level=16 --steps=40 --attack_eps=0.3 --attack_step_size=0.01
python TE_Test.py --dataset=CIFAR10 --level=16 --steps=7 --attack_eps=0.031 --attack_step_size=0.01
RC python RC_Test.py --dataset=MNIST --search=True --radius_min=0 --radius_max=0.3 --radius_step=0.01 --num_points=1000
python RC_Test.py --dataset=CIFAR10 --gpu_index=2 --search=True --radius_min=0.0 --radius_max=0.1 --radius_step=0.01 --num_points=1000