Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective [arXiv]

Overview

This work presents a holistic study of the impact of architectural choice on adversarial robustness.

(Left) Impact of architectural components on adversarial robustness on CIFAR-10, relative to that of adversarial training methods. (Right) Progress of SotA robust accuracy against AutoAttack without additional data on CIFAR-10 with $\ell_{\infty}$ perturbations of $\epsilon=8/255$ chronologically.

Impact of Block-level Design

The design of a block primarily comprises its topology, type of convolution and kernel size, choice of activation, and normalization. We examine these elements independently through controlled experiments and propose a novel residual block, dubbed RobustResBlock, based on our observations. An overview of RobustResBlock is provided below:

Table 1. White-box adversarial robustness of WRN with RobustResBlock

	$^{\#}\rm{P}$	$^{\#}\rm{F}$	$\rm{PGD}^{20}$	$\rm{CW}^{40}$
$D=4$, $W=10$	39.6M	6.00G	57.70	54.71	[BaiduDisk]
$D=5$, $W=12$	70.5M	10.6G	58.46	55.56	[BaiduDisk]
$D=7$, $W=14$	133M	19.6G	59.41	56.62	[BaiduDisk]
$D=11$, $W=16$	270M	39.3G	60.48	57.78	[BaiduDisk]

Impact of Network-level Design

Independent Scaling by Depth ( $D_{1}$ : $D_2$ : $D_3$ = $2$ : $2$ : $1$ )

We allow the depth of each stage ( $D_{i\in\{1,2,3\}}$ ) to vary among $\{2, 3, 4, 5, 7, 9, 11\}$, details and pre-trained checkpoints of $7^{3} = 343$ depth settings are available from here.

Independent Scaling by Width ( $W_{1}$ : $W_2$ : $W_3$ = $2$ : $2.5$ : $1$ )

We allow the width (in terms of widening factors) of each stage ( $W_{i\in\{1,2,3\}}$ ) to vary among $\{4, 6, 8, 10, 12, 14, 16, 20\}$, details and pre-trained checkpoints of $8^{3} = 512$ width settings are available from here.

Interplay between Depth and Width ( $\sum D_{i}$ : $\sum W_{i}$ = $7$ : $3$ )

Table 2. Performance of independent scaling ( $D$ or $W$ ) and compound scaling ( $D\&W$ )

$^{\#}\rm{F}$ Target	Scale by	$D_{1}$	$W_{1}$	$D_{2}$	$W_{2}$	$D_{3}$	$W_{3}$	$^{\#}\rm{P}$	$^{\#}\rm{F}$	$\rm{PGD}^{20}$	$\rm{CW}^{40}$
	$D$	5	10	5	10	2	10	24.0M	5.25G	56.05	53.14	[BaiduDisk]
5G	$W$	4	11	4	13	4	6	24.5M	5.71G	56.89	53.87	[BaiduDisk]
	$D\&W$	14	5	14	7	7	3	17.7M	5.09G	57.49	54.78	[BaiduDisk]
	$D$	6	12	6	12	3	12	48.5M	9.59G	56.42	53.91	[BaiduDisk]
10G	$W$	5	13	5	16	5	7	44.4M	10.5G	57.06	54.29	[BaiduDisk]
	$D\&W$	17	7	17	9	8	4	39.3M	9.74G	58.06	55.45	[BaiduDisk]
	$D$	9	14	8	14	4	14	90.4M	18.6G	57.11	54.48	[BaiduDisk]
20G	$W$	7	16	7	18	7	8	81.7M	20.4G	58.02	55.34	[BaiduDisk]
	$D\&W$	22	8	22	11	11	5	74.8M	20.3G	58.47	56.14	[BaiduDisk]
	$D$	14	16	13	16	11	16	185M	38.8G	57.90	55.79	[BaiduDisk]
40G	$W$	11	18	11	21	11	9	170M	42.7G	58.48	56.15	[BaiduDisk]
	$D\&W$	27	10	28	14	13	6	147M	40.4G	58.76	56.59	[BaiduDisk]

Adversarially Robust Residual Networks (RobustResNets)

We use the proposed compound scaling rule to scale RobustResBlock and present a portfolio of adversarially robust residual networks.

Table 3. Comparison to SotA methods with additional 500K data

Method	Model	$^{\#}\rm{P}$	$^{\#}\rm{F}$	$\rm{AA}$
RST	WRN-28-10	36.5M	5.20G	59.53
AWP	WRN-28-10	36.5M	5.20G	60.04
HAT	WRN-28-10	36.5M	5.20G	62.50
Gowal et al.	WRN-28-10	36.5M	5.20G	62.80
Huang el al.	WRN-34-R	68.1M	19.1G	62.54
Ours	RobustResNet-A1	19.2M	5.11G	63.70	[BaiduDisk]
Ours	WRN-A4	147M	40.4G	65.79	[BaiduDisk]

How to use

1. Use our RobustResNets

  from models.resnet import PreActResNet
  depth = [D1, D2, D3]
  channels = [16, 16*W1, 32*W2, 64*W3]
  block_types = ['robust_res_block', 'robust_res_block', 'robust_res_block']
  
  # Syntax
  model = PreActResNet(
    depth_configs=depth,
    channel_configs=channels,
    block_types=block_types,
    scales=8,
    base_width=10,
    cardinality=4,
    se_reduction=64
    num_classes=10,  # for CIFAR-10/SVHN/MNIST)
  
  # See Table 2 "D&W" rows for D1, D2, D3 and W1, W2, W3, see below for examples
  RobustResNet-A1 = PreActResNet(
    depth_configs=[14, 14, 7],
    channel_configs=[5, 7, 3],
    ...)
  RobustResNet-A2 = PreActResNet(
    depth_configs=[17, 17, 8],
    channel_configs=[7, 9, 4],
    ...)
  RobustResNet-A3 = PreActResNet(
    depth_configs=[22, 22, 11],
    channel_configs=[8, 11, 5],
    ...)
  RobustResNet-A4 = PreActResNet(
    depth_configs=[27, 28, 13],
    channel_configs=[10, 14, 6],
    ...)
  
  # If you prefer to use WRN's block but with our scalings
  WRN-A1 = PreActResNet(
    depth_configs=[14, 14, 7],
    channel_configs=[5, 7, 3],
    block_types = ['basic_block', 'basic_block', 'basic_block']
    ...)

2. Just want to use our block RobustResBlock

  from models.resnet import RobustResBlock
  # See Table 1 above for the performance of RobustResBlock
  block = RobustResBlock(
    in_chs, out_chs,
    kernel_size=3, 
    scales=8, 
    base_width=10, 
    cardinality=4,
    se_reduction=64,
    activation='ReLU', 
    normalization='BatchNorm')

3. Use our compound scaling rule, RobustScaling, to scale your custom models

Please see examples/compound_scaling.ipynb

How to evaluate pre-trained models

Download the checkpoints, which should contain the following:

arch_xxx/
  -arch_xxx.log  # training log
  -arch_xxx.yaml  # configuration file 
  -checkpoints/
    -arch_xxx.pth  # last epoch checkpoint
    -arch_xxx_best.pth  # checkpoint for best robust acc on valid set

Run the following lines to evaluate adversarial robustness

  python eval_robustness.py \
    --data "path to data" \
    --config_file_path "path to configuration yaml file" \
    --checkpoint_path "path to checkpoint pth file" \
    --save_path "path to file for logging evaluation" \
    --attack_choice [FGSM/PGD/CW/AA] \
    --num_steps [1/20/40/0] \
    --batch_size 100  # batch size for evaluation, adjust according to your GPU memory

CIFAR-10 (TRADES)

Model	$^{\#}\rm{P}$	$^{\#}\rm{F}$	Clean	$\rm{PGD}^{20}$	$\rm{CW}^{40}$	AA
WRN-28-10	36.5M	5.20G	84.62	55.90	53.15	51.66	[BaiduDisk]
RobNet-large-v2	33.3M	5.10G	84.57	52.79	48.94	47.48	[BaiduDisk]
AdvRush	32.6M	4.97G	84.95	56.99	53.27	52.90	[BaiduDisk]
RACL	32.5M	4.93G	83.91	55.98	53.22	51.37	[BaiduDisk]
RRN-A1 (ours)	19.2M	5.11G	85.46	58.47	55.72	54.42	[BaiduDisk]
WRN-34-12	66.5M	9.60G	84.93	56.01	53.53	51.97	[BaiduDisk]
WRN-34-R	68.1M	19.1G	85.80	57.35	54.77	53.23	[BaiduDisk]
RRN-A2 (ours)	39.0M	10.8G	85.80	59.72	56.74	55.49	[BaiduDisk]
WRN-46-14	128M	18.6G	85.22	56.37	54.19	52.63	[BaiduDisk]
RRN-A3 (ours)	75.9M	19.9G	86.79	60.10	57.29	55.84	[BaiduDisk]
WRN-70-16	267M	38.8G	85.51	56.78	54.52	52.80	[BaiduDisk]
RRN-A4 (ours)	147M	39.4G	87.10	60.26	57.90	56.29	[BaiduDisk]

CIFAR-100 (TRADES)

Model	$^{\#}\rm{P}$	$^{\#}\rm{F}$	Clean	$\rm{PGD}^{20}$	$\rm{CW}^{40}$	AA
WRN-28-10	36.5M	5.20G	56.30	29.91	26.22	25.26	[BaiduDisk]
RobNet-large-v2	33.3M	5.10G	55.27	29.23	24.63	23.69	[BaiduDisk]
AdvRush	32.6M	4.97G	56.40	30.40	26.16	25.27	[BaiduDisk]
RACL	32.5M	4.93G	56.09	30.38	26.65	25.65	[BaiduDisk]
RRN-A1 (ours)	19.2M	5.11G	59.34	32.70	27.76	26.75	[BaiduDisk]
WRN-34-12	66.5M	9.60G	56.08	29.87	26.51	25.47	[BaiduDisk]
WRN-34-R	68.1M	19.1G	58.78	31.17	27.33	26.31	[BaiduDisk]
RRN-A2 (ours)	39.0M	10.8G	59.38	33.00	28.71	27.68	[BaiduDisk]
WRN-46-14	128M	18.6G	56.78	30.03	27.27	26.28	[BaiduDisk]
RRN-A3 (ours)	75.9M	19.9G	60.16	33.59	29.58	28.48	[BaiduDisk]
WRN-70-16	267M	38.8G	56.93	29.76	27.20	26.12	[BaiduDisk]
RRN-A4 (ours)	147M	39.4G	61.66	34.25	30.04	29.00	[BaiduDisk]

CIFAR-10 (SAT)

Model	$^{\#}\rm{P}$	$^{\#}\rm{F}$	$\rm{PGD}^{20}$	$\rm{CW}^{40}$
WRN-28-10	36.5M	5.20G	52.44	50.97	[BaiduDisk]
RRN-A1 (ours)	19.2M	5.11G	57.62	56.06	[BaiduDisk]
WRN-34-12	66.5M	9.60G	52.85	51.36	[BaiduDisk]
RRN-A2 (ours)	39.0M	10.8G	58.39	56.99	[BaiduDisk]
WRN-46-14	128M	18.6G	53.67	52.95	[BaiduDisk]
RRN-A3 (ours)	75.9M	19.9G	58.81	57.60	[BaiduDisk]
WRN-70-16	267M	38.8G	54.12	50.52	[BaiduDisk]
RRN-A4 (ours)	147M	39.4G	59.01	57.85	[BaiduDisk]

CIFAR-10 (MART)

Model	$^{\#}\rm{P}$	$^{\#}\rm{F}$	$\rm{PGD}^{20}$	$\rm{CW}^{40}$
WRN-28-10	36.5M	5.20G	57.69	52.88	[BaiduDisk]
RRN-A1 (ours)	19.2M	5.11G	59.34	54.42	[BaiduDisk]
WRN-34-12	66.5M	9.60G	57.40	53.11	[BaiduDisk]
RRN-A2 (ours)	39.0M	10.8G	60.33	55.51	[BaiduDisk]
WRN-46-14	128M	18.6G	58.43	54.32	[BaiduDisk]
RRN-A3 (ours)	75.9M	19.9G	60.95	56.52	[BaiduDisk]
WRN-70-16	267M	38.8G	58.15	54.37	[BaiduDisk]
RRN-A4 (ours)	147M	39.4G	61.88	57.55	[BaiduDisk]

How to train

Baseline adversarial training

python -m torch.distributed.launch \
  --nproc_per_node=2 --master_port 24220 \  # use a random port number
  main_dist.py \
  --config_path ./configs/CIFAR10 \
  --exp_name ./exps/CIFAR10 \  # path to where you want to store training stats
  --version [WRN-A1/A2/A3/A4] \  # you may also change it to RobustResNet-A1/A2/A3/A4
  --train \ 
  --data_parallel \
  --apex-amp

Advanced adversarial training

Please download the additional pseudolabeled data from Carmon et al., 2019.

python -m torch.distributed.launch \
  --nproc_per_node=8 --master_port 14226 \  # use a random port number
  adv-main_dist.py \
  --log-dir ./checkpoints/ \  # path to where you want to store training stats
  --config-path ./configs/Advanced_CIFAR10
  --version [WRN-A1/A2/A3/A4] \ 
  --desc drna4-basic-silu-apex-500k \  # name of the folder for storing training stats
  --apex-amp --adv-eval-freq 5 \  # evaluation frequency, will significantly slow down your training if too often
  --start-eval 310 \  # start evaluating after N epochs
  --apex_amp --advnorm --adjust_bn True \
   --num-adv-epochs 400 --batch-size 1024 --lr 0.4 --weight-decay 0.0005 --beta 6.0 \
  --data-dir /datasets/ --data cifar10s \
  --aux-data-filename /datasets/ti_500K_pseudo_labeled.pickle \  # location to where you download the pseudolabeled data
  --unsup-fraction 0.7

Requirements

The code has been implemented and tested with Python 3.8.5, PyTorch 1.8.0, and apex(use for accel).

Part of the code is based on the following repos:

RobustWRN: https://github.com/HanxunH/RobustWRN
adversarial_robustness_pytorch: https://github.com/imrahulr/adversarial_robustness_pytorch
MART: https://github.com/YisenWang/MART
TREADES: https://github.com/yaodongyu/TRADES
RST: https://github.com/yaircarmon/semisup-adv
AutoAttack: https://github.com/fra31/auto-attack

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
adv_core		adv_core
assets		assets
auto_attack		auto_attack
configs		configs
core		core
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
adv-main_dist.py		adv-main_dist.py
main_dist.py		main_dist.py
requirements.txt		requirements.txt

License

zhichao-lu/robust-residual-network

Folders and files

Latest commit

History

Repository files navigation