ZOSVRG for Generating Universal Attacks on Black-box Neural Networks
ZOSVRG is the proposed new zeroth-order nonconvex optimization method. This repo presents ZOSVRG's application for generating adversarial attacks on black-box neural networks. It contains a pretrained network model for the MNIST classification task, and a Python implementation for attack generation that can directly be applied to the network model.
For the ZOSVRG algorithm, see our NIPS 2018 paper “Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization” (Hereinafter referred to as Paper.)
This Python code generates universal adversarial attacks on neural networks for the MNIST classification task under the black-box setting. For an image x, the universal attack d is first applied to x in the arctanh space. The final adversarial image is then obtained by applying the tanh transform. Summarizing, xadv = tanh(arctanh(2x) + d)/2
Below is a list of parameters that the present code takes:
- optimizer: This parameter specifies the optimizer to use during attack generation. Currently the code supports ZOSGD and ZOSVRG.
- q: The number of random vector to average over when estimating the gradient.
- alpha: The optimizer's step size for updating solutions is alpha/(dimension of x)
- M: (For ZOSVRG) The number of batches to apply during each stage.
- nStage: (For ZOSVRG) The number of stages. Note that for ZOSGD, the number of iterations is equal to M × nStage.
python3 Universal_Attack.py -optimizer ZOSVRG -q 10 -alpha 1.0 -M 10 -nStage 25000 -const 1 -nFunc 10 -batch_size 5 -mu 0.01 -target_label 4 -rv_dist UnitSphere