Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Subspace Attack Reproduction


Attempt to reproduce the NeurIPS 2019 paper Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks.

The original code of the paper can be found here. We are trying to reproduce the attack to GDAS and WRN model trained on CIFAR-10 dataset, without using and looking at the original code.

This project is done as project for the CS-433 Machine Learning Course at EPFL, and as part of the NeurIPS 2019 Reproducibility Challenge.


We make use of some pretrained models, that can be downloaded here. They are a subset of the models provided with the code of the original paper. They need to be unzipped and put in the ./pretrained folder, in the root directory of the repo.

The dataset (CIFAR) is automatically downloaded via torchvision.datasets when first running the experiment, and will be saved in the data/ folder (more info here).

The paper is implemented and tested using Python 3.7. Dependencies are listed in requirements.txt.

For the moment, it is possible to run the experiment using VGG nets and AlexNet as reference models and GDAS, WRN and PyramidNet as victim models.

In order to test our implemenation, install the dependencies with pip3 install --user --requirement requirements.txt, and run the following command:


This will run the experiment on line 5 of table II of our report, with the following settings:

  • Reference models: AlexNet+VGGs
  • Victim model: GDAS
  • Number of images: 1000
  • Maximum queries per image: 10000
  • 0 seed

And hyperparameters:

  • eta_g = 0.1
  • eta = 1/255
  • delta = 0.1
  • tau = 1.0
  • epsilon = 8/255

N.B.: it takes 7 hours 45 minutes to run on a Google Cloud Platform n1-highmem-8 virtual machine, with 8 vCPU, 52 GB memory and an Nvidia Tesla T4.

Moreover, the following settings can be used to customize the experiment:

usage: [-h] [-ds {Dataset.CIFAR_10}]
                     [--reference-models {vgg11_bn,vgg13_bn,vgg16_bn,vgg19_bn,AlexNet_bn} [{vgg11_bn,vgg13_bn,vgg16_bn,vgg19_bn,AlexNet_bn} ...]]
                     [--victim-model {gdas,wrn,pyramidnet}]
                     [--loss {ExperimentLoss.CROSS_ENTROPY,ExperimentLoss.NEG_LL}]
                     [--tau TAU] [--epsilon EPSILON] [--delta DELTA]
                     [--eta ETA] [--eta_g ETA_G] [--n-images N_IMAGES]
                     [--image-limit IMAGE_LIMIT]
                     [--compare-gradients COMPARE_GRADIENTS]
                     [--check-success CHECK_SUCCESS]
                     [--show-images SHOW_IMAGES] [--seed SEED]

optional arguments:
  -h, --help            show this help message and exit
  -ds {Dataset.CIFAR_10}, --dataset {Dataset.CIFAR_10}
                        The dataset to be used.
  --reference-models {vgg11_bn,vgg13_bn,vgg16_bn,vgg19_bn,AlexNet_bn} [{vgg11_bn,vgg13_bn,vgg16_bn,vgg19_bn,AlexNet_bn} ...]
                        The reference models to be used.
  --victim-model {gdas,wrn,pyramidnet}
                        The model to be attacked.
  --loss {ExperimentLoss.CROSS_ENTROPY,ExperimentLoss.NEG_LL}
                        The loss function to be used
  --tau TAU             Bandit exploration.
  --epsilon EPSILON     The norm budget.
  --delta DELTA         Finite difference probe.
  --eta ETA             Image learning rate.
  --eta_g ETA_G         OCO learning rate.
  --n-images N_IMAGES   The number of images on which the attack has to be run
  --image-limit IMAGE_LIMIT
                        Limit of iterations to be done for each image
  --compare-gradients COMPARE_GRADIENTS
                        Whether the program should output a comparison between
                        the estimated and the true gradients.
  --check-success CHECK_SUCCESS
                        Whether the attack on each image should stop if it has
                        been successful.
  --show-images SHOW_IMAGES
                        Whether each image to be attacked, and its
                        corresponding adversarial examples should be shown
  --seed SEED           The random seed with which the experiment should be
                        run, to be used for reproducibility purposes.

In order to run an experiment on 100 images in which the loss of the true model and the cosine similarity between the estimated and true gradient, for all 5000 iterations per image, regardless of the success of the attack (i.e. the one used for figures 1 and 2 of our report), you should run

python3 --check-success=False --n-images=100 --compare-gradients=True

N.B.: it takes around 20 hours to run the experiment on the aforementioned machine.

The experiment results are saved in the outputs/ folder, in a file named YYYY-MM-DD.HH-MM.npy a dictionary exported with The format of the dictionary is:

experiment_info = {
    'experiment_baseline': {
        'victim_model': victim_model_name,
        'reference_model_names': reference_model_names,
        'dataset': dataset
    'hyperparameters': {
        'tau': tau,
        'epsilon': epsilon,
        'delta': delta,
        'eta': eta,
        'eta_g': eta_g
    'settings': {
        'n_images': n_images,
        'image_limit': image_limit,
        'compare_gradients': compare_gradients,
        'gpu': # If the GPU has been used for the experiment,
        'seed': seed
    'results': {
        'queries': # The number of queries run
        'total_time' # The time it took to run the experiment
        # The following are present only if compare_gradients == True
        'gradient_products': # The cosine similarities for each image
        'true_gradient_norms': # The norms of the true gradients for each image
        'estimated_gradient_norms': # The norms of the estimated gradients for each image
        'true_losses': # The true losses each iteration
        'common_signs': # The percentages of common signs between true and est gradients
        'subs_common_signs': # The percentages of common signs between subsequent gradients

The file can be imported in Python using np.load(output_path, allow_pickle=True).item().

Project structure

The repository is structured in the following way:

├── black-box_attack_reproduce.ipynb
├── data # Should contain the dataset used
├── # Contains the experiment
├── img # Contains images used in notebooks
│   └── algo1.png
├── notebooks # Contains some notebooks used to analyze the experiments
│   └── experiment_analysis.ipynb
├── outputs # Contains the .npy files obtained in the reported experiments
├── pretrained # Should contain the pretrained models (.pth files)
├── # This file :)
├── requirements.txt # Contains information about dependencies
└── src
    ├── # Some helper functions
    ├── # Some functions used to load the dataset
    ├── # Some functions used to load the loss function
    ├── # Some functions to load pretrained models
    ├── models # Contains the classes of the models (not made by us, link to original repo above)
    ├── # A function to plot images
    └── # The very attack, the core of the repo


Attempt to reproduce the paper Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks.







No releases published


No packages published