pytorch-cw2

Introduction

This is a rich-documented PyTorch implementation of Carlini-Wanger's L2 attack. The main reason to develop this respository is to make it easier to do research using the attach technique. Another implementation in PyTorch is rwightman/pytorch-nips2017-attack-example. However, the author failed to reproduce the result presented in the original paper (by Aug 2, 2018 at least).

cw.py has been tested under python 2.7.12 and torch-0.3.1.

References:

carlini/nn_robust_attacks
rwightman/pytorch-nips2017-attack-example

Usage of this library module

First of all, make sure the import runutils statement in cw.py (line 19) is a valid import statement in your development environment.

In the following code sample, we assume that net is a pretrained network, such that outputs = net(torch.autograd.Variable(inputs)) returns a torch.autograd.Variable of dimension (batch_size, num_classes) if inputs is of dimension (batch_size, num_channels, height, width). Assume also that when doing normalization to the inputs, the normalization transformation is presented something like:

normalization = torchvision.transforms.Normalize(mean, std)

where mean and std are both 3-tuples of floats.

One more thing to notice is that when producing adversarial examples from inputs, cw.py prints debugging information. To suppress such behavior, use

sed -i '/FIXME$/d' cw.py

to delete all printing statements.

Code example

To make the following code snippet executable, these variables need to be assigned:

dataloader: the dataloader (of type torch.utils.data.DataLoader)
mean: the mean used in inputs normalization
std: the standard deviation used in inputs normalization

import torch
import cw

inputs_box = (min((0 - m) / s for m, s in zip(mean, std)),
              max((1 - m) / s for m, s in zip(mean, std)))
# an untargeted adversary
adversary = cw.L2Adversary(targeted=False,
                           confidence=0.0,
                           search_steps=10,
                           box=inputs_box,
                           optimizer_lr=5e-4)

inputs, targets = next(iter(dataloader))
adversarial_examples = adversary(net, inputs, targets, to_numpy=False)
assert isinstance(adversarial_examples, torch.FloatTensor)
assert adversarial_examples.size() == inputs.size()

# a targeted adversary
adversary = cw.L2Adversary(targeted=True,
                           confidence=0.0,
                           search_steps=10,
                           box=inputs_box,
                           optimizer_lr=5e-4)

inputs, _ = next(iter(dataloader))
# a batch of any attack targets
attack_targets = torch.ones(inputs.size(0)) * 3
adversarial_examples = adversary(net, inputs, attack_targets, to_numpy=False)
assert isinstance(adversarial_examples, torch.FloatTensor)
assert adversarial_examples.size() == inputs.size()

What's to_numpy parameter? In the above examples, if it were True, adversarial_examples would be of type numpy.ndarray. This behavior might be desirable if one would like to store the adversarial examples in compressed npz format using numpy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pytorch-cw2

Introduction

References:

Usage of this library module

Code example

Files

README.md

Latest commit

History

README.md

File metadata and controls

pytorch-cw2

Introduction

References:

Usage of this library module

Code example