This is a rich-documented PyTorch implementation of Carlini-Wanger's L2 attack. The main reason to develop this respository is to make it easier to do research using the attach technique. Another implementation in PyTorch is rwightman/pytorch-nips2017-attack-example. However, the author failed to reproduce the result presented in the original paper (by Aug 2, 2018 at least).
cw.py
has been tested under python 2.7.12
and torch-0.3.1
.
First of all, make sure the import runutils
statement in cw.py
(line 19) is a valid import statement in your development environment.
In the following code sample, we assume that net
is a pretrained network, such that outputs = net(torch.autograd.Variable(inputs))
returns a torch.autograd.Variable
of dimension (batch_size, num_classes)
if inputs
is of dimension (batch_size, num_channels, height, width)
. Assume also that when doing normalization to the inputs, the normalization transformation is presented something like:
normalization = torchvision.transforms.Normalize(mean, std)
where mean
and std
are both 3-tuples of floats.
One more thing to notice is that when producing adversarial examples from inputs, cw.py
prints debugging information. To suppress such behavior, use
sed -i '/FIXME$/d' cw.py
to delete all printing statements.
To make the following code snippet executable, these variables need to be assigned:
dataloader
: the dataloader (of typetorch.utils.data.DataLoader
)mean
: the mean used in inputs normalizationstd
: the standard deviation used in inputs normalization
import torch
import cw
inputs_box = (min((0 - m) / s for m, s in zip(mean, std)),
max((1 - m) / s for m, s in zip(mean, std)))
# an untargeted adversary
adversary = cw.L2Adversary(targeted=False,
confidence=0.0,
search_steps=10,
box=inputs_box,
optimizer_lr=5e-4)
inputs, targets = next(iter(dataloader))
adversarial_examples = adversary(net, inputs, targets, to_numpy=False)
assert isinstance(adversarial_examples, torch.FloatTensor)
assert adversarial_examples.size() == inputs.size()
# a targeted adversary
adversary = cw.L2Adversary(targeted=True,
confidence=0.0,
search_steps=10,
box=inputs_box,
optimizer_lr=5e-4)
inputs, _ = next(iter(dataloader))
# a batch of any attack targets
attack_targets = torch.ones(inputs.size(0)) * 3
adversarial_examples = adversary(net, inputs, attack_targets, to_numpy=False)
assert isinstance(adversarial_examples, torch.FloatTensor)
assert adversarial_examples.size() == inputs.size()
What's to_numpy
parameter? In the above examples, if it were True
, adversarial_examples
would be of type numpy.ndarray
. This behavior might be desirable if one would like to store the adversarial examples in compressed npz
format using numpy
.