# FGSM

Here we're using targeted attacks to force a toy model into mistakes. As a targeted attack method we employ FGSM (fast gradient signed method), whose priority is speed (as the name suggests), definetly not optimality.

In [12]:
import torch
import torch.nn as nn

In [13]:
torch.manual_seed(1)

<torch._C.Generator at 0x1958029f210>

Define the architecture. <br>
Note that for the purpose of the exercise we don't actually need to train the model. 

In [14]:
# Plain and simple ReLU network
N = nn.Sequential(nn.Linear(10, 10, bias=False),
                  nn.ReLU(),
                  nn.Linear(10, 10, bias=False),
                  nn.ReLU(),
                  nn.Linear(10, 3, bias=False))

Let's generate an input

In [15]:
# Let's generate an example
x = torch.rand((1, 10)) # the first dimension is the batch size; the following dimensions the actual dimension of the data
x.requires_grad_() # make sure we can compute the gradient w.r.t x
t = 1 # target class (i.e. the class that we want to force the model to predict)

Let's define the noise level

In [16]:
epsReal = 0.4 #depending on your data this might be large or small
eps = epsReal - 1e-7 # small constant to offset floating-point erros

In [17]:
print("Original Class: ", N(x).argmax(dim=1).item())
assert(N(x).argmax(dim=1).item() == 2)

Original Class:  2


In [18]:
# compute gradient
# note that CrossEntropyLoss() combines the cross-entropy loss and an implicit softmax function
L = nn.CrossEntropyLoss()
loss = L(N(x), torch.tensor([t], dtype=torch.long))
loss.backward()

The idea behinf FGSM is pretty simple: we'll perturbe the input on each dimension in the direction that minimizes the loss wrt to the target class, hoping to get into the new label area without actually going too far. Note however, that there is no control over the length of the movement in FGSM, hence we have no guarantees (other than the magnitude of the noise) on the level of distortion of the input.

In [24]:
xBar = x - eps*torch.sign(x.grad)

In [25]:
print("New Class: ", N(xBar).argmax(dim=1).item())
assert(N(xBar).argmax(dim=1).item() == 1)
assert( torch.norm((x-xBar), p=float('inf')) <= epsReal)

New Class:  1


Mission accomplished! 