In [2]:
%matplotlib inline


# Adversarial Example Generation

**Author:** [Nathan Inkawhich](https://github.com/inkawhich)_

If you are reading this, hopefully you can appreciate how effective some
machine learning models are. Research is constantly pushing ML models to
be faster, more accurate, and more efficient. However, an often
overlooked aspect of designing and training models is security and
robustness, especially in the face of an adversary who wishes to fool
the model.

This tutorial will raise your awareness to the security vulnerabilities
of ML models, and will give insight into the hot topic of adversarial
machine learning. You may be surprised to find that adding imperceptible
perturbations to an image *can* cause drastically different model
performance. Given that this is a tutorial, we will explore the topic
via example on an image classifier. Specifically, we will use one of the
first and most popular attack methods, the Fast Gradient Sign Attack
(FGSM), to fool an MNIST classifier.


## Threat Model

For context, there are many categories of adversarial attacks, each with
a different goal and assumption of the attacker’s knowledge. However, in
general the overarching goal is to add the least amount of perturbation
to the input data to cause the desired misclassification. There are
several kinds of assumptions of the attacker’s knowledge, two of which
are: **white-box** and **black-box**. A *white-box* attack assumes the
attacker has full knowledge and access to the model, including
architecture, inputs, outputs, and weights. A *black-box* attack assumes
the attacker only has access to the inputs and outputs of the model, and
knows nothing about the underlying architecture or weights. There are
also several types of goals, including **misclassification** and
**source/target misclassification**. A goal of *misclassification* means
the adversary only wants the output classification to be wrong but does
not care what the new classification is. A *source/target
misclassification* means the adversary wants to alter an image that is
originally of a specific source class so that it is classified as a
specific target class.

In this case, the FGSM attack is a *white-box* attack with the goal of
*misclassification*. With this background information, we can now
discuss the attack in detail.

## Fast Gradient Sign Attack

One of the first and most popular adversarial attacks to date is
referred to as the *Fast Gradient Sign Attack (FGSM)* and is described
by Goodfellow et. al. in [Explaining and Harnessing Adversarial
Examples](https://arxiv.org/abs/1412.6572)_. The attack is remarkably
powerful, and yet intuitive. It is designed to attack neural networks by
leveraging the way they learn, *gradients*. The idea is simple, rather
than working to minimize the loss by adjusting the weights based on the
backpropagated gradients, the attack *adjusts the input data to maximize
the loss* based on the same backpropagated gradients. In other words,
the attack uses the gradient of the loss w.r.t the input data, then
adjusts the input data to maximize the loss.

Before we jump into the code, let’s look at the famous
[FGSM](https://arxiv.org/abs/1412.6572)_ panda example and extract
some notation.

.. figure:: /_static/img/fgsm_panda_image.png
   :alt: fgsm_panda_image

From the figure, $\mathbf{x}$ is the original input image
correctly classified as a “panda”, $y$ is the ground truth label
for $\mathbf{x}$, $\mathbf{\theta}$ represents the model
parameters, and $J(\mathbf{\theta}, \mathbf{x}, y)$ is the loss
that is used to train the network. The attack backpropagates the
gradient back to the input data to calculate
$\nabla_{x} J(\mathbf{\theta}, \mathbf{x}, y)$. Then, it adjusts
the input data by a small step ($\epsilon$ or $0.007$ in the
picture) in the direction (i.e.
$sign(\nabla_{x} J(\mathbf{\theta}, \mathbf{x}, y))$) that will
maximize the loss. The resulting perturbed image, $x'$, is then
*misclassified* by the target network as a “gibbon” when it is still
clearly a “panda”.

Hopefully now the motivation for this tutorial is clear, so lets jump
into the implementation.




In [3]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.autograd import grad as torch_grad
import torch.optim as optim
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt

# NOTE: This is a hack to get around "User-agent" limitations when downloading MNIST datasets
#       see, https://github.com/pytorch/vision/issues/3497 for more information
from six.moves import urllib
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)

## Implementation

In this section, we will discuss the input parameters for the tutorial,
define the model under attack, then code the attack and run some tests.

### Inputs

There are only three inputs for this tutorial, and are defined as
follows:

-  **epsilons** - List of epsilon values to use for the run. It is
   important to keep 0 in the list because it represents the model
   performance on the original test set. Also, intuitively we would
   expect the larger the epsilon, the more noticeable the perturbations
   but the more effective the attack in terms of degrading model
   accuracy. Since the data range here is $[0,1]$, no epsilon
   value should exceed 1.

-  **pretrained_model** - path to the pretrained MNIST model which was
   trained with
   [pytorch/examples/mnist](https://github.com/pytorch/examples/tree/master/mnist)_.
   For simplicity, download the pretrained model [here](https://drive.google.com/drive/folders/1fn83DF14tWmit0RTKWRhPq5uVXt73e0h?usp=sharing)_.

-  **use_cuda** - boolean flag to use CUDA if desired and available.
   Note, a GPU with CUDA is not critical for this tutorial as a CPU will
   not take much time.




In [4]:
epsilons = [0, .05, .1, .15, .2, .25, .3]
pretrained_model = "/home/reihaneh/CAAD/datamodel/caad-Copy1.pth"
use_cuda=True

In [5]:
class Discriminator_nodropout(nn.Module):
    def __init__(self, d_project=128, d_hidden=128):
        super(Discriminator_nodropout, self).__init__()

        self.main = nn.Sequential(
            nn.Conv2d(1, 32, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(32, 64, 4, 2, 1, bias=False),
            nn.InstanceNorm2d(64, affine=True),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(64, 64, 4, 2, 1, bias=False),
            nn.InstanceNorm2d(64, affine=True),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(64, 32, 4, 2, 1, bias=False)
        )

        self.decision = nn.Sequential(
            nn.InstanceNorm2d(32, affine=True),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(32, 1, 5, 1, 0, bias=False)
        )

        self.projection1 = nn.Sequential(
            nn.Linear(800, d_hidden),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Linear(d_hidden, d_hidden),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Linear(d_hidden, d_project)
        )
        self.projection2 = nn.Sequential(
            nn.Linear(800, d_hidden),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Linear(d_hidden, d_hidden),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Linear(d_hidden, d_project)
        )

    def forward(self, input, sg_linear=False, projection1=False, projection2=False, penul=False):

        _aux = {}
        _return_aux = False

        penultimate = self.main(input)

        if sg_linear:
            out_d = penultimate.detach()
        else:
            out_d = penultimate

        discout = self.decision(out_d)

        if projection1:
            _return_aux = True
            _aux['projection1'] = self.projection1(
                penultimate.view(penultimate.shape[0], -1))

        if projection2:
            _return_aux = True
            _aux['projection2'] = self.projection2(
                penultimate.view(penultimate.shape[0], -1))

        if _return_aux:
            return discout, _aux

        if penul:
            return discout, penultimate

        return discout

### Generator

In [6]:
def conv_1(in_c, out_c, bt):
    if bt:
        conv = nn.Sequential(
            nn.Conv2d(in_c, out_c, 4, 2, 1, bias=False),
            nn.BatchNorm2d(out_c),
            nn.LeakyReLU(0.2, inplace=True)
        )
    else:
        conv = nn.Sequential(
            nn.Conv2d(in_c, out_c, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True)
        )
    return conv

def de_conv(in_c, out_c):
    conv = nn.Sequential(
        nn.ConvTranspose2d(in_c, out_c, 4, 2, 1, bias=False),
        nn.BatchNorm2d(out_c),
        nn.ReLU(True)
    )
    return conv


class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.ngpu = 1
        self.nef = 32
        self.ngf = 16
        self.nBottleneck = 32
        self.nc = 1
        self.num_same_conv = 5

        # 1x80x80
        self.conv1 = conv_1(self.nc, self.nef, False)
        # 32x40x40
        self.conv2 = conv_1(self.nef, self.nef, True)
        # 32x20x20
        self.conv3 = conv_1(self.nef, self.nef*2, True)
        # 64x10x10
        self.conv4 = conv_1(self.nef*2+1, self.nef*4, True)
        # 128x5x5
        self.conv6 = nn.Conv2d(self.nef*4, self.nBottleneck, 2, bias=False)
        # 4000x4x4
        self.batchNorm1 = nn.BatchNorm2d(self.nBottleneck)
        self.leak_relu = nn.LeakyReLU(0.2, inplace=True)
        # 4000x4x4

        self.num_same_conv = self.num_same_conv
        self.sameconvs = nn.ModuleList([nn.ConvTranspose2d(
            32, 32, 3, 1, 1, bias=False) for _ in range(self.num_same_conv)])
        self.samepools = nn.ModuleList([nn.MaxPool2d(
            kernel_size=3, stride=1, padding=1) for _ in range(self.num_same_conv)])
        self.samebns = nn.ModuleList(
            [nn.BatchNorm2d(32) for _ in range(self.num_same_conv)])

        self.convt1 = nn.ConvTranspose2d(
            self.nBottleneck, self.ngf * 8, 2, bias=False)
        self.batchNorm2 = nn.BatchNorm2d(self.ngf * 8)
        self.relu = nn.ReLU(True)
        # 128x5x5
        self.convt2 = de_conv(256, 64)
        # 64x10x10
        self.convt3 = de_conv(128+1, 32)
        # 32x20x20
        self.convt4 = de_conv(64, 32)
        # 32x40x40
        self.convt6 = nn.ConvTranspose2d(64, self.nc, 4, 2, 1, bias=False)
        # 1x80x80
        self.tan = nn.Tanh()

    def forward(self, noise, x):
        x1 = self.conv1(x)
        x2 = self.conv2(x1)
        x3 = self.conv3(x2)
        mod_input = torch.cat([noise, x3], dim=1)
        x4 = self.conv4(mod_input)
        x6 = self.conv6(x4)
        x7 = self.batchNorm1(x6)
        x8 = self.leak_relu(x7)
        x9 = self.convt1(x8)

        x10 = self.batchNorm2(x9)
        x11 = self.relu(x10)
        x12 = self.convt2(torch.cat([x4, x11], 1))
        out = self.convt3(torch.cat([mod_input, x12], 1))

        for i in range(self.num_same_conv):
            conv = self.sameconvs[i]
            pool = self.samepools[i]
            bn = self.samebns[i]

            out = conv(out)
            out = pool(out)
            out = bn(out)
            out = F.leaky_relu(out, negative_slope=0.2)

        x14 = self.convt4(torch.cat([x2, out], 1))
        x15 = self.convt6(torch.cat([x1, x14], 1))

        return self.tan(x15)



In [7]:
import os

In [8]:
#check_dirrectory
for f in os.listdir("/home/reihaneh/CAAD/datamodel/"):
 print(f)

testwithtargetmain.pkl
caad-Copy1.pth
.ipynb_checkpoints


In [9]:
import pickle


with open('/home/reihaneh/CAAD/datamodel/testwithtargetmain.pkl', 'rb') as f:
    read = pickle.load(f)

data, target = list(zip(*list(read)))
data = np.array(data)
target = np.array(target)
target_idx_anom = np.argwhere(target==0)
#print(data.shape, target.shape, np.sum(target) ,target_idx_anom.shape )
data = np.squeeze(data[target_idx_anom, :], axis=1)
target = target[target_idx_anom]
print(data.shape, target.shape)

test_data = []
for i in range(len(data)):
    test_data.append([data[i], target[i]])
    
test_loader = torch.utils.data.DataLoader(test_data, batch_size=1, shuffle=True)

(982, 1, 80, 80) (982, 1)


### Model Under Attack

As mentioned, the model under attack is the same MNIST model from
[pytorch/examples/mnist](https://github.com/pytorch/examples/tree/master/mnist)_.
You may train and save your own MNIST model or you can download and use
the provided model. The *Net* definition and test dataloader here have
been copied from the MNIST example. The purpose of this section is to
define the model and dataloader, then initialize the model and load the
pretrained weights.




In [10]:
# Define what device we are using
print("CUDA Available: ",torch.cuda.is_available())
device = torch.device("cuda:2" if (use_cuda and torch.cuda.is_available()) else "cpu")

# Initialize the network
model = Discriminator_nodropout().to(device)

# Load the pretrained model
model.load_state_dict(torch.load(pretrained_model, map_location=device))

# Set the model in evaluation mode. In this case this is for the Dropout layers
model.eval()

CUDA Available:  True


Discriminator_nodropout(
  (main): Sequential(
    (0): Conv2d(1, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): LeakyReLU(negative_slope=0.2, inplace=True)
    (2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (3): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)
    (4): LeakyReLU(negative_slope=0.2, inplace=True)
    (5): Conv2d(64, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (6): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)
    (7): LeakyReLU(negative_slope=0.2, inplace=True)
    (8): Conv2d(64, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  )
  (decision): Sequential(
    (0): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)
    (1): LeakyReLU(negative_slope=0.2, inplace=True)
    (2): Conv2d(32, 1, kernel_size=(5, 5), stride=(1, 1), bias=False)
  )
  (projecti

### FGSM Attack

Now, we can define the function that creates the adversarial examples by
perturbing the original inputs. The ``fgsm_attack`` function takes three
inputs, *image* is the original clean image ($x$), *epsilon* is
the pixel-wise perturbation amount ($\epsilon$), and *data_grad*
is gradient of the loss w.r.t the input image
($\nabla_{x} J(\mathbf{\theta}, \mathbf{x}, y)$). The function
then creates perturbed image as

\begin{align}perturbed\_image = image + epsilon*sign(data\_grad) = x + \epsilon * sign(\nabla_{x} J(\mathbf{\theta}, \mathbf{x}, y))\end{align}

Finally, in order to maintain the original range of the data, the
perturbed image is clipped to range $[0,1]$.




In [11]:
# FGSM attack code
def fgsm_attack(image, epsilon, data_grad):
    # Collect the element-wise sign of the data gradient
    sign_data_grad = data_grad.sign()
    # Create the perturbed image by adjusting each pixel of the input image
    perturbed_image = image + epsilon*sign_data_grad
    # Adding clipping to maintain [0,1] range
    perturbed_image = torch.clamp(perturbed_image, 0, 1)
    # Return the perturbed image
    return perturbed_image

### Discriminator loss function

In [12]:
def gradient_penalty(D, images, gen_images):
    
    batch_size = images.size(0)
    _device = images.device
    alpha = torch.rand(batch_size, 1, 1, 1)
    alpha = alpha.expand_as(images)
    alpha = alpha.to(_device)
    interpolated = alpha * images.data + (1 - alpha) * gen_images.data
    interpolated = Variable(interpolated, requires_grad=True)
    interpolated = interpolated.to(_device)
    prob_interpolated = D(interpolated)
    gradients = torch_grad(outputs=prob_interpolated, inputs=interpolated,
                           grad_outputs=torch.ones(prob_interpolated.size()).to(_device),
                           create_graph=True, retain_graph=True)[0]
    gradients = gradients.view(batch_size, -1)
    
    return ((gradients.norm(2, dim=1) - 1) ** 2).mean()

### Testing Function

Finally, the central result of this tutorial comes from the ``test``
function. Each call to this test function performs a full test step on
the MNIST test set and reports a final accuracy. However, notice that
this function also takes an *epsilon* input. This is because the
``test`` function reports the accuracy of a model that is under attack
from an adversary with strength $\epsilon$. More specifically, for
each sample in the test set, the function computes the gradient of the
loss w.r.t the input data ($data\_grad$), creates a perturbed
image with ``fgsm_attack`` ($perturbed\_data$), then checks to see
if the perturbed example is adversarial. In addition to testing the
accuracy of the model, the function also saves and returns some
successful adversarial examples to be visualized later.




In [15]:
import pdb
lambda_gp = 10

def test( model, device, test_loader, epsilon ):

    # Accuracy counter
    correct = 0
    adv_examples = []
    new_dat = []
    new_lab = []

    # Loop over all examples in test set
    for data, target in test_loader:

        # Send the data and label to the device
        data, target = data.to(device).type(torch.cuda.FloatTensor), target.unsqueeze(1).to(device).type(torch.cuda.torch.long)
        
        # Set requires_grad attribute of tensor. Important for Attack
        data.requires_grad = True

        # Forward pass the data through the model
        output = model(data)        
        init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability

        # If the initial prediction is wrong, dont bother attacking, just move on
        if init_pred.item() != target.item():
            continue
            
        # Collect datagrad
        data_grad = torch.autograd.grad(output, data)[0]
        
        
        # Call FGSM Attack
        perturbed_data = fgsm_attack(data, epsilon, data_grad)
        
        #pdb.set_trace()
        # Calculate the loss
        gp = gradient_penalty(model, data, perturbed_data)
        loss = (-(torch.mean(data) -
                        torch.mean(perturbed_data))) + lambda_gp*gp

        # Zero all existing gradients
        model.zero_grad()

        # Calculate gradients of model in backward pass
        loss.backward()

        # Re-classify the perturbed image
        output = model(perturbed_data)
        
        #save new dataset
        new_data = perturbed_data.squeeze().detach().cpu().numpy()        
        targett = target.squeeze().detach().cpu().numpy()
        new_dat.append(new_data)
        new_lab.append(targett)
                        
        #pdb.set_trace()
        final_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
        if final_pred.item() == target.item():
            correct += 1
            # Special case for saving 0 epsilon examples
            if (epsilon == 0) and (len(adv_examples) < 5):
                adv_ex = perturbed_data.squeeze().detach().cpu().numpy()
                adv_examples.append( (init_pred.item(), final_pred.item(), adv_ex) )
        else:
            # Save some adv examples for visualization later
            if len(adv_examples) < 5:
                adv_ex = perturbed_data.squeeze().detach().cpu().numpy()
                adv_examples.append( (init_pred.item(), final_pred.item(), adv_ex) )
               
    #save new dataset
    newdat = (new_dat, new_lab)

    # Calculate final accuracy for this epsilon
    final_acc = correct/float(len(test_loader))
    print("Epsilon: {}\tTest Accuracy = {} / {} = {}".format(epsilon, correct, len(test_loader), final_acc))

    # Return the accuracy and an adversarial example
    return final_acc, adv_examples, newdat

### Run Attack

The last part of the implementation is to actually run the attack. Here,
we run a full test step for each epsilon value in the *epsilons* input.
For each epsilon we also save the final accuracy and some successful
adversarial examples to be plotted in the coming sections. Notice how
the printed accuracies decrease as the epsilon value increases. Also,
note the $\epsilon=0$ case represents the original test accuracy,
with no attack.




In [16]:
accuracies = []
examples = []
entire_adv_mnist = []

# Run test for each epsilon
for eps in epsilons:
    acc, ex, mnist = test(model, device, test_loader, eps)
    accuracies.append(acc)
    examples.append(ex)
    if eps == 0.0:
        entire_adv_mnist.append(mnist)


        
#pdb.set_trace()
        
with open('nwe-mnist.pkl','wb') as fp:
    pickle.dump(entire_adv_mnist,fp)
       
    

Epsilon: 0	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.05	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.1	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.15	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.2	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.25	Test Accuracy = 982 / 982 = 1.0
Epsilon: 0.3	Test Accuracy = 982 / 982 = 1.0


In [None]:
import pickle
import numpy as np

with open('nwe-mnist.pkl', 'rb') as f:
    entire_adv_mnist = pickle.load(f)

print(len(entire_adv_mnist)) # should be equal to the number of epsilons
#for mnist in entire_adv_mnist:
dat = np.array(mnist[0])
lab = np.array(mnist[1])
print(dat.shape, lab.shape)

## Results

### Accuracy vs Epsilon

The first result is the accuracy versus epsilon plot. As alluded to
earlier, as epsilon increases we expect the test accuracy to decrease.
This is because larger epsilons mean we take a larger step in the
direction that will maximize the loss. Notice the trend in the curve is
not linear even though the epsilon values are linearly spaced. For
example, the accuracy at $\epsilon=0.05$ is only about 4% lower
than $\epsilon=0$, but the accuracy at $\epsilon=0.2$ is 25%
lower than $\epsilon=0.15$. Also, notice the accuracy of the model
hits random accuracy for a 10-class classifier between
$\epsilon=0.25$ and $\epsilon=0.3$.




In [None]:
plt.figure(figsize=(5,5))
plt.plot(epsilons, accuracies, "*-")
plt.yticks(np.arange(0, 1.1, step=0.1))
plt.xticks(np.arange(0, .35, step=0.05))
plt.title("Accuracy vs Epsilon")
plt.xlabel("Epsilon")
plt.ylabel("Accuracy")
plt.show()

### Sample Adversarial Examples

Remember the idea of no free lunch? In this case, as epsilon increases
the test accuracy decreases **BUT** the perturbations become more easily
perceptible. In reality, there is a tradeoff between accuracy
degredation and perceptibility that an attacker must consider. Here, we
show some examples of successful adversarial examples at each epsilon
value. Each row of the plot shows a different epsilon value. The first
row is the $\epsilon=0$ examples which represent the original
“clean” images with no perturbation. The title of each image shows the
“original classification -> adversarial classification.” Notice, the
perturbations start to become evident at $\epsilon=0.15$ and are
quite evident at $\epsilon=0.3$. However, in all cases humans are
still capable of identifying the correct class despite the added noise.




In [None]:
# Plot several examples of adversarial samples at each epsilon
cnt = 0
plt.figure(figsize=(8,10))
for i in range(len(epsilons)):
    for j in range(len(examples[i])):
        cnt += 1
        plt.subplot(len(epsilons),len(examples[0]),cnt)
        plt.xticks([], [])
        plt.yticks([], [])
        if j == 0:
            plt.ylabel("Eps: {}".format(epsilons[i]), fontsize=14)
        orig,adv,ex = examples[i][j]
        plt.title("{} -> {}".format(orig, adv))
        plt.imshow(ex, cmap="gray")
plt.tight_layout()
plt.show()

## Where to go next?

Hopefully this tutorial gives some insight into the topic of adversarial
machine learning. There are many potential directions to go from here.
This attack represents the very beginning of adversarial attack research
and since there have been many subsequent ideas for how to attack and
defend ML models from an adversary. In fact, at NIPS 2017 there was an
adversarial attack and defense competition and many of the methods used
in the competition are described in this paper: [Adversarial Attacks and
Defences Competition](https://arxiv.org/pdf/1804.00097.pdf)_. The work
on defense also leads into the idea of making machine learning models
more *robust* in general, to both naturally perturbed and adversarially
crafted inputs.

Another direction to go is adversarial attacks and defense in different
domains. Adversarial research is not limited to the image domain, check
out [this](https://arxiv.org/pdf/1801.01944.pdf)_ attack on
speech-to-text models. But perhaps the best way to learn more about
adversarial machine learning is to get your hands dirty. Try to
implement a different attack from the NIPS 2017 competition, and see how
it differs from FGSM. Then, try to defend the model from your own
attacks.


