<a href="https://colab.research.google.com/github/8BitRobot/ECE188DeepLearning/blob/main/Adv_example_Task2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Perform an Adversarial attack.

For the second part of the project we consider a trained model (MobileNet) which is trained on the imagenet dataset. 

We use an evasion attack called [FGSM](https://neptune.ai/blog/adversarial-attacks-on-neural-networks-exploring-the-fast-gradient-sign-method#:~:text=The%20Fast%20Gradient%20Sign%20Method%20(FGSM)%20combines%20a%20white%20box,model%20into%20making%20wrong%20predictions.) to fool the neural network into making incorrect predictions.

## Import Packages.

Import the necessary packages we continue to use Tensorflow and Keras

In [58]:
import tensorflow as tf
import matplotlib as mpl
import matplotlib.pyplot as plt
from keras.preprocessing import image

mpl.rcParams['figure.figsize'] = (8, 8)
mpl.rcParams['axes.grid'] = False

## Load the Pretrained model. 

We use the [MobileNetV2](https://arxiv.org/abs/1801.04381) model trained on the [Imagenet](https://www.image-net.org/) dataset. 

In [59]:
pretrained_model = tf.keras.applications.MobileNetV2(include_top=True,
                                                     weights='imagenet')
pretrained_model.trainable = False

# ImageNet labels
decode_predictions = tf.keras.applications.mobilenet_v2.decode_predictions

### Helper Function for Data Processing


Following functions can be used for data processing. Dont worry about these, just use them. 

In [60]:
# Helper function to preprocess the image so that it can be inputted in MobileNetV2
def preprocess(image, target_size):
  image = tf.cast(image, tf.float32)
  image = tf.image.resize(image, (target_size, target_size))
  image = tf.keras.applications.mobilenet_v2.preprocess_input(image)
  image = image[None, ...]
  return image

# Helper function to extract labels from probability vector
def get_imagenet_label(probs):
  return decode_predictions(probs, top=1)[0][0]

## Load an Image. 


Load any image, we consider an image of a Golden Retriever. 

In [4]:
image_raw = tf.io.read_file('/content/giantpanda.jpg')
image = tf.image.decode_image(image_raw)

image = preprocess(image)
image_probs = pretrained_model.predict(image)

NotFoundError: ignored

In [5]:
plt.figure()
plt.imshow(image[0] * 0.5 + 0.5)  # To change [-1, 1] to [0,1]
_, image_class, class_confidence = get_imagenet_label(image_probs)
plt.title('{} : {:.2f}% Confidence'.format(image_class, class_confidence*100))
plt.show()

TypeError: ignored

<Figure size 432x288 with 0 Axes>

## Create the Adversarial Image. 

We use the FGSM method to create an adversarial image. Be sure to read about FGSM to understand how the attack works. 

In [61]:
loss_object = tf.keras.losses.CategoricalCrossentropy()

def create_adversarial_pattern(input_image, input_label):
  with tf.GradientTape() as tape:
    tape.watch(input_image)
    prediction = pretrained_model(input_image)
    loss = loss_object(input_label, prediction)

  # Get the gradients of the loss w.r.t to the input image.
  gradient = tape.gradient(loss, input_image)
  # Get the sign of the gradients to create the perturbation
  signed_grad = tf.sign(gradient)
  return signed_grad

In [7]:
# Get the input label of the image.
giant_panda_index = 388
label = tf.one_hot(giant_panda_index, image_probs.shape[-1])
label = tf.reshape(label, (1, image_probs.shape[-1]))

perturbations = create_adversarial_pattern(image, label)
plt.imshow(perturbations[0] * 0.5 + 0.5);  # To change [-1, 1] to [0,1]

NameError: ignored

In [62]:
def display_images(image, correct):
  _, label, confidence = get_imagenet_label(pretrained_model.predict(image))
  plt.figure()
  plt.imshow(image[0]*0.5+0.5)
  description = 'Epsilon = {:0.5f}'.format(eps) if eps else 'Input'
  plt.title('{} \n {} : {:.2f}% Confidence'.format(description,
                                                   label, confidence*100))
  plt.show()
  return label != correct

In [9]:
epsilons = [0, 0.01, 0.05, 0.1, 0.15]
descriptions = [('Epsilon = {:0.3f}'.format(eps) if eps else 'Input')
                for eps in epsilons]

for i, eps in enumerate(epsilons):
  adv_x = image + eps*perturbations
  adv_x = tf.clip_by_value(adv_x, -1, 1)
  if display_images(adv_x, "giant_panda"): break

NameError: ignored

# Task2: Perform an Analysis to understand the potency of the attack. 

Your task here is to understand how small a change could change the class output and this is measured by the epsilon value needed to change the class. 

Your task is as follows:

* Pick 10 images each from different classes in imagenet. 
* Perform a perturbation analysis on each of these images. 
* In the analysis you are required to report the smallest epsilon value for which you notice a class change. 
* Make a table for each of the images considered with the minimum epsilon value for the FGSM attack. 

Write the Code for the above below. You can add the table also below. 

In [79]:
def attack_image(i, step_size, model_num):
  global eps

  image_name = images[i]
  image_label_index = image_label_indices[i]
  image = tf.image.decode_image(tf.io.read_file("/content/task2-images/" + image_name + ".jpg"))
  image = preprocess(image, model_input_sizes[model_num])

  image_probs = pretrained_model.predict(image)

  label = tf.one_hot(image_label_indices[i], image_probs.shape[-1])
  label = tf.reshape(label, (1, image_probs.shape[-1]))

  perturbations = create_adversarial_pattern(image, label)
  # plt.imshow(perturbations[0] * 0.5 + 0.5);  # To change [-1, 1] to [0,1]

  eps = 0
  adv_x = image
  adv_x = tf.clip_by_value(adv_x, -1, 1)
  prediction = image_name

  while prediction == image_name:
    eps += step_size
    adv_x = image + eps * perturbations
    adv_x = tf.clip_by_value(adv_x, -1, 1)
    prediction = get_imagenet_label(pretrained_model.predict(adv_x))[1]
  min_epsilon[image_name][model_num] = eps

  return adv_x

def attack_all_images(model_num):
  for i in range(10):
    current_step_size = 0.01
    res = attack_image(i, current_step_size, model_num)
    while min_epsilon[images[i]][model_num] == current_step_size and current_step_size > 0.0001:
      current_step_size /= 2
      res = attack_image(i, current_step_size, model_num)
    # display_images(res, images[i])
    print("{: <12}{:0.5f}".format(images[i], min_epsilon[images[i]][model_num]))

In [81]:
images = ["agaric", "bee", "cannon", "flute", "jellyfish", "maze", "peacock", "pizza", "screw", "teddy"]
image_label_indices = [992, 309, 471, 558, 107, 646, 84, 963, 783, 850]

min_epsilon = {}

for i in images:
  min_epsilon[i] = [0 for i in range(6)]

model_input_sizes = [224, 299, 224, 480, 331, 224]

eps = 0

pretrained_model = tf.keras.applications.MobileNetV2(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(0)

agaric      0.38000


KeyboardInterrupt: ignored

```
agaric      	0.38000
bee         	0.00063
cannon      	0.02000
flute       	0.00063
jellyfish   	0.00250
maze        	0.07000
peacock     	0.55000
pizza       	0.00500
screw       	0.00500
teddy       	0.35000
```
Since the images might be necessary in order to generate these same epsilon values, I've copypasted the table here.

# Task3: Compare the robustness of the considered model with other models. 

Your task here is to compare how this model (MobileNetV2) compares with other popular object detection models. 

Your task is as follows:

* Consider 5 different models (you can consider various RESNET architectures, any models you find interesting).
* Load the pre-trained weights of the model (trained on imagenet). 
* Perform Task2 on all the considered models. 
* Add all the results in the table. Hence the final table you have 6 columns for each model and epsilon values for each of the 10 images for all 6 models. 


What do you observe? Why do you think this is the case? 

Write the Code for the above below. You can also add the table and answer to the question below. 


In [82]:
pretrained_model = tf.keras.applications.MobileNetV2(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(0)

pretrained_model = tf.keras.applications.Xception(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(1)

pretrained_model = tf.keras.applications.MobileNet(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(2)

pretrained_model = tf.keras.applications.EfficientNetV2L(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(3)

pretrained_model = tf.keras.applications.NASNetLarge(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(4)

pretrained_model = tf.keras.applications.ResNet152V2(include_top=True, weights='imagenet')
pretrained_model.trainable = False
attack_all_images(5)

print("{:<10} {:<8} {:<8} {:<8} {:<8} {:<8} {:<8}".format(
    "label",
    "MobileV2",
    "Xception",
    "Mobile",
    "EffV2L",
    "NASLarge",
    "Res152V2"
))

for i in range(10):
  print("{:<10} {:<8} {:<8} {:<8} {:<8} {:<8} {:<8}".format(
      images[i], 
      str(round(min_epsilon[images[i]][0], 5)),
      str(round(min_epsilon[images[i]][1], 5)),
      str(round(min_epsilon[images[i]][2], 5)),
      str(round(min_epsilon[images[i]][3], 5)),
      str(round(min_epsilon[images[i]][4], 5)),
      str(round(min_epsilon[images[i]][5], 5))
  ))

agaric      0.38000
bee         0.00063
cannon      0.02000
flute       0.00063
jellyfish   0.00250
maze        0.07000
peacock     0.55000
pizza       0.00500
screw       0.00500
teddy       0.35000
agaric      0.51000
bee         0.02000
cannon      0.38000
flute       0.00500
jellyfish   0.40000
maze        0.27000
peacock     0.67000
pizza       0.37000
screw       0.33000
teddy       0.65000
agaric      0.00250
bee         0.00008
cannon      0.01000
flute       0.00008
jellyfish   0.00250
maze        0.00500
peacock     0.02000
pizza       0.00250
screw       0.00250
teddy       0.01000
agaric      0.00500
bee         0.00500
cannon      0.01000
flute       0.00008
jellyfish   0.00008
maze        0.00008
peacock     0.00008
pizza       0.00008
screw       0.04000
teddy       0.00008
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/nasnet/NASNet-large.h5
agaric      0.02000
bee         0.32000
cannon      0.93000
flute       0.00008
jellyfish   0.

```
label      MobileV2 Xception Mobile   EffV2L   NASLarge Res152V2

agaric     0.38     0.51     0.0025   0.005    0.02     0.36    
bee        0.00063  0.02     8e-05    0.005    0.32     0.01    
cannon     0.02     0.38     0.01     0.01     0.93     0.22    
flute      0.00063  0.005    8e-05    8e-05    8e-05    0.01    
jellyfish  0.0025   0.4      0.0025   8e-05    0.41     0.23    
maze       0.07     0.27     0.005    8e-05    0.41     0.28    
peacock    0.55     0.67     0.02     8e-05    0.45     0.57    
pizza      0.005    0.37     0.0025   8e-05    0.45     0.06    
screw      0.005    0.33     0.0025   0.04     0.71     0.005   
teddy      0.35     0.65     0.01     8e-05    1.28     0.59    
```

First thing I noticed is that the original MobileNet is not particularly great. On the other hand, NASNetLarge is quite impressive at resisting evasive attacks for some images, but for others (like the flute) it got fooled instantly. The flute's classification was actually quite easily flipped across all 6 networks, so that might be a trait of my image rather than the networks; but regardless, even MobileNet somehow outclassed NASNetLarge for that image. Overall, it looks like NASNetLarge was the most robust of these models, followed by Xception; and MobileNet was by far the most susceptible to attacks on average.

# BONUS: Can you provide a better attack?

Can you design a better attack that lowers the epsilon required for the images?

Task:

* Design another attack. 
* Compare the epsilon values on 10 images. 
* Does it perform better than the FGSM attack? That is, does it have lower epsilon values?

Write the code and provide your answers below. 