** MODEL FOR DEFENSE AGAINST ADVERSARIAL IMAGES**

#### (ASHFAQUE AZAD MSc Computer Science (Data Analytics))

*Uses Foolbox to generate adversarial images*
https://github.com/bethgelab/foolbox

@article{rauber2017foolbox,
  
  title={Foolbox: A Python toolbox to benchmark the robustness of machine learning models},
  
  author={Rauber, Jonas and Brendel, Wieland and Bethge, Matthias},
  
  journal={arXiv preprint arXiv:1707.04131},
  
  year={2017},
  
  url={http://arxiv.org/abs/1707.04131},
  
  archivePrefix={arXiv},
  
  eprint={1707.04131},
}

## The model works in the following way


1.   Median filtering of the input image (3 x 3 filter)
2.   Filtered image is passed to 3 pre-trained models (ResNet18,VGG16,DENSENET) simultaneously.
3.   Majority vote. 




## The attack method attacks 2 of the pre-trained models at the same time.

It works in the following way:

1.Generate two adversaries of the same class from two different models

2.Extract the adversarial inputs (Adversarial image minus the original image) from the 
      two adversarial images and add it to the original image. Let this be called X

3.Clip the numbers of X array which crosses the boundary limitations for the image.
      e.g. [0-1] or [0-255] being the boundary limits.
     
It is shown that this attack was successful in the model 65% of time without median filtering.

In [1]:
from google.colab import files
uploaded=files.upload()#upload the images

In [2]:
#120 images. Broken down here into 6 lists due to issues related to Google Colab shutting down midway
imageList1=['agama.jpg',
'agama1.jpg',
'agama2.jpg',
'airliner.jpg',
'airliner1.jpg',
'airliner2.jpg',
'armadillo.jpg',
'armadillo1.jpg',
'armadillo2.jpg',
'banana.jpg',
'banana1.jpg',
'banana2.jpg',
'beagle.jpg',
'beagle1.jpg',
'beagle2.jpg',
'bee.jpg',
'bee1.jpg',
'bee2.jpg',
'binoculars.jpg',
'binoculars1.jpg',
'binoculars2.jpg']
imageList2=['bulbul.jpg',
'bulbul1.jpg',
'bulbul2.jpg',
'camel.jpg',
'camel1.jpg',
'camel2.jpg',
'cellphone.jpg',
'cellphone1.jpg',
'cellphone2.jpg',
'cheetah.jpg',
'cheetah1.jpg',
'cheetah2.jpg',
'cockroach.jpg',
'cockroach1.jpg',
'cockroach2.jpg',
'daisy.jpg',
'daisy1.jpg',
'daisy2.jpg',
'flamingo.jpg',
'flamingo1.jpg',
'flamingo2.jpg']
imageList3=['germanshepherd.jpg',
'germanshepherd1.jpg',
'germanshepherd2.jpg',
'glofball.jpg',
'glofball1.jpg',
'glofball2.jpg',
'goose.jpg',
'goose1.jpg',
'goose2.jpg',
'hare.jpg',
'hare1.jpg',
'hare2.jpg',
'hippopotamus.jpg',
'hippopotamus1.jpg',
'hippopotamus2.jpg',
'hummingbird.jpg',
'hummingbird1.jpg',
'hummingbird2.jpg',
'jellyfish.jpg',
'jellyfish1.jpg',
'jellyfish2.jpg']
imageList4=['langur.jpg',
'langur1.jpg',
'langur2.jpg',
'lion.jpg',
'lion1.jpg',
'lion2.jpg',
'lorikeet.jpg',
'lorikeet1.jpg',
'lorikeet2.jpg',
'magpie.jpg',
'magpie1.jpg',
'magpie2.jpg',
'mongoose.jpg',
'mongoose1.jpg',
'mongoose2.jpg',
'ostrich.jpg',
'ostrich1.jpg',
'ostrich2.jpg',
'peacock.jpg',
'peacock1.jpg',
'peacock2.jpg']
imageList5=['pineapple.jpg',
'pineapple1.jpg',
'pineapple2.jpg',
'scorpion.jpg',
'scorpion1.jpg',
'scorpion2.jpg',
'snail.jpg',
'snail1.jpg',
'snail2.jpg',
'snake.jpg',
'snake1.jpg',
'snake2.jpg',
'sportscar.jpg',
'sportscar1.jpg',
'sportscar2.jpg',
'starfish.jpg',
'starfish1.jpg',
'starfish2.jpg',
'streetsign.jpg',
'streetsign1.jpg',
'streetsign2.jpg']
imageList6=['tiger.jpeg',
'tiger1.jpg',
'tiger2.jpg',
'tigercat.jpg',
'tigercat1.jpg',
'tigercat2.jpg',
'toaster.jpg',
'toaster1.jpg',
'toaster2.jpg',
'vulture.jpg',
'vulture1.jpg',
'vulture2.jpg',
'zebra.jpg',
'zebra1.jpg',
'zebra2.jpg']

In [3]:
#This part is to be run everytime in a Google Colab environment
!pip install image
!pip install foolbox
!pip install torch
!pip install torchvision

In [4]:
import os

import numpy as np

#This part of the code is from : 
#https://github.com/bethgelab/foolbox

def imagenet_example(shape=(224, 224), data_format='channels_last'):
    """ Returns an example image and its imagenet class label.
    Parameters
    ----------
    shape : list of integers
        The shape of the returned image.
    data_format : str
        "channels_first" or "channels_last"
    Returns
    -------
    image : array_like
        The example image.
    label : int
        The imagenet label associated with the image.
    """
    assert len(shape) == 2
    assert data_format in ['channels_first', 'channels_last']

    from PIL import Image
    from urllib.request import urlretrieve
    path = os.path.join(nameOfTheImage)
    image = Image.open(path)
    image = image.resize(shape)
    image = np.asarray(image, dtype=np.float32)
    image = image[:, :, :3]
    assert image.shape == shape + (3,)
    if data_format == 'channels_first':
        image = np.transpose(image, (2, 0, 1))
    return image

In [5]:
#Imports the necessary libraries
import foolbox
import torch
import torchvision.models as models
import numpy as np

In [6]:
def RESNET18(targetClass):
  # instantiate the model (as per foolbox)
  resnet18 = models.resnet18(pretrained=True).eval()
  if torch.cuda.is_available():
      resnet18 = resnet18.cuda()
  mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
  std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
  fmodel = foolbox.models.PyTorchModel(
      resnet18, bounds=(0, 1), num_classes=1000, preprocessing=(mean, std))

  # get source image and label
  image = imagenet_example(data_format='channels_first')
  image = image / 255.  # because our model expects values in [0, 1]


  print('predicted class(RESNET)', np.argmax(fmodel.predictions(image)))

  from foolbox.criteria import TargetClassProbability

  target_class = targetClass#372 #https://gist.github.com/ageitgey/4e1342c10a71981d0b491e1b8227328b
  criterion = TargetClassProbability(target_class, p=0.95)

  # apply attack on source image
  attack = foolbox.attacks.LBFGSAttack(fmodel,criterion)

  adversarial = attack(image,np.argmax(fmodel.predictions(image)))

  print('adversarial class(RESNET)', np.argmax(fmodel.predictions(adversarial)))
  
  #returns image,model, adversarial
  return image,fmodel,adversarial

In [7]:
def VGG16(targetClass):
  # instantiate the model (as per foolbox)
  vgg16 = models.vgg16(pretrained=True).eval()
  if torch.cuda.is_available():
      vgg16 = vgg16.cuda()
  mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
  std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
  fmodel1 = foolbox.models.PyTorchModel(
      vgg16, bounds=(0, 1), num_classes=1000, preprocessing=(mean, std))

  # get source image and label
  image1 = imagenet_example(data_format='channels_first')
  image1 = image1 / 255.  # because our model expects values in [0, 1]


  print('predicted class(VGG16)', np.argmax(fmodel1.predictions(image1)))

  from foolbox.criteria import TargetClassProbability

  target_class1 = targetClass #https://gist.github.com/ageitgey/4e1342c10a71981d0b491e1b8227328b
  criterion1 = TargetClassProbability(target_class1, p=0.95)

  # apply attack on source image
  attack1 = foolbox.attacks.LBFGSAttack(fmodel1,criterion1)

  adversarial1 = attack1(image1,np.argmax(fmodel1.predictions(image1)))

  print('adversarial class(VGG16)', np.argmax(fmodel1.predictions(adversarial1)))
  
  return image1,fmodel1,adversarial1

In [8]:

def DENSENET():
  # instantiate the model (as per foolbox)
  densenet = models.densenet161(pretrained=True).eval()
  if torch.cuda.is_available():
      densenet = densenet.cuda()
  mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
  std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
  fmodel2 = foolbox.models.PyTorchModel(
      densenet, bounds=(0, 1), num_classes=1000, preprocessing=(mean, std))

  # get source image and label
  image2 = imagenet_example(data_format='channels_first')
  image2 = image2 / 255.  # because our model expects values in [0, 1]


  print('predicted class(DENSENET)', np.argmax(fmodel2.predictions(image2)))
  
  return image2,fmodel2

## First line of defense

In [9]:
import numpy 
import cv2
def defense(altered):#median filter using 3 x 3 filter
  return cv2.medianBlur(altered,3)

## Second line of defense

In [10]:
def ensemble(inputImage):#Must be odd Numbers
  listPredictions=[]#list would contain the predictions
  listPredictions.append(np.argmax(fmodel.predictions(inputImage)))
  listPredictions.append(np.argmax(fmodel1.predictions(inputImage)))
  listPredictions.append(np.argmax(fmodel2.predictions(inputImage)))
  
  if(listPredictions.count(max(set(listPredictions),key=listPredictions.count))==1):#if the maximum class occurs just once
    return "***adversarial***"#in case all models predict different classes
  else:
    return max(set(listPredictions),key=listPredictions.count),listPredictions#sends the class with maximum occurence

## The complete model including both the defenses

In [11]:
def defenseEnsemble(inputImage):#uses ensemble(abc) and defense(xyz) functions
  x=ensemble(inputImage)
  y=ensemble(defense(inputImage))
  print("Prediction Before Defense",x)
  print("Prediction After Defense",y)
  if(x!=y):
    print("Warning:Adversarial Input")
  return x[0],y[0]#returns the max predictions

In [12]:
origImage=[]#original classes
defeImage=[]#classes after 1st line of defense
origAdv=[]#adversarial classes should be 390 , if not then adversarial class is not robust for that instance 
defeAdv=[]#defended classes, should equal the origImage

## Attack and the defense

In [13]:
for i in range(0,len(imageList6)):#Here put imageList1 to imageList6
  print(i)
  nameOfTheImage=imageList6[i]#change the list accordingly
  
  image,fmodel,adversarial=RESNET18(390)#would classify as eel 390
  image1,fmodel1,adversarial1=VGG16(390)#would classify as eel 390
  image2,fmodel2=DENSENET()#unattacked DENSENET model
  
  #Differences
  diffVGG=adversarial1-image1
  diffRES=adversarial-image
  
  perturbation=image+diffRES+diffVGG#addition of the noise to the input image
  perturbation=numpy.clip(perturbation,0,1)#clipping so that the boundary conditions are not violated.
  
  a,b=defenseEnsemble(image)
  origImage.append(a)
  defeImage.append(b)
  
  c,d=defenseEnsemble(perturbation)
  origAdv.append(c)
  defeAdv.append(d)

**CODE ENDS, BELOW IS THE CODE REQUIRED FOR DATAFRAME GENERATION TO GET THE ACCURACY**

In [14]:
import pandas as pd
df = pd.DataFrame(origImage)
df['1']=pd.DataFrame(defeImage)
df['2']=pd.DataFrame(origAdv)
df['3']=pd.DataFrame(defeAdv)
df

In [15]:
df1 = pd.DataFrame(origImage)
df1['1']=pd.DataFrame(defeImage)
df1['2']=pd.DataFrame(origAdv)
df1['3']=pd.DataFrame(defeAdv)
df1

In [16]:
df2 = pd.DataFrame(origImage)
df2['1']=pd.DataFrame(defeImage)
df2['2']=pd.DataFrame(origAdv)
df2['3']=pd.DataFrame(defeAdv)
df2

In [17]:
df3 = pd.DataFrame(origImage)
df3['1']=pd.DataFrame(defeImage)
df3['2']=pd.DataFrame(origAdv)
df3['3']=pd.DataFrame(defeAdv)
df3

In [18]:
df4 = pd.DataFrame(origImage)
df4['1']=pd.DataFrame(defeImage)
df4['2']=pd.DataFrame(origAdv)
df4['3']=pd.DataFrame(defeAdv)
df4

In [19]:
df5 = pd.DataFrame(origImage)
df5['1']=pd.DataFrame(defeImage)
df5['2']=pd.DataFrame(origAdv)
df5['3']=pd.DataFrame(defeAdv)
df5

In [20]:
# from google.colab import files
# df5.to_csv("dataframe06.csv")#from df to df5.
# files.download('dataframe06.csv')#dataframe01 to dataframe06