# ML in Cybersecurity: Task II

## Team
  * **Team name**:  *R2D2C3P0BB8*
  * **Members**:  <br/> **Navdeeppal Singh (s8nlsing@stud.uni-saarland.de)** <br/> **Shahrukh Khan (shkh00001@stud.uni-saarland.de)** <br/> **Mahnoor Shahid (mash00001@stud.uni-saarland.de)**


## Logistics
  * **Due date**: 25th Nov. 2021, 23:59:59 (email the completed notebook including outputs to mlcysec_ws2022_staff@lists.cispa.saarland)
  * Email the completed notebook to mlcysec_ws2022_staff@lists.cispa.saarland 
  * Complete this in the previously established **teams of 3**
  * Feel free to use the course forum to discuss.
  
  
## About this Project
In this project, we dive into the vulnerabilities of machine learning models and the difficulties of defending against them. To this end, we ask you to implement an evasion attack (craft adversarial examples) yourselves, and defend your own model.   


## A Note on Grading
The total number of points in this project is 100. We further provide the number of points achievable with each excercise. You should take particular care to document and visualize your results.

Whenever possible, please use tools like tables or figures to compare the different findings


 
## Filling-in the Notebook
You'll be submitting this very notebook that is filled-in with (all) your code and analysis. Make sure you submit one that has been previously executed in-order. (So that results/graphs are already visible upon opening it). 

The notebook you submit **should compile** (or should be self-contained and sufficiently commented). Check tutorial 1 on how to set up the Python3 environment.

It is extremely important that you **do not** re-order the existing sections. Apart from that, the code blocks that you need to fill-in are given by:
```
#
#
# ------- Your Code -------
#
#
```
Feel free to break this into multiple-cells. It's even better if you interleave explanations and code-blocks so that the entire notebook forms a readable "story".


## Code of Honor
We encourage discussing ideas and concepts with other students to help you learn and better understand the course content. However, the work you submit and present **must be original** and demonstrate your effort in solving the presented problems. **We will not tolerate** blatantly using existing solutions (such as from the internet), improper collaboration (e.g., sharing code or experimental data between groups) and plagiarism. If the honor code is not met, no points will be awarded.

 
  ---

In [1]:
import time 
 
import numpy as np 
import matplotlib.pyplot as plt 

import json 
import time 
import pickle 
import sys 
import csv 
import os 
import os.path as osp 
import shutil 
import decimal
import pandas as pd
from sewar.full_ref import mse, rmse, psnr, uqi, ssim, ergas, scc, rase, sam, msssim, vifp, psnrb
from IPython.display import display, HTML
 
%matplotlib inline 
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots 
plt.rcParams['image.interpolation'] = 'nearest' 
plt.rcParams['image.cmap'] = 'gray' 
 
# for auto-reloading external modules 
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython 
%load_ext autoreload
%autoreload 2

In [2]:
# Some suggestions of our libraries that might be helpful for this project
from collections import Counter          # an even easier way to count
from multiprocessing import Pool         # for multiprocessing
from tqdm import tqdm                    # fancy progress bars
import warnings
import sklearn.metrics
# Load other libraries here.
# Keep it minimal! We should be easily able to reproduce your code.
# We only support sklearn and pytorch.
import torchvision.datasets as datasets
import torchvision.transforms as transforms
import torch.utils.data as data

# We preload pytorch as an example
import torch

import torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset, TensorDataset, random_split
# Please set random seed to have reproduceable results, e.g. torch.manual_seed(123)
random_seed = 42
torch.manual_seed(random_seed)
np.random.seed(random_seed)

In [3]:
compute_mode = 'gpu'
global device

if compute_mode == 'cpu':
    device = torch.device('cpu')
elif compute_mode == 'gpu':
    # If you are using pytorch on the GPU cluster, you have to manually specify which GPU device to use
    # It is extremely important that you *do not* spawn multi-GPU jobs.
    os.environ["CUDA_VISIBLE_DEVICES"] = '0'    # Set device ID here
    device = torch.device('cuda')
else:
    raise ValueError('Unrecognized compute mode')
    
print(device)

cuda


#### Helpers

In case you choose to have some methods you plan to reuse during the notebook, define them here. This will avoid clutter and keep rest of the notebook succinct.

In [4]:
# data loading helper
def _get_data(DATA_PATH, TRAIN_BATCH_SIZE, TEST_BATCH_SIZE):
    try:
        """
        This method is created to split the MNIST data into training, validation and testing set accordingly 
        and load it into dataloaders. Also, to specify any transformations required to perform on the data. 
        As well as this method is being called multiple times in hyper parameter tuning where different batch 
        sizes are being tested
        ...

        Parameters
        ----------
        DATA_PATH : str
            specifies the path directory where dataset will be downloaded
        TRAIN_BATCH_SIZE : int
            specifies the batch size in the training loader
        TEST_BATCH_SIZE : int
            specifies the batch size in the training loader
            
        Returns
        -------
        train_loader, validation_loader, test_loader with the specified batch sizes 
            
        """
        tranformations = transforms.Compose([transforms.ToTensor()])
        
        mnist_training_dataset = datasets.MNIST(root=DATA_PATH+'train', train=True, download=True, transform=tranformations)
        mnist_testing_dataset = datasets.MNIST(root=DATA_PATH+'test', train=False, download=True, transform=tranformations)
        
        training_dataset, validation_dataset = random_split(mnist_training_dataset, [int(0.8*len(mnist_training_dataset)), int(0.2*len(mnist_training_dataset))])
        
        train_loader = DataLoader(training_dataset, batch_size=TRAIN_BATCH_SIZE, shuffle=True)
        validation_loader = DataLoader(training_dataset, batch_size=TRAIN_BATCH_SIZE, shuffle=False)
        test_loader = DataLoader(mnist_testing_dataset, batch_size=TEST_BATCH_SIZE, shuffle=True)
        
        return train_loader, validation_loader, test_loader
    
    except Exception as e:
        print('Unable to get data due to ', e)
        
    
def visualize_specific_predictions(specific_predictions):
    target_values = []
    predicted_values = []
    figure = plt.figure(figsize=(20, 8))
    columns = 4
    rows = 3
    axs = []
    for index, (images, attacks, targets, w_preds_before_attack, w_preds_after_attack) in list(enumerate(specific_predictions))[:12]:
        with warnings.catch_warnings(record=True):
            axs.append( figure.add_subplot(rows, columns, index+1) )
            axs[-1].set_title(f'Correct: {targets}, Predicted Before: {w_preds_before_attack}, Predicted After: {w_preds_after_attack}', fontsize=10)
            axs[-1].axis("off")
            plt.imshow(images.cpu().detach().numpy().reshape(28, 28),cmap="gray")
            plt.imshow(attacks.cpu().detach().numpy().reshape(28, 28),cmap="gray")
    plt.show()
    

# 1. Attacking an ML-model (30 points) 

In this section, we implement an attack ourselves. First, however, you need a model you can attack. Feel free to choose the DNN/ConvNN from task 1.



## 1.1: Setting up the model and data (4 Points)

Load the MNIST data, as done in task 1. 

Re-use the model from task 1 here and train it until it achieves reasonable accuracy (>92%).

If you have the saved checkpoint from task 1, you can load it directly. But please compute here the test accuracy using this checkpoint.  

**Hint:** In order to save computation time for the rest of exercise, you might consider having a relatively small model here.

**Hint**: You might want to save the trained model to save time later.

In [5]:
## 1. Loading data

DATA_PATH = './data/'
TRAIN_BATCH_SIZE, TEST_BATCH_SIZE = 64, 64
train_loader, validation_loader, test_loader = _get_data(DATA_PATH, TRAIN_BATCH_SIZE, TEST_BATCH_SIZE)

In [6]:
## 2. Defining model

class CNN_Network(nn.Module):
    def __init__(self, model_params):
        """
        This class is created to specify the Convolutional Neural Network on which MNIST dataset is trained on, 
        validated and later tested. 
        It consist of one input layer, one output layer can consist of multiple hidden layers all of which is 
        specified by the user as provided through model_paramaters
        Size of the kernel, stride and padding can also be adjusted by the user as provided through model_paramaters
        ...

        Parameters
        ----------
        model_params : dictionary
            provides the model with the required input size, hidden layers and output size
            
            model_params = {
            'INPUT_SIZE' : int,
            'HIDDEN_LAYERS' : list(int),
            'OUTPUT_SIZE' : int,
            'KERNEL' : int,
            'STRIDE' : int,
            'PADDING' : int
        }
        """
        try:
            super(CNN_Network, self).__init__()
            
            layers = []
            
            for input_channel, out_channel in zip([model_params['INPUT_SIZE']] + model_params['HIDDEN_LAYERS'][:-1], 
                                                     model_params['HIDDEN_LAYERS'][:len(model_params['HIDDEN_LAYERS'])]):
                layers.append(nn.Conv2d(input_channel, out_channel, model_params['KERNEL'], model_params['STRIDE'], model_params['PADDING'], bias=True))
                layers.append(nn.MaxPool2d(2, 2))
                layers.append(nn.ReLU())
            layers.append(nn.Flatten(1))      
            layers.append(nn.Linear(model_params['HIDDEN_LAYERS'][-1], model_params['OUTPUT_SIZE'], bias=True))

            self.layers = nn.Sequential(*layers)
        
        except Exception as e:
            print('initializing failed due to ', e)
    
    def forward(self, x):
        try:
            return self.layers(x)
        
        except Exception as e:
            print('forward pass failed due to ', e)
      

In [7]:
## 3. initializing the pre-trained model from assignment 1
model_params = {
        'INPUT_SIZE' : 1,
        'HIDDEN_LAYERS' : [160, 100, 64, 10],
        'OUTPUT_SIZE' : 10,
        'KERNEL' : 3,
        'STRIDE' : 1,
        'PADDING' : 1
}
  
undefended_model = CNN_Network(model_params).to(device)

In [8]:
# loading checkpoint and evaluating on test set
def _test_model(model, test_loader, BEST_MODEL):
    try:
        model.load_state_dict(torch.load(BEST_MODEL))
        with torch.no_grad():
            correct_predictions = []
            testing_acc_scores = []
            wrong_predictions = []
            all_targets = []
            all_preds = []


            for images, targets in iter(test_loader):
                images = images.to(device)
                targets = targets.to(device)
                outputs = model(images)
                
                _, preds = torch.max(outputs, 1)
                correct_indicies = (preds == targets).nonzero(as_tuple=True)[0]
                c_images = images[correct_indicies]
                c_targets = targets[correct_indicies]
                c_correct_preds = preds[correct_indicies]
                testing_acc_scores.append(len(correct_indicies)/targets.shape[0])

                wrong_indicies = (preds != targets).nonzero(as_tuple=True)[0]
                w_images = images[wrong_indicies]
                w_targets = targets[wrong_indicies]
                w_wrong_preds = preds[wrong_indicies]
            
                correct_predictions += zip(c_images, c_targets, c_correct_preds)
                wrong_predictions += zip(w_images, w_targets, w_wrong_preds)
                all_targets+= zip(targets.cpu().numpy())
                all_preds+= zip(preds.cpu().numpy())

            return (sum(testing_acc_scores)/len(testing_acc_scores))*100, correct_predictions, wrong_predictions, all_targets, all_preds
        
    except Exception as e:
            print('Error occured in testing the model = ', e)

In [9]:
(test_accuracy, 
 correct_predictions, 
 wrong_predictions, 
 all_targets, all_preds) = _test_model(undefended_model, test_loader, 
                                       BEST_MODEL='Accuracy_99.8875_batchsize_64_lr_0.001.ckpt' )

In [10]:
print(f'Our test_accuracy is {test_accuracy}')

Our test_accuracy is 99.19386942675159


In [11]:
print(sklearn.metrics.classification_report(all_targets, all_preds))

              precision    recall  f1-score   support

           0       0.99      1.00      1.00       980
           1       0.99      1.00      1.00      1135
           2       0.99      0.99      0.99      1032
           3       1.00      0.99      0.99      1010
           4       0.99      0.99      0.99       982
           5       1.00      0.99      0.99       892
           6       0.99      0.99      0.99       958
           7       0.99      0.99      0.99      1028
           8       0.99      0.99      0.99       974
           9       0.99      0.99      0.99      1009

    accuracy                           0.99     10000
   macro avg       0.99      0.99      0.99     10000
weighted avg       0.99      0.99      0.99     10000



## 1.2: Implementing the FGSM attack (7 Points)

We now want to attack the model trained in the previous step. We will start with the FGSM attack as a simple example. 

Please implement the FGSM attack mentioned in the lecture. 

More details: https://arxiv.org/pdf/1412.6572.pdf


In [12]:
def fgsm_attack(images, epsilon, clip_min, clip_max):
 
    attack_images = images + epsilon*images.grad.sign()
    attack_images = torch.clamp(attack_images, clip_min, clip_max)

    return attack_images

## 1.3: Adversarial sample set (7 Points)

* Please generate a dataset containing at least 1,000 adversarial examples using FGSM.

* Please vary the perturbation budget (3 variants) and generate 1,000 adversarial examples for each. 
    * **Hint**: you can choose epsilons within, e.g., = [.05, .1, .15, .2, .25, .3],  using MNIST pixel values in the interval       [0, 1]

* Compute the accuracy of each attack set. 

In [13]:
TEST_BATCH_SIZE = 1000
_, _, test_loader = _get_data(DATA_PATH, TRAIN_BATCH_SIZE, TEST_BATCH_SIZE)

fgsm_params = {'epsilons':  [0, .05, .1, .15, .2, .25, .3, .35],
               'clip_min': 0.,
               'clip_max': 1.}

In [14]:
def _evaluate_and_attack(model, test_loader, fgsm_params):
    try:
        loss_func = nn.CrossEntropyLoss().to(device)
        predictions = []
        acc_scores = []
        mse_scores = []
        psnr_scores = []
        ssim_scores = []
        uqi_scores = []
        sam_scores = []
        vifp_scores = []

        for epsilon in tqdm(fgsm_params['epsilons']):
            
            (images, targets) = next(iter(test_loader))
            images, targets = images.to(device), targets.to(device)  
            images.requires_grad = True 
            
            outputs = model(images)
            _, _preds_before_attack = torch.max(outputs,1)
            loss = loss_func(outputs, targets)
            undefended_model.zero_grad()
            loss.backward()
            
            attack_images = fgsm_attack(images, epsilon, fgsm_params['clip_min'], fgsm_params['clip_max'])
            _, _preds_after_attack = torch.max(model(attack_images),1)
            indicies = (_preds_after_attack != targets).nonzero(as_tuple=True)[0]
            predictions.append(zip(images[indicies], attack_images[indicies], targets[indicies], _preds_before_attack[indicies], _preds_after_attack[indicies]))
            
            acc_scores.append(len((_preds_after_attack == targets).nonzero(as_tuple=True)[0])/targets.shape[0]*100)
            mse_scores.append(mse(images.cpu().detach().numpy(),attack_images.cpu().detach().numpy()))
            psnr_scores.append(psnr(images.cpu().detach().numpy(),attack_images.cpu().detach().numpy(), MAX=1))
            # psnrb_scores.append(psnrb(images.cpu().detach().numpy(),attack_images.cpu().detach().numpy()))
            ssim_scores.append(ssim(attack_images.view(1000, -1).detach().numpy(),images.view(1000, -1).detach().numpy(), MAX=1))

            uqi_scores.append(uqi(attack_images.view(1000, -1).detach().numpy(),images.view(1000, -1).detach().numpy()))
            sam_scores.append(sam(attack_images.view(1000, -1).detach().numpy(),images.view(1000, -1).detach().numpy()))
            vifp_scores.append(vifp(attack_images.view(1000, -1).detach().numpy(),images.view(1000, -1).detach().numpy()))
            print(f"Epsilon: {epsilon}\tTest Accuracy = {acc_scores[-1]} % \tMSE = {mse_scores[-1]} \tPSNR = {psnr_scores[-1]} \tVIFP = {vifp_scores[-1]}")
        
        variants = pd.DataFrame({'Epsilon': fgsm_params['epsilons'], 'Accuracy': acc_scores, 'MSE': mse_scores, 'PSNR': psnr_scores, 'UQI': uqi_scores
                                , 'SAM': sam_scores, 'VIFP': vifp_scores, 'SSIM': ssim_scores})
        return variants, predictions
    
    except Exception as e:
            print('Error occured in _evaluate_and_attack method = ', e)    

In [15]:
variants, predictions = _evaluate_and_attack(undefended_model, test_loader, fgsm_params)

  0%|                                                                                            | 0/8 [00:00<?, ?it/s]


Error occured in _evaluate_and_attack method =  can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.


TypeError: cannot unpack non-iterable NoneType object

In [None]:
import seaborn as sns
sns.set(rc={"figure.figsize":(15, 8)}) 
sns.set_style("whitegrid")
clrs = ['grey' if (x > min(variants['Accuracy'])) else 'red' for x in variants['Accuracy']]

sns.barplot(x= variants['Epsilon'], y=  variants['Accuracy'], palette=clrs)

plt.xticks(fontsize=14)  
plt.title('Decrease of Accuracy with the increase in Epsilon Values', fontsize=20)
plt.xlabel('Epsilon Values', fontsize=16)
plt.ylabel('Accuracy', fontsize=16)
plt.grid(alpha = 0.3, linestyle = '--', linewidth = 2)
plt.show()

#### Moreover, it has been observed that similarity metrics can also be used to highlight the presence of an adversarial attack in an image when compared with its benign counterpart. Thus, these scores can be a measure to quantify the amount of perturbations brought in by these attacks.

In [None]:
variants

In [None]:
fgsm_selected_sets = variants[['Epsilon', 'Accuracy']].sort_values(by='Accuracy', ascending=True).head(3)
fgsm_selected_sets

## 1.4: Visualizing the results (7 Points)

* Please chose one sample for each class (for example the first when iterating the test data) and plot the (ten) adversarial examples as well as the predicted label (before and after the attack)

* Please repeat the visualization for the three sets you have created 

In [None]:
#
#
# ------- Your Code -------
#
#

# template code (Please feel free to change this)
# (each column corresponds to one attack method)
# col_titles = ['Ori','FGSM','Method 1', 'Method 2'] 
# nsamples = 10
# nrows = nsamples
# ncols = len(col_titles)

# fig, axes = plt.subplots(nrows,ncols,figsize=(8,12))  # create the figure with subplots
# [ax.set_axis_off() for ax in axes.ravel()]  # remove the axis

# for ax, col in zip(axes[0], col_titles): # set up the title for each column
#     ax.set_title(col,fontdict={'fontsize':18,'color':'b'})

# for i in range(nsamples):
#     axes[i,0].imshow(images_ori[i])
#     axes[i,1].imshow(adv_FGSM[i])
#     axes[i,2].imshow(adv_Method1[i])
#     axes[i,3].imshow(adv_Method2[i])
                  

In [None]:
def visualize_specific_predictions(specific_predictions):
    target_values = []
    predicted_values = []
    figure = plt.figure(figsize=(20, 8))
    columns = 4
    rows = 3
    axs = []
    for index, (images, attacks, targets, w_preds_before_attack, w_preds_after_attack) in list(enumerate(specific_predictions))[:12]:
        with warnings.catch_warnings(record=True):
            axs.append( figure.add_subplot(rows, columns, index+1) )
            axs[-1].set_title(f'Correct: {targets}, Predicted Before: {w_preds_before_attack}, Predicted After: {w_preds_after_attack}', fontsize=10)
            axs[-1].axis("off")
            plt.imshow(images.cpu().detach().numpy().reshape(28, 28),cmap="gray")
            plt.imshow(attacks.cpu().detach().numpy().reshape(28, 28),cmap="gray")
    plt.show()

In [None]:
visualize_specific_predictions(predictions[variants.query('Epsilon==0.35').index.item()])

In [None]:
predictions[variants.query('Epsilon==0.35').index.item()]

In [None]:
visualize_specific_predictions(predictions[variants.query('Epsilon==0.3').index.item()])

In [None]:
visualize_specific_predictions(predictions[variants.query('Epsilon==0.25').index.item()])

## 1.5: Analyzing the results (5 Points)

Please write a brief summary of your findings.  

* Does the attack always succeed (the model makes wrong prediction on the adversarial sample)? What is the relationship between the attack success rate and the perturbation budget?
* How about the computation cost of the attack? (you can report the time in second) 
* Does the attack require white-box access to the model?
* Feel free to report your results via tables or figures, and mention any other interesting observations 



**Your answers go here**

# 2. Defending an ML model (35 points) 

So far, we have focused on attacking an ML model. In this section, we want you to defend your model. 


## 2.1: Implementing the adversarial training defense (20 Points)

* We would like to ask you to implement the adversarial training defense (https://arxiv.org/pdf/1412.6572.pdf) mentioned in the lecture. 

* You can use the **FGSM adversarial training** method (i.e., train on FGSM examples). 

* You can also check the adversarial training implementation in other papers, e.g., http://proceedings.mlr.press/v97/pang19a/pang19a.pdf 

* Choose a certain **maximum perturbation budget** during training that is in the middle of the range you have experimented with before. 

* We do not require the defense to work perfectly - but what we want you to understand is why it works or why it does not work.

**Hint:** You can save the checkpoint of the defended model as we would need it to for the third part of this exercise.


In [16]:

def _standard_training(loader, model):
    """Standard training/evaluation epoch over the dataset"""
    try:
        model.train()
        train_loss_scores = []
        training_acc_scores = []
        
        for batch_index, (images, targets) in enumerate(loader):
            images, targets = images.to(device), targets.to(device)
                
            outputs = model(images)
            loss = criterion(outputs, targets)
            train_loss_scores.append(loss.item())
                
            _, preds = torch.max(outputs, 1)
            correct_predictions = (preds==targets).sum().item()
            training_acc_scores.append(correct_predictions/targets.shape[0])
                
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
                
            # if (batch_index+1) % 100 == 0:
            #     print(f"Epoch : [{epoch+1}/{training_params['NUM_EPOCHS']}] | Step : [{batch_index+1}/{len(train_loader)}] | Loss : {loss.item()} ")
            
        # train_loss_history.append((sum(train_loss_scores)/len(train_loss_scores)))
        # training_acc_history.append((sum(training_acc_scores)/len(training_acc_scores))*100)      
        # print(f'Epoch : {epoch+1} | Loss : {train_loss_history[-1]} | Training Accuracy : {training_acc_history[-1]}%')

        return training_acc_scores, train_loss_scores
    
    except Exception as e:
        print('Error occured in standard training method = ', e)


def _adversarial_training_defense(loader, model, fgsm_params):
    """Adversarial training/evaluation epoch over the dataset"""
    try:
        model.train()
        train_loss_scores = []
        training_acc_scores = []
        
        for batch_index, (images, targets) in enumerate(loader):
            images, targets = images.to(device), targets.to(device)
            images.requires_grad = True
            
            # delta = fgsm_attack(images, fgsm_params['epsilon'], fgsm_params['clip_min'], fgsm_params['clip_max'])
            outputs = model(images+0)
            loss = criterion(outputs, targets)
            train_loss_scores.append(loss.item())
                
            _, preds = torch.max(outputs, 1)
            correct_predictions = (preds==targets).sum().item()
            training_acc_scores.append(correct_predictions/targets.shape[0])
                
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
                
            # if (batch_index+1) % 100 == 0:
            #     print(f"Epoch : [{epoch+1}/{training_params['NUM_EPOCHS']}] | Step : [{batch_index+1}/{len(train_loader)}] | Loss : {loss.item()} ")
            
        # train_loss_history.append((sum(train_loss_scores)/len(train_loss_scores)))
        # training_acc_history.append((sum(training_acc_scores)/len(training_acc_scores))*100)      
        # print(f'Epoch : {epoch+1} | Loss : {train_loss_history[-1]} | Training Accuracy : {training_acc_history[-1]}%')

        return training_acc_scores, train_loss_scores
    
    except Exception as e:
        print('Error occured in adversarial defense method = ', e)


In [18]:
training_params = {
        'TRAIN_BATCH_SIZE' : 32,
        'TEST_BATCH_SIZE' : 1000,
        'LEARNING_RATE' : 0.001,
        'OPTIMIZER': optim.Adam,
        'NUM_EPOCHS' : 2
}

fgsm_params = {'epsilon':  .35,
               'clip_min': 0.,
               'clip_max': 1.}
    
    
model = CNN_Network(model_params).to(device)
model.load_state_dict(torch.load('Accuracy_99.8875_batchsize_64_lr_0.001.ckpt'))
print(f'Network structure is: {model.parameters}')
print(f'Total number of parameters: {sum(p.numel() for p in model.parameters())}')

criterion = nn.CrossEntropyLoss().to(device)
optimizer = training_params['OPTIMIZER'](model.parameters(), lr=training_params['LEARNING_RATE'])

for t in range(2):
    train_err, train_loss = _adversarial_training_defense(train_loader, model, fgsm_params)
    test_err, test_loss = _standard_training(test_loader, model)
    adv_err, adv_loss = _adversarial_training_defense(test_loader, model, fgsm_params)
    print(train_err, test_err, adv_err)
torch.save(model.state_dict(), "model.ckpt")

Network structure is: <bound method Module.parameters of CNN_Network(
  (layers): Sequential(
    (0): Conv2d(1, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (2): ReLU()
    (3): Conv2d(160, 100, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): ReLU()
    (6): Conv2d(100, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (8): ReLU()
    (9): Conv2d(64, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (11): ReLU()
    (12): Flatten(start_dim=1, end_dim=-1)
    (13): Linear(in_features=10, out_features=10, bias=True)
  )
)>
Total number of parameters: 209244
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.984375, 1.0, 1.0,

In [None]:

def _adversarial_training_defense(model, train_loader, validation_loader, attack, criterion, optimizer, training_params, set_device, tuning=False):
    try:
        best_accuracy = 0
        train_loss_history = []
        validation_loss_history = []
        training_acc_history = []
        validation_acc_history = []
        
        for epoch in range(0, training_params['NUM_EPOCHS']):
            model.train()
            train_loss_scores = []
            training_acc_scores = []
            correct_predictions = 0
            
            for batch_index, (images, targets) in enumerate(train_loader):
                images, targets = images.to(set_device), targets.to(set_device)
                
                outputs = model(images)
                loss = criterion(outputs, targets)
                train_loss_scores.append(loss.item())
                
                _, preds = torch.max(outputs, 1)
                correct_predictions = (preds==targets).sum().item()
                training_acc_scores.append(correct_predictions/targets.shape[0])
                
                optimizer.zero_grad()
                loss.backward()
                optimizer.step()
                
                if not tuning:
                    if (batch_index+1) % 100 == 0:
                        print(f"Epoch : [{epoch+1}/{training_params['NUM_EPOCHS']}] | Step : [{batch_index+1}/{len(train_loader)}] | Loss : {loss.item()} ")
            
            train_loss_history.append((sum(train_loss_scores)/len(train_loss_scores)))
            training_acc_history.append((sum(training_acc_scores)/len(training_acc_scores))*100)      
            print(f'Epoch : {epoch+1} | Loss : {train_loss_history[-1]} | Training Accuracy : {training_acc_history[-1]}%')
       
            return train_loss_history, training_acc_history
    
    except Exception as e:
        print('Error occured in training method = ', e)

In [None]:


train_loss, validation_loss, train_acc, validation_acc = _network_training(model, train_loader, validation_loader, criterion, optimizer, training_params, device)

## 2.2: Evaluation (10 Points)

* Craft adversarial examples using the **defended** model. This entails at least 1,000 examples crafted via FGSM. 
    * Create one set using a budget that is **less than (within)** the one used in training.
    * Create another set using a budget that is **higher than** the one used in training. 
    * You can use two values of epsilons from question 1.3 
    
* Evaluate the **defended** model on these two adversarial examples sets. 


In [None]:
#
#
# ------- Your Code -------
#
#

print('Accuracy on the lower-budget adversarial samples (FGSM) %.2f'%acc_FGSM1)
print('Accuracy on the lower-budget adversarial samples (FGSM) after defense %.2f'%acc_FGSM_defend1)

print('Accuracy on the higher-budget adversarial samples (FGSM) %.2f'%acc_FGSM2)
print('Accuracy on the higher-budget adversarial samples (FGSM) after defense %.2f'%acc_FGSM_defend2)

## 2.3 Discussion (5 points)

* How successful was the defense against the attack compared to the undefended model? How do you interpret the difference?
* How did the two sets differ?

**Your answers go here**

# 3: I-FGSM attack (35 points) 

* FGSM is one of the simplest and earliest attacks. Since then, many more advanced attacks have been proposed. 
* One of them is the Iterative-FGSM (https://arxiv.org/pdf/1607.02533.pdf), where the attack is repeated multiple times.
* In this part, we ask you to please implement the iterative FGSM attack. 



## 3.1: Implementing the I-FGSM attack (10 Points)

**Hints**: 

* Your code should have an attack loop. At each step, the FGSM attack that you have implemented before is computed using a small step.
* After each step, you should perform a per-pixel clipping to make sure the image is in the allowed range, and that the perturbation is within budget.


In [None]:
#
#
# ------- Your Code -------
#
#


## 3.2: Attack the undefended model (5 Points)

* We will first attack the **undefended model** (i.e., without adversarial training).

* Choose one perturbation budget from Question **1.3** for comparison. 

    * Hint: A simple way to choose the small step is to divide the total budget by the number of steps (e.g., 10).

* Please generate 1000 adversarial examples using the **undefended** model and the **I-FGSM** you implemented. 

* Please compute the accuracy of the adversarial set on the **undefended** model. 

In [None]:
#
#
# ------- Your Code -------
#
#


### 3.2.1: Findings and comparison with FGSM (8 points)

* Please report your findings. How successful was the attack? 

* What do you expect when increasing the number of steps? (you can experiment with different parameters of the attack and report your findings) 

* Compare with the basic FGSM. Using the same perturbation budget and using the same model, which attack is more successful? Why do you think this is the case? What about the computation time?

* Feel free to report any interesting observations. 

**Your answers go here**

## 3.3: Attack the defended model (5 poinst) 

* In the previous question, we attacked the **undefended model**. 

* Now, we want to explore how successful the previous implemented defense (FGSM adversarial training) is againts this new attack. (we will not implement a new defense here, we will be reusing your previous checkpoint of the **defended model**)


* Use the **defended model** to create one set of adversarial examples. Use a perturbation budget from Question **2.2** for comparison.  

In [None]:
#
#
# ------- Your Code -------
#
#

### 3.3.1: Discussion (7 points) 
* Please report your results. How successful was the attack on the defended model? 
* Compare it with the success of the FGSM attack on the defended model. What do you observe? How do you interpret the difference? 
* How do you think you can improve the defense against I-FGSM attack?


* Feel free to state any interesting findings you encountered during this project.

**Your answers go here**