# STOR 566, Homework 5
### Instructor: Yao Li
### Keywords: Adversarial Robustness
### Due date: Oct 30, 11:55pm
### **Submission Instruction**

- Please download this script and use it to answer the questions in the homework. 
- For submission, please include your code, code output and answers in the script and submit the ipynb file on sakai.
- Please don't modify existing cells. But you can add cells between the exercise statements.
- To make markdown, please switch the cell type to markdown (from code) - you can hit 'm' when you are in command mode - and use the markdown language. For a brief tutorial see: https://daringfireball.net/projects/markdown/syntax

### **References:**

- You can follow the setup instructions at [here](https://pytorch.org/get-started/locally/).
- A useful tutorial on learning pytorch by examples at [here](https://pytorch.org/tutorials/beginner/pytorch_with_examples.html).
- Check Pytorch optimization methods at [here](https://pytorch.org/docs/stable/optim.html).
- Check implementation of different attack methods at [here](https://github.com/Harry24k/adversarial-attacks-pytorch) for reference. 


### **Method Review**

- PGD: L-infinity norm restricted attack
	\begin{align}
	x^{t+1}=\Pi_{\epsilon}\{x^t+\alpha\cdot\text{sign}(\nabla_xL(\theta,x,y)),x_0\}
	\end{align}	
    - $x^{t+1}$: the adversarial example generated from step t+1
    - $\epsilon$: epsilon, which controls the perturbation
    - $\Pi_\epsilon$: project the input to the epsilon-ball around $x_0$
    - $\alpha$: step size
    - $\text{sign}(\nabla_xL(\theta,x,y))$: sign of gradient
    - Don't forget to project $x^{t+1}$ to the valid pixel value range. If you don't modify the data loader, the valid pixel value range is $[0,1]$.

- C&W: L-2 norm targeted attack
	\begin{align}
	x^{*}=\arg\min_{x}\|x-x_0\|^2+\lambda\cdot\max\{\max_{j\ne t}f_j(x)-f_t(x),\kappa\}
	\end{align}	
    - $x^{*}$: the adversarial example
    - $\lambda$: parameter that controls the balance between distortion and the attack success. 
    - $f_j(x)$: the logits (predicted score) of class j
    - $t$: the target class
    - $\kappa$: confidence
    - Don't forget to project the $x^{*}$ to $[0,1]$.

- Adversarial training
	\begin{align}
	\arg\min_\theta E_{(x,y)\sim D}\{\max_{\|\delta\|\le\epsilon}L(\theta,x+\delta,y)\}
	\end{align}	
    - $\theta$: model parameters
    - $D$: clean data distribution
    - $\delta$: adversarial perturbation
    - For each batch, generate adversarial examples based on the batch of samples, then update the model with adversarial examples 
    
### **Evaluation Metrics:**

- Testing accuracy: 
	\begin{align}
	\frac{1}{N}\sum_{i=1}^N {\bf 1}(\hat{y}_i=y_i)
	\end{align}	
    - $N$: the total number of samples in the testing set
    - $y_i$: true label of sample $x_i$
    - $\hat{y}_i$: predicted label by the model

- Robust Testing Accuracy: testing accuracy on adversarial examples

	\begin{align}
	\frac{1}{N}\sum_{i=1}^N {\bf 1}(c(x^*_i)=y_i)
	\end{align}	
    - $N$: the total number of samples in the testing set
    - $x^*_i$: adversarial example generated from $x_i$
    - $c(\cdot)$: returns the label predicted by the model
    - $c(x^*_i)$: predicted label of adversarial example by the model
    - $y_i$: true label of sample $x_i$

## Problem 1 (50 points)

In this problem you will practice implementing **PGD attack** and **targetted C&W attack** on CIFAR10 data set.

**Data.** You will use CIFAR10 classification dataset (10 classes). Pytorch/torchvision has provide a useful dataloader to automatically download and load the data into batches. Code of the data loader has been provided in the template. Please don't modify the data loading part.

In [None]:
import time
import copy
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import numpy as np
import torch.utils.data as td
import random, time
import matplotlib.pyplot as plt
import torchvision
import PIL.Image as Image
from tqdm import tqdm
from torch.autograd import Variable
from torchvision import datasets, transforms
from torch.utils.data.sampler import SubsetRandomSampler
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

In [None]:
## Data loading code chunk, please don't modify it. 
## However, you can adjust the batch size if you want to.
batch_size_cifar = 128

def cifar_loaders(batch_size, shuffle_test=False): 
    data_dir = './data'
    train = datasets.CIFAR10(data_dir, train=True, download=True, 
        transform=transforms.Compose([
            transforms.ToTensor(),
        ]))
    # Once you have downloaded the data by setting download=True, you can
    # change download=True to download=False
    test = datasets.CIFAR10(data_dir, train=False, 
        transform=transforms.Compose([transforms.ToTensor()]))
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size,
        shuffle=True, pin_memory=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size,
        shuffle=shuffle_test, pin_memory=True)
    return train_loader, test_loader

train_cifar_loader, test_cifar_loader = cifar_loaders(batch_size_cifar)

# Get the loader that only loads class 0 image (for question 1.d)
def subset_loaders(batch_size=128): 
    data_dir = './data'
    test = datasets.CIFAR10(data_dir, train=False, 
        transform=transforms.Compose([transforms.ToTensor()]))
    subset_indices = (torch.tensor(test.targets) == 0).nonzero()
    test_loader = torch.utils.data.DataLoader(test,batch_size=batch_size, 
          shuffle=False,sampler=SubsetRandomSampler(subset_indices))
    return test_loader

sub_loader = subset_loaders()

In [None]:
## Load the pre-trained VGG11bn model
## For this step, you need to put the vgg.py and vgg11_bn.pt under your working directory
## See vgg.py and vgg11_bn.pt and under the shared google drive folder of HW5 from the course website
## The model is from https://github.com/huyvnphan/PyTorch_CIFAR10
from vgg import vgg11_bn
vgg11 = vgg11_bn()
vgg11.load_state_dict(torch.load('vgg11_bn.pt'))
vgg11.to(device)

### **Problem Description.** Generate adversarial examples on CIFAR10 using PGD and C&W.

### (a) (5 points) Check the testing accuracy of the pre-trained VGG11 model on the test set of CIFAR10. The expected testing accuracy is 92.39%. 

### (b) (15 points) Implement PGD with L-infinity norm, epsilon=0.03, step size=0.01, max steps=20, to attack the pre-trained VGG11 and generate adversarial examples with the CIFAR10 test set. Report the robust testing accuracy.

### (c) (5 points) Plot a batch of adversarial examples (PGD) and the corresponding test samples.

### (d) (25 points) Implement targeted C&W with L2 norm, lambda=1, confidence=0, max steps=50, using CIFAR10 class 0 test samples. The target class is class 1. Report the testing accuracy on the original class 0 samples, and the proportion of adversarial examples classified as class 1.

## Problem 2 (50 points)

In this problem you will practice implementing **adversarial training** on CIFAR10

**Data.** You will use CIFAR10 classification dataset (10 classes). Pytorch/torchvision has provide a useful dataloader to automatically download and load the data into batches. Code of the data loader has been provided in the template. Please don't modify the data loading part.

### **Problem Description.** Implement **adversarial training** to train VGG11 on CIFAR10.

### (a) (25 points) Implement adversarial training (generate PGD adversarial examples every iteration and train the model with adversarial examples) with PGD (L-infinity norm, epsilon=0.03, step size=0.01, max steps=7). You can initial the model with pre-trained VGG11. Do adversarial training for at least 10 epochs.

### (b) (20 points) Implement PGD with L-infinity norm, epsilon=0.03, step size=0.01, max steps=20, to attack the **adversarially trained** VGG11 and generate adversarial examples with the CIFAR10 test set. Report the testing accuracy and robust testing accuracy of the **adversarially trained** VGG11 on CIFAR10 test set.

### (c) (5 points) Please compare the performance of the pre-trained VGG11 and the adversarially trained VGG11 against PGD attack.