<a href="https://colab.research.google.com/github/pbwhere/pbwhere.github.io/blob/main/b09901160_hw10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Homework 10 - Adversarial Attack**

## Enviroment & Download

We make use of [pytorchcv](https://pypi.org/project/pytorchcv/) to obtain CIFAR-10 pretrained model, so we need to set up the enviroment first. We also need to download the data (200 images) which we want to attack.

In [None]:
# set up environment
!pip install pytorchcv
!pip install imgaug

# download
!gdown --id 1t2UFQXr1cr5qLMBK2oN2rY1NDypi9Nyw --output data.zip

# if the above link isn't available, try this one
# !wget https://www.dropbox.com/s/lbpypqamqjpt2qz/data.zip

# unzip
!unzip ./data.zip

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pytorchcv
  Downloading pytorchcv-0.0.67-py2.py3-none-any.whl (532 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m532.4/532.4 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pytorchcv
Successfully installed pytorchcv-0.0.67
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Downloading...
From: https://drive.google.com/uc?id=1t2UFQXr1cr5qLMBK2oN2rY1NDypi9Nyw
To: /content/data.zip
100% 490k/490k [00:00<00:00, 129MB/s]
Archive:  ./data.zip
   creating: data/
   creating: data/deer/
 extracting: data/deer/deer13.png    
 extracting: data/deer/deer6.png     
 extracting: data/deer/deer11.png    
 extracting: data/deer/deer2.png     
 extracting: data/deer/deer10.png    
 extracting: data/deer/deer16.png    
 extracting: data/deer/deer9.png     
 extracting: data/deer/deer20.png    
 ext

In [None]:
!rm ./data.zip

In [None]:
import torch
import torch.nn as nn
from pytorchcv.model_provider import get_model as ptcv_get_model
import random
import numpy as np

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
batch_size = 8

def same_seeds(seed):
	  torch.manual_seed(seed)
	  if torch.cuda.is_available():
		    torch.cuda.manual_seed(seed)
		    torch.cuda.manual_seed_all(seed)
	  np.random.seed(seed)
	  random.seed(seed)
	  torch.backends.cudnn.benchmark = False
	  torch.backends.cudnn.deterministic = True
same_seeds(0) 

## Global Settings 
#### **[NOTE]**: Don't change the settings here, or your generated image might not meet the constraint.
* $\epsilon$ is fixed to be 8. But on **Data section**, we will first apply transforms on raw pixel value (0-255 scale) **by ToTensor (to 0-1 scale)** and then **Normalize (subtract mean divide std)**. $\epsilon$ should be set to $\frac{8}{255 * std}$ during attack.

* Explaination (optional)
    * Denote the first pixel of original image as $p$, and the first pixel of adversarial image as $a$.
    * The $\epsilon$ constraints tell us $\left| p-a \right| <= 8$.
    * ToTensor() can be seen as a function where $T(x) = x/255$.
    * Normalize() can be seen as a function where $N(x) = (x-mean)/std$ where $mean$ and $std$ are constants.
    * After applying ToTensor() and Normalize() on $p$ and $a$, the constraint becomes $\left| N(T(p))-N(T(a)) \right| = \left| \frac{\frac{p}{255}-mean}{std}-\frac{\frac{a}{255}-mean}{std} \right| = \frac{1}{255 * std} \left| p-a \right| <= \frac{8}{255 * std}.$
    * So, we should set $\epsilon$ to be $\frac{8}{255 * std}$ after ToTensor() and Normalize().

In [None]:
# the mean and std are the calculated statistics from cifar_10 dataset
cifar_10_mean = (0.491, 0.482, 0.447) # mean for the three channels of cifar_10 images
cifar_10_std = (0.202, 0.199, 0.201) # std for the three channels of cifar_10 images

# convert mean and std to 3-dimensional tensors for future operations
mean = torch.tensor(cifar_10_mean).to(device).view(3, 1, 1)
std = torch.tensor(cifar_10_std).to(device).view(3, 1, 1)

epsilon = 8/255/std

In [None]:
root = './data' # directory for storing benign images
# benign images: images which do not contain adversarial perturbations
# adversarial images: images which include adversarial perturbations

## Data

Construct dataset and dataloader from root directory. Note that we store the filename of each image for future usage.

In [None]:
import os
import glob
import shutil
import numpy as np
from PIL import Image
from torchvision.transforms import transforms
from torch.utils.data import Dataset, DataLoader

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(cifar_10_mean, cifar_10_std)
])

class AdvDataset(Dataset):
    def __init__(self, data_dir, transform):
        self.images = []
        self.labels = []
        self.names = []
        '''
        data_dir
        ├── class_dir
        │   ├── class1.png
        │   ├── ...
        │   ├── class20.png
        '''
        for i, class_dir in enumerate(sorted(glob.glob(f'{data_dir}/*'))):
            images = sorted(glob.glob(f'{class_dir}/*'))
            self.images += images
            self.labels += ([i] * len(images))
            self.names += [os.path.relpath(imgs, data_dir) for imgs in images]
        self.transform = transform
    def __getitem__(self, idx):
        image = self.transform(Image.open(self.images[idx]))
        label = self.labels[idx]
        return image, label
    def __getname__(self):
        return self.names
    def __len__(self):
        return len(self.images)

adv_set = AdvDataset(root, transform=transform)
adv_names = adv_set.__getname__()
adv_loader = DataLoader(adv_set, batch_size=batch_size, shuffle=False)

print(f'number of images = {adv_set.__len__()}')

number of images = 200


## Utils -- Benign Images Evaluation

In [None]:
# to evaluate the performance of model on benign images
def epoch_benign(model, loader, loss_fn):
    model.eval()
    train_acc, train_loss = 0.0, 0.0
    for x, y in loader:
        x, y = x.to(device), y.to(device)
        yp = model(x)
        loss = loss_fn(yp, y)
        train_acc += (yp.argmax(dim=1) == y).sum().item()
        train_loss += loss.item() * x.shape[0]
    return train_acc / len(loader.dataset), train_loss / len(loader.dataset)

## Utils -- Attack Algorithm

In [None]:
# perform fgsm attack
import tensorflow as tf
def fgsm(model, x, y, loss_fn, epsilon=epsilon):
    x_adv = x.detach().clone() # initialize x_adv as original benign image x
    x_adv.requires_grad = True # need to obtain gradient of x_adv, thus set required grad
    loss = loss_fn(model(x_adv), y) # calculate loss
    loss.backward() # calculate gradient
    # fgsm: use gradient ascent on x_adv to maximize loss
    grad = x_adv.grad.detach()
    x_adv = x_adv + epsilon * grad.sign()
    return x_adv

# alpha and num_iter can be decided by yourself
alpha = 0.8/255/std

def ifgsm(model, x, y, loss_fn, epsilon=epsilon, alpha=alpha, num_iter=20):
    x_adv = x.detach().clone()
    # initialize
    x_adv.requires_grad = True # need to obtain gradient of x_adv, thus set required grad
    loss = loss_fn(model(x_adv), y) # calculate loss
    loss.backward() # calculate gradient
    # fgsm: use gradient ascent on x_adv to maximize loss
    grad = x_adv.grad.detach()
    x_adv = x_adv + epsilon * grad.sign()
    ################ TODO: Medium baseline #######################
    # write a loop with num_iter times
    for i in range(1, num_iter):
      # TODO: Each iteration, execute fgsm
      x_adv = fgsm(model, x_adv, y, loss_fn, alpha)
      # clip new x_adv back to [x-epsilon, x+epsilon]
      x_adv = torch.min(torch.max(x_adv, x-epsilon), x+epsilon)

    return x_adv

def mifgsm(model, x, y, loss_fn, epsilon=epsilon, alpha=alpha, num_iter=20, decay=0.3):
    x_adv = x.detach().clone().to(device)
    # initialze momentum tensor
    momentum = torch.zeros_like(x).detach().to(device)
    
    ################ TODO: Strong baseline ####################
    grad = 0
    for i in range(num_iter):
      if torch.rand(1).item() >= 0.6:
        #resize img to rnd X rnd
        rnd = torch.randint(29, 33, (1,)).item()
        x_adv = transforms.Resize((rnd, rnd))(x_adv)
        #padding img to 32 X 32 with 0
        left = torch.randint(0, 32 - rnd + 1, (1,)).item()
        top = torch.randint(0, 32 - rnd + 1, (1,)).item()
        right = 32 - rnd - left
        bottom = 32 - rnd - top
        x_adv = transforms.Pad([left, top, right, bottom])(x_adv)

      x_adv = x_adv.detach().clone()
      x_adv.requires_grad = True # need to obtain gradient of x_adv, thus set required grad
      loss = loss_fn(model(x_adv), y) # calculate loss
      loss.backward() # calculate gradient
      
      # TODO: Refer to the algorithm of MI-FGSM
      # Calculate the momentum and update
      grad =  decay * grad + x_adv.grad.detach() / torch.norm(x_adv.grad.detach())
      x_adv = x_adv + alpha * grad.sign()
      x_adv = torch.max(torch.min(x_adv, x+epsilon), x-epsilon) # clip new x_adv back to [x-epsilon, x+epsilon]


    return x_adv

## Utils -- Attack
* Recall
  * ToTensor() can be seen as a function where $T(x) = x/255$.
  * Normalize() can be seen as a function where $N(x) = (x-mean)/std$ where $mean$ and $std$ are constants.

* Inverse function
  * Inverse Normalize() can be seen as a function where $N^{-1}(x) = x*std+mean$ where $mean$ and $std$ are constants.
  * Inverse ToTensor() can be seen as a function where $T^{-1}(x) = x*255$.

* Special Noted
  * ToTensor() will also convert the image from shape (height, width, channel) to shape (channel, height, width), so we also need to transpose the shape back to original shape.
  * Since our dataloader samples a batch of data, what we need here is to transpose **(batch_size, channel, height, width)** back to **(batch_size, height, width, channel)** using np.transpose.

In [None]:
# perform adversarial attack and generate adversarial examples
def gen_adv_examples(model, loader, attack, loss_fn):
    model.eval()
    adv_names = []
    train_acc, train_loss = 0.0, 0.0
    for i, (x, y) in enumerate(loader):
        x, y = x.to(device), y.to(device)
        x_adv = attack(model, x, y, loss_fn) # obtain adversarial examples
        yp = model(x_adv)
        loss = loss_fn(yp, y)
        train_acc += (yp.argmax(dim=1) == y).sum().item()
        train_loss += loss.item() * x.shape[0]
        # store adversarial examples
        adv_ex = ((x_adv) * std + mean).clamp(0, 1) # to 0-1 scale
        adv_ex = (adv_ex * 255).clamp(0, 255) # 0-255 scale
        adv_ex = adv_ex.detach().cpu().data.numpy().round() # round to remove decimal part
        adv_ex = adv_ex.transpose((0, 2, 3, 1)) # transpose (bs, C, H, W) back to (bs, H, W, C)
        adv_examples = adv_ex if i == 0 else np.r_[adv_examples, adv_ex]
    return adv_examples, train_acc / len(loader.dataset), train_loss / len(loader.dataset)

# create directory which stores adversarial examples
def create_dir(data_dir, adv_dir, adv_examples, adv_names):
    if os.path.exists(adv_dir) is not True:
        _ = shutil.copytree(data_dir, adv_dir)
    for example, name in zip(adv_examples, adv_names):
        im = Image.fromarray(example.astype(np.uint8)) # image pixel value should be unsigned int
        im.save(os.path.join(adv_dir, name))

## Model / Loss Function

Model list is available [here](https://github.com/osmr/imgclsmob/blob/master/pytorch/pytorchcv/model_provider.py). Please select models which has _cifar10 suffix. Other kinds of models are prohibited, and it will be considered to be cheating if you use them. 

Note: Some of the models cannot be accessed/loaded. You can safely skip them since TA's model will not use those kinds of models.

In [None]:
# This function is used to check whether you use models pretrained on cifar10 instead of other datasets
def model_checker(model_name):
  assert ('cifar10' in model_name) and ('cifar100' not in model_name), 'The model selected is not pretrained on cifar10!'

In [None]:
################ BOSS BASELINE ######################
class ensembleNet(nn.Module):
    def __init__(self, model_names, device='cuda'):
        super(ensembleNet, self).__init__()
        
        self.model1 = ptcv_get_model(model_names[0], pretrained=True).to(device)
        self.model2 = ptcv_get_model(model_names[1], pretrained=True).to(device)
        self.model3 = ptcv_get_model(model_names[2], pretrained=True).to(device)
        self.model4 = ptcv_get_model(model_names[3], pretrained=True).to(device)
        self.model5 = ptcv_get_model(model_names[4], pretrained=True).to(device)
        self.model6 = ptcv_get_model(model_names[5], pretrained=True).to(device)
        self.model7 = ptcv_get_model(model_names[6], pretrained=True).to(device)
        self.model8 = ptcv_get_model(model_names[7], pretrained=True).to(device)
        self.model9 = ptcv_get_model(model_names[8], pretrained=True).to(device)
        self.model10 = ptcv_get_model(model_names[9], pretrained=True).to(device)
        self.model11 = ptcv_get_model(model_names[10], pretrained=True).to(device)
        self.model12 = ptcv_get_model(model_names[11], pretrained=True).to(device)
        self.model13 = ptcv_get_model(model_names[12], pretrained=True).to(device)
        self.model14 = ptcv_get_model(model_names[13], pretrained=True).to(device)
        self.model15 = ptcv_get_model(model_names[14], pretrained=True).to(device)
        
    def forward(self, x):
        
        x1 = self.model1(x.clone())
        x2 = self.model2(x.clone())
        x3 = self.model3(x.clone())
        x4 = self.model4(x.clone())
        x5 = self.model5(x.clone())
        x6 = self.model6(x.clone())
        x7 = self.model7(x.clone())
        x8 = self.model8(x.clone())
        x9 = self.model9(x.clone())
        x10 = self.model10(x.clone())
        x11 = self.model11(x.clone())
        x12 = self.model12(x.clone())
        x13 = self.model13(x.clone())
        x14 = self.model14(x.clone())
        x15 = self.model15(x.clone())

        
        x = (x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15)/ 15
        
        return x


In [None]:
from pytorchcv.model_provider import get_model as ptcv_get_model

model_names = [
    'resnext29_16x64d_cifar10',
    'resnext29_32x4d_cifar10',
    'preresnet56_cifar10',
    'preresnet110_cifar10',
    'sepreresnet110_cifar10',
    'preresnet164bn_cifar10',
    'sepreresnet56_cifar10',
    'seresnet110_cifar10',
    'diaresnet56_cifar10',
    'resnet1001_cifar10',
    'diapreresnet56_cifar10',
    'resnet1202_cifar10',
    'resnet56_cifar10',
    'resnet110_cifar10',
    'diapreresnet110_cifar10
    ]

model = ensembleNet(model_names, device=device)
loss_fn = nn.CrossEntropyLoss()

for model_name in model_names:
  model_checker(model_name)

benign_acc, benign_loss = epoch_benign(model, adv_loader, loss_fn)
print(f'benign_acc = {benign_acc:.5f}, benign_loss = {benign_loss:.5f}')


benign_acc = 0.96000, benign_loss = 0.10345


## FGSM

In [None]:
adv_examples, ifgsm_acc, ifgsm_loss = gen_adv_examples(model, adv_loader, ifgsm, loss_fn)
print(f'ifgsm_acc = {ifgsm_acc:.5f}, ifgsm_loss = {ifgsm_loss:.5f}')

create_dir(root, 'ifgsm', adv_examples, adv_names)

In [None]:
%cd ifgsm
!tar zcvf ../ifgsm.tgz *
%cd ..

In [None]:
from google.colab import files
files.download('ifgsm.tgz')

## MIFGSM

In [None]:
adv_examples, mifgsm_acc, mifgsm_loss = gen_adv_examples(model, adv_loader, mifgsm, loss_fn)
print(f'mifgsm_acc = {mifgsm_acc:.5f}, mifgsm_loss = {mifgsm_loss:.5f}')

create_dir(root, 'mifgsm', adv_examples, adv_names)

In [None]:
%cd mifgsm
!tar zcvf ../mifgsm.tgz *
%cd ..

In [None]:
from google.colab import files
files.download('mifgsm.tgz')

## Example of Ensemble Attack
* Ensemble multiple models as your proxy model to increase the black-box transferability ([paper](https://arxiv.org/abs/1611.02770))

In [None]:
################ BOSS BASELINE ######################

class ensembleNet(nn.Module):
    def __init__(self, model_names):
        super().__init__()
        self.models = nn.ModuleList([ptcv_get_model(name, pretrained=True) for name in model_names])
        
    def forward(self, x):
        #################### TODO: boss baseline ###################
        ensemble_logits = None
        for i, m in enumerate(self.models):
          emsemble_logits = m(x) if i == 0 else emsemble_logits + m(x)
        ensemble_logits /=  len(self.models)
        # TODO: sum up logits from multiple models  
        return ensemble_logits

* Construct your ensemble model

In [None]:
model_names = [
    'nin_cifar10',
    #'resnet1202_cifar10',
    #'preresnet110_cifar10', 
    #'seresnet272bn_cifar10',
]

for model_name in model_names:
  model_checker(model_name)

ensemble_model = ensembleNet(model_names).to(device)
ensemble_model.eval()

## Visualization

In [None]:
import matplotlib.pyplot as plt

classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(10, 20))
cnt = 0
for i, cls_name in enumerate(classes):
    path = f'{cls_name}/{cls_name}1.png'
    # benign image
    cnt += 1
    plt.subplot(len(classes), 4, cnt)
    im = Image.open(f'./data/{path}')
    logit = model(transform(im).unsqueeze(0).to(device))[0]
    predict = logit.argmax(-1).item()
    prob = logit.softmax(-1)[predict].item()
    plt.title(f'benign: {cls_name}1.png\n{classes[predict]}: {prob:.2%}')
    plt.axis('off')
    plt.imshow(np.array(im))
    # adversarial image
    cnt += 1
    plt.subplot(len(classes), 4, cnt)
    im = Image.open(f'./mifgsm/{path}')
    logit = model(transform(im).unsqueeze(0).to(device))[0]
    predict = logit.argmax(-1).item()
    prob = logit.softmax(-1)[predict].item()
    plt.title(f'adversarial: {cls_name}1.png\n{classes[predict]}: {prob:.2%}')
    plt.axis('off')
    plt.imshow(np.array(im))
plt.tight_layout()
plt.show()

## Report Question
* Make sure you follow below setup: the source model is "resnet110_cifar10", applying the vanilla fgsm attack on `dog2.png`. You can find the perturbed image in `fgsm/dog2.png`.

In [None]:
# original image
path = f'dog/dog2.png'
im = Image.open(f'./data/{path}')
logit = model(transform(im).unsqueeze(0).to(device))[0]
predict = logit.argmax(-1).item()
prob = logit.softmax(-1)[predict].item()
plt.title(f'benign: dog2.png\n{classes[predict]}: {prob:.2%}')
plt.axis('off')
plt.imshow(np.array(im))
plt.tight_layout()
plt.show()

# adversarial image 
adv_im = Image.open(f'./fgsm/{path}')
logit = model(transform(adv_im).unsqueeze(0).to(device))[0]
predict = logit.argmax(-1).item()
prob = logit.softmax(-1)[predict].item()
plt.title(f'adversarial: dog2.png\n{classes[predict]}: {prob:.2%}')
plt.axis('off')
plt.imshow(np.array(adv_im))
plt.tight_layout()
plt.show()

## Passive Defense - JPEG compression
JPEG compression by imgaug package, compression rate set to 70

Reference: https://imgaug.readthedocs.io/en/latest/source/api_augmenters_arithmetic.html#imgaug.augmenters.arithmetic.JpegCompression

Note: If you haven't implemented the JPEG compression, this module will return an error. Don't worry about this.

In [None]:
import imgaug.augmenters as iaa

# pre-process image
x = transforms.ToTensor()(adv_im)*255
x = x.permute(1, 2, 0).numpy()
x = x.astype(np.uint8)

# TODO: use "imgaug" package to perform JPEG compression (compression rate = 70)
#compressed_x = iaa.JpegCompression(compression = 70) 
# ref : chatgpt
seq = iaa.Sequential([
    iaa.JpegCompression(compression=(70, 70))
])
compressed_x = seq.augment_image(x)

logit = model(transform(compressed_x).unsqueeze(0).to(device))[0]
predict = logit.argmax(-1).item()
prob = logit.softmax(-1)[predict].item()
plt.title(f'JPEG adversarial: dog2.png\n{classes[predict]}: {prob:.2%}')
plt.axis('off')


plt.imshow(compressed_x)
plt.tight_layout()
plt.show()