# **Homework 10 - Adversarial Attack**

Slides: https://reurl.cc/v5kXkk

Videos:

TA: ntu-ml-2021spring-ta@googlegroups.com

## Enviroment & Download

We make use of [pytorchcv](https://pypi.org/project/pytorchcv/) to obtain CIFAR-10 pretrained model, so we need to set up the enviroment first. We also need to download the data (200 images) which we want to attack.

In [None]:
# set up environment
!pip install pytorchcv

# download
!gdown --id 1fHi1ko7wr80wXkXpqpqpOxuYH1mClXoX -O data.zip

# unzip
!unzip ./data.zip
!rm ./data.zip

Collecting pytorchcv
[?25l  Downloading https://files.pythonhosted.org/packages/44/74/e5dae0875679d296fa9a3833041699cee9222e2d3dd1f9ae1ded050b5672/pytorchcv-0.0.65-py2.py3-none-any.whl (527kB)
[K     |▋                               | 10kB 17.1MB/s eta 0:00:01[K     |█▎                              | 20kB 23.6MB/s eta 0:00:01[K     |█▉                              | 30kB 26.4MB/s eta 0:00:01[K     |██▌                             | 40kB 27.4MB/s eta 0:00:01[K     |███                             | 51kB 29.0MB/s eta 0:00:01[K     |███▊                            | 61kB 31.7MB/s eta 0:00:01[K     |████▍                           | 71kB 31.7MB/s eta 0:00:01[K     |█████                           | 81kB 32.8MB/s eta 0:00:01[K     |█████▋                          | 92kB 33.9MB/s eta 0:00:01[K     |██████▏                         | 102kB 34.8MB/s eta 0:00:01[K     |██████▉                         | 112kB 34.8MB/s eta 0:00:01[K     |███████▌                        | 1

In [None]:
!nvidia-smi -L

GPU 0: Tesla T4 (UUID: GPU-758b4681-8bcd-0bcf-d61e-05ba2646f497)


## Global Settings

* $\epsilon$ is fixed to be 8. But on **Data section**, we will first apply transforms on raw pixel value (0-255 scale) **by ToTensor (to 0-1 scale)** and then **Normalize (subtract mean divide std)**. $\epsilon$ should be set to $\frac{8}{255 * std}$ during attack.

* Explaination (optional)
    * Denote the first pixel of original image as $p$, and the first pixel of adversarial image as $a$.
    * The $\epsilon$ constraints tell us $\left| p-a \right| <= 8$.
    * ToTensor() can be seen as a function where $T(x) = x/255$.
    * Normalize() can be seen as a function where $N(x) = (x-mean)/std$ where $mean$ and $std$ are constants.
    * After applying ToTensor() and Normalize() on $p$ and $a$, the constraint becomes $\left| N(T(p))-N(T(a)) \right| = \left| \frac{\frac{p}{255}-mean}{std}-\frac{\frac{a}{255}-mean}{std} \right| = \frac{1}{255 * std} \left| p-a \right| <= \frac{8}{255 * std}.$
    * So, we should set $\epsilon$ to be $\frac{8}{255 * std}$ after ToTensor() and Normalize().

In [None]:
import torch
import torch.nn as nn

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

batch_size = 8
p = 1 / 10

# the mean and std are the calculated statistics from cifar_10 dataset
cifar_10_mean = (0.491, 0.482, 0.447) # mean for the three channels of cifar_10 images
cifar_10_std = (0.202, 0.199, 0.201) # std for the three channels of cifar_10 images

# convert mean and std to 3-dimensional tensors for future operations
mean = torch.tensor(cifar_10_mean).to(device).view(3, 1, 1)
std = torch.tensor(cifar_10_std).to(device).view(3, 1, 1)

epsilon = 8/255/std
# TODO: iterative fgsm attack
# alpha (step size) can be decided by yourself
alpha = 0.8/255/std

root = './data' # directory for storing benign images
# benign images: images which do not contain adversarial perturbations
# adversarial images: images which include adversarial perturbations

## Data

Construct dataset and dataloader from root directory. Note that we store the filename of each image for future usage.

In [None]:
import os
import glob
import shutil
import numpy as np
from PIL import Image
from torchvision.transforms import transforms
from torch.utils.data import Dataset, DataLoader

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(cifar_10_mean, cifar_10_std)
])

class AdvDataset(Dataset):
    def __init__(self, data_dir, transform):
        self.images = []
        self.labels = []
        self.names = []
        '''
        data_dir
        ├── class_dir
        │   ├── class1.png
        │   ├── ...
        │   ├── class20.png
        '''
        for i, class_dir in enumerate(sorted(glob.glob(f'{data_dir}/*'))):
            images = sorted(glob.glob(f'{class_dir}/*'))
            self.images += images
            self.labels += ([i] * len(images))
            self.names += [os.path.relpath(imgs, data_dir) for imgs in images]
        self.transform = transform
    def __getitem__(self, idx):
        image = self.transform(Image.open(self.images[idx]))
        label = self.labels[idx]
        return image, label
    def __getname__(self):
        return self.names
    def __len__(self):
        return len(self.images)

adv_set = AdvDataset(root, transform=transform)
adv_names = adv_set.__getname__()
adv_loader = DataLoader(adv_set, batch_size=batch_size, shuffle=False)

print(f'number of images = {adv_set.__len__()}')

number of images = 200


## Utils -- Benign Images Evaluation

In [None]:
# to evaluate the performance of model on benign images
def epoch_benign(models, loader, loss_fn):
    for model in models:
      model.eval()
    train_acc, train_loss = 0.0, 0.0
    for x, y in loader:
        x, y = x.to(device), y.to(device)
        yps = list()
        for model in models:
          yps.append(model(x))
        for index, yp in enumerate(yps):
          if index == 0:
            loss = loss_fn(yp, y) * p
          else:
            loss += loss_fn(yp, y) * p
          train_acc += (yp.argmax(dim=1) == y).sum().item() * p
        train_loss += loss.item() * x.shape[0]
    return train_acc / len(loader.dataset), train_loss / len(loader.dataset)

## Utils -- Attack Algorithm

In [None]:
# perform fgsm attack
def fgsm(models, x, y, loss_fn, epsilon=epsilon):
    x_adv = x.detach().clone() # initialize x_adv as original benign image x
    x_adv.requires_grad = True # need to obtain gradient of x_adv, thus set required grad

    for index, model in enumerate(models):
        if index == 0:
            loss = loss_fn(model(x_adv), y) * p
        else:
            loss += loss_fn(model(x_adv), y) * p
    loss.backward() # calculate gradient
    # fgsm: use gradient ascent on x_adv to maximize loss
    x_adv = x_adv + epsilon * x_adv.grad.detach().sign()
    return x_adv
    

# TODO: perform iterative fgsm attack
# set alpha as the step size in Global Settings section
# alpha and num_iter can be decided by yourself
def ifgsm(models, x, y, loss_fn, epsilon=epsilon, alpha=alpha, num_iter=100):
    # initialize x_adv as original benign image x
    x_adv = x.detach().clone()
    # write a loop of num_iter to represent the iterative times
    # for each loop
    for i in range(num_iter):
        # call fgsm with (epsilon = alpha) to obtain new x_adv
        x_adv = fgsm(models, x_adv, y, loss_fn, alpha)
        # clip new x_adv back to [x-epsilon, x+epsilon]
        x_adv = torch.min(torch.max(x_adv, x-epsilon), x+epsilon)
    return x_adv
    pass

## Utils -- Attack

* Recall
    * ToTensor() can be seen as a function where $T(x) = x/255$.
    * Normalize() can be seen as a function where $N(x) = (x-mean)/std$ where $mean$ and $std$ are constants.

* Inverse function
    * Inverse Normalize() can be seen as a function where $N^{-1}(x) = x*std+mean$ where $mean$ and $std$ are constants.
    * Inverse ToTensor() can be seen as a function where $T^{-1}(x) = x*255$.

* Special Noted
    * ToTensor() will also convert the image from shape (height, width, channel) to shape (channel, height, width), so we also need to transpose the shape back to original shape.
    * Since our dataloader samples a batch of data, what we need here is to transpose **(batch_size, channel, height, width)** back to **(batch_size, height, width, channel)** using np.transpose.

In [None]:
# perform adversarial attack and generate adversarial examples
def gen_adv_examples(models, loader, attack, loss_fn):
    for model in models:
        model.eval()
    adv_names = []
    train_acc, train_loss = 0.0, 0.0
    for i, (x, y) in enumerate(loader):
        x, y = x.to(device), y.to(device)
        x_adv = attack(models, x, y, loss_fn) # obtain adversarial examples

        yps = list()
        for model in models:
            yps.append(model(x_adv))
        for index, yp in enumerate(yps):
            if index == 0:
                loss = loss_fn(yp, y) * p
            else:
                loss += loss_fn(yp, y) * p
            train_acc += (yp.argmax(dim=1) == y).sum().item() * p
        train_loss += loss.item() * x.shape[0]
        # store adversarial examples
        adv_ex = ((x_adv) * std + mean).clamp(0, 1) # to 0-1 scale
        adv_ex = (adv_ex * 255).clamp(0, 255) # 0-255 scale
        adv_ex = adv_ex.detach().cpu().data.numpy().round() # round to remove decimal part
        adv_ex = adv_ex.transpose((0, 2, 3, 1)) # transpose (bs, C, H, W) back to (bs, H, W, C)
        adv_examples = adv_ex if i == 0 else np.r_[adv_examples, adv_ex]
    return adv_examples, train_acc / len(loader.dataset), train_loss / len(loader.dataset)

# create directory which stores adversarial examples
def create_dir(data_dir, adv_dir, adv_examples, adv_names):
    if os.path.exists(adv_dir) is not True:
        _ = shutil.copytree(data_dir, adv_dir)
    for example, name in zip(adv_examples, adv_names):
        im = Image.fromarray(example.astype(np.uint8)) # image pixel value should be unsigned int
        im.save(os.path.join(adv_dir, name))

## Model / Loss Function

Model list is available [here](https://github.com/osmr/imgclsmob/blob/master/pytorch/pytorchcv/model_provider.py). Please select models which has _cifar10 suffix. Some of the models cannot be accessed/loaded. You can safely skip them since TA's model will not use those kinds of models.

In [None]:
from pytorchcv.model_provider import get_model as ptcv_get_model

# model = ptcv_get_model('resnet110_cifar10', pretrained=True).to(device)
model_list = ['seresnet56_cifar10', 'msdnet22_cifar10', 'rir_cifar10', 'diapreresnet110_cifar10', 'fractalnet_cifar10',
        'resdropresnet20_cifar10', 'wrn40_8_cifar10', 'densenet40_k12_cifar10', 'resnet110_cifar10', 'preresnet56_cifar10']
# model_list = ['seresnet20_cifar10', 'diaresnet110_cifar10', 'preresnet20_cifar10', 'resnet20_cifar10', 'nin_cifar10',
#         'diapreresnet20_cifar10', 'diapreresnet56_cifar10', 'sepreresnet110_cifar10', 'resnet56_cifar10', 'pyramidnet110_a48_cifar10']
models = list()
for model in model_list:
    models.append(ptcv_get_model(model, pretrained=True).to(device))
loss_fn = nn.CrossEntropyLoss()

benign_acc, benign_loss = epoch_benign(models, adv_loader, loss_fn)
print(f'benign_acc = {benign_acc:.5f}, benign_loss = {benign_loss:.5f}')

Downloading /root/.torch/models/diapreresnet20_cifar10-0642-14a1eb85.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.343/diapreresnet20_cifar10-0642-14a1eb85.pth.zip...
Downloading /root/.torch/models/diapreresnet56_cifar10-0483-41cae958.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.343/diapreresnet56_cifar10-0483-41cae958.pth.zip...
Downloading /root/.torch/models/sepreresnet110_cifar10-0454-418daea9.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.379/sepreresnet110_cifar10-0454-418daea9.pth.zip...
Downloading /root/.torch/models/resnet56_cifar10-0452-628c42a2.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.163/resnet56_cifar10-0452-628c42a2.pth.zip...
Downloading /root/.torch/models/pyramidnet110_a48_cifar10-0372-eb185645.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.184/pyramidnet110_a48_cifar10-0372-eb185645.pth.zip...
benign_acc = 0.94500, benign_loss = 0.20839


## FGSM

In [None]:
adv_examples, fgsm_acc, fgsm_loss = gen_adv_examples(models, adv_loader, fgsm, loss_fn)
print(f'fgsm_acc = {fgsm_acc:.5f}, fgsm_loss = {fgsm_loss:.5f}')

create_dir(root, 'fgsm', adv_examples, adv_names)

fgsm_acc = 0.41300, fgsm_loss = 2.61361


## I-FGSM

In [None]:
# TODO: iterative fgsm attack
adv_examples, ifgsm_acc, ifgsm_loss = gen_adv_examples(models, adv_loader, ifgsm, loss_fn)
print(f'ifgsm_acc = {ifgsm_acc:.5f}, ifgsm_loss = {ifgsm_loss:.5f}')

create_dir(root, 'ifgsm', adv_examples, adv_names)

ifgsm_acc = 0.01100, ifgsm_loss = 14.45450


## Compress the images

In [None]:
# %cd fgsm
# !tar zcvf ../fgsm.tgz *
# %cd ..

%cd ifgsm
!tar zcvf ../ifgsm.tgz *
%cd ..

/content/ifgsm
airplane/
airplane/airplane10.png
airplane/airplane20.png
airplane/airplane12.png
airplane/airplane17.png
airplane/airplane14.png
airplane/airplane6.png
airplane/airplane1.png
airplane/airplane5.png
airplane/airplane4.png
airplane/airplane7.png
airplane/airplane2.png
airplane/airplane16.png
airplane/airplane18.png
airplane/airplane3.png
airplane/airplane19.png
airplane/airplane8.png
airplane/airplane9.png
airplane/airplane15.png
airplane/airplane11.png
airplane/airplane13.png
automobile/
automobile/automobile19.png
automobile/automobile5.png
automobile/automobile18.png
automobile/automobile8.png
automobile/automobile3.png
automobile/automobile16.png
automobile/automobile11.png
automobile/automobile7.png
automobile/automobile10.png
automobile/automobile15.png
automobile/automobile20.png
automobile/automobile13.png
automobile/automobile17.png
automobile/automobile1.png
automobile/automobile4.png
automobile/automobile6.png
automobile/automobile14.png
automobile/automobile9.

## Visualization

In [None]:
# import matplotlib.pyplot as plt

# classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# plt.figure(figsize=(10, 20))
# cnt = 0
# for i, cls_name in enumerate(classes):
#     path = f'{cls_name}/{cls_name}1.png'
#     # benign image
#     cnt += 1
#     plt.subplot(len(classes), 4, cnt)
#     im = Image.open(f'./data/{path}')
#     for model in models
#     logit = model(transform(im).unsqueeze(0).to(device))[0]
#     predict = logit.argmax(-1).item()
#     prob = logit.softmax(-1)[predict].item()
#     plt.title(f'benign: {cls_name}1.png\n{classes[predict]}: {prob:.2%}')
#     plt.axis('off')
#     plt.imshow(np.array(im))
#     # adversarial image
#     cnt += 1
#     plt.subplot(len(classes), 4, cnt)
#     im = Image.open(f'./ifgsm/{path}')
#     logit = model(transform(im).unsqueeze(0).to(device))[0]
#     predict = logit.argmax(-1).item()
#     prob = logit.softmax(-1)[predict].item()
#     plt.title(f'adversarial: {cls_name}1.png\n{classes[predict]}: {prob:.2%}')
#     plt.axis('off')
#     plt.imshow(np.array(im))
# plt.tight_layout()
# plt.show()

SyntaxError: ignored