# Can **you** break computer vision model?
## On-site round

## Task 1: Break a network by hand



Suppose you have a neural network $M$, defined b $M(x) = \text{softmax}\left(x^\top\theta\right)$. Note that for $z\in\mathbb{R}^n$ we have $\text{softmax}(z)=\frac{1}{\sum_i \exp(z_i)}\cdot\left[\exp(z_1), \exp(z_2), \ldots, \exp(z_n)\right]$.

Further, let $x=[1.,2.,3.]$ be an input of interest and let $\theta=\begin{bmatrix}
    5. & -2. \\
    3. & 4. \\
    2. & -1.
\end{bmatrix}$. Compute an an adversarial example $x^\prime$ that maximizes the second coordinate of $M(x)$ under the constraint that $\|x^\prime-x\|_\infty \leq 1.$, i.e., you're creating an $\ell_\infty$ adversarial example with $\varepsilon=1$.

Your solution here



## Task 2: Alice heard you've attacked her latest model, so she created a new one! Can you break that one too?

Alice heard about your adaptive attack on AliceNet, so she's raised the bar with her latest robust model, AliceNetV2. According to her, there's no way to attack this new model!
Like last time, Alice won't share with you what her defense is!

Your task is to develop an adaptive *targeted* attack. In particular, you must find a way to make small perturbations to the input images such that AliceNetV2 outputs the class "dog" every time!

###Task Requirements

1. **Understand the Defense**: Analyze Alice's model to understand the type of defense implemented. This could involve reviewing the model architecture, preprocessing steps, or any additional mechanisms employed for defense.

2. **Design an Adaptive Targeted Attack**: Develop an attack strategy that goes around Alice's defense and makes AliceNetV2 output the calss "dog". This might involve modifying standard attack methods like PGD.

3. **Generate Adversarial Examples**: Modify all test images from the CIFAR-10 dataset using your adversarial attack. You are allowed to modify the original test images within an $\ell_\infty$ ball of radius $8/255$.

4. **Test Model Accuracy**: Evaluate how often the AliceNetV2 model predicts the label "dog" on these adversarially modified images.


###Deliverables

* Python code used for your attack and generation of the adversarial CIFAR-10 test set.
* A short (up to a few paragraphs) report detailing your analysis of the defense, the approach used for the adaptive attack, and the success rate of your attack on the CIFAR-10 test set.


In [None]:
!pip install wget
import wget

[31mERROR: Operation cancelled by user[0m[31m
[0mTraceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 3108, in _dep_map
    return self.__dep_map
  File "/usr/local/lib/python3.10/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2901, in __getattr__
    raise AttributeError(attr)
AttributeError: _DistInfoDistribution__dep_map

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/base_command.py", line 169, in exc_logging_wrapper
    status = run_func(*args)
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/cli/req_command.py", line 242, in wrapper
    return func(self, options, args)
  File "/usr/local/lib/python3.10/dist-packages/pip/_internal/commands/install.py", line 441, in run
    conflicts = self._determine_conflicts(to_install)
  File "/usr/local/lib/python3.10/di

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import functools

class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(
            in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False
        )
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(
            planes, planes, kernel_size=3, stride=1, padding=1, bias=False
        )
        self.bn2 = nn.BatchNorm2d(planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion * planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(
                    in_planes,
                    self.expansion * planes,
                    kernel_size=1,
                    stride=stride,
                    bias=False,
                ),
                nn.BatchNorm2d(self.expansion * planes),
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        return F.relu(out)



def recursive_getattr(obj, attr, *args):
    def _getattr(obj, attr):
        return getattr(obj, attr, *args)

    return functools.reduce(_getattr, [obj] + attr.split("."))


class AliceNetV2_inner(nn.Module):
    KERNEL_SIZE = 3
    IN_CHANNELS = 3
    AVG_POOL_SIZE = 8
    LINEAR_MUL = 4 * 4

    def __init__(self, num_blocks=2, in_planes=64, num_classes=10, debug=False):
        super().__init__()
        self.num_classes = num_classes
        self.debug = debug
        self.in_planes = in_planes
        self.kernel_size = self.KERNEL_SIZE
        self.num_blocks = num_blocks

        # pre-layer stuff
        self.conv1 = nn.Conv2d(
            self.IN_CHANNELS,
            self.in_planes,
            kernel_size=self.kernel_size,
            stride=1,
            padding=1,
            bias=False,
        )
        self.bn1 = nn.BatchNorm2d(self.in_planes)

        # make single layer with K BasicBlocks
        # BasicBLock: conv1, bn1, conv2, bn2, shortcut
        # each conv has `in_planes` filters
        get_block = lambda: BasicBlock(self.in_planes, self.in_planes, stride=1)
        self.layer = nn.Sequential(*[get_block() for _ in range(num_blocks)])

        # register blocks with setattr to make it compatible with masking code
        for idx, block in enumerate(self.layer):
            setattr(self, f"block{idx}", block)

        # post-layer stuff
        self.flatten = nn.Flatten()
        self.avg_pool_2d = nn.AvgPool2d(self.AVG_POOL_SIZE)
        self.linear = nn.Linear(self.in_planes * self.LINEAR_MUL, num_classes)

    def forward(self, x, *args, **kwargs):
        out = F.relu(self.bn1(self.conv1(x)))
        if self.debug:
            print(f"conv1: {out.shape}")
        out = self.layer(out)
        if self.debug:
            print(f"layer: {out.shape}")
        out = self.avg_pool_2d(out)
        if self.debug:
            print(f"avg_pool_2d: {out.shape}")
        out = self.flatten(out)
        out = self.linear(out)
        return out

    def get_components(self, add_weight_suffix=True):
        comps = ["conv1"]
        num_convs = 2
        for block_idx in range(self.num_blocks):
            for conv_idx in range(1, 1 + num_convs):
                comps.append(f"block{block_idx}.conv{conv_idx}")

        comps.append("linear")

        if add_weight_suffix:
            return ["{}.weight".format(c) for c in comps]
        return comps

    def get_component_dimensions(self):
        comps = self.get_components()
        comps_map = {}

        for c in comps:
            w = recursive_getattr(self, c)
            comps_map[c] = w.shape[0]

        return comps_map

class AliceNetV2(nn.Module):
    def __init__(self, num_blocks=2, in_planes=64, num_classes=10):
        super(AliceNetV2, self).__init__()
        self.model1 = AliceNetV2_inner(num_blocks=num_blocks, in_planes=in_planes, num_classes=num_classes)
        self.model2 = AliceNetV2_inner(num_blocks=num_blocks, in_planes=in_planes, num_classes=num_classes)
        self.model3 = AliceNetV2_inner(num_blocks=num_blocks, in_planes=in_planes, num_classes=num_classes)

    def forward(self, x):
        outputs = []
        for network in [self.model1, self.model2, self.model3]:
            out = network(x)
            outputs.append(out)
        outputs = torch.stack(outputs)
        network_predictions = torch.argmax(outputs, dim=-1)
        majority_votes = torch.mode(network_predictions, dim=0).values
        result = torch.zeros_like(out)
        for i, vote in enumerate(majority_votes):
            result[i, vote] = 1.
        return result


def get_alicenet_v2() -> AliceNetV2:

    model = AliceNetV2()

    # url = 'https://www.dropbox.com/scl/fi/awpcptjv4jbrgljn58k0m/alice_model_v2_IOAI.pt?rlkey=1v9hzkkpyqm0xqcyl1dogfwz8&dl=1'
    # wget.download(url, out='./alicenet_v2.pt', bar=None)
    trained_ckpt = torch.load('./alicenet_v2.pt', map_location="cpu")
    model.load_state_dict(trained_ckpt)

    return model

In [None]:
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision
from torch.utils.data import Dataset, DataLoader

alicenetv2 = get_alicenet_v2().to("cuda")

# umieszczenie tego w eval zmienia działanie
# posiedź nad tym
# alicenetv2.eval()

In [None]:
train_dataset = datasets.CIFAR10(root="./data", train=True, transform=transforms.ToTensor(), download=True)
valid_dataset = datasets.CIFAR10(root="./data", train=False, transform=transforms.ToTensor(), download=True)

train_dataloader = DataLoader(train_dataset, batch_size=16, shuffle=True)
val_dataloader = DataLoader(valid_dataset, batch_size=16, shuffle=False)


Files already downloaded and verified
Files already downloaded and verified


In [None]:
print(len(valid_dataset))

10000


In [None]:
def norm(x):
    x = (0.5 - x) * 2

    # Obliczamy długości wektorów wzdłuż osi kanałów (axis=1)
    vector_lengths = torch.norm(x, dim=1, keepdim=True)  # [batch_size x 1 x 32 x 32]

    # Znajdujemy maksymalną długość wektora w całym batchu
    max_vector_length = vector_lengths.max()

    # Ustalamy współczynnik normalizacji
    normalization_factor = (8.0 / 255.0) / max_vector_length

    # Normalizujemy tensor x
    x = x * normalization_factor

    return x

t_loss = 0
t_loss_num = 0

# Zapisz ten plik i zamień na model
num_epochs = 1000
criterion = torch.nn.CrossEntropyLoss()
for i, (image, target) in enumerate(val_dataloader):
  image = image.to("cuda")
  target = target.to("cuda")



  x = torch.rand(image.shape).to("cuda")
  x.requires_grad_(True)
  opt = torch.optim.Adam([x], lr=0.1)

  for epoch in range(num_epochs):
      ni = image + norm(x)

      # Upewnij się że modele alicenet nie są trenowane
      t1 = alicenetv2.model1(ni)
      t2 = alicenetv2.model2(ni)
      t3 = alicenetv2.model3(ni)

      # labels = torch.zeros(image.shape[0], 10).to("cuda")

      # Zmień to na ludzką formę
      # labels[:, 5] = 1.0
      labels = torch.full((image.shape[0],), 5).to("cuda")
      # print(labels)
      # exit(0)

      loss1 = criterion(t1, labels)
      loss2 = criterion(t2, labels)
      loss3 = criterion(t3, labels)

      opt.zero_grad()
      # loss = loss1 + loss2 + loss3
      # loss.backward()

      loss1.backward(retain_graph=True)
      loss2.backward(retain_graph=True)
      loss3.backward()

      opt.step()

  # Średnia != głosowanie - popraw
  args = torch.argmax(alicenetv2(ni), dim=-1)

  args2 = torch.full(args.shape, 5).to("cuda")
  print(torch.sum(args == args2)/len(args))

  t_loss += torch.sum(args == args2)
  t_loss_num += len(args)

print("TOTAL")
print(t_loss/t_loss_num)

  # for epoch in range(num_epochs):


tensor(0.5625, device='cuda:0')
tensor(0.6250, device='cuda:0')
tensor(0.7500, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.5625, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.6250, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.5000, device='cuda:0')
tensor(0.5625, device='cuda:0')
tensor(0.6250, device='cuda:0')
tensor(0.7500, device='cuda:0')
tensor(0.6250, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0

KeyboardInterrupt: 