We examine the predictions of normally trained classifiers for adversarial perturbations on natural samples or uniform noises created by `create.py`. Specifically, we assess whether:
- The adversarial attacks we implemented work as intended.
- The adversarial datasets in each scenario successfully mislead the classifiers.
    - Notably, certain adversarial images failed to deceive the classifiers, possibly because of learning bias, architectural bias, or suboptimal PGD optimization.

For example, we examined the following values.

Dataset: CIFAR10  
Scenario: natural_det_L2  
Accuracy: [0.98 (1), 0.99 (2), ..., 0.99 (10)]

(1) Ratio of truck images (with imperceptible L2 perturbations to mislead the classifier into identifying them as planes) classified as planes by the classifier.  
(2) Ratio of plane images (with imperceptible L2 perturbations to mislead the classifier into identifying them as cars) classified as cars by the classifier.  
(10) Ratio of ship images (with imperceptible L2 perturbations to mislead the classifier into identifying them as trucks) classified as trucks by the classifier.  

Dataset: CIFAR10  
Scenario: natural_rand_L2  
Accuracy: [0.99 (1), ...]

(1) Ratio of images that appear as objects other than planes (with imperceptible L2 perturbations to mislead the classifier into identifying them as planes) classified as planes by the classifier.  

These results indicate that L2 perturbations on natural samples can effectively fool the classifier.

Dataset: CIFAR10  
Scenario: uniform_L2  
Accuracy: [0.07 (1), 0.00, 0.00, 1.0 (4), ...]

(1) Ratio of noises (with imperceptible L2 perturbations to mislead the classifier into identifying them as planes) classified as planes by the classifier.  
(4) Ratio of noises (with imperceptible L2 perturbations to mislead the classifier into identifying them as cats) classified as cats by the classifier.  

This suggests that while L2 perturbations on noises can successfully mislead the classifier to identify them as cats, they are less effective in doing so for planes.

In [1]:
import os
root = os.path.join('..', '..')

In [None]:
import sys
sys.path.append(root)

: 

In [3]:
from collections import OrderedDict
from typing import Literal

import torch

from utils.classifiers import ConvNet, WideResNet
from utils.datasets import CIFAR10, FMNIST, MNIST, SequenceDataset
from utils.utils import (CalcClassificationAcc, ModelWithNormalization,
                         dataloader, freeze)

In [4]:
device = [0]
dataset_root = os.path.join(os.path.sep, 'root', 'datasets')

In [5]:
class Util:
    def __init__(self, dataset_name: Literal['MNIST', 'FMNIST', 'CIFAR10']) -> None:
        self.dataset_name = dataset_name
        self.classifier = self._define_classifier()
        self._load_weight()

    def _define_classifier(self) -> ModelWithNormalization:
        if self.dataset_name == 'MNIST':
            classifier = ConvNet(10)
            dataset_cls = MNIST
        elif self.dataset_name == 'FMNIST':
            classifier = ConvNet(10)
            dataset_cls = FMNIST
        elif self.dataset_name == 'CIFAR10':
            classifier = WideResNet(28, 10, 0.3, 10)
            dataset_cls = CIFAR10
        else:
            raise ValueError(self.dataset_name)
        return ModelWithNormalization(classifier, dataset_cls.mean, dataset_cls.std)

    def _load_weight(self) -> None:
        dir_path = os.path.join(root, 'models', self.dataset_name, 'version_0', 'checkpoints')
        ckpt_name = [fname for fname in os.listdir(dir_path) if '.ckpt' in fname][0]
        path = os.path.join(dir_path, ckpt_name)

        state_dict = torch.load(path, map_location='cpu')['state_dict']
        state_dict = OrderedDict((k.replace('classifier.', ''), v) for k, v in state_dict.items())
        self.classifier.load_state_dict(state_dict)

        freeze(self.classifier)
        self.classifier.eval()

    def _load_dataset(self, suffix: str) -> SequenceDataset:
        p = os.path.join(root, 'datasets', f'{self.dataset_name}_{suffix}', 'dataset')
        d = torch.load(p, map_location='cpu')
        return SequenceDataset(d['imgs'], d['labels'])
    
    def test(self, suffix: str) -> None:
        print(suffix)

        if self.dataset_name in ('MNIST', 'FMNIST'):
            batch_size = 60000
        elif self.dataset_name == 'CIFAR10':
            batch_size = 10000
        else:
            raise ValueError(self.dataset_name)

        d = self._load_dataset(suffix)
        loader = dataloader(d, batch_size, False)

        acc = CalcClassificationAcc(
            accelerator='gpu',
            strategy='dp',
            devices=device,
            precision=16,
        ).run(self.classifier, loader, 10, average='none')
        print(acc)
        print()
    
    def test_all(self) -> None:
        names = [
            'natural_rand_L0', 
            'natural_det_L0',
            #'natural_rand_L2', 
            #'natural_det_L2',
            #'natural_rand_Linf', 
            #'natural_det_Linf',
            #'uniform_L0', 
            #'uniform_L2', 
            #'uniform_Linf', 
        ]
        for n in names:
            self.test(n)

In [None]:
Util('MNIST').test_all()

In [None]:
Util('FMNIST').test_all()

In [None]:
Util('CIFAR10').test_all()