These benchmarks are based off of those of [Adversarial-Attacks-PyTorch](https://github.com/Harry24k/adversarial-attacks-pytorch/blob/master/README.md#performance-comparison). However, they differ in that:
- They ensure that all enqueued asynchornous CUDA computations have been completed prior to recording the time elapsed
- [Madry lab models](https://github.com/mit-ll-responsible-ai/responsible-ai-toolbox/blob/dab4ba8b506ad6bd83842a282f3867388896ce65/experiments/src/rai_experiments/models/pretrained.py#L59-L69) are used for timing instead of robust_bench (I hit a `SSLCertVerificationError` when trying to download the `robust_bench` models)
- Only FGSM, Linf PGD, and L2 PGD are timed (rai-toolbox doesn't have the other methods yet)
- Time per-step is recorded for easier comparison across methods


Depenencies:
- torch
- [rai-toolbox](https://mit-ll-responsible-ai.github.io/responsible-ai-toolbox/#installation)
- [rai-experiments](https://github.com/mit-ll-responsible-ai/responsible-ai-toolbox/tree/main/experiments#installing-rai-experiments-utilities)
- [torch-attacks](https://github.com/Harry24k/adversarial-attacks-pytorch/blob/master/README.md#hammer-installation)
- [foolbox](https://github.com/bethgelab/foolbox#-quickstart)
- [ART](https://github.com/Trusted-AI/adversarial-robustness-toolbox/wiki/Get-Started#installation-with-pip) (with the `pytorch` dependencies)

In [1]:
from collections import defaultdict
from contextlib import contextmanager
from functools import wraps
from time import time
from typing import Callable, Hashable, Optional

import torch

__all__ = ["PyTorchTimeLogger"]


class PyTorchTimeLogger(defaultdict):
    """Provides a context manager and decorator for timing the code within the
    context. Measurements are in seconds. 
    
    This ensures that CUDA-bound computations in PyTorch
    are timed appropriately. Specifically, the default device 
    stream in synchronized so that all enqueued asynchornous CUDA 
    computations have been completed prior to recording the time elapsed.
    
    This is a subclass of `defaultdict`, and stores the mappings:
    
        event-name (str) -> sequence of associated times (List[float])

    Examples
    --------
    Timing an event with a context manager

    >>> timelog = PyTorchTimeLogger()
    >>> with timelog.timeit(name="event-1", cpu_only=False)
    ...    # Timing of this context will be appended
    ...    # to the list `timelog["event-1"]`
    ...    # Times are in seconds

    Timing function execution with a decorator

    >>> @timelog  # you can specify `name` and `cpu_only` if you'd like
    ... def func():
    ...    # Time spent within function body will be logged to
    ...    # `timelog["func"]`
    ...    pass

    Or, you can make a timed function "on the fly" with a functional form factor

    >>> def f(x): return x
    >>> # timing in f will be logged to `timelog["special-name-for-f"]`
    >>> timed_f = timelog(f, name="special-name-for-f")
    """

    def __init__(self):
        super().__init__(list)

    @contextmanager
    def timeit(self, name: Hashable, cpu_only=False):
        """
        Records the time elapsed, in seconds, within the context 
        of this context manager.

        Will record the time as `self[name].append(time_elapsed)`

        Parameters
        ----------
        name : Hashable
            The name (i.e. the identifier) used as the key for
            storing the timed code

        cpu_only : bool, optional (default=False)
            If ``True``, the context manager does not synchronize the
            default CUDA device stream

        Notes
        -----
        Invoking this with `cpu_only=False` can incur a one-time 
        loading time associated with initializing the CUDA
        device stream. This delay will occur before the context
        is executed and thus will not be present in the recorded 
        elapsed time. This will not occur if PyTorch has already 
        initialized a CUDA device (e.g. by putting a model onto 
        a GPU). 

        Examples
        --------
        >>> time_log = PyTorchTimeLogger()
        >>> with time_log.timeit('event_a', cpu_only=False):
        ...     event() # measure time to leave context (seconds)

        >>> # the time elapsed within the context is appended
        >>> # to `times['event_a']`"""

        if not cpu_only and torch.cuda.is_available():
            start_event = torch.cuda.Event(enable_timing=True)
            end_event = torch.cuda.Event(enable_timing=True)
            start_event.record()  # enqueue start-event
            try:
                yield name
            finally:
                end_event.record()  # enqueue stop-event
                torch.cuda.synchronize()  # sync/run all events in default device stream

                # millisec -> sec
                elapsed_time_s = start_event.elapsed_time(end_event) / 1000
                self[name].append(elapsed_time_s)
        else:
            start_time = time()
            try:
                yield name
            finally:
                self[name].append(time() - start_time)

    def __call__(
        self,
        func: Optional[Callable] = None,
        *,
        name: Optional[str] = None,
        cpu_only: bool = False
    ) -> Callable:
        """Exposes a decorator interface to the timing context.
        
        Each execution of the decorated function will be timed and logged.
        
        The decorator can be invoked with or without its named arguments.
        
        Parameters
        ----------
        func : Callable
            The function to be decorated
        
        name : Optional[str]
            The name associated with the timed event. If `None`, the
            name of the decorated function is used.
        
        cpu_only : bool, optional (default=False)
            If True, timing is performed without syncing CUDA events.
        
        Returns
        -------
        timed_function : Callable
    
        Examples
        --------
        Timing function execution with a decorator

        >>> timelog = PyTorchTimeLogger()
        >>> @timelog  # you can specify `name` and `cpu_only` if you'd like
        ... def func():
        ...    # Time spent within function body will be logged to
        ...    # `timelog["func"]`
        ...    pass

        Or, you can make a timed function "on the fly" with a functional form factor

        >>> def f(x): return x
        >>> # timing in f will be logged to `timelog["special-name-for-f"]`
        >>> timed_f = timelog(f, name="special-name-for-f")
        """
        if name is None:
            name = func.__name__

        if func is None:
            return lambda x: self(x, name=name, cpu_only=cpu_only)

        @wraps(func)
        def wrapper(*args, **kwargs):
            with self.timeit(name=name, cpu_only=cpu_only):
                return func(*args, **kwargs)

        return wrapper


In [2]:
# https://github.com/RobustBench/robustbench
from robustbench.data import load_cifar10
from robustbench.utils import load_model, clean_accuracy

images, labels = load_cifar10(n_examples=50)
device = "cuda"



Files already downloaded and verified


In [4]:
from rai_experiments.models.pretrained import load_model


# Load pre-trained model that was trained with standard approach
# ckpt, model_name = "mitll_cifar_nat.pt", "standard"
ckpt, model_name = "mitll_cifar_l2_1_0.pt", "robust"

# ckpt_standard = "mitll_cifar_l2_1_0.pt"
model = load_model(ckpt)
model.to(device)
model.eval()

acc = clean_accuracy(model, images.to(device), labels.to(device))
print("Model: {}".format(model_name))
print("- Clean Acc: {}".format(acc))


Model: robust
- Clean Acc: 0.9


In [4]:
import datetime
import numpy as np
import warnings

warnings.filterwarnings(action="ignore")

import torch
import torch.nn as nn
import torch.optim as optim

# https://github.com/bethgelab/foolbox
import foolbox as fb

print("foolbox %s" % (fb.__version__))

# https://github.com/IBM/adversarial-robustness-toolbox
import art
import art.attacks.evasion as evasion
from art.estimators.classification import PyTorchClassifier

print("art %s" % (art.__version__))

import rai_toolbox
from rai_toolbox.perturbations import gradient_ascent, AdditivePerturbation
from rai_toolbox.optim import (
    L2ProjectedOptim,
    ChainedParamTransformingOptimizer,
    ClampedGradientOptimizer,
    SignedGradientOptim,
    ClampedParameterOptimizer,
    L2ProjectedOptim
)
from functools import partial

print("rai-toolbox %s" % (rai_toolbox.__version__))

import sys

sys.path.insert(0, "..")
# https://github.com/Harry24k/adversarial-attacks-pytorch
import torchattacks

print("torchattacks %s" % (torchattacks.__version__))


foolbox 3.3.3
art 1.12.1
rai-toolbox 0.2.1.post1.dev49+g3ae3a80.d20221026
torchattacks 3.3.0


##  3.1. Linf
### FGSM 

In [13]:
num_steps = 10

time_log = PyTorchTimeLogger()

print("Model: {}".format(model_name))

print("- Torchattacks")

atk = torchattacks.FGSM(model, eps=8 / 255)
assert not model.training
with time_log.timeit(name="torch-attack", cpu_only=False):
    adv_images = images
    for _ in range(num_steps):
        adv_images = atk(adv_images, labels)


acc = clean_accuracy(model, adv_images, labels)
time, = time_log["torch-attack"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- Foolbox")
fmodel = fb.PyTorchModel(model, bounds=(0, 1))
atk = fb.attacks.LinfFastGradientAttack(random_start=False)


adv_images = images.to("cuda:0")
foolbox_labels = labels.to("cuda:0")
assert not model.training
with time_log.timeit(name="Foolbox", cpu_only=False):
    for _ in range(num_steps):
        _, adv_images, _ = atk(fmodel, adv_images, foolbox_labels, epsilons=8 / 255)

acc = clean_accuracy(model, adv_images, labels)
time, = time_log["Foolbox"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- ART")
classifier = PyTorchClassifier(
    model=model,
    clip_values=(0, 1),
    loss=nn.CrossEntropyLoss(),
    optimizer=optim.Adam(model.parameters(), lr=0.01),
    input_shape=(3, 32, 32),
    nb_classes=10,
)
atk = evasion.FastGradientMethod(
    norm=np.inf, batch_size=50, estimator=classifier, eps=8 / 255
)


adv_images = images.numpy()
art_labels = labels.numpy()
assert not model.training
with time_log.timeit(name="ART", cpu_only=False):
    for _ in range(num_steps):
        adv_images = atk.generate(adv_images, art_labels)
acc = clean_accuracy(model, torch.tensor(adv_images).to(device), labels)

time, = time_log["ART"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- rai-toolbox")


d_images = images.to(device)
d_labels = labels.to(device)

pert_model = AdditivePerturbation(d_images)
rai_optim = SignedGradientOptim(
    lr=8 / 255,
    params=pert_model.parameters(),
)

assert not model.training
with time_log.timeit(name="rai-toolbox", cpu_only=False):
    adv_images, _ = gradient_ascent(
        model=model,
        data=d_images,
        target=d_labels,
        steps=num_steps,
        optimizer=rai_optim,
        perturbation_model=pert_model,
        use_best=False,
    )

acc = clean_accuracy(model, adv_images, labels)

time, = time_log["rai-toolbox"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)
print()


Model: robust
- Torchattacks
- Robust Acc: 0.0 (81 ms per step)
- Foolbox
- Robust Acc: 0.0 (104 ms per step)
- ART
- Robust Acc: 0.0 (84 ms per step)
- rai-toolbox
- Robust Acc: 0.0 (58 ms per step)



### PGD

In [15]:
print('Model: {}'.format(model_name))
num_steps = 10
model.eval()
time_log = PyTorchTimeLogger()

print("- Torchattacks")
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=num_steps, random_start=False)

assert not model.training
with time_log.timeit(name="torch-attack", cpu_only=False):
    adv_images = atk(images, labels)

acc = clean_accuracy(model, adv_images, labels)
time, = time_log["torch-attack"]
print('- Robust Acc: {} ({} ms per step)'.format(acc, int(time * 1000 / num_steps)))

print("- Foolbox")
fmodel = fb.PyTorchModel(model, bounds=(0, 1))
atk = fb.attacks.LinfPGD(abs_stepsize=2/255, steps=num_steps, random_start=False)

fb_images = images.to('cuda:0')
fb_labels = labels.to('cuda:0')

assert not model.training
with time_log.timeit(name="Foolbox", cpu_only=False):
    _, adv_images, _ = atk(fmodel, fb_images, fb_labels, epsilons=8/255)

acc = clean_accuracy(model, adv_images, labels)
time, = time_log["Foolbox"]
print('- Robust Acc: {} ({} ms per step)'.format(acc, int(time * 1000 / num_steps)))

print("- ART")
classifier = PyTorchClassifier(model=model, clip_values=(0, 1),
                                loss=nn.CrossEntropyLoss(),
                                optimizer=optim.Adam(model.parameters(), lr=0.01),
                                input_shape=(3, 32, 32), nb_classes=10)
atk = evasion.ProjectedGradientDescent(batch_size=50, num_random_init=0,
                                        estimator=classifier, eps=8/255,
                                        eps_step=2/255, max_iter=num_steps)
start = datetime.datetime.now()
art_images = images.numpy()
art_labels = labels.numpy()

assert not model.training
with time_log.timeit(name="ART", cpu_only=False):
    adv_images = atk.generate(art_images, art_labels)

acc = clean_accuracy(model, torch.tensor(adv_images, device=device), labels)
time, = time_log["ART"]
print('- Robust Acc: {} ({} ms per step)'.format(acc, int(time * 1000 / num_steps)))

print("- rai-toolbox")


d_images = images.to(device)
d_labels = labels.to(device)

pert_model = AdditivePerturbation(d_images)
rai_optim = ChainedParamTransformingOptimizer(
    SignedGradientOptim,
    # need smaller epsilon ball to match attack performance
    # due to other methods applying an additional clamp to (delta + img)
    partial(ClampedParameterOptimizer, clamp_min=-7 / 225, clamp_max=7 / 225),
    lr=2 / 255,
    params=pert_model.parameters(),
)


start = datetime.datetime.now()

assert not model.training
with time_log.timeit(name="rai-toolbox", cpu_only=False):
    adv_images, _ = gradient_ascent(
        model=model,
        data=d_images,
        target=d_labels,
        steps=num_steps,
        optimizer=rai_optim,
        perturbation_model=pert_model,
        use_best=False,
    )
    adv_images = torch.clamp(adv_images, 0, 1)

acc = clean_accuracy(model, adv_images, labels)

time, = time_log["rai-toolbox"]
print("- Robust Acc: {} ({} ms per step)".format(acc, int(time * 1000 / num_steps)))
print()


Model: robust
- Torchattacks
- Robust Acc: 0.44 (80 ms per step)
- Foolbox
- Robust Acc: 0.44 (82 ms per step)
- ART


PGD - Batches:   0%|          | 0/1 [00:00<?, ?it/s]

- Robust Acc: 0.44 (88 ms per step)
- rai-toolbox
- Robust Acc: 0.44 (58 ms per step)



## L2 PGD

In [17]:
num_steps = 10
time_log = PyTorchTimeLogger()

print("Model: {}".format(model_name))
print("- Torchattacks")
atk = torchattacks.PGDL2(
    model, eps=128 / 255, alpha=15 / 255, steps=num_steps, random_start=False
)
with time_log.timeit(name="torch-attack", cpu_only=False):
    adv_images = atk(images, labels)

acc = clean_accuracy(model, adv_images, labels)

time, = time_log["torch-attack"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- Foolbox")
fmodel = fb.PyTorchModel(model, bounds=(0, 1))
atk = fb.attacks.L2PGD(abs_stepsize=15 / 255, steps=num_steps, random_start=False)
fb_images = images.to("cuda:0")
fb_labels = labels.to("cuda:0")
with time_log.timeit(name="Foolbox", cpu_only=False):
    _, adv_images, _ = atk(fmodel, fb_images, fb_labels, epsilons=128 / 255)

acc = clean_accuracy(model, adv_images, labels)

time, = time_log["Foolbox"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- ART")
classifier = PyTorchClassifier(
    model=model,
    clip_values=(0, 1),
    loss=nn.CrossEntropyLoss(),
    optimizer=optim.Adam(model.parameters(), lr=0.01),
    input_shape=(3, 32, 32),
    nb_classes=10,
)
atk = evasion.ProjectedGradientDescent(
    batch_size=50,
    num_random_init=0,
    norm=2,
    estimator=classifier,
    eps=128 / 255,
    eps_step=15 / 255,
    max_iter=num_steps,
)

art_imgs = images.numpy()
art_lbls = labels.numpy()

with time_log.timeit(name="ART", cpu_only=False):
    adv_images = atk.generate(art_imgs, art_lbls)

acc = clean_accuracy(model, torch.tensor(adv_images, device=device), labels)

time, = time_log["ART"]
print(
    "- Robust Acc: {} ({} ms per step)".format(
        acc, int(time * 1000 / num_steps)
    )
)

print("- rai-toolbox")

d_images = images.to(device)
d_labels = labels.to(device)

pert_model = AdditivePerturbation(d_images)
rai_optim = L2ProjectedOptim(
    lr=15 / 255,
    epsilon=128 / 255,
    params=pert_model.parameters(),
)

start = datetime.datetime.now()
with time_log.timeit(name="rai-toolbox", cpu_only=False):
    adv_images, _ = gradient_ascent(
        model=model,
        data=d_images,
        target=d_labels,
        steps=num_steps,
        optimizer=rai_optim,
        perturbation_model=pert_model,
        use_best=False,
    )
    adv_images = torch.clamp(adv_images, 0, 1)

acc = clean_accuracy(model, adv_images, labels)

time, = time_log["rai-toolbox"]
print("- Robust Acc: {} ({} ms per step)".format(acc, int(time * 1000 / num_steps)))
print()


Model: robust
- Torchattacks
- Robust Acc: 0.7 (79 ms per step)
- Foolbox
- Robust Acc: 0.7 (82 ms per step)
- ART


PGD - Batches:   0%|          | 0/1 [00:00<?, ?it/s]

- Robust Acc: 0.7 (90 ms per step)
- rai-toolbox
- Robust Acc: 0.7 (58 ms per step)




FSGM, Model: standard
- Torchattacks
  - Robust Acc: 0.0 (82 ms per step)
- Foolbox
  - Robust Acc: 0.0 (105 ms per step)
- ART
  - Robust Acc: 0.0 (83 ms per step)
- rai-toolbox
  - Robust Acc: 0.0 (58 ms per step)


LinfPGD, Model: standard
- Torchattacks
  - Robust Acc: 0.0 (81 ms per step)
- Foolbox
  - Robust Acc: 0.0 (82 ms per step)
- ART
  - Robust Acc: 0.0 (89 ms per step)
- rai-toolbox
  - Robust Acc: 0.0 (58 ms per step)


L2PGD, Model: standard
- Torchattacks
  - Robust Acc: 0.02 (79 ms per step)
- Foolbox
  - Robust Acc: 0.02 (82 ms per step)
- ART
  - Robust Acc: 0.02 (89 ms per step)
- rai-toolbox
  - Robust Acc: 0.02 (58 ms per step)


FGSM, Model: robust
- Torchattacks
  - Robust Acc: 0.0 (81 ms per step)
- Foolbox
  - Robust Acc: 0.0 (105 ms per step)
- ART
  - Robust Acc: 0.0 (84 ms per step)
- rai-toolbox
  - Robust Acc: 0.0 (58 ms per step)

LinfPGD, Model: robust
- Torchattacks
  - Robust Acc: 0.44 (79 ms per step)
- Foolbox
  - Robust Acc: 0.44 (82 ms per step)
- ART
  - Robust Acc: 0.44 (90 ms per step)
- rai-toolbox
  - Robust Acc: 0.44 (58 ms per step)

L2PGD, Model: robust
- Torchattacks
  - Robust Acc: 0.7 (81 ms per step)
- Foolbox
  - Robust Acc: 0.7 (82 ms per step)
- ART
  - Robust Acc: 0.7 (89 ms per step)
- rai-toolbox
  - Robust Acc: 0.7 (58 ms per step)

| Attack       | Package        | Model: Madry-Robust |
| :-----------:| -------------  |     -------------   |
| FGSM (Linf)  | rai-toolbox    |  **58 ms** (0%) |
|              | Torchattacks   |  81 ms (0%)       |
|              | Foolbox        |  105 ms (0%)      |
|              | ART            |  83 ms (0%)       |
| PGD (Linf)   | rai-toolbox    |  **58 ms** (44%) |
|              | Torchattacks   |  79 ms (44%)       |
|              | Foolbox        |  82 ms (44%)      |
|              | ART            |  90 ms (44%)       |
| PGD (L2)   | rai-toolbox    |  **58 ms** (70%) |
|              | Torchattacks   |  81 ms (70%)       |
|              | Foolbox        |  82 ms (70%)      |
|              | ART            |  89 ms (70%)       |