# Adversarial Robustness Tutorial

In this tutorial, we will:

- Implement and run adversarial attacks on a standard neural network.
- Build and train a Lipschitz-constrained network (LipNet).
- Evaluate the empirical and certified robustness of the LipNet.

**Table of contents**

1. [üß† Train a standard neural network on MNIST](#train-mnist)
2. [‚öîÔ∏è Adversarial attacks](#Ô∏èadv-att)
3. [üõ°Ô∏è Robustness with Lipschitz-constrained networks](#lipschitz)


**Package Requirements**

`notebook`, `torch`, `torchvision`, `torchattacks`, `deel-torchlip`, `matplotlib`

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms
import matplotlib.pyplot as plt

## 1. üß† Train a standard neural network on MNIST <a id="train-mnist"></a>

In this first section, we will train a very simple CNN on MNIST. There is no exercice
here.

In [None]:
# Device & Hyperparameters

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

BATCH_SIZE = 128
VAL_SPLIT = 0.1
NUM_EPOCHS = 10
LR = 1e-3

### 1.1 Load MNIST dataset

We will:

- Use the **MNIST** dataset (handwritten digits 0‚Äì9).
- Split the original training set into:
  - a **train** set
  - a small **validation** set (to monitor training)

We also create a **test** loader to evaluate the final model.

In [4]:
# Dataloaders for MNIST
def get_mnist_dataloaders(batch_size=BATCH_SIZE, val_split=VAL_SPLIT):
    transform = transforms.Compose([transforms.ToTensor()])

    trainval_dataset = datasets.MNIST(root="./data", train=True, download=True, transform=transform)
    test_dataset = datasets.MNIST(root="./data", train=False, download=True, transform=transform)

    n_total = len(trainval_dataset)
    n_val = int(n_total * val_split)
    n_train = n_total - n_val

    train_dataset, val_dataset = random_split(trainval_dataset, [n_train, n_val])

    train_loader = DataLoader(train_dataset, batch_size, shuffle=True, num_workers=2, pin_memory=True)
    val_loader = DataLoader(val_dataset, batch_size, shuffle=False, num_workers=2, pin_memory=True)
    test_loader = DataLoader(test_dataset, batch_size, shuffle=False, num_workers=2, pin_memory=True)

    return train_loader, val_loader, test_loader


train_loader, val_loader, test_loader = get_mnist_dataloaders()

### 1.2 Build a simple CNN network

We define a very small convolutional neural network:

- Two convolutional layers with ReLU and max pooling.
- Two fully connected layers.

This is not a state-of-the-art model, but it is good enough to reach high accuracy on
MNIST and to demonstrate adversarial attacks.


In [5]:
# Small CNN Model


class SimpleCNN(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)  # 1x28x28 -> 32x28x28
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)  # 32x14x14 -> 64x14x14
        self.pool = nn.MaxPool2d(2, 2)  # downsample by 2
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, num_classes)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))  # -> 32 x 14 x 14
        x = self.pool(F.relu(self.conv2(x)))  # -> 64 x 7 x 7
        x = x.view(x.size(0), -1)  # flatten
        x = F.relu(self.fc1(x))
        logits = self.fc2(x)
        return logits


model = SimpleCNN().to(device)

### 1.3 Training

We define three helper functions:

- `train_one_epoch` ‚Äì does one pass over the training data.
- `evaluate` ‚Äì computes loss and accuracy on a given dataloader.
- `train` ‚Äì trains on multiple epochs using the two previous helper functions. 

We will train only for a few epochs (e.g., 10 epochs).  
The goal is just to get the model to a reasonable accuracy so that we can see meaningful
adversarial examples.


In [6]:
# Training & Evaluation helper functions


def train_one_epoch(model, dataloader, optimizer, criterion, device):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for images, labels in dataloader:
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        _, preds = outputs.max(1)
        correct += (preds == labels).sum().item()
        total += labels.size(0)

    epoch_loss = running_loss / total
    epoch_acc = correct / total
    return epoch_loss, epoch_acc


def evaluate(model, dataloader, criterion, device):
    model.eval()
    running_loss = 0.0
    correct = 0
    total = 0

    with torch.no_grad():
        for images, labels in dataloader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)

            running_loss += loss.item() * images.size(0)
            _, preds = outputs.max(1)
            correct += (preds == labels).sum().item()
            total += labels.size(0)

    epoch_loss = running_loss / total
    epoch_acc = correct / total
    return epoch_loss, epoch_acc


def train(model, train_loader, val_loader, optimizer, criterion, device):
    print("Starting training...")
    for epoch in range(NUM_EPOCHS):
        train_loss, train_acc = train_one_epoch(model, train_loader, optimizer, criterion, device)
        val_loss, val_acc = evaluate(model, val_loader, criterion, device)

        print(
            f"Epoch [{epoch + 1}/{NUM_EPOCHS}] "
            f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f} "
            f"| Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}"
        )

In [None]:
# Train the standard model

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=LR)

train(model, train_loader, val_loader, optimizer, criterion, device)

### 1.4 Evaluate on Test Set

Now we measure how well the trained model performs on unseen test images.
We expect a high accuracy (above 98% usually), but the exact number is not critical.


In [None]:
# Test Evaluation
test_loss, test_acc = evaluate(model, test_loader, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Acc: {test_acc:.4f}")

### 1.5 Visualizing Logits for One Image

The model outputs **logits**: raw scores for each class (0‚Äì9).  
We will visualize these logits by sorting them in descending order.

This helps us see how "confident" the model is for each class. We will see later that
for Lipschitz-constrained networks, these logits are keys to guarantee the robustness
of the model.


In [None]:
# Plot logits (sorted) with top-1 prediction highlighted

MNIST_CLASS_NAMES = [str(i) for i in range(10)]


def plot_sorted_logits(logits: torch.Tensor, class_names=MNIST_CLASS_NAMES, use_softmax=False):
    """
    logits: 1D tensor of shape (num_classes,)
    Plots logits sorted by value (descending) with class names on x-axis.
    Top-1 predicted class is the first bar.
    """
    if logits.dim() != 1:
        raise ValueError("logits must be a 1D tensor for a single sample.")

    logits = logits.detach().cpu()
    if use_softmax:
        logits = F.softmax(logits, dim=0)
    sorted_indices = torch.argsort(logits, descending=True)
    sorted_logits = logits[sorted_indices]

    labels = [class_names[i] for i in sorted_indices]

    plt.figure(figsize=(8, 4))
    bars = plt.bar(range(len(sorted_logits)), sorted_logits)

    # Optionally highlight the top-1 bar
    bars[0].set_color("orange")

    plt.xticks(range(len(sorted_logits)), labels)
    plt.xlabel("Class")
    plt.ylabel("Logit value")
    plt.title("Logits sorted" + (" (Softmax)" if use_softmax else ""))
    plt.tight_layout()
    plt.show()


model.eval()
with torch.no_grad():
    sample_images, sample_labels = next(iter(test_loader))
    sample_images, sample_labels = sample_images.to(device), sample_labels.to(device)
    sample_logits = model(sample_images)

# Plot single image
idx = 7
plt.imshow(sample_images[idx].cpu().squeeze(), cmap="gray")
plt.title(f"True Label: {sample_labels[idx].item()}, Predicted: {sample_logits[idx].argmax().item()}")

# Plot sorted logits on one test sample
plot_sorted_logits(sample_logits[idx], MNIST_CLASS_NAMES)

## 2. ‚öîÔ∏è Adversarial attacks <a id="adv-att"></a>

An adversarial attack in classification is the process of finding an image $x_{adv}$
close to the original image $x$ such that the predicted class is different, i.e.
$f(x_{adv})$ and $f(x)$ differs. It was shown that it is easy to build such an
adversarial example on standard neural networks even with a very small added noise. In
the following, we suppose that the original image is correctly classified, i.e. the
attack tries to predict a class different from the true label.

Related to the sorted logits plot above, the idea is to find a small perturbation on the
input in order to change the highest logit. 

In the following, we will:
1. Implement and run the simple FGM method, in the untargeted setup.
2. Implement and run the same attack but with a targeted objective.
3. Use a more sophisticated attack with the `torchattacks` library.

### 2.1 Untargeted FGM attack

Here, we want to implement the untargeted FGM attack, defined as:

$$x_{adv} = x + \epsilon \frac{\nabla_{x} l(f_\theta(x), y)}{\| \nabla_{x}
l(f_\theta(x), y) \|}.$$

$\epsilon$ defines the budget of the attack and controls how strong the attack is. Note
that the distance between $x_{adv}$ and $x$ (in $\ell_2$ norm) is $\epsilon$. The FGM
attack simply goes in the direction of the gradient to increase the loss for the true
label. It is highly related to the L2-PGD attack with a single iteration.

**Exercise 2.1.1**

Complete the following code for the untargeted FGM attack.

In [10]:
# Implement the untargeted FGM Attack


def fgm_attack_untargeted(model, images, labels, epsilon, loss_fn=None):
    """
    Untargeted FGM.

    Args:
        images: (N, C, H, W) tensor, normalized as in training
        labels: (N,) tensor for true labels
        epsilon: attack budget (same space as normalized images)
        loss_fn: loss function to maximize (default: CrossEntropyLoss)

    Returns:
        adv_images: (N, C, H, W) tensor of FGM adversarial samples
    """

    model.eval()
    loss_fn = loss_fn or nn.CrossEntropyLoss()

    images = images.clone().detach().to(device)
    labels = labels.to(device)
    images.requires_grad = True

    outputs = model(images)

    # TODO: compute gradient of the loss and create adversarial examples
    ...

    adv_images = torch.clamp(adv_images, 0, 1)

    return adv_images.detach()

In [35]:
# Run FGM attack on the first batch

images, labels = next(iter(test_loader))
adv_images_untargeted = fgm_attack_untargeted(model, images, labels, epsilon=1.5)

In [12]:
def plot_adversarial_examples(model, original, adversarial):
    """
    Displays the original and adversarial images, and their difference.

    Args:
        model: the trained model (used to get predictions)
        original: (1, H, W) tensor
        adversarial: (1, H, W) tensor
    """

    # Get prediction of original and adversarial
    model.eval()
    with torch.no_grad():
        orig_pred = model(original.unsqueeze(0).to(device)).argmax(dim=1).item()
        adv_pred = model(adversarial.unsqueeze(0).to(device)).argmax(dim=1).item()

    original = original.cpu().squeeze()  # (1, H, W) -> (H, W)
    adversarial = adversarial.cpu().squeeze()  # (1, H, W) -> (H, W)

    plt.figure(figsize=(12, 4))
    plt.subplot(1, 3, 1)
    plt.title(f"Original Image (pred: {orig_pred})")
    plt.imshow(original, cmap="gray")
    plt.axis("off")

    plt.subplot(1, 3, 2)
    plt.title(f"Adversarial Image (pred: {adv_pred})")
    plt.imshow(adversarial, cmap="gray")
    plt.axis("off")

    plt.subplot(1, 3, 3)
    diff_norm = torch.norm(adversarial - original)
    plt.title(f"Difference (epsilon={diff_norm:.2f})")
    diff_pos = F.relu(adversarial - original)
    diff_neg = F.relu(original - adversarial)
    difference = torch.stack([diff_neg, diff_pos, torch.zeros_like(diff_pos)], dim=-1)
    difference /= difference.abs().max()
    plt.imshow(difference, cmap="gray")
    plt.axis("off")

    plt.tight_layout()
    plt.show()

In [None]:
# Plot the original and adversarial images

idx = 7
plot_adversarial_examples(model, images[idx], adv_images_untargeted[idx])

In [None]:
# Plot the sorted logits for original and adversarial images

plot_sorted_logits(model(images[idx].unsqueeze(0).to(device)).squeeze(), MNIST_CLASS_NAMES)
plot_sorted_logits(model(adv_images_untargeted[idx].unsqueeze(0).to(device)).squeeze(), MNIST_CLASS_NAMES)

**Exercise 2.1.2**

We want to find (approximately) the minimum budget $\epsilon$ to change the original
class. We will perform successive attacks with different values of $\epsilon$ and find
when the attack succeeds.

In [None]:
# Change epsilon budget to observe when the attack becomes successful

epsilons = ...

for eps in epsilons:
    adv_images_untargeted = ...

    adv_preds = ...
    diffs = ...

    if adv_preds[idx].item() != labels[idx].item():
        print(f"‚úÖ Attack succeeded at epsilon = {diffs:.4f}  (set budget = {eps})")
    else:
        print(f"‚ùå Attack failed at epsilon = {diffs:.4f}  (set budget = {eps})")

### 2.2 Targeted FGM attack

Previously, the untargeted attack aims at changing the predicted class, whatever the new
prediction. In the targeted setting, we want to attack towards a specific class.
Depending on the target, this attack can require more budget to succeed.

The targeted adversarial sample $x_{adv}$ is:

$$x_{adv} = x - \epsilon \frac{\nabla_{x} l(f_\theta(x), y^t)}{\| \nabla_{x}
l(f_\theta(x), y^t) \|}$$

with $y^t$ being the targeted class. Note the **minus sign**: we are moving the input in
the direction that makes the model more confident in the target class, i.e minimizing
the loss considering the target class.

**Exercise 2.2.1**

Implement the targeted FGM attack.

In [16]:
# Targeted FGM Attack


def fgm_attack_targeted(model, images, target_labels, epsilon, loss_fn=None):
    """
    Targeted FGM

    Args:
        images: (N, C, H, W) tensor, normalized as in training
        target_labels: (N,) tensor of desired target classes
        epsilon: attack budget (same space as normalized images)
        loss_fn: loss function to minimize (default: CrossEntropyLoss)

    Returns:
        adv_images: (N, C, H, W) tensor of targeted FGM adversarial samples
    """
    model.eval()
    loss_fn = loss_fn or nn.CrossEntropyLoss()

    images = images.clone().detach().to(device)
    target_labels = target_labels.to(device)
    images.requires_grad = True

    outputs = model(images)

    # TODO: compute gradient of the loss and create adversarial examples
    ...

    adv_images = torch.clamp(adv_images, 0, 1)

    return adv_images.detach()

In [17]:
# Run targeted FGM attack

images, labels = next(iter(test_loader))
target_labels = torch.ones_like(labels) * 2  # targets all set to class '8'
adv_images_targeted = fgm_attack_targeted(model, images, target_labels, epsilon=3.0)

In [None]:
# Plot the original and targeted adversarial images

idx = 7
plot_adversarial_examples(model, images[idx], adv_images_targeted[idx])

In [None]:
# Plot the sorted logits for original and adversarial images

plot_sorted_logits(model(images[idx].unsqueeze(0).to(device)).squeeze(), MNIST_CLASS_NAMES)
plot_sorted_logits(model(adv_images_targeted[idx].unsqueeze(0).to(device)).squeeze(), MNIST_CLASS_NAMES)

### 2.3 Run more complex attacks

The FGM attack implemented above is a very simple way to find an adversarial example.
Many attacks exist and perform much better than FGM, e.g. PGD, Carlini&Wagner,
Auto-Attack, etc.

There are several Python libraries to run adversarial attacks: 

- [ART toolbox](https://adversarial-robustness-toolbox.readthedocs.io/en/latest/)
- [foolbox](https://foolbox.jonasrauber.de/)
- [Adversarial Library](https://github.com/jeromerony/adversarial-library)
- [Torchattacks](https://github.com/Harry24k/adversarial-attacks-pytorch)


We will use the `torchattacks` library that implements several well-known attacks.


**Exercise 2.3.1**

Run APGD attack on the first batch of test images. Be careful with `torchattacks`: if
the attack fails and no adversarial image is found within the budget, `torchattacks`
returns the original image (instead of an adversarial one).


In [None]:
# Run APGD with torchattacks

from torchattacks import APGD

# TODO : create APGD attack instance, and run it on the first batch of test images
...

adv_images_apgd = ...

In [None]:
idx = 7
plot_adversarial_examples(model, images[idx], adv_images_apgd[idx])

**Exercise 2.3.2**

Compare performance with the FGM attack. Which attack performs best? 

In [None]:
# Comparison between FGM and APGD
...

To go further, you can try other attacks, like AutoAttack which is a combination of
state-of-the-art attacks. AutoAttack performs very well but is costly due to the
evaluations of multiple attacks.

## 3. üõ°Ô∏è Robustness with Lipschitz-constrained networks <a id="lipschitz"></a>

In this last section, we will see how Lipschitz-constrained networks can be used to
be robust to adversarial attacks and to certify the robustness.

### 3.1 Build and train a Lipschitz-constrained network

First, we will build and train a simple Lipschitz-constrained network, with the same
architecture as the unconstrained network above.

Recall that the loss is the keystone of the accuracy/robustness trade-off. The
hyperparameter allows to tune the robustness of the model. Here, using the
Tau-Cross-Entropy loss, the model is more robust for small temperatures `tau`.


**Exercise 3.1.1**

Build a LipNet using the `torchlip` layers.

In [23]:
# Small Lipschitz CNN Model

from deel.torchlip import ...


class LipCNN(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.conv1 = ...
        self.conv2 = ...
        self.pool = ...
        self.fc1 = ...
        self.fc2 = ...
        self.activation = ...

    def forward(self, x):
        x = self.pool(self.activation(self.conv1(x)))  # -> 32 x 14 x 14
        x = self.pool(self.activation(self.conv2(x)))  # -> 64 x 7 x 7
        x = x.view(x.size(0), -1)  # flatten
        x = self.activation(self.fc1(x))
        logits = self.fc2(x)
        return logits


lip_model = LipCNN().to(device)

**Exercise 3.1.2**

Train the LipNet with a Tau-Cross-Entropy. Set the temperature to 10.

In [None]:
# Training LipCNN with Tau-Cross-Entropy Loss

from deel.torchlip import TauCrossEntropyLoss

# TODO: Set the loss
criterion = ...
optimizer = torch.optim.Adam(lip_model.parameters(), lr=LR)

train(lip_model, train_loader, val_loader, optimizer, criterion, device)

In [None]:
# Test Evaluation

test_loss, test_acc = evaluate(lip_model, test_loader, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Acc: {test_acc:.4f}")

In [None]:
# Plot image and sorted logits for LipCNN

lip_model.eval()
with torch.no_grad():
    sample_images, sample_labels = next(iter(test_loader))
    sample_images, sample_labels = sample_images.to(device), sample_labels.to(device)
    sample_logits = lip_model(sample_images)

# Plot a single image
idx = 7
plt.imshow(sample_images[idx].cpu().squeeze(), cmap="gray")
plt.title(f"True Label: {sample_labels[idx].item()}, Predicted: {sample_logits[idx].argmax().item()}")

# Plot sorted logits
plot_sorted_logits(sample_logits[idx], MNIST_CLASS_NAMES)

### 3.2 Compute robustness certificates

Thanks to the knowledge of the Lipschitz constant, we can guarantee that there is no
attack with a budget lower than the certificate $\mathcal{M}$ that can change the
prediction of the original image $x$. This certificate is defined as:

$$ \mathcal{M}(x) = \frac{f_\theta(x)_{top1} - f_\theta(x)_{top2}}{\sqrt{2}} $$

This certificate is based on the difference between the two highest logits (recall the figure above where we plot the sorted logits).


**Exercise 3.2.1**

Complete the function that builds the certificates.

In [27]:
# Compute Lipschitz Certificates


def compute_certificates(lip_model, images):
    """
    Computes Lipschitz certificates for the given images.

    Args:
        lip_model: the trained Lipschitz model
        images: (N, C, H, W) tensor of input images

    Returns:
        a (N,) tensor of certificates
    """

    lip_model.eval()
    with torch.no_grad():
        logits = lip_model(images.to(device))
        sorted_logits, _ = ...
        margins = ...
        certificates = ...
    return certificates.cpu()

In [None]:
# Compute certificates for a batch of test images

certificates = compute_certificates(lip_model, images)
print("Certificates:\n", certificates)
print("Certificate for image idx =", idx, ":", certificates[idx].item())

### 3.3 Run adversarial attacks on the LipNet

We can compare the empirical robustness of our LipNet with the unconstrained network.

**Exercise 3.3.1**

Run untargeted FGM attack on the LipNet

In [None]:
# Run FGM attack on the first batch

images, labels = next(iter(test_loader))
adv_images_lip_untargeted = ...

idx = 7
plot_adversarial_examples(lip_model, images[idx], adv_images_lip_untargeted[idx])

**Exercise 3.3.2**

Find (approximately) the budget required to perturb the image

In [None]:
# Change epsilon budget to observe when the attack becomes successful

epsilons = ...
for eps in epsilons:
    adv_images_lip_untargeted = ...

    adv_preds = ...
    diffs = ...

    if adv_preds[idx].item() != labels[idx].item():
        print(f"‚úÖ Attack succeeded at epsilon = {diffs:.4f}  (set budget = {eps})")
    else:
        print(f"‚ùå Attack failed at epsilon = {diffs:.4f}  (set budget = {eps})")

    # plot_adversarial_examples(model, images[idx], adv_images_lip_untargeted[idx])

**Exercise 3.3.2**

Run APGD attack on the LipNet

In [None]:
# Run APGD on LipCNN

# TODO : create APGD attack instance, and run it on the first batch of test images
...


adv_images_apgd_lip = ...

In [None]:
idx = 7
plot_adversarial_examples(lip_model, images[idx], adv_images_apgd_lip[idx])

**Exercise 3.3.3**

1. Compare the empirical budget epsilon and the robustness certificate.
2. Compare the empirical robustness between the standard and the Lipschitz network.

### 3.4 Go further: Train and evaluate with different temperatures

You can re-run the sections 3.1 to 3.3 with a different temperature `tau`. We expect
that for smaller temperatures, the robustness of the LipNet is increased, with a small
drop in accuracy. The temperature is an important parameter to move on the Pareto
frontier of the accuracy-robustness curve.


**Exercise 3.4.1**

Complete the following code to train a new LipNet with a lower temperature.

In [None]:
lip_model_tau1 = LipCNN().to(device)

criterion = ...
optimizer = torch.optim.Adam(lip_model_tau1.parameters(), lr=LR)

train(lip_model_tau1, train_loader, val_loader, optimizer, criterion, device)

test_loss, test_acc = evaluate(lip_model_tau1, test_loader, criterion, device)
print(f"Test Loss: {test_loss:.4f}, Test Acc: {test_acc:.4f}")

In [None]:
lip_model_tau1.eval()
with torch.no_grad():
    sample_images, sample_labels = next(iter(test_loader))
    sample_images, sample_labels = sample_images.to(device), sample_labels.to(device)
    sample_logits = lip_model_tau1(sample_images)

# Plot a single image
idx = 7
plt.imshow(sample_images[idx].cpu().squeeze(), cmap="gray")
plt.title(f"True Label: {sample_labels[idx].item()}, Predicted: {sample_logits[idx].argmax().item()}")

# Plot sorted logits
plot_sorted_logits(sample_logits[idx], MNIST_CLASS_NAMES)

certificates = compute_certificates(lip_model_tau1, images)
print("Certificates:\n", certificates)


# Run APGD on LipCNN
...

adv_images_apgd_lip = ...
idx = 7
plot_adversarial_examples(lip_model_tau1, images[idx], adv_images_apgd_lip[idx])

## 4. Certified Robust accuracy (CRA) on the test set

In this tutorial, we focus on the attack and robustness of a single image: we attacked
with a given budget and measured if it was succesful, and we computed the certificate of
an image (for LipNets).

Generally, we measure the performance of an attack or a defense on a whole test set. For
attacks, we measure the empirical robust accuracy at a given budget epsilon. In other
terms, we compute how many images of the test set can be misclassified by the attacker.

For LipNets, we can compute the Certified Robust Accuracy (CRA) which corresponds to the
percentage of certificates on the test set that are above a given budget epsilon. It
corresponds to a lower bound of the empirical robust accuracy measured by an adversarial
attack.

**Exercise 4.1**

1. Compute the empirical robust accuracy on the test set using the attack and the budget
   of your choice.
2. Compute the CRA on the test set for the same budget.