# Restarts and Ensembles: Evaluating Robustness with Multiple PGD Runs

This tutorial shows how to:
- Run PGD with multiple random initializations (restarts).
- Aggregate multiple fixed-ε attack runs using ensembling to obtain aggregated evaluation metrics.

We will:
- Load the CIFAR-10 test set and a robust model from RobustBench.
- Run PGD multiple times with `random_start=True`.
- Compute robust accuracy (RA) and attack success rate (ASR) across runs using ensemble metrics.
- Use `FixedEpsilonEnsemble` to select per-sample the strongest adversarial example among runs.

In [10]:
%%capture --no-stdout
try:
    import secmlt
except ImportError:
   %pip install secml-torch[foolbox,adv_lib]

try:
  import robustbench
except ImportError:
   %pip install robustbench
   %pip install git+https://github.com/fra31/auto-attack


In [11]:
# Imports
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Subset

from secmlt.models.pytorch.base_pytorch_nn import BasePytorchClassifier
from secmlt.metrics.classification import (
    Accuracy,
    AttackSuccessRate,
    AccuracyEnsemble,
    EnsembleSuccessRate,
)
from secmlt.adv.backends import Backends
from secmlt.adv.evasion.perturbation_models import LpPerturbationModels
from secmlt.adv.evasion.pgd import PGD
from secmlt.adv.evasion.aggregators.ensemble import FixedEpsilonEnsemble

from robustbench.utils import load_model

device = "cuda" if torch.cuda.is_available() else "cpu"
dataset_path = "data/datasets/"  # relative to this notebook's folder
print(f"Using device: {device}")

Using device: cpu


## Data and Robust Model (CIFAR-10)

We load a small CIFAR-10 test subset and a robust model from RobustBench (L∞ threat model),
then wrap it with SecML‑Torch's `BasePytorchClassifier`.

In [12]:
%%capture --no-stdout

# Load CIFAR-10 test subset
transform = transforms.Compose([transforms.ToTensor()])
test_dataset = torchvision.datasets.CIFAR10(
    root=dataset_path, train=False, download=True, transform=transform
)
num_samples = 20
batch_size = num_samples // 2
test_subset = Subset(test_dataset, list(range(num_samples)))
test_loader = DataLoader(test_subset, batch_size=batch_size, shuffle=False)
print(f"Loaded {len(test_subset)} samples from CIFAR-10 test set")

# Load a robust model from RobustBench (L∞) and wrap it
net = load_model(model_name="Gowal2021Improving_R18_ddpm_100m", dataset="cifar10", threat_model="Linf")
net = net.to(device).eval()
model = BasePytorchClassifier(net)
print("Loaded robust model: Gowal2021Improving_R18_ddpm_100m (CIFAR-10, Linf)")

# Baseline accuracy on clean data
clean_acc = Accuracy()(model, test_loader)
print(f"Clean accuracy: {clean_acc.item():.4f} ({clean_acc.item() * 100:.2f}%)")

Files already downloaded and verified
Loaded 20 samples from CIFAR-10 test set
Loaded robust model: Gowal2021Improving_R18_ddpm_100m (CIFAR-10, Linf)
Clean accuracy: 1.0000 (100.00%)


## PGD with Random Initialization

Multiple random starts could mitigate local optima in non-convex loss landscapes and often increase attack success.
Here we configure PGD (fixed-ε, L∞) and run it several times with `random_start=True`.

In [13]:
# PGD configuration (L∞)
epsilon = 8 / 255     # Max L∞ perturbation
num_steps = 20        # PGD iterations
step_size = 4 / 255   # Step size per iteration
perturbation_model = LpPerturbationModels.LINF

print("Attack configuration:")
print(f"  - Epsilon: {epsilon:.4f} ({epsilon * 255:.1f}/255)")
print(f"  - Steps:   {num_steps}")
print(f"  - Step sz: {step_size:.4f} ({step_size * 255:.1f}/255)")

pgd = PGD(
    perturbation_model=perturbation_model,
    epsilon=epsilon,
    num_steps=num_steps,
    step_size=step_size,
    random_start=True,
    backend=Backends.NATIVE,
)
print("PGD (native) ready with random_start=True")

Attack configuration:
  - Epsilon: 0.0314 (8.0/255)
  - Steps:   20
  - Step sz: 0.0157 (4.0/255)
PGD (native) ready with random_start=True


In [14]:
# Single PGD run
adv_loader_single = pgd(model, test_loader)
acc_single = Accuracy()(model, adv_loader_single)
asr_single = AttackSuccessRate()(model, adv_loader_single)
print("=== Single-run PGD ===")
print(f"RA (single):  {acc_single.item():.4f} ({acc_single.item() * 100:.2f}%)")
print(f"ASR (single): {asr_single.item():.4f} ({asr_single.item() * 100:.2f}%)")

=== Single-run PGD ===
RA (single):  0.6500 (65.00%)
ASR (single): 0.3500 (35.00%)


### Multiple Restarts: Evaluation Across Runs

We now perform several runs (restarts) and compute ensemble metrics across runs:
- `AccuracyEnsemble` gives robust accuracy across runs.
- `EnsembleSuccessRate` gives success rate across runs across runs.

In [15]:
num_runs = 3
adv_loaders = []
for i in range(num_runs):
    print(f"Running PGD restart {i+1}/{num_runs}...")
    adv_loaders.append(pgd(model, test_loader))
    acc_single = Accuracy()(model, adv_loaders[i])
    print(f"Single run: accuracy {acc_single.item():.4f} ({acc_single.item() * 100:.2f}%)")

ra_ensemble = AccuracyEnsemble()(model, adv_loaders)
asr_ensemble = EnsembleSuccessRate()(model, adv_loaders)

print("=== Ensemble over multiple PGD runs ===")
print(f"RA (ensemble across runs):  {ra_ensemble.item():.4f} ({ra_ensemble.item() * 100:.2f}%)")
print(f"ASR (ensemble across runs): {asr_ensemble.item():.4f} ({asr_ensemble.item() * 100:.2f}%)")

Running PGD restart 1/3...
Single run: accuracy 0.6500 (65.00%)
Running PGD restart 2/3...
Single run: accuracy 0.6500 (65.00%)
Running PGD restart 3/3...
Single run: accuracy 0.6500 (65.00%)
=== Ensemble over multiple PGD runs ===
RA (ensemble worst-case):  0.6500 (65.00%)
ASR (ensemble worst-case): 0.3500 (35.00%)


## Fixed-ε Ensembling: Select Strongest Adversarial per Sample

For fixed-ε attacks like PGD, we can build a per-sample ensemble that picks the adversarial example with the worst loss among multiple runs.
This yields a new dataloader containing the *selected* adversarial examples across runs.

In [16]:
criterion = FixedEpsilonEnsemble(loss_fn=torch.nn.CrossEntropyLoss(), maximize=True, y_target=None)
best_advs_loader = criterion(model, test_loader, adv_loaders)

ra_best = Accuracy()(model, best_advs_loader)
asr_best = AttackSuccessRate()(model, best_advs_loader)

print("=== Fixed-ε ensemble selection (per-sample strongest) ===")
print(f"RA (best-advs):  {ra_best.item():.4f} ({ra_best.item() * 100:.2f}%)")
print(f"ASR (best-advs): {asr_best.item():.4f} ({asr_best.item() * 100:.2f}%)")

=== Fixed-ε ensemble selection (per-sample strongest) ===
RA (best-advs):  0.6500 (65.00%)
ASR (best-advs): 0.3500 (35.00%)
