# What is Adversarial Training?

Adversarial training is a defense strategy in which **adversarial examples are generated during training**, and the model is trained on these perturbed samples.  
By repeatedly exposing the model to attacks, it learns to become **robust** against them.

---

# Why Do We Need It?

Deep neural networks are highly vulnerable to **adversarial perturbations**—small, often imperceptible changes to the input that cause incorrect predictions.

Adversarial training helps the model:

- Learn a smoother and more stable loss landscape  
- Become more resistant to adversarial attacks  
- Improve robustness (usually at a slight cost to clean accuracy)

---

# PGD Adversarial Training (Madry et al.)

The most widely used and strongest baseline for adversarial robustness is **PGD adversarial training**. In this way, for each batch during training:

1. Start with a slightly perturbed version of the input  
2. Apply multiple PGD steps to craft a strong adversarial example  
3. Train the model using these adversarial samples  

This training update is commonly expressed as:
$
\theta_{t+1} = \theta_t - \eta \, \nabla_{\theta} \, L\big(f_{\theta}(x_{\text{adv}}),\, y\big)
$, where $x_{\text{adv}}$ was created by PGD.


# Adversarial Training Scenarios

Below are the different training strategies explored in this notebook. Each scenario demonstrates a distinct way of incorporating adversarial examples into the training process.

Let's begin by importing the necessary libraries and setting up initial configurations:




In [None]:
import copy
import math
import random
import time
from tqdm import tqdm
from typing import Tuple, Callable

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader, Subset

import torchvision
import torchvision.transforms as transforms
from torchvision.models import resnet18

In [None]:
# --- Reproducibility helpers ---
def set_seed(seed: int = 0):
    random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

# --- Device ---
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# --- Seed Setting (!!! do not change the seed !!!) ---
SEED = 2025
set_seed(SEED)

# Data / training hyperparameters
BATCH_SIZE = 128
LR = 0.1
MOMENTUM = 0.9
WEIGHT_DECAY = 5e-4

# 20 clean epochs, 20 std-adv epochs
EPOCHS_CLEAN = 20
EPOCHS_ADV_STD = 20      # standard PGD AT (for counting budget)
EPOCHS_ADV_MAX = 50      # max epochs for other ADV variants
EPOCHS_SEQ_CLEAN = 20    # clean phase epochs for sequential

EPS = 8 / 255.0
PGD_STEPS = 8
PGD_STEP_SIZE = 2 / 255.0
M = 4

In [None]:
# --- Data preparation ---
transform_train = transforms.Compose([
                                      transforms.RandomCrop(32, padding=4),
                                      transforms.RandomHorizontalFlip(),
                                      transforms.ToTensor()]
                                     )
transform_test = transforms.Compose([transforms.ToTensor()])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)

indices_train = torch.randperm(len(trainset), generator=torch.Generator().manual_seed(SEED))[:30000]
indices_test  = torch.randperm(len(testset), generator=torch.Generator().manual_seed(SEED))[:5000]

train_subset = Subset(trainset, indices_train)
test_subset = Subset(testset, indices_test)

trainloader = DataLoader(train_subset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)
testloader = DataLoader(test_subset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)

100%|██████████| 170M/170M [00:18<00:00, 9.39MB/s]


In [None]:
# --- Model & Optimizer helpers ---
def make_model(num_classes=10):
    model = resnet18(pretrained=False)
    model.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
    model.maxpool = nn.Identity()
    model.fc = nn.Linear(model.fc.in_features, num_classes)
    return model

def make_optimizer(model):
    optimizer = optim.SGD(
        model.parameters(),
        lr=LR,
        momentum=MOMENTUM,
        weight_decay=WEIGHT_DECAY
    )
    return optimizer

**Hint:** In our adversarial training setup, the goal of the budget system is to measure or constrain **only the expensive computations**, namely the gradient-based operations performed during PGD steps and the model’s backward pass. Since `.backward()` and `autograd.grad()` calls dominate the computational cost of adversarial training, the budget should be tied specifically to those operations.

In **standard adversarial training**, where we simply want to track how much compute is used without interrupting training, we use **budget_mode="count"** and call `budget_add` every time a gradient is computed.

In contrast, when we want to enforce a strict compute limit we use **budget_mode="consume"**, calling `budget_sub` after each gradient computation and stopping early once the budget is exhausted.



In [None]:
# --- Budget helpers ---
def make_budget(initial=0):
    """Create a mutable budget object."""
    return {"value": int(initial)}

def budget_add(budget, n=1):
    """Increases the budget counter."""
    if budget is not None:
        budget["value"] += int(n)

def budget_sub(budget, n=1):
    """Decreases the budget counter."""
    if budget is not None:
        budget["value"] -= int(n)

def budget_left(budget):
    """Returns the remaining budget, or None when budget is disabled."""
    return None if budget is None else int(budget["value"])

def budget_exhausted(budget):
    """Checks whether the budget is finished."""
    return (budget is not None) and (budget["value"] <= 0)

In [None]:
# --- PGD attack implementation ---
def pgd_attack(
    model: nn.Module,
    x: torch.Tensor,
    y: torch.Tensor,
    eps: float = EPS,
    alpha: float = PGD_STEP_SIZE,
    steps: int = PGD_STEPS,
    budget=None,
    mode: str = None,
) -> torch.Tensor:
    """
    model : model to attack
    x : clean input batch
    y : labels for x
    eps : max L∞ perturbation
    alpha : PGD step size
    steps : number of PGD iterations
    budget : optional budget object for AT
    mode:
      - None      : ignore budget (for evaluation, etc.)
      - "count"   : budget++ for every autograd.grad (std adv training)
      - "consume" : budget-- for every autograd.grad, stop early if budget <= 0
    """
    model.eval()

    x_adv = x.detach() + torch.empty_like(x).uniform_(-eps, eps)
    x_adv = torch.clamp(x_adv, 0.0, 1.0)

    for _ in range(steps):
        if mode == "consume" and budget_exhausted(budget):
            break

        x_adv.requires_grad_(True)
        logits = model(x_adv)
        loss = F.cross_entropy(logits, y)

        grad = torch.autograd.grad(loss, x_adv)[0]

        if mode == "count":
            budget_add(budget)
        elif mode == "consume":
            budget_sub(budget)

        x_adv = x_adv.detach() + alpha * grad.sign()
        x_adv = torch.min(torch.max(x_adv, x - eps), x + eps)
        x_adv = torch.clamp(x_adv, 0.0, 1.0)

    return x_adv.detach()

In [None]:
# --- Evaluation functions ---
def evaluate_clean(model: nn.Module, dataloader: DataLoader) -> Tuple[float, float]:
    model.eval()
    correct = 0
    total = 0

    with torch.no_grad():
        for x, y in dataloader:
            x, y = x.to(device), y.to(device)
            pred = model(x).argmax(dim=1)
            correct += (pred == y).sum().item()
            total += y.size(0)

    acc = correct / total
    return acc

def evaluate_adv(model: nn.Module, dataloader: DataLoader, attack_fn: Callable = pgd_attack, eps: float = EPS) -> Tuple[float, float]:
    model.eval()
    correct = 0
    total = 0

    for x, y in dataloader:
        x, y = x.to(device), y.to(device)
        x_adv = attack_fn(model, x, y, eps=eps)
        with torch.no_grad():
            pred = model(x_adv).argmax(dim=1)
        correct += (pred == y).sum().item()
        total += y.size(0)

    acc = correct / total
    return acc

In [None]:
# ---Prepare identical initialization ---
set_seed(SEED)
base_model = make_model().to(device)
initial_state = copy.deepcopy(base_model.state_dict())



## 1) Baseline: Train a Clean Model, Then Apply PGD Attack

In this scenario, we train a model **only on clean data**, with no adversarial examples involved.  
After training, we evaluate the model under a PGD attack.

In this scenario, our goal is to show how a standard (non-robust) model suffers *accuracy collapse* when exposed to strong adversarial perturbations. This serves as the reference point against which all adversarial training methods are compared.

In [None]:
def train_one_epoch_clean(model, loader, optimizer, budget=None):
    """
    Standard clean training on clean data only.
    """
    model.train()
    total_loss = 0.0

    for x, y in loader:
        x, y = x.to(device), y.to(device)

        optimizer.zero_grad()
        loss = F.cross_entropy(model(x), y)
        loss.backward()

        if budget is not None:
            budget_sub(budget, n=1)

        optimizer.step()
        total_loss += loss.item()

    return total_loss / len(loader)

In [None]:
# Train clean model
model_clean = make_model().to(device)
model_clean.load_state_dict(copy.deepcopy(initial_state))
opt_clean = make_optimizer(model_clean)

print('Training clean baseline model...')
for ep in tqdm(range(1, EPOCHS_CLEAN + 1)):
    loss = train_one_epoch_clean(model_clean, trainloader, opt_clean)
    print(f"Epoch {ep}/{EPOCHS_CLEAN} | clean loss: {loss:.4f}")

torch.save(model_clean.state_dict(), "model_clean.pth")

clean_acc = evaluate_clean(model_clean, testloader)
print(f"\nClean model test accuracy: {clean_acc:.4f}")

Training clean baseline model...


  5%|▌         | 1/20 [00:17<05:38, 17.81s/it]

Epoch 1/20 | clean loss: 2.1135


 10%|█         | 2/20 [00:34<05:10, 17.22s/it]

Epoch 2/20 | clean loss: 1.5996


 15%|█▌        | 3/20 [00:51<04:49, 17.05s/it]

Epoch 3/20 | clean loss: 1.3770


 20%|██        | 4/20 [01:08<04:31, 16.97s/it]

Epoch 4/20 | clean loss: 1.1702


 25%|██▌       | 5/20 [01:25<04:13, 16.91s/it]

Epoch 5/20 | clean loss: 0.9978


 30%|███       | 6/20 [01:41<03:56, 16.87s/it]

Epoch 6/20 | clean loss: 0.8617


 35%|███▌      | 7/20 [01:58<03:38, 16.84s/it]

Epoch 7/20 | clean loss: 0.7529


 40%|████      | 8/20 [02:15<03:21, 16.83s/it]

Epoch 8/20 | clean loss: 0.6768


 45%|████▌     | 9/20 [02:32<03:05, 16.82s/it]

Epoch 9/20 | clean loss: 0.6163


 50%|█████     | 10/20 [02:49<02:48, 16.81s/it]

Epoch 10/20 | clean loss: 0.5714


 55%|█████▌    | 11/20 [03:05<02:31, 16.83s/it]

Epoch 11/20 | clean loss: 0.5496


 60%|██████    | 12/20 [03:22<02:14, 16.83s/it]

Epoch 12/20 | clean loss: 0.5255


 65%|██████▌   | 13/20 [03:39<01:57, 16.84s/it]

Epoch 13/20 | clean loss: 0.5058


 70%|███████   | 14/20 [03:56<01:41, 16.85s/it]

Epoch 14/20 | clean loss: 0.4836


 75%|███████▌  | 15/20 [04:13<01:24, 16.84s/it]

Epoch 15/20 | clean loss: 0.4683


 80%|████████  | 16/20 [04:30<01:07, 16.84s/it]

Epoch 16/20 | clean loss: 0.4579


 85%|████████▌ | 17/20 [04:46<00:50, 16.83s/it]

Epoch 17/20 | clean loss: 0.4442


 90%|█████████ | 18/20 [05:03<00:33, 16.84s/it]

Epoch 18/20 | clean loss: 0.4353


 95%|█████████▌| 19/20 [05:20<00:16, 16.83s/it]

Epoch 19/20 | clean loss: 0.4248


100%|██████████| 20/20 [05:37<00:00, 16.87s/it]

Epoch 20/20 | clean loss: 0.4200






Clean model test accuracy: 0.7388


In [None]:
adv_acc = evaluate_adv(model_clean, testloader)
print(f"Under PGD attack (8/255) -> adversarial accuracy: {adv_acc:.4f}")

Under PGD attack (8/255) -> adversarial accuracy: 0.0000


### Now let's apply a weaker PGD attack to this cleanly trained model and observe the difference…

In [None]:
adv_acc_eps4 = evaluate_adv(model_clean, testloader, eps=4/255)
print(f"Under PGD attack (4/255) -> adversarial accuracy: {adv_acc_eps4:.4f}")

Under PGD attack (4/255) -> adversarial accuracy: 0.0012


In [None]:
adv_acc_eps2 = evaluate_adv(model_clean, testloader, eps=2/255)
print(f"Under PGD attack (2/255) -> adversarial accuracy: {adv_acc_eps2:.4f}")

Under PGD attack (2/255) -> adversarial accuracy: 0.0486


###### **Question:** What can you infer from the adversarial accuracy presented above?

clean trained model has high accuracy on unperturbed data but is extremely vulnerable to adversarial attacks and even small perturbations can make a significant drop in accuracy and showing that standard training alone does not provide robustness

## 2) Standard Adversarial Training (PGD)

Using the **same initial weights as the clean model**, we train the model using **PGD-generated adversarial examples** at every step.

**How it works:**
1. For each batch, run PGD to generate $ x_{\text{adv}} $.  
2. Compute the loss on these adversarial samples.  
3. Backpropagate and update model parameters.

In this scenarion, we want to obtain the classical *PGD-adversarially trained model*.


In [None]:
def train_one_epoch_adv_standard(
    model,
    loader,
    optimizer,
    attack_fn=pgd_attack,
    budget=None,
    budget_mode=None,   # "count" or "consume" or None
):
    """
    Standard PGD adversarial training on adversarial examples only.
    model : neural network to train
    loader : training dataloader
    optimizer : optimizer used for updating model weights
    attack_fn : adversarial attack function (default: pgd_attack)
    budget : optional budget object for tracking compute
    budget_mode :
    - When budget_mode=="count": budget++ for autograd.grad in PGD and for backward().
    - When budget_mode=="consume": budget-- for both and stop when exhausted.
    """
    model.train()
    total_loss = 0.0

    for x, y in loader:
        if budget_mode == "consume" and budget_exhausted(budget):
            break

        x, y = x.to(device), y.to(device)

        model.eval()
        x_adv = attack_fn(
            model, x, y,
            budget=budget,
            mode=budget_mode
        )
        model.train()

        optimizer.zero_grad()
        loss = F.cross_entropy(model(x_adv), y)
        loss.backward()

        if budget_mode == "count":
            budget_add(budget)
        elif budget_mode == "consume":
            budget_sub(budget)

        optimizer.step()
        total_loss += loss.item()

    return total_loss / len(loader)

In [None]:
# --- Standard adversarial training starting from SAME initial weights ---
model_adv_standard = make_model().to(device)
model_adv_standard.load_state_dict(copy.deepcopy(initial_state))
opt_adv_std = make_optimizer(model_adv_standard)
std_budget = make_budget(0) # This budget will COUNT all backward() and autograd.grad() calls

print('Training standard adversarial training (PGD) from same init...')
for ep in tqdm(range(1, EPOCHS_ADV_STD + 1)):
    loss = train_one_epoch_adv_standard(model_adv_standard, trainloader, opt_adv_std, budget=std_budget, budget_mode="count")
    print(f"Epoch {ep}/{EPOCHS_ADV_STD} | adv loss: {loss:.4f}")

total_budget = std_budget["value"]
print(f"\nTotal gradient budget from standard PGD AT: {total_budget}")

torch.save(model_adv_standard.state_dict(), "model_adv_standard.pth")

Training standard adversarial training (PGD) from same init...


  5%|▌         | 1/20 [01:46<33:50, 106.89s/it]

Epoch 1/20 | adv loss: 2.4045


 10%|█         | 2/20 [03:33<32:03, 106.89s/it]

Epoch 2/20 | adv loss: 2.0986


 15%|█▌        | 3/20 [05:20<30:17, 106.89s/it]

Epoch 3/20 | adv loss: 2.0085


 20%|██        | 4/20 [07:07<28:30, 106.89s/it]

Epoch 4/20 | adv loss: 1.9476


 25%|██▌       | 5/20 [08:54<26:43, 106.88s/it]

Epoch 5/20 | adv loss: 1.9028


 30%|███       | 6/20 [10:41<24:56, 106.88s/it]

Epoch 6/20 | adv loss: 1.8561


 35%|███▌      | 7/20 [12:28<23:09, 106.87s/it]

Epoch 7/20 | adv loss: 1.8261


 40%|████      | 8/20 [14:15<21:22, 106.87s/it]

Epoch 8/20 | adv loss: 1.7821


 45%|████▌     | 9/20 [16:01<19:35, 106.87s/it]

Epoch 9/20 | adv loss: 1.7369


 50%|█████     | 10/20 [17:48<17:48, 106.87s/it]

Epoch 10/20 | adv loss: 1.7104


 55%|█████▌    | 11/20 [19:35<16:01, 106.87s/it]

Epoch 11/20 | adv loss: 1.6795


 60%|██████    | 12/20 [21:22<14:14, 106.86s/it]

Epoch 12/20 | adv loss: 1.6503


 65%|██████▌   | 13/20 [23:09<12:27, 106.85s/it]

Epoch 13/20 | adv loss: 1.6254


 70%|███████   | 14/20 [24:56<10:41, 106.86s/it]

Epoch 14/20 | adv loss: 1.6028


 75%|███████▌  | 15/20 [26:43<08:54, 106.85s/it]

Epoch 15/20 | adv loss: 1.5840


 80%|████████  | 16/20 [28:29<07:07, 106.84s/it]

Epoch 16/20 | adv loss: 1.5692


 85%|████████▌ | 17/20 [30:16<05:20, 106.84s/it]

Epoch 17/20 | adv loss: 1.5514


 90%|█████████ | 18/20 [32:03<03:33, 106.84s/it]

Epoch 18/20 | adv loss: 1.5407


 95%|█████████▌| 19/20 [33:50<01:46, 106.86s/it]

Epoch 19/20 | adv loss: 1.5271


100%|██████████| 20/20 [35:37<00:00, 106.86s/it]

Epoch 20/20 | adv loss: 1.5138

Total gradient budget from standard PGD AT: 42300





In [None]:
acc_clean_std = evaluate_clean(model_adv_standard, testloader)
acc_adv_std = evaluate_adv(model_adv_standard, testloader)
print(f"\nStandard adv-trained model -> clean acc: {acc_clean_std:.4f} | adv acc: {acc_adv_std:.4f}")


Standard adv-trained model -> clean acc: 0.6426 | adv acc: 0.3536


### Let's examine how the attack’s epsilon ball influences the robust model’s accuracy...


In [None]:
acc_adv_std_eps4 = evaluate_adv(model_adv_standard, testloader, eps=4/255)
print(f"Under PGD attack (4/255) -> adversarial accuracy: {acc_adv_std_eps4:.4f}")

Under PGD attack (4/255) -> adversarial accuracy: 0.5006


In [None]:
acc_adv_std_eps2 = evaluate_adv(model_adv_standard, testloader, eps=2/255)
print(f"Under PGD attack (2/255) -> adversarial accuracy: {acc_adv_std_eps2:.4f}")

Under PGD attack (2/255) -> adversarial accuracy: 0.5728


## 3) Joint Clean + Adversarial Loss

For each clean batch:
1. Generate adversarial examples using PGD or FGSM.  
2. Compute **clean loss** and **adversarial loss**.  
3. Backpropagate on their **mean**:
$
L_{\text{total}} = \frac{1}{2}\big(L(x, y) + L(x_{\text{adv}}, y)\big)
$

In this case we wnat to train a model that balances *clean accuracy* and *robust accuracy*, by learning from both types of data simultaneously.


In [None]:
def train_one_epoch_mean_clean_adv(
    model,
    loader,
    optimizer,
    attack_fn=pgd_attack,
    budget=None,
):
    """
    One backward per batch on 0.5*(loss_clean + loss_adv).
    In adv training (mean), we *consume* budget: both PGD grads and backward() are charged.
    """
    model.train()
    total_loss = 0.0

    for x, y in loader:
        if budget_exhausted(budget):
            break

        x, y = x.to(device), y.to(device)

        x_adv = attack_fn(model, x, y, budget=budget, mode="consume")

        optimizer.zero_grad()
        loss_clean = F.cross_entropy(model(x), y)
        loss_clean.backward()
        budget_sub(budget)

        loss_adv = F.cross_entropy(model(x_adv), y)
        loss_adv.backward()
        budget_sub(budget)

        optimizer.step()
        total_loss += 0.5 * (loss_clean.item() + loss_adv.item())

    return total_loss / len(loader)

In [None]:
# --- Mean (clean, adv) under fixed budget ---
model_mean = make_model().to(device)
model_mean.load_state_dict(copy.deepcopy(initial_state))
opt_mean = make_optimizer(model_mean)
mean_budget = make_budget(total_budget)

print('Training on mean(clean loss, adv loss) under fixed budget...')
for ep in tqdm(range(1, EPOCHS_ADV_MAX + 1)):
    if budget_exhausted(mean_budget):
        print(f"Budget exhausted before epoch {ep}.")
        break

    loss = train_one_epoch_mean_clean_adv(
        model_mean,
        trainloader,
        opt_mean,
        budget=mean_budget
    )

    print(
        f"Epoch {ep}/{EPOCHS_ADV_MAX} | mean loss: {loss:.4f} "
        f"| budget left: {budget_left(mean_budget)}"
    )

    if budget_exhausted(mean_budget):
        print("Budget exhausted, stopping mean training.")
        break

torch.save(model_mean.state_dict(), "model_mean.pth")

Training on mean(clean loss, adv loss) under fixed budget...


  2%|▏         | 1/50 [02:02<1:40:09, 122.64s/it]

Epoch 1/50 | mean loss: nan | budget left: 39950


  4%|▍         | 2/50 [04:05<1:38:07, 122.65s/it]

Epoch 2/50 | mean loss: nan | budget left: 37600


  6%|▌         | 3/50 [06:07<1:36:04, 122.64s/it]

Epoch 3/50 | mean loss: nan | budget left: 35250


  8%|▊         | 4/50 [08:10<1:34:01, 122.64s/it]

Epoch 4/50 | mean loss: nan | budget left: 32900


 10%|█         | 5/50 [10:13<1:31:58, 122.64s/it]

Epoch 5/50 | mean loss: nan | budget left: 30550


 12%|█▏        | 6/50 [12:15<1:29:56, 122.64s/it]

Epoch 6/50 | mean loss: nan | budget left: 28200


 14%|█▍        | 7/50 [14:18<1:27:53, 122.65s/it]

Epoch 7/50 | mean loss: nan | budget left: 25850


 16%|█▌        | 8/50 [16:21<1:25:50, 122.64s/it]

Epoch 8/50 | mean loss: nan | budget left: 23500


 18%|█▊        | 9/50 [18:23<1:23:47, 122.63s/it]

Epoch 9/50 | mean loss: nan | budget left: 21150


 20%|██        | 10/50 [20:26<1:21:45, 122.63s/it]

Epoch 10/50 | mean loss: nan | budget left: 18800


 22%|██▏       | 11/50 [22:28<1:19:42, 122.63s/it]

Epoch 11/50 | mean loss: nan | budget left: 16450


 24%|██▍       | 12/50 [24:31<1:17:39, 122.63s/it]

Epoch 12/50 | mean loss: nan | budget left: 14100


 26%|██▌       | 13/50 [26:34<1:15:36, 122.61s/it]

Epoch 13/50 | mean loss: nan | budget left: 11750


 28%|██▊       | 14/50 [28:36<1:13:33, 122.61s/it]

Epoch 14/50 | mean loss: nan | budget left: 9400


 30%|███       | 15/50 [30:39<1:11:31, 122.62s/it]

Epoch 15/50 | mean loss: nan | budget left: 7050


 32%|███▏      | 16/50 [32:42<1:09:29, 122.62s/it]

Epoch 16/50 | mean loss: nan | budget left: 4700


 34%|███▍      | 17/50 [34:44<1:07:26, 122.63s/it]

Epoch 17/50 | mean loss: nan | budget left: 2350


 34%|███▍      | 17/50 [36:47<1:11:24, 129.85s/it]

Epoch 18/50 | mean loss: nan | budget left: 0
Budget exhausted, stopping mean training.





In [6]:
acc_clean_mean = evaluate_clean(model_mean, testloader)
acc_adv_mean = evaluate_adv(model_mean, testloader)

print(f"\nMean-loss model -> clean acc: {acc_clean_mean:.4f} | adv acc: {acc_adv_mean:.4f}")


Mean-loss model -> clean acc: 0.7328 | adv acc: 0.3103


## 4) Sequential Training: Clean First → Adversarial Later

Training is split into two distinct phases:

- **Phase 1:** Train the model normally on clean data only.  
- **Phase 2:** Continue training using only adversarial examples.

Here we start with a well-performing clean model, then adapt it to robustness.  


In [None]:
# --- Sequential training: clean phase then adversarial phase ---
model_seq = make_model().to(device)
model_seq.load_state_dict(copy.deepcopy(initial_state))
opt_seq = make_optimizer(model_seq)
seq_budget = make_budget(total_budget)

print('Sequential training (clean -> adv) under fixed budget...')

# 1) Clean phase: fixed number of epochs, consume budget per backward
for ep_clean in tqdm(range(1, EPOCHS_SEQ_CLEAN + 1)):
    if budget_exhausted(seq_budget):
        print(f"Budget exhausted during clean phase before epoch {ep_clean}.")
        break

    loss_clean = train_one_epoch_clean(
        model_seq,
        trainloader,
        opt_seq,
        budget=seq_budget
    )


    print(
        f"Sequential clean phase | epoch {ep_clean}/{EPOCHS_SEQ_CLEAN} "
        f"| loss: {loss_clean:.4f} | budget left: {budget_left(seq_budget)}"
    )

    if budget_exhausted(seq_budget):
        print("Budget exhausted at end of clean phase.")
        break

print('Switching to adversarial phase...')

# 2) Adversarial phase: PGD training consuming whatever budget is left
for ep_adv in tqdm(range(1, EPOCHS_ADV_MAX + 1)):
    if budget_exhausted(seq_budget):
        print(f"Budget exhausted before adversarial epoch {ep_adv}.")
        break

    loss_adv = train_one_epoch_adv_standard(model_seq, trainloader, opt_seq, budget=seq_budget, budget_mode="consume")

    print(
        f"Sequential adv phase | epoch {ep_adv}/{EPOCHS_ADV_MAX} "
        f"| loss: {loss_adv:.4f} | budget left: {budget_left(seq_budget)}"
    )

    if budget_exhausted(seq_budget):
        print("Budget exhausted in adversarial phase, stopping.")
        break

torch.save(model_seq.state_dict(), "model_seq.pth")


Sequential training (clean -> adv) under fixed budget...


  5%|▌         | 1/20 [00:16<05:20, 16.88s/it]

Sequential clean phase | epoch 1/20 | loss: 1.9955 | budget left: 42065


 10%|█         | 2/20 [00:33<05:03, 16.88s/it]

Sequential clean phase | epoch 2/20 | loss: 1.5221 | budget left: 41830


 15%|█▌        | 3/20 [00:50<04:46, 16.85s/it]

Sequential clean phase | epoch 3/20 | loss: 1.2624 | budget left: 41595


 20%|██        | 4/20 [01:07<04:29, 16.84s/it]

Sequential clean phase | epoch 4/20 | loss: 1.0474 | budget left: 41360


 25%|██▌       | 5/20 [01:24<04:12, 16.84s/it]

Sequential clean phase | epoch 5/20 | loss: 0.9059 | budget left: 41125


 30%|███       | 6/20 [01:41<03:55, 16.84s/it]

Sequential clean phase | epoch 6/20 | loss: 0.7869 | budget left: 40890


 35%|███▌      | 7/20 [01:57<03:38, 16.84s/it]

Sequential clean phase | epoch 7/20 | loss: 0.7115 | budget left: 40655


 40%|████      | 8/20 [02:14<03:22, 16.83s/it]

Sequential clean phase | epoch 8/20 | loss: 0.6344 | budget left: 40420


 45%|████▌     | 9/20 [02:31<03:05, 16.83s/it]

Sequential clean phase | epoch 9/20 | loss: 0.5965 | budget left: 40185


 50%|█████     | 10/20 [02:48<02:48, 16.83s/it]

Sequential clean phase | epoch 10/20 | loss: 0.5525 | budget left: 39950


 55%|█████▌    | 11/20 [03:05<02:31, 16.82s/it]

Sequential clean phase | epoch 11/20 | loss: 0.5218 | budget left: 39715


 60%|██████    | 12/20 [03:22<02:14, 16.82s/it]

Sequential clean phase | epoch 12/20 | loss: 0.5137 | budget left: 39480


 65%|██████▌   | 13/20 [03:38<01:57, 16.82s/it]

Sequential clean phase | epoch 13/20 | loss: 0.4840 | budget left: 39245


 70%|███████   | 14/20 [03:55<01:40, 16.82s/it]

Sequential clean phase | epoch 14/20 | loss: 0.4763 | budget left: 39010


 75%|███████▌  | 15/20 [04:12<01:24, 16.82s/it]

Sequential clean phase | epoch 15/20 | loss: 0.4455 | budget left: 38775


 80%|████████  | 16/20 [04:29<01:07, 16.82s/it]

Sequential clean phase | epoch 16/20 | loss: 0.4426 | budget left: 38540


 85%|████████▌ | 17/20 [04:46<00:50, 16.83s/it]

Sequential clean phase | epoch 17/20 | loss: 0.4391 | budget left: 38305


 90%|█████████ | 18/20 [05:02<00:33, 16.84s/it]

Sequential clean phase | epoch 18/20 | loss: 0.4281 | budget left: 38070


 95%|█████████▌| 19/20 [05:19<00:16, 16.84s/it]

Sequential clean phase | epoch 19/20 | loss: 0.4181 | budget left: 37835


100%|██████████| 20/20 [05:36<00:00, 16.83s/it]


Sequential clean phase | epoch 20/20 | loss: 0.4085 | budget left: 37600
Switching to adversarial phase...


  2%|▏         | 1/50 [01:46<1:27:14, 106.82s/it]

Sequential adv phase | epoch 1/50 | loss: 2.0543 | budget left: 35485


  4%|▍         | 2/50 [03:33<1:25:26, 106.81s/it]

Sequential adv phase | epoch 2/50 | loss: 1.7544 | budget left: 33370


  6%|▌         | 3/50 [05:20<1:23:40, 106.83s/it]

Sequential adv phase | epoch 3/50 | loss: 1.6660 | budget left: 31255


  8%|▊         | 4/50 [07:07<1:21:54, 106.83s/it]

Sequential adv phase | epoch 4/50 | loss: 1.6220 | budget left: 29140


 10%|█         | 5/50 [08:54<1:20:07, 106.83s/it]

Sequential adv phase | epoch 5/50 | loss: 1.5826 | budget left: 27025


 12%|█▏        | 6/50 [10:40<1:18:20, 106.84s/it]

Sequential adv phase | epoch 6/50 | loss: 1.5624 | budget left: 24910


 14%|█▍        | 7/50 [12:27<1:16:34, 106.84s/it]

Sequential adv phase | epoch 7/50 | loss: 1.5409 | budget left: 22795


 16%|█▌        | 8/50 [14:14<1:14:47, 106.85s/it]

Sequential adv phase | epoch 8/50 | loss: 1.5217 | budget left: 20680


 18%|█▊        | 9/50 [16:01<1:13:00, 106.85s/it]

Sequential adv phase | epoch 9/50 | loss: 1.5037 | budget left: 18565


 20%|██        | 10/50 [17:48<1:11:14, 106.86s/it]

Sequential adv phase | epoch 10/50 | loss: 1.4935 | budget left: 16450


 22%|██▏       | 11/50 [19:35<1:09:27, 106.86s/it]

Sequential adv phase | epoch 11/50 | loss: 1.4796 | budget left: 14335


 24%|██▍       | 12/50 [21:22<1:07:40, 106.86s/it]

Sequential adv phase | epoch 12/50 | loss: 1.4627 | budget left: 12220


 26%|██▌       | 13/50 [23:09<1:05:53, 106.86s/it]

Sequential adv phase | epoch 13/50 | loss: 1.4619 | budget left: 10105


 28%|██▊       | 14/50 [24:55<1:04:07, 106.86s/it]

Sequential adv phase | epoch 14/50 | loss: 1.4469 | budget left: 7990


 30%|███       | 15/50 [26:42<1:02:20, 106.87s/it]

Sequential adv phase | epoch 15/50 | loss: 1.4468 | budget left: 5875


 32%|███▏      | 16/50 [28:29<1:00:33, 106.86s/it]

Sequential adv phase | epoch 16/50 | loss: 1.4312 | budget left: 3760


 34%|███▍      | 17/50 [30:16<58:46, 106.85s/it]  

Sequential adv phase | epoch 17/50 | loss: 1.4246 | budget left: 1645


 34%|███▍      | 17/50 [31:39<1:01:27, 111.75s/it]

Sequential adv phase | epoch 18/50 | loss: 1.0926 | budget left: -1
Budget exhausted in adversarial phase, stopping.





In [None]:
acc_clean_seq = evaluate_clean(model_seq, testloader)
acc_adv_seq = evaluate_adv(model_seq, testloader)
print(f"\nSequential model -> clean acc: {acc_clean_seq:.4f} | adv acc: {acc_adv_seq:.4f}")


Sequential model -> clean acc: 0.6678 | adv acc: 0.3798


## 5) Alternating Training: Clean Batch ↔ Adversarial Batch

During each epoch, training alternates between:

- one clean batch  
- one adversarial batch  
- one clean batch  
- one adversarial batch  
- …and so on

In this scenario, we expose the model to both clean and adversarial data *within the same training phase*, allowing it to maintain clean performance while becoming robust.

In [None]:
def train_one_epoch_alternating(model, loader, optimizer, budget=None):
    """
    For each batch:
      - one clean step
      - one adv step (with PGD)
    Both steps consume budget (PGD autograd.grad + both backwards).
    """
    model.train()
    total_loss_clean = 0.0
    total_loss_adv = 0.0

    for x, y in loader:
        if budget_exhausted(budget):
            break

        x, y = x.to(device), y.to(device)

        # clean step
        optimizer.zero_grad()
        loss_clean = F.cross_entropy(model(x), y)
        loss_clean.backward()
        budget_sub(budget)
        optimizer.step()
        total_loss_clean += loss_clean.item()

        if budget_exhausted(budget):
            break

        # adversarial step
        x_adv = pgd_attack(model, x, y, budget=budget, mode="consume")
        optimizer.zero_grad()
        loss_adv = F.cross_entropy(model(x_adv), y)
        loss_adv.backward()
        budget_sub(budget)
        optimizer.step()
        total_loss_adv += loss_adv.item()

    return (
        total_loss_clean / len(loader),
        total_loss_adv / len(loader),
    )

In [None]:
# --- Alternating training: one clean batch, one adv batch ---
model_alt = make_model().to(device)
model_alt.load_state_dict(copy.deepcopy(initial_state))
opt_alt = make_optimizer(model_alt)
alt_budget = make_budget(total_budget)

print('Alternating training (clean + adv batches) under fixed budget...')
for ep in tqdm(range(1, EPOCHS_ADV_MAX + 1)):
    if budget_exhausted(alt_budget):
        print(f"Budget exhausted before epoch {ep}.")
        break

    clean_loss, adv_loss = train_one_epoch_alternating(
        model_alt,
        trainloader,
        opt_alt,
        budget=alt_budget
    )

    print(
        f"Epoch {ep}/{EPOCHS_ADV_MAX} | clean loss: {clean_loss:.4f} "
        f"| adv loss: {adv_loss:.4f} | budget left: {budget_left(alt_budget)}"
    )

    if budget_exhausted(alt_budget):
        print("Budget exhausted, stopping alternating training.")
        break

torch.save(model_alt.state_dict(), "model_alt.pth")


Alternating training (clean + adv batches) under fixed budget...


  2%|▏         | 1/50 [02:02<1:40:12, 122.71s/it]

Epoch 1/50 | clean loss: 2.3073 | adv loss: 2.3140 | budget left: 39950


  4%|▍         | 2/50 [04:05<1:38:09, 122.69s/it]

Epoch 2/50 | clean loss: 2.3396 | adv loss: 2.3049 | budget left: 37600


  6%|▌         | 3/50 [06:08<1:36:07, 122.71s/it]

Epoch 3/50 | clean loss: 2.3601 | adv loss: 2.3024 | budget left: 35250


  8%|▊         | 4/50 [08:10<1:34:03, 122.69s/it]

Epoch 4/50 | clean loss: 2.3365 | adv loss: 2.3050 | budget left: 32900


 10%|█         | 5/50 [10:13<1:32:01, 122.69s/it]

Epoch 5/50 | clean loss: 2.3135 | adv loss: 2.3056 | budget left: 30550


 12%|█▏        | 6/50 [12:16<1:29:58, 122.70s/it]

Epoch 6/50 | clean loss: 2.3080 | adv loss: 2.3056 | budget left: 28200


 14%|█▍        | 7/50 [14:18<1:27:55, 122.69s/it]

Epoch 7/50 | clean loss: 2.3069 | adv loss: 2.3059 | budget left: 25850


 16%|█▌        | 8/50 [16:21<1:25:52, 122.68s/it]

Epoch 8/50 | clean loss: 2.3065 | adv loss: 2.3058 | budget left: 23500


 18%|█▊        | 9/50 [18:24<1:23:50, 122.68s/it]

Epoch 9/50 | clean loss: 2.3069 | adv loss: 2.3062 | budget left: 21150


 20%|██        | 10/50 [20:26<1:21:48, 122.70s/it]

Epoch 10/50 | clean loss: 2.3059 | adv loss: 2.3054 | budget left: 18800


 22%|██▏       | 11/50 [22:29<1:19:45, 122.71s/it]

Epoch 11/50 | clean loss: 2.3061 | adv loss: 2.3054 | budget left: 16450


 24%|██▍       | 12/50 [24:32<1:17:43, 122.72s/it]

Epoch 12/50 | clean loss: 2.3066 | adv loss: 2.3060 | budget left: 14100


 26%|██▌       | 13/50 [26:35<1:15:40, 122.72s/it]

Epoch 13/50 | clean loss: 2.3067 | adv loss: 2.3060 | budget left: 11750


 28%|██▊       | 14/50 [28:37<1:13:37, 122.72s/it]

Epoch 14/50 | clean loss: 2.3058 | adv loss: 2.3052 | budget left: 9400


 30%|███       | 15/50 [30:40<1:11:35, 122.73s/it]

Epoch 15/50 | clean loss: 2.3066 | adv loss: 2.3059 | budget left: 7050


 32%|███▏      | 16/50 [32:43<1:09:32, 122.72s/it]

Epoch 16/50 | clean loss: 2.3057 | adv loss: 2.3050 | budget left: 4700


 34%|███▍      | 17/50 [34:46<1:07:29, 122.72s/it]

Epoch 17/50 | clean loss: 2.3060 | adv loss: 2.3053 | budget left: 2350


 34%|███▍      | 17/50 [36:48<1:11:27, 129.93s/it]

Epoch 18/50 | clean loss: 2.3061 | adv loss: 2.3054 | budget left: 0
Budget exhausted, stopping alternating training.





In [3]:
acc_clean_alt = evaluate_clean(model_alt, testloader)
acc_adv_alt = evaluate_adv(model_alt, testloader)
print(f"\nAlternating model -> clean acc: {acc_clean_alt:.4f} | adv acc: {acc_adv_alt:.4f}")


Alternating model -> clean acc: 0.6645 | adv acc: 0.2632


In [4]:
# --- Summary (print final table) ---
from tabulate import tabulate

rows = [
    ['Baseline clean (no adv training)', f"{clean_acc:.4f}", f"{adv_acc:.4f}"],
    ['Standard PGD adv-train', f"{acc_clean_std:.4f}", f"{acc_adv_std:.4f}"],
    ['Mean(clean, adv)', f"{acc_clean_mean:.4f}", f"{acc_adv_mean:.4f}"],
    ['Sequential (clean -> adv)', f"{acc_clean_seq:.4f}", f"{acc_adv_seq:.4f}"],
    ['Alternating batches', f"{acc_clean_alt:.4f}", f"{acc_adv_alt:.4f}"],
]
print('\nFinal comparison table:')
print(tabulate(rows, headers=['Scenario', 'Clean Acc', 'Adv Acc']))


Final comparison table:
Scenario                            Clean Acc    Adv Acc
--------------------------------  -----------  ---------
Baseline clean (no adv training)       0.7388     0
Standard PGD adv-train                 0.6426     0.3536
Mean(clean, adv)                       0.7319     0.3061
Sequential (clean -> adv)              0.6678     0.3798
Alternating batches                    0.6622     0.2708


###### **Question:** Based on the results from all above adversarial training scenarios, compare the key features of each method — including clean accuracy, adversarial accuracy (under PGD), robust–clean accuracy trade-off, and training stability. Which adversarial training strategy provides the best balance between clean and robust performance?

baseline clean: high clean accuracy but very low adversarial accuracy

standard PGD adversarial training: high adversarial robustness and moderate drop in clean accuracy so it is a stable training

mean (clean + adv loss): high clean accuracy with moderate robustness and a good trade off between clean and robust performance

sequential (clean -> adv): slightly better robustness than standard PGD but lower clean accuracy so clean first phase helps stabilize training

alternating batches: balanced approach that clean accuracy is slightly lower than sequential and adversarial accuracy is moderate so maintains stable training while exposing model continuously to both clean and adversarial data

Best balance: if the goal is overall trade off, mean (clean + adv) is the best, keeping clean accuracy high while improving robustness but if maximizing robustness is the priority, sequential training slightly outperforms others in adversarial accuracy

# Adversarial Training For Free ([Shafahi et al.](https://arxiv.org/abs/1904.12843))

**Free Adversarial Training** is an efficient method designed to achieve adversarial robustness **without increasing training cost** compared to standard (clean) training.

Traditional adversarial training (especially PGD-based) is expensive because it requires multiple gradient steps **per batch** to generate adversarial examples.  
Free Adversarial Training avoids this overhead using two key ideas:

#### **1) Reuse the same batch multiple times**
Instead of performing many PGD steps *inside a single batch*, the same batch is processed for **multiple mini-steps** (called “free” steps), effectively simulating PGD iterations across repeated passes.

#### **2) Use gradient ascent on the input simultaneously with gradient descent on the weights**
Each training step does two things:

- **Update the adversarial example**  
  by performing gradient ascent on the input (using the backward pass you already computed)

- **Update the model parameters**  
  using the same backward pass (gradient descent as usual)

This means **one backward pass per step**, instead of many per PGD attack. Let's implement this method:


In [None]:
def train_one_epoch_free(
    model,
    loader,
    optimizer,
    m: int = 4,
    eps: float = EPS,
    alpha: float = PGD_STEP_SIZE,
    budget=None,
):
    model.train()
    total_loss = 0.0

    for x, y in loader:
        if budget_exhausted(budget):
            break

        x, y = x.to(device), y.to(device)
        delta = torch.zeros_like(x).uniform_(-eps, eps).to(device)

        for _ in range(m):
            if budget_exhausted(budget):
                break

            delta.requires_grad_(True)
            optimizer.zero_grad()

            logits = model(x + delta)
            loss = F.cross_entropy(logits, y)
            loss.backward()

            budget_sub(budget)

            grad = delta.grad.detach()
            delta = delta + alpha * grad.sign()
            delta = torch.clamp(delta, -eps, eps).detach()

            optimizer.step()
            total_loss += loss.item()

    return total_loss / (len(loader) * m)

In [None]:
# --- Free adversarial training under fixed budget ---
model_adv_free = make_model().to(device)
model_adv_free.load_state_dict(copy.deepcopy(initial_state))
opt_adv_free = make_optimizer(model_adv_free)
free_budget = make_budget(total_budget)

print(f'Training Free-AT with m={M} from same init under fixed budget...')
for ep in tqdm(range(1, EPOCHS_ADV_MAX + 1)):
    if budget_exhausted(free_budget):
        print(f"Budget exhausted before epoch {ep}.")
        break

    loss = train_one_epoch_free(
        model_adv_free,
        trainloader,
        opt_adv_free,
        m=M,
        budget=free_budget
    )

    print(
        f"Epoch {ep}/{EPOCHS_ADV_MAX} | free-adv loss: {loss:.4f} "
        f"| budget left: {budget_left(free_budget)}"
    )

    if budget_exhausted(free_budget):
        print("Budget exhausted, stopping Free-AT training.")
        break

torch.save(model_adv_free.state_dict(), "model_adv_free.pth")

Training Free-AT with m=4 from same init under fixed budget...


  2%|▏         | 1/50 [01:06<53:56, 66.05s/it]

Epoch 1/50 | free-adv loss: 1.9729 | budget left: 41360


  4%|▍         | 2/50 [02:12<52:50, 66.05s/it]

Epoch 2/50 | free-adv loss: 1.7091 | budget left: 40420


  6%|▌         | 3/50 [03:18<51:44, 66.06s/it]

Epoch 3/50 | free-adv loss: 1.5688 | budget left: 39480


  8%|▊         | 4/50 [04:24<50:38, 66.05s/it]

Epoch 4/50 | free-adv loss: 1.4486 | budget left: 38540


 10%|█         | 5/50 [05:30<49:32, 66.05s/it]

Epoch 5/50 | free-adv loss: 1.3825 | budget left: 37600


 12%|█▏        | 6/50 [06:36<48:26, 66.05s/it]

Epoch 6/50 | free-adv loss: 1.3298 | budget left: 36660


 14%|█▍        | 7/50 [07:42<47:20, 66.05s/it]

Epoch 7/50 | free-adv loss: 1.2932 | budget left: 35720


 16%|█▌        | 8/50 [08:48<46:14, 66.05s/it]

Epoch 8/50 | free-adv loss: 1.2550 | budget left: 34780


 18%|█▊        | 9/50 [09:54<45:07, 66.04s/it]

Epoch 9/50 | free-adv loss: 1.2342 | budget left: 33840


 20%|██        | 10/50 [11:00<44:01, 66.04s/it]

Epoch 10/50 | free-adv loss: 1.2242 | budget left: 32900


 22%|██▏       | 11/50 [12:06<42:55, 66.05s/it]

Epoch 11/50 | free-adv loss: 1.1967 | budget left: 31960


 24%|██▍       | 12/50 [13:12<41:49, 66.05s/it]

Epoch 12/50 | free-adv loss: 1.1937 | budget left: 31020


 26%|██▌       | 13/50 [14:18<40:43, 66.04s/it]

Epoch 13/50 | free-adv loss: 1.1742 | budget left: 30080


 28%|██▊       | 14/50 [15:24<39:38, 66.06s/it]

Epoch 14/50 | free-adv loss: 1.1647 | budget left: 29140


 30%|███       | 15/50 [16:30<38:31, 66.05s/it]

Epoch 15/50 | free-adv loss: 1.1598 | budget left: 28200


 32%|███▏      | 16/50 [17:36<37:25, 66.05s/it]

Epoch 16/50 | free-adv loss: 1.1547 | budget left: 27260


 34%|███▍      | 17/50 [18:42<36:20, 66.06s/it]

Epoch 17/50 | free-adv loss: 1.1648 | budget left: 26320


 36%|███▌      | 18/50 [19:48<35:13, 66.06s/it]

Epoch 18/50 | free-adv loss: 1.1482 | budget left: 25380


 38%|███▊      | 19/50 [20:54<34:07, 66.06s/it]

Epoch 19/50 | free-adv loss: 1.1439 | budget left: 24440


 40%|████      | 20/50 [22:00<33:01, 66.04s/it]

Epoch 20/50 | free-adv loss: 1.1399 | budget left: 23500


 42%|████▏     | 21/50 [23:06<31:54, 66.03s/it]

Epoch 21/50 | free-adv loss: 1.1377 | budget left: 22560


 44%|████▍     | 22/50 [24:13<30:48, 66.03s/it]

Epoch 22/50 | free-adv loss: 1.1334 | budget left: 21620


 46%|████▌     | 23/50 [25:19<29:43, 66.04s/it]

Epoch 23/50 | free-adv loss: 1.1256 | budget left: 20680


 48%|████▊     | 24/50 [26:25<28:37, 66.04s/it]

Epoch 24/50 | free-adv loss: 1.1234 | budget left: 19740


 50%|█████     | 25/50 [27:31<27:30, 66.03s/it]

Epoch 25/50 | free-adv loss: 1.1296 | budget left: 18800


 52%|█████▏    | 26/50 [28:37<26:24, 66.03s/it]

Epoch 26/50 | free-adv loss: 1.1286 | budget left: 17860


 54%|█████▍    | 27/50 [29:43<25:20, 66.09s/it]

Epoch 27/50 | free-adv loss: 1.1261 | budget left: 16920


 56%|█████▌    | 28/50 [30:49<24:15, 66.14s/it]

Epoch 28/50 | free-adv loss: 1.1259 | budget left: 15980


 58%|█████▊    | 29/50 [31:55<23:09, 66.17s/it]

Epoch 29/50 | free-adv loss: 1.1256 | budget left: 15040


 60%|██████    | 30/50 [33:02<22:03, 66.19s/it]

Epoch 30/50 | free-adv loss: 1.1169 | budget left: 14100


 62%|██████▏   | 31/50 [34:08<20:57, 66.21s/it]

Epoch 31/50 | free-adv loss: 1.1164 | budget left: 13160


 64%|██████▍   | 32/50 [35:14<19:51, 66.21s/it]

Epoch 32/50 | free-adv loss: 1.1192 | budget left: 12220


 66%|██████▌   | 33/50 [36:20<18:45, 66.22s/it]

Epoch 33/50 | free-adv loss: 1.1165 | budget left: 11280


 68%|██████▊   | 34/50 [37:27<17:39, 66.23s/it]

Epoch 34/50 | free-adv loss: 1.1208 | budget left: 10340


 70%|███████   | 35/50 [38:33<16:33, 66.23s/it]

Epoch 35/50 | free-adv loss: 1.1119 | budget left: 9400


 72%|███████▏  | 36/50 [39:39<15:27, 66.24s/it]

Epoch 36/50 | free-adv loss: 1.1156 | budget left: 8460


 74%|███████▍  | 37/50 [40:45<14:21, 66.23s/it]

Epoch 37/50 | free-adv loss: 1.1171 | budget left: 7520


 76%|███████▌  | 38/50 [41:52<13:14, 66.23s/it]

Epoch 38/50 | free-adv loss: 1.1128 | budget left: 6580


 78%|███████▊  | 39/50 [42:58<12:08, 66.24s/it]

Epoch 39/50 | free-adv loss: 1.1157 | budget left: 5640


 80%|████████  | 40/50 [44:04<11:02, 66.24s/it]

Epoch 40/50 | free-adv loss: 1.1097 | budget left: 4700


 82%|████████▏ | 41/50 [45:10<09:56, 66.24s/it]

Epoch 41/50 | free-adv loss: 1.1132 | budget left: 3760


 84%|████████▍ | 42/50 [46:16<08:49, 66.23s/it]

Epoch 42/50 | free-adv loss: 1.1189 | budget left: 2820


 86%|████████▌ | 43/50 [47:23<07:43, 66.23s/it]

Epoch 43/50 | free-adv loss: 1.1089 | budget left: 1880


 88%|████████▊ | 44/50 [48:29<06:37, 66.23s/it]

Epoch 44/50 | free-adv loss: 1.1066 | budget left: 940


 88%|████████▊ | 44/50 [49:35<06:45, 67.63s/it]

Epoch 45/50 | free-adv loss: 1.1102 | budget left: 0
Budget exhausted, stopping Free-AT training.





In [None]:
acc_clean_free = evaluate_clean(model_adv_free, testloader)
acc_adv_free = evaluate_adv(model_adv_free, testloader)
print(f"\nAdversarial Training For Free model -> clean acc: {acc_clean_free:.4f} | adv acc: {acc_adv_free:.4f}")


Adversarial Training For Free model -> clean acc: 0.5894 | adv acc: 0.1640
