## Task 1: Baseline Model Evaluation
This section loads a pre‑trained ResNet‑34 model, prepares the TestDataSet with the same normalization used during training, and computes the model’s top‑1 and top‑5 accuracy on the original dataset. It sets the model to evaluation mode, iterates through the DataLoader, maps folder indices to true ImageNet labels, and reports the baseline performance.

In [17]:
import os
import json
import numpy as np
import torch
from torch.utils.data import DataLoader
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
from tqdm import tqdm

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Load pre-trained ResNet-34 model
pretrained_model = torchvision.models.resnet34(weights='IMAGENET1K_V1')
pretrained_model = pretrained_model.to(device)
pretrained_model.eval()  # Switch to evaluation mode

# Data preprocessing
mean_norms = np.array([0.485, 0.456, 0.406])
std_norms  = np.array([0.229, 0.224, 0.225])
plain_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=mean_norms, std=std_norms)
])

# Update dataset path for Kaggle
dataset_path = "../input/testdataset/TestDataSet"
dataset = ImageFolder(root=dataset_path, transform=plain_transforms)

# Create DataLoader
dataloader = DataLoader(dataset, batch_size=32, shuffle=False)

# Load label mapping from JSON
labels_file = "../input/testdataset/TestDataSet/labels_list.json"
with open(labels_file, 'r') as f:
    labels_map = json.load(f)

# Parse label mapping into a dictionary: integer ID -> label name
id_to_label = {}
for entry in labels_map:
    idx, name = entry.split(": ")
    id_to_label[int(idx)] = name

# Map folder index (0–99) to ImageNet ID (0–999)
folder_to_imagenet_id = {}
for idx, folder in enumerate(dataset.classes):
    synset = os.path.basename(folder)
    # Assume sorted synset keys correspond in order to folder indices
    imagenet_id = list(id_to_label.keys())[idx]
    folder_to_imagenet_id[idx] = imagenet_id

# Define evaluation function for top-k accuracy
def evaluate_model(model, dataloader, device, top_k=(1, 5)):
    model.eval()
    correct = {k: 0 for k in top_k}
    total = 0

    with torch.no_grad():
        for images, folder_labels in tqdm(dataloader):
            images = images.to(device)
            folder_labels = folder_labels.to(device)

            # Convert folder index to actual ImageNet label
            imagenet_labels = torch.tensor(
                [folder_to_imagenet_id[int(fl)] for fl in folder_labels],
                device=device
            )

            outputs = model(images)
            # Compute top-k predictions
            _, preds = outputs.topk(max(top_k), dim=1, largest=True, sorted=True)
            preds = preds.t()

            for k in top_k:
                correct_k = preds[:k].eq(imagenet_labels.view(1, -1).expand_as(preds[:k]))
                correct[k] += correct_k.reshape(-1).float().sum().item()

            total += images.size(0)

    # Calculate accuracy percentages
    accuracy = {k: (correct[k] / total) * 100.0 for k in top_k}
    return accuracy

# Evaluate the original ResNet-34 model
print("Evaluating original ResNet-34 model...")
accuracy = evaluate_model(pretrained_model, dataloader, device, top_k=(1, 5))

# Print results
print("ResNet-34 performance on the test dataset:")
print(f"Top-1 Accuracy: {accuracy[1]:.2f}%")
print(f"Top-5 Accuracy: {accuracy[5]:.2f}%")


Using device: cuda
Evaluating original ResNet-34 model...


100%|██████████| 16/16 [00:01<00:00, 12.06it/s]

ResNet-34 performance on the test dataset:
Top-1 Accuracy: 76.00%
Top-5 Accuracy: 94.20%





## Task 2: Single-Step FGSM Attack (L∞)
Here we implement the Fast Gradient Sign Method (FGSM) to generate an **Adversarial Test Set 1**. For each input batch, we compute the gradient of the cross‑entropy loss with respect to the normalized image, add a perturbation of magnitude ε in the gradient’s sign direction, clamp the result to the valid pixel range, and evaluate how much the attack degrades the model’s top‑1 and top‑5 accuracy.

In [18]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

# --------- 1. FGSM attack function --------- #
def fgsm_attack(model, x, y, epsilon, clamp_min, clamp_max):
    """
    Performs a single-step FGSM attack on a batch of inputs.

    Parameters:
    - model: the evaluation model
    - x: input tensor batch, already normalized (shape=[B,3,224,224])
    - y: corresponding true ImageNet labels (tensor shape=[B])
    - epsilon: attack budget (maximum perturbation magnitude)
    - clamp_min, clamp_max: per-channel lower/upper bounds in normalized space (tensor shape=[3,1,1])

    Returns:
    - A tensor of adversarial examples (shape=[B,3,224,224])
    """
    x_adv = x.clone().detach().to(device)
    x_adv.requires_grad = True

    outputs = model(x_adv)
    loss = nn.CrossEntropyLoss()(outputs, y)
    model.zero_grad()
    loss.backward()

    # Add epsilon * sign(gradient) to the inputs
    x_adv = x_adv + epsilon * x_adv.grad.sign()
    # Clamp to the valid range
    x_adv = torch.clamp(x_adv, clamp_min, clamp_max)

    return x_adv.detach()

# --------- 2. Precompute constants --------- #
epsilon = 0.02  # attack budget

# Normalization mean/std (must match Task 1)
mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(3, 1, 1)
std  = torch.tensor([0.229, 0.224, 0.225], device=device).view(3, 1, 1)

# Ensure that after denormalization, pixel values fall in [0,1]
clamp_min = (0.0 - mean) / std
clamp_max = (1.0 - mean) / std

# --------- 3. Generate adversarial samples for the entire test set --------- #
adv_images_list = []
adv_labels_list = []

pretrained_model.eval()
for images, folder_labels in tqdm(dataloader, desc=f"FGSM Attack (ε={epsilon})"):
    images = images.to(device)
    folder_labels = folder_labels.to(device)

    # Convert folder index (0–99) to actual ImageNet label (0–999)
    imagenet_labels = torch.tensor(
        [folder_to_imagenet_id[int(idx)] for idx in folder_labels],
        device=device
    )

    adv_batch = fgsm_attack(
        pretrained_model,
        images,
        imagenet_labels,
        epsilon,
        clamp_min,
        clamp_max
    )

    adv_images_list.append(adv_batch)
    adv_labels_list.append(folder_labels)

# Concatenate all batches and move to CPU
adv_images_all = torch.cat(adv_images_list, dim=0).cpu()
adv_labels_all = torch.cat(adv_labels_list, dim=0).cpu()

# --------- 4. Build adversarial dataset and evaluate --------- #
adv_dataset = TensorDataset(adv_images_all, adv_labels_all)
adv_loader  = DataLoader(
    adv_dataset,
    batch_size=32,
    shuffle=False,
    num_workers=2,
    pin_memory=True
)

adv_accuracy = evaluate_model(
    pretrained_model,
    adv_loader,
    device,
    top_k=(1, 5)
)

print(f"\n=== Adversarial Test Set 1 (ε={epsilon}) ===")
print(f"Top-1 Accuracy: {adv_accuracy[1]:.2f}%")
print(f"Top-5 Accuracy: {adv_accuracy[5]:.2f}%")


FGSM Attack (ε=0.02): 100%|██████████| 16/16 [00:01<00:00, 11.34it/s]
100%|██████████| 16/16 [00:00<00:00, 26.53it/s]


=== Adversarial Test Set 1 (ε=0.02) ===
Top-1 Accuracy: 6.20%
Top-5 Accuracy: 35.40%





## Task 3: Iterative PGD Attack (L∞)
This block introduces a multi‑step PGD (Projected Gradient Descent) attack to create **Adversarial Test Set 2**. We split the L∞ budget ε into multiple small steps α, perform gradient ascent and projection back into the ε‑ball at each iteration, then assess the resulting adversarial examples on the same ResNet‑34 model to measure the additional drop in accuracy.

In [19]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

# --------- 1. PGD attack function --------- #
def pgd_attack(model, x, y, epsilon, alpha, num_steps, clamp_min, clamp_max):
    """
    Performs a multi-step PGD attack on a batch of inputs.

    Parameters:
    - model: the evaluation model
    - x: original inputs, already normalized (tensor of shape [B,3,224,224])
    - y: true ImageNet labels for each input (tensor of shape [B])
    - epsilon: L-infinity perturbation budget
    - alpha: step size for each iteration
    - num_steps: total number of attack iterations
    - clamp_min, clamp_max: per-channel lower/upper bounds in normalized space

    Returns:
    - A tensor of adversarial examples (shape [B,3,224,224])
    """
    # Create a copy of the input and ensure it is on the correct device
    x_adv = x.clone().detach().to(device)
    # Optional random initialization within the epsilon ball:
    # x_adv = x_adv + torch.empty_like(x_adv).uniform_(-epsilon, epsilon)
    # x_adv = torch.clamp(x_adv, clamp_min, clamp_max)

    for _ in range(num_steps):
        x_adv.requires_grad = True
        outputs = model(x_adv)
        loss = nn.CrossEntropyLoss()(outputs, y)
        model.zero_grad()
        loss.backward()
        # Compute the sign of the gradient
        grad_sign = x_adv.grad.sign()
        # Update adversarial example by a small step in the gradient direction
        x_adv = x_adv + alpha * grad_sign
        # Project back into the valid L-infinity ball around the original input
        x_adv = torch.max(torch.min(x_adv, x + epsilon), x - epsilon)
        # Clamp to ensure the values remain valid after normalization
        x_adv = torch.clamp(x_adv, clamp_min, clamp_max).detach()

    return x_adv

# --------- 2. Hyperparameters & precomputations --------- #
epsilon   = 0.02                      # L-infinity perturbation budget
num_steps = 10                        # number of PGD iterations
alpha     = epsilon / num_steps      # step size per iteration
device    = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Normalization mean and std (must match Task 1 preprocessing)
mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(3, 1, 1)
std  = torch.tensor([0.229, 0.224, 0.225], device=device).view(3, 1, 1)
# Compute min/max values in normalized space corresponding to [0,1] pixel range
clamp_min = (0.0 - mean) / std
clamp_max = (1.0 - mean) / std

# --------- 3. Run PGD attack over the test set --------- #
adv_imgs_2 = []
adv_lbls_2 = []

pretrained_model.eval()
for imgs, folder_idxs in tqdm(dataloader, desc="PGD Attack"):
    imgs = imgs.to(device)
    # Convert folder index (0–99) to true ImageNet label (0–999)
    y_true = torch.tensor(
        [folder_to_imagenet_id[int(i)] for i in folder_idxs],
        device=device
    )

    adv_batch = pgd_attack(
        pretrained_model,
        imgs,
        y_true,
        epsilon,
        alpha,
        num_steps,
        clamp_min,
        clamp_max
    )

    adv_imgs_2.append(adv_batch.cpu())
    adv_lbls_2.append(folder_idxs)  # keep folder indices for evaluation

# Concatenate all adversarial batches and build a DataLoader
adv_images_2 = torch.cat(adv_imgs_2, dim=0)
adv_labels_2 = torch.cat(adv_lbls_2, dim=0)

adv_ds_2 = TensorDataset(adv_images_2, adv_labels_2)
adv_loader_2 = DataLoader(
    adv_ds_2,
    batch_size=32,
    shuffle=False,
    num_workers=2,
    pin_memory=True
)

# --------- 4. Evaluate & print results --------- #
acc2 = evaluate_model(pretrained_model, adv_loader_2, device, top_k=(1, 5))
print(f"\n=== Adversarial Test Set 2 (PGD ε={epsilon}, steps={num_steps}) ===")
print(f"Top-1 Accuracy: {acc2[1]:.2f}%")
print(f"Top-5 Accuracy: {acc2[5]:.2f}%")


PGD Attack: 100%|██████████| 16/16 [00:14<00:00,  1.10it/s]
100%|██████████| 16/16 [00:00<00:00, 25.57it/s]


=== Adversarial Test Set 2 (PGD ε=0.02, steps=10) ===
Top-1 Accuracy: 0.20%
Top-5 Accuracy: 14.00%





## Task 4: Patch‑PGD Attack
In this section, we constrain the PGD attack to a single fixed square patch (e.g., 64×64) in the image center, generating **Adversarial Test Set 3**. By iteratively optimizing the perturbation only within that patch and projecting back into the local ε‑ball, we demonstrate how localized attacks can still significantly reduce model accuracy.

In [20]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

# --------- 1. Patch‑PGD attack function --------- #
def patch_pgd_attack(model, x, y, epsilon, alpha, num_steps, patch_size, clamp_min, clamp_max):
    """
    Performs a PGD attack confined to the central patch of each image.

    Parameters:
    - model: the evaluation model
    - x: original inputs, already normalized (tensor of shape [B,3,224,224])
    - y: true ImageNet labels for each input (tensor of shape [B])
    - epsilon: L-infinity perturbation budget
    - alpha: step size for each PGD iteration
    - num_steps: number of attack iterations
    - patch_size: size of the square patch (in pixels)
    - clamp_min, clamp_max: per-channel bounds in normalized space

    Returns:
    - A tensor of adversarial examples (shape [B,3,224,224])
    """
    B, C, H, W = x.shape

    # Compute coordinates for a central patch
    top  = (H - patch_size) // 2
    left = (W - patch_size) // 2

    # Create a mask that is 1 inside the patch, 0 elsewhere
    mask = torch.zeros_like(x)
    mask[:, :, top:top+patch_size, left:left+patch_size] = 1.0
    mask = mask.to(x.device)

    # Clone the original input for projection reference
    x_adv  = x.clone().detach()
    x_orig = x.clone().detach()

    for _ in range(num_steps):
        # Enable gradient computation on adversarial tensor
        x_adv.requires_grad = True

        # Forward and backward pass
        outputs = model(x_adv)
        loss    = nn.CrossEntropyLoss()(outputs, y)
        model.zero_grad()
        loss.backward()

        # Take a step of size alpha in the sign gradient direction within the patch
        grad_sign = x_adv.grad.sign()
        x_adv = x_adv + alpha * grad_sign * mask

        # Project back to ensure perturbation stays within L_inf ball of radius epsilon
        x_adv = torch.max(torch.min(x_adv, x_orig + epsilon), x_orig - epsilon)
        # Finally clamp to valid normalized pixel range
        x_adv = torch.clamp(x_adv, clamp_min, clamp_max).detach()

    return x_adv

# --------- 2. Hyperparameters & precomputations --------- #
device      = torch.device("cuda" if torch.cuda.is_available() else "cpu")
epsilon     = 1.0       # L-infinity budget (can also try other values)
num_steps   = 30        # number of PGD iterations
alpha       = epsilon / num_steps  # step size per iteration
patch_size  = 64        # side length of the square patch

# Reuse normalization parameters from Task 1/2
mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(3,1,1)
std  = torch.tensor([0.229, 0.224, 0.225], device=device).view(3,1,1)
# Compute normalized pixel bounds corresponding to [0,1]
clamp_min = (0.0 - mean) / std
clamp_max = (1.0 - mean) / std

# --------- 3. Build adversarial set & evaluate --------- #
adv_imgs_3 = []
adv_lbls_3 = []

pretrained_model.eval()
for imgs, folder_idxs in tqdm(dataloader, desc="Patch‑PGD Attack"):
    imgs = imgs.to(device)
    # Convert folder index to true ImageNet label
    y_true = torch.tensor(
        [folder_to_imagenet_id[int(i)] for i in folder_idxs],
        device=device
    )

    # Generate patch-based adversarial examples
    adv_batch = patch_pgd_attack(
        pretrained_model,
        imgs,
        y_true,
        epsilon,
        alpha,
        num_steps,
        patch_size,
        clamp_min,
        clamp_max
    )

    adv_imgs_3.append(adv_batch.cpu())
    adv_lbls_3.append(folder_idxs)

# Concatenate all adversarial samples and create a DataLoader
adv_images_3 = torch.cat(adv_imgs_3, dim=0)
adv_labels_3 = torch.cat(adv_lbls_3, dim=0)
adv_ds_3     = TensorDataset(adv_images_3, adv_labels_3)
adv_loader_3 = DataLoader(
    adv_ds_3,
    batch_size=32,
    shuffle=False,
    num_workers=2,
    pin_memory=True
)

# --------- 4. Evaluate & print results --------- #
acc3 = evaluate_model(pretrained_model, adv_loader_3, device, top_k=(1,5))
print(f"\n=== Adversarial Test Set 3 (Patch‑PGD ε={epsilon}, steps={num_steps}, patch size={patch_size}) ===")
print(f"Top‑1 Accuracy: {acc3[1]:.2f}%")
print(f"Top‑5 Accuracy: {acc3[5]:.2f}%")


Patch‑PGD Attack: 100%|██████████| 16/16 [00:41<00:00,  2.59s/it]
100%|██████████| 16/16 [00:00<00:00, 25.04it/s]


=== Adversarial Test Set 3 (Patch‑PGD ε=1.0, steps=30, patch size=64) ===
Top‑1 Accuracy: 0.60%
Top‑5 Accuracy: 20.20%





## Task 5: Transferability Evaluation on DenseNet‑121
Finally, we load a different pre‑trained network (DenseNet‑121) to test the transferability of all four datasets (original + three adversarial sets). We reuse the same `evaluate_model` function to report top‑1 and top‑5 accuracy for each DataLoader, summarize the results in a table, and analyze cross‑model vulnerability.

In [21]:
import torch
import torchvision
from torch.utils.data import DataLoader

# --------- 1. Load new pre-trained model --------- #
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

new_model = torchvision.models.densenet121(weights='IMAGENET1K_V1')
new_model = new_model.to(device)
new_model.eval()

# --------- 2. Prepare four DataLoaders --------- #
# Assume the following variables have been defined:
#   dataloader     # DataLoader for the original test set
#   adv_loader     # DataLoader for Adversarial Test Set 1 (FGSM)
#   adv_loader_2   # DataLoader for Adversarial Test Set 2 (PGD)
#   adv_loader_3   # DataLoader for Adversarial Test Set 3 (Patch-PGD)

loaders = [
    ("Original Test Set", dataloader),
    ("Adv Test Set 1 (FGSM)", adv_loader),
    ("Adv Test Set 2 (PGD)", adv_loader_2),
    ("Adv Test Set 3 (Patch-PGD)", adv_loader_3),
]

# --------- 3. Evaluate all datasets on the new model --------- #
results = {}
for name, loader in loaders:
    acc = evaluate_model(new_model, loader, device, top_k=(1, 5))
    results[name] = acc
    print(f"{name}: Top-1 = {acc[1]:.2f}%, Top-5 = {acc[5]:.2f}%")

# --------- 4. (Optional) Aggregate results into a table --------- #
# If you want to organize the results into a DataFrame for easy printing or saving:
import pandas as pd

df = pd.DataFrame.from_dict(results, orient='index')
df.columns = ['Top-1 (%)', 'Top-5 (%)']
print("\nSummary:")
print(df)

100%|██████████| 16/16 [00:01<00:00,  8.56it/s]


Original Test Set: Top-1 = 74.80%, Top-5 = 93.60%


100%|██████████| 16/16 [00:01<00:00, 15.20it/s]


Adv Test Set 1 (FGSM): Top-1 = 63.40%, Top-5 = 89.40%


100%|██████████| 16/16 [00:01<00:00, 15.24it/s]


Adv Test Set 2 (PGD): Top-1 = 64.00%, Top-5 = 90.80%


100%|██████████| 16/16 [00:01<00:00, 15.26it/s]

Adv Test Set 3 (Patch-PGD): Top-1 = 60.60%, Top-5 = 89.00%

Summary:
                            Top-1 (%)  Top-5 (%)
Original Test Set                74.8       93.6
Adv Test Set 1 (FGSM)            63.4       89.4
Adv Test Set 2 (PGD)             64.0       90.8
Adv Test Set 3 (Patch-PGD)       60.6       89.0





## Findings & Trends
- **Degradation Across Attacks**  
  On DenseNet‑121, all three adversarial sets generated against ResNet‑34 still significantly reduce accuracy. The original Top‑1/Top‑5 of **74.8%/93.6%** falls to:  
  - **FGSM (ε=0.02):** 63.4% / 89.4%  
  - **PGD (ε=0.02, steps=10):** 64.0% / 90.8%  
  - **Patch‑PGD (ε=1, steps=30, patch_size=64):** 60.6% / 89.0%

- **Relative Strength**  
  - Patch‑based PGD yields the largest Top‑1 drop (~ 14.2 pp) and Top‑5 drop (~ 4.6 pp).  
  - Multi‑step PGD only marginally outperforms single‑step FGSM in black‑box transfer.

- **Transferability Plateau**  
  Despite white‑box PGD being very strong on ResNet‑34, its **black‑box** effect on DenseNet‑121 saturates around 60–65% Top‑1. More aggressive attacks on ResNet‑34 give diminishing returns when transferred.



## Lessons Learned
1. **Cross‑Model Generalization**  
   Adversarial examples crafted on one architecture often transfer to others, but with reduced potency.  
2. **Simplicity Suffices**  
   Even simple FGSM attacks can induce large drops in unseen models, so computationally cheap attacks can be effective in transfer settings.  
3. **Patch Concentration**  
   Focused perturbations (patch‑based) in discriminative regions can amplify transfer success.  
4. **Cost vs. Benefit**  
   Iterative methods (PGD) yield slightly stronger transfer than FGSM, but the extra compute may not be justified by the marginal accuracy drop.



## Mitigating Transferability
- **Adversarial Training with Ensembles**  
  Train on adversarial examples from multiple source models (ResNet, DenseNet, etc.) to build broader robustness.
- **Input Randomization & Preprocessing**  
  Apply random resizing, cropping, JPEG compression, or bit‑depth reduction to disrupt precise perturbations.
- **Ensemble Defenses**  
  Aggregate predictions across diverse models to dilute attack effectiveness from any single surrogate.
- **Certified Defenses**  
  Use approaches like randomized smoothing to provide provable ℓ₂‑bounded robustness, which also reduces ℓ∞ transferability.

