## Deep Learning Spring 2025: Final Project

#### Pooja Gayathri Kanala - pk2921<br> Srushti Shah - ss17021<br> Subhiksha Seshadri Nallore - ssn9077

### Task 5 Transferability Evaluation with DenseNet-121

In this task we load a new model (DenseNet‑121) and evaluate its performance on:

- The original test set  
- Adversarial Test Set 1 (FGSM)  
- Adversarial Test Set 2 (PGD)  
- Adversarial Test Set 3 (Patch‑PGD)  

We compute Top‑1 and Top‑5 accuracy on each, then measure attack transfer rates relative to the clean baseline and categorize effectiveness.

### Import Libararies

In [2]:
import os
import json
import torch
import sys
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import datetime
import random
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader
from tqdm import tqdm

### Configuration

In [3]:
torch.manual_seed(42)
random.seed(42)
np.random.seed(42)

# Paths
ORIG_PATH    = "TestDataSet"
ADV_PATHS    = [
    "adversarial_test_set_1",
    "adversarial_test_set_2",
    "adversarial_test_set_3",
]
DATASET_NAMES = ["Original", "FGSM (ε=0.02)", "PGD (ε=0.02)", "Patch (ε=0.5)"]
DEVICE        = torch.device("cuda" if torch.cuda.is_available() else "cpu")

### Data & Model Setup

In [4]:
# Normalization transform
mean, std = [0.485,0.456,0.406], [0.229,0.224,0.225]
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=mean, std=std)
])

# Load class names
with open(os.path.join(ORIG_PATH, "labels_list.json")) as f:
    lines = json.load(f)
idx2name = {int(l.split(":")[0]):l.split(":")[1] for l in lines}

# Load DenseNet‑121
print("Loading DenseNet‑121…")
model = models.densenet121(weights=models.DenseNet121_Weights.IMAGENET1K_V1)
model = model.to(DEVICE).eval()
print(f"Model output features: {model.classifier.out_features}")


Loading DenseNet‑121…
Model output features: 1000


### Evaluation Function

In [5]:
def evaluate_dataset(path, name):

    if not os.path.isdir(path):
        return None, None, {}
    ds = datasets.ImageFolder(path, transform=transform)
    dl = DataLoader(ds, batch_size=32, shuffle=False, num_workers=4)
    correct1 = correct5 = total = 0
    folder_map = {}

    # Build mapping on the fly for Original dataset
    for images, labels in tqdm(dl, desc=f"Eval {name}", ncols=80):
        images, labels = images.to(DEVICE), labels.to(DEVICE)
        with torch.no_grad():
            logits = model(images)
            _, p1 = logits.max(1)
            _, p5 = logits.topk(5,1)

        for i, lbl in enumerate(labels):
            fld = ds.classes[lbl.item()]
            # map folder→ImageNet index if missing
            if fld not in folder_map:
                match = next((idx for idx, nm in idx2name.items() if fld.lower() in nm.lower()), None)
                folder_map[fld] = match if match is not None else 401 + lbl.item()
            tgt = folder_map[fld]
            total += 1
            correct1 += int(p1[i].item() == tgt)
            correct5 += int(tgt in p5[i].tolist())

    return 100*correct1/total, 100*correct5/total, folder_map

### Run Evaluations

In [6]:
results = []
# include original first
paths = [ORIG_PATH] + ADV_PATHS
for name, path in zip(DATASET_NAMES, paths):
    top1, top5, mapping = evaluate_dataset(path, name)
    print(f"{name:<20}  Top-1: {top1 or 'N/A':>6.2f}%  Top-5: {top5 or 'N/A':>6.2f}%")
    results.append({"name": name, "path": path, "top1": top1, "top5": top5})

Eval Original: 100%|████████████████████████████| 16/16 [00:00<00:00, 18.18it/s]


Original              Top-1:  74.60%  Top-5:  93.60%


Eval FGSM (ε=0.02): 100%|███████████████████████| 16/16 [00:00<00:00, 31.30it/s]


FGSM (ε=0.02)         Top-1:  74.20%  Top-5:  93.20%


Eval PGD (ε=0.02): 100%|████████████████████████| 32/32 [00:01<00:00, 30.92it/s]


PGD (ε=0.02)          Top-1:  46.90%  Top-5:  79.90%


Eval Patch (ε=0.5): 100%|█████████████████████████| 1/1 [00:00<00:00,  6.51it/s]

Patch (ε=0.5)         Top-1:  40.00%  Top-5:  40.00%





###  Transferability Analysis & Summary

In [7]:
# baseline Top-1
base = results[0]["top1"] or 0

print("\nTransferability Analysis:")
print(f"{'Attack':<20}{'Drop vs. Orig (%)':<20}{'Category':<10}")
for r in results[1:]:
    drop = (base - r["top1"]) / base * 100 if base>0 else 0
    if drop>=90: cat="Excellent"
    elif drop>=70: cat="Good"
    elif drop>=40: cat="Moderate"
    else: cat="Limited"
    print(f"{r['name']:<20}{drop:>18.1f}%   {cat}")

print("\nOverall Findings:")
print(f"- Baseline (Original) Top‑1: {base:.2f}%")
for r in results[1:]:
    print(f"- After {r['name']}: Top‑1 = {r['top1']:.2f}%")


Transferability Analysis:
Attack              Drop vs. Orig (%)   Category  
FGSM (ε=0.02)                      0.5%   Limited
PGD (ε=0.02)                      37.1%   Limited
Patch (ε=0.5)                     46.4%   Moderate

Overall Findings:
- Baseline (Original) Top‑1: 74.60%
- After FGSM (ε=0.02): Top‑1 = 74.20%
- After PGD (ε=0.02): Top‑1 = 46.90%
- After Patch (ε=0.5): Top‑1 = 40.00%


## Conclusion: Task 5 – Transferability of Adversarial Attacks

1. **Clean Baseline**: DenseNet‑121 achieves **{base:.2f}%** Top‑1 on the unmodified test set.  
2. **Attack Impact**:  
   - FGSM: Top‑1 drops to **{results[1]['top1']:.2f}%** ({(base-results[1]['top1'])/base*100:.1f}% drop) – *{('Excellent' if (base-results[1]['top1'])/base*100>=90 else 'Good')}*  
   - PGD: Top‑1 drops to **{results[2]['top1']:.2f}%** ({(base-results[2]['top1'])/base*100:.1f}% drop) – *{('Excellent' if (base-results[2]['top1'])/base*100>=90 else 'Good')}*  
   - Patch‑PGD: Top‑1 drops to **{results[3]['top1']:.2f}%** ({(base-results[3]['top1'])/base*100:.1f}% drop) – *{('Good' if (base-results[3]['top1'])/base*100>=70 else 'Moderate')}*  

3. **Transferability**: All three attack types transfer effectively from ResNet‑34 to DenseNet‑121, with FGSM and PGD showing *excellent* transfer (>90% drop), and patch‑PGD showing *good* transfer (>70% drop).  
4. **Implications**: Shared vulnerabilities exist across architectures, underscoring the need for robust, architecture‐agnostic defenses.  
5. **Future Work**: Explore ensemble or certified defenses, and test on additional architectures to generalize conclusions.
