# Deep Learning Final Project - Adversarial Attacks

## Setup and Imports
This cell sets up the necessary Python environment and imports required libraries for the project:

- **Basic Libraries**: 
  - `os` for file system operations
  - `json` for handling JSON data
  - `numpy` for numerical computations
  - `matplotlib` for visualization
  - `tqdm` for progress bars

- **PyTorch Libraries**:
  - Core PyTorch (`torch`) for deep learning operations
  - Neural network modules (`nn`) for model architecture
  - Functional operations (`F`) for loss functions and activations
  - Data loading utilities (`DataLoader`) for batch processing

- **Computer Vision Libraries**:
  - `torchvision` for computer vision models and datasets
  - `transforms` for image preprocessing
  - `save_image` and `make_grid` for image saving and visualization
  - `PIL` (Python Imaging Library) for image handling

These libraries provide the foundation for implementing and evaluating adversarial attacks on deep learning models.

In [1]:
import os
import json
import numpy as np
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader

import torchvision
import torchvision.transforms as transforms
from torchvision.utils import save_image, make_grid
from PIL import Image

## Environment Setup and Device Configuration

This cell initializes the computational environment and sets up reproducibility:

- Sets random seeds for both PyTorch (`torch.manual_seed(42)`) and NumPy (`np.random.seed(42)`) to ensure reproducible results across runs
- Determines the available computing device (GPU if available, otherwise CPU) using `torch.device()`
- Prints the selected device to confirm the execution environment

This setup is crucial for:
- Ensuring consistent results across multiple runs
- Optimizing performance by utilizing GPU acceleration when available
- Maintaining reproducibility in experimental results

In [None]:
torch.manual_seed(42)
np.random.seed(42)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## Image Visualization Utility

This cell defines a helper function `imshow` for displaying images in the notebook:

- **Function Purpose**: 
  - Displays images with proper denormalization and formatting
  - Handles tensor-to-image conversion for visualization

- **Key Features**:
  - Denormalizes images using ImageNet statistics:
    - Mean: [0.485, 0.456, 0.406]
    - Standard deviation: [0.229, 0.224, 0.225]
  - Ensures proper tensor device handling (CPU/GPU)
  - Clamps pixel values to valid range [0, 1]
  - Creates a large figure (10x10) for clear visualization
  - Supports optional title display
  - Removes axis for cleaner presentation

This utility function is essential for:
- Visualizing original images
- Comparing original and adversarial examples
- Debugging image transformations
- Presenting results in a clear, professional manner

In [3]:
mean_norms = np.array([0.485, 0.456, 0.406])
std_norms = np.array([0.229, 0.224, 0.225])

plain_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=mean_norms, std=std_norms)
])

In [4]:
def verify_linf_constraint(original_dataset_path, adversarial_dataset_path, epsilon=0.02, tolerance=1e-6):
    """
    Verify that the L∞ distance between original and adversarial images is within epsilon
    """
    from torchvision.datasets import ImageFolder
    from torchvision.transforms import ToTensor
    
    # Define raw pixel transform
    raw_transform = ToTensor()
    
    def load_rgb_tensor(path):
        """Load image as RGB tensor in [0,1] range"""
        img = Image.open(path).convert("RGB")
        return raw_transform(img)
    
    # Load datasets
    original_raw = ImageFolder(original_dataset_path, transform=None)
    adv_raw = ImageFolder(adversarial_dataset_path, transform=None)
    
    # Sort samples to ensure matching order
    original_raw.samples.sort(key=lambda x: x[0])
    adv_raw.samples.sort(key=lambda x: x[0])
    
    # Check if datasets have the same size
    assert len(original_raw.samples) == len(adv_raw.samples), "Datasets have different sizes"
    
    # Process images in batches to avoid memory issues
    batch_size = 50
    max_perturbation = 0.0
    
    for i in range(0, len(original_raw.samples), batch_size):
        batch_end = min(i + batch_size, len(original_raw.samples))
        
        # Load batch of images
        orig_tensors = torch.stack([
            load_rgb_tensor(p) for p, _ in original_raw.samples[i:batch_end]
        ])
        
        adv_tensors = torch.stack([
            load_rgb_tensor(p) for p, _ in adv_raw.samples[i:batch_end]
        ])
        
        # Compute L∞ distance for this batch
        batch_max_perturb = (adv_tensors - orig_tensors).abs().max().item()
        max_perturbation = max(max_perturbation, batch_max_perturb)
        
        # Early termination if constraint is violated
        if max_perturbation > epsilon + tolerance:
            break
    
    is_valid = max_perturbation <= epsilon + tolerance
    
    return max_perturbation, is_valid

In [5]:
# Load the pre-trained ResNet-34 model
model = torchvision.models.resnet34(weights='IMAGENET1K_V1')
model = model.to(device)
model.eval()

# Load the test dataset
dataset_path = "./TestDataSet"
dataset = torchvision.datasets.ImageFolder(root=dataset_path, transform=plain_transforms)

In [None]:
# Load custom labels list
try:
    with open("labels_list.json", 'r') as f:
        labels_list = json.load(f)
    
    # Parse labels into a dictionary {class_idx: class_name}
    imagenet_label_map = {}
    for label_entry in labels_list:
        parts = label_entry.split(": ", 1)
        if len(parts) == 2:
            class_idx, class_name = parts
            # Remove any extra characters from index
            class_idx = class_idx.strip()
            imagenet_label_map[class_idx] = class_name
    
    print(f"Loaded custom labels list with {len(imagenet_label_map)} classes")
except Exception as e:
    print(f"Warning: Could not load custom labels list: {e}. Using fallback.")
    imagenet_label_map = {}

# Create mapping between folder indices and ImageNet indices (401-500)
folder_to_imagenet = {}
class_folders = dataset.classes

# Since folders are in the same order as the labels in the json file,
# we can directly map index i to ImageNet index 401+i
for i, folder_name in enumerate(class_folders):
    imagenet_idx = 401 + i  # Map to 401-500 range
    folder_to_imagenet[i] = imagenet_idx
    class_name = imagenet_label_map.get(str(imagenet_idx), "Unknown")
    print(f"Folder {i} ({folder_name}) -> ImageNet class {imagenet_idx}: {class_name}")

# Create DataLoader
dataloader = DataLoader(dataset, batch_size=32, shuffle=False, num_workers=4)
print(f"Loaded test dataset with {len(dataset)} images across {len(dataset.classes)} classes")

In [7]:
def imshow(img, title=None):
    """Display an image"""
    # Denormalize
    img = img.clone().detach().cpu()
    img = img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    img = torch.clamp(img, 0, 1)

    plt.figure(figsize=(10, 10))
    plt.imshow(img.permute(1, 2, 0))
    if title:
        plt.title(title)
    plt.axis('off')
    plt.show()

In [None]:
def show_examples(dataloader, num_examples=5):
    """Display a few example images from the dataset"""
    examples = []
    labels = []

    # Get a batch of images
    images, targets = next(iter(dataloader))

    for i in range(min(num_examples, len(images))):
        img = images[i]
        folder_idx = targets[i].item()
        imagenet_idx = folder_to_imagenet[folder_idx]

        # Denormalize
        img = img.clone().detach().cpu()
        img = img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
        img = torch.clamp(img, 0, 1)

        examples.append(img)
        labels.append((folder_idx, imagenet_idx))

    # Create grid of images
    grid = make_grid(examples, nrow=5, padding=2)
    plt.figure(figsize=(15, 3))
    plt.imshow(grid.permute(1, 2, 0))
    plt.title("Example Images from Test Dataset")
    plt.axis('off')

    # Print labels
    label_texts = []
    for i, (folder_idx, imagenet_idx) in enumerate(labels):
        folder_name = dataset.classes[folder_idx]
        class_name = imagenet_label_map.get(str(imagenet_idx), f"Class {imagenet_idx}")
        label_texts.append(f"{i+1}: {folder_name} (ImageNet idx: {imagenet_idx}, {class_name})")

    plt.figtext(0.5, 0.01, "\n".join(label_texts), ha="center", fontsize=10)
    plt.show()

# Show example images
show_examples(dataloader)

## Advanced Image Comparison and Analysis Function

This cell defines a comprehensive visualization function `plot_images_with_predictions` that compares original and adversarial images with detailed model predictions:

- **Function Purpose**:
  - Creates side-by-side visualizations of original and adversarial images
  - Displays model predictions and confidence scores
  - Calculates and shows L∞ distance between images
  - Presents top-5 predictions for both images

- **Key Features**:
  - **Model Evaluation**:
    - Gets predictions for both original and adversarial images
    - Calculates confidence scores using softmax
    - Extracts top-5 predictions for detailed analysis
  
  - **Visualization**:
    - Creates a 2-panel figure (15x7) for clear comparison
    - Properly denormalizes images for display
    - Shows true labels and predicted classes
    - Displays confidence scores
    - Optional red bounding box for patch attacks
  
  - **Metrics Display**:
    - L∞ distance between images
    - Attack epsilon value
    - Top-5 predictions with confidence scores
    - True label information

- **Output Format**:
  - Side-by-side image comparison
  - Detailed prediction information
  - Formatted confidence scores
  - Clear labeling and titles

This function is crucial for:
- Analyzing attack effectiveness
- Understanding model behavior
- Visualizing adversarial perturbations
- Comparing prediction changes
- Debugging attack implementations

In [9]:
def plot_images_with_predictions(model, original_img, adversarial_img, folder_idx, epsilon, patch_coords=None):
    """Plot original and adversarial images with predictions"""
    model.eval()

    # Convert folder index to ImageNet index
    imagenet_idx = folder_to_imagenet[folder_idx]

    # Get predictions for original image
    with torch.no_grad():
        output = model(original_img.unsqueeze(0).to(device))

    original_pred = output.max(1)[1].item()
    original_prob = F.softmax(output, dim=1).max().item()

    # Get predictions for adversarial image
    with torch.no_grad():
        output = model(adversarial_img.unsqueeze(0).to(device))

    adversarial_pred = output.max(1)[1].item()
    adversarial_prob = F.softmax(output, dim=1).max().item()

    # Get top-5 predictions for both images
    original_top5 = torch.topk(F.softmax(model(original_img.unsqueeze(0).to(device)), dim=1), 5)
    adversarial_top5 = torch.topk(F.softmax(model(adversarial_img.unsqueeze(0).to(device)), dim=1), 5)

    # Calculate L-infinity distance
    l_inf_dist = (adversarial_img - original_img).abs().max().item()

    # Plotting
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 7))

    # Denormalize for visualization
    orig_img = original_img.clone().detach().cpu()
    orig_img = orig_img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    orig_img = torch.clamp(orig_img, 0, 1)

    adv_img = adversarial_img.clone().detach().cpu()
    adv_img = adv_img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    adv_img = torch.clamp(adv_img, 0, 1)

    # Get class names
    def get_class_name(idx):
        return imagenet_label_map.get(str(idx), f"Class {idx}")

    # Plot original image
    ax1.imshow(orig_img.permute(1, 2, 0))
    ax1.set_title(f"Original: {get_class_name(original_pred)}\nConfidence: {original_prob:.4f}\nTrue: {get_class_name(imagenet_idx)}")
    ax1.axis('off')

    # Plot adversarial image
    ax2.imshow(adv_img.permute(1, 2, 0))
    # Add red bounding box for patch attack
    if patch_coords is not None:
        x, y = patch_coords
        import matplotlib.patches as patches
        rect = patches.Rectangle((y, x), 32, 32, linewidth=2, edgecolor='r', facecolor='none')
        ax2.add_patch(rect)
    ax2.set_title(f"Adversarial: {get_class_name(adversarial_pred)}\nConfidence: {adversarial_prob:.4f}\nL∞ distance: {l_inf_dist:.4f}")
    ax2.axis('off')

    plt.suptitle(f"Attack with ε = {epsilon}")
    plt.tight_layout()
    plt.show()

    # Print top-5 predictions
    print("Original Top-5 Predictions:")
    for i in range(5):
        idx = original_top5[1][0][i].item()
        prob = original_top5[0][0][i].item()
        print(f"{i+1}. {get_class_name(idx)}: {prob:.4f}")

    print("\nAdversarial Top-5 Predictions:")
    for i in range(5):
        idx = adversarial_top5[1][0][i].item()
        prob = adversarial_top5[0][0][i].item()
        print(f"{i+1}. {get_class_name(idx)}: {prob:.4f}")

## Model Accuracy Evaluation Function

This cell defines a function `calculate_accuracy` for evaluating model performance on a dataset:

- **Function Purpose**:
  - Calculates top-k accuracy metrics for a model
  - Supports both Top-1 and Top-5 accuracy calculations
  - Handles ImageNet class index mapping

- **Key Features**:
  - **Model Evaluation Mode**:
    - Sets model to evaluation mode
    - Uses `torch.no_grad()` for efficient inference
  
  - **Accuracy Calculation**:
    - Processes data in batches using the provided dataloader
    - Maps target labels to ImageNet indices
    - Computes top-k predictions for each batch
    - Accumulates correct predictions for each k value
  
  - **Progress Tracking**:
    - Uses tqdm for progress visualization
    - Shows batch processing progress
  
  - **Return Values**:
    - Returns accuracy percentages for each k value
    - Default: Top-1 and Top-5 accuracy

- **Implementation Details**:
  - Efficient batch processing
  - Proper device handling (CPU/GPU)
  - Memory-efficient evaluation
  - Progress bar for long evaluations

This function is essential for:
- Evaluating model performance
- Comparing original vs. adversarial accuracy
- Measuring attack effectiveness
- Benchmarking model robustness

In [None]:
def create_adversarial_dataset(model, dataloader, attack_fn, attack_params, save_dir, dataset_name):
    """Create and save adversarial dataset"""
    model.eval()
    
    # Create save directory and class subdirectories
    os.makedirs(save_dir, exist_ok=True)
    for class_idx, class_name in enumerate(dataset.classes):
        os.makedirs(os.path.join(save_dir, str(class_idx)), exist_ok=True)
    
    for batch_idx, (inputs, targets) in enumerate(tqdm(dataloader)):
        inputs, targets = inputs.to(device), targets.to(device)
        
        # Convert folder indices to ImageNet indices
        batch_imagenet_indices = torch.tensor([
            folder_to_imagenet[label.item()] for label in targets
        ]).to(device)
        
        # Generate adversarial examples
        attack_result = attack_fn(model, inputs, batch_imagenet_indices, **attack_params)
        
        # Handle different attack function return types
        if isinstance(attack_result, tuple):
            adv_inputs = attack_result[0]
            # Optional additional info (e.g., patch coordinates)
        else:
            adv_inputs = attack_result
        
        # Verify L∞ constraint
        with torch.no_grad():
            delta = adv_inputs - inputs
            max_perturb = delta.abs().max().item()
            
            # Enforce constraint if violated
            if max_perturb > attack_params['epsilon']:
                print(f"Scaling down perturbation: {max_perturb:.6f} -> {attack_params['epsilon']:.6f}")
                scale_factor = attack_params['epsilon'] / max_perturb
                delta = delta * scale_factor
                adv_inputs = inputs + delta
        
        # Final verification
        max_perturb = (adv_inputs - inputs).abs().max().item()
        assert max_perturb <= attack_params['epsilon'] + 1e-6, f"Batch {batch_idx} exceeds ε bound: {max_perturb:.6f} > {attack_params['epsilon']}"
        
        # Save adversarial examples
        for i, (adv_input, target) in enumerate(zip(adv_inputs, targets)):
            idx = batch_idx * dataloader.batch_size + i
            
            # Save in the original class folder structure
            save_path = os.path.join(save_dir, str(target.item()), f"{idx}.png")
            
            # Denormalize before saving
            save_image(
                adv_input * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1).to(device) + 
                torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1).to(device),
                save_path
            )
    
    print(f"Created adversarial dataset: {dataset_name}")

# Function to calculate model accuracy
def calculate_accuracy(model, dataloader, folder_to_imagenet_map, topk=(1, 5)):
    """Calculate top-k accuracy"""
    model.eval()
    topk_correct = [0] * len(topk)
    total = 0

    with torch.no_grad():
        for inputs, targets in tqdm(dataloader):
            inputs = inputs.to(device)
            # Convert folder indices to ImageNet indices
            imagenet_targets = torch.tensor([folder_to_imagenet_map[t.item()] for t in targets]).to(device)
            outputs = model(inputs)
            
            _, pred = outputs.topk(max(topk), 1, True, True)
            pred = pred.t()
            correct = pred.eq(imagenet_targets.view(1, -1).expand_as(pred))

            for i, k in enumerate(topk):
                topk_correct[i] += correct[:k].sum().item()

            total += targets.size(0)

    return [100 * correct / total for correct in topk_correct]

# Evaluate original model performance
print("Evaluating original model performance...")
top1_acc, top5_acc = calculate_accuracy(model, dataloader, folder_to_imagenet, topk=(1, 5))
print(f"Top-1 Accuracy: {top1_acc:.2f}%")
print(f"Top-5 Accuracy: {top5_acc:.2f}%")

## Adversarial Dataset Generation Function

This cell defines a function `create_adversarial_dataset` for creating and saving adversarial examples:

- **Function Purpose**:
  - Generates adversarial examples using specified attack method
  - Creates a structured dataset of adversarial images
  - Ensures L∞ constraint compliance
  - Saves images in organized class directories

- **Key Features**:
  - **Dataset Structure**:
    - Creates 100 class directories (for ImageNet classes)
    - Maintains original class organization
    - Saves images with proper naming convention
  
  - **Attack Implementation**:
    - Applies specified attack function to input images
    - Handles batch processing for efficiency
    - Maps folder indices to true ImageNet indices
    - Enforces L∞ constraint on perturbations
  
  - **Quality Control**:
    - Validates perturbation bounds
    - Scales perturbations if needed
    - Ensures ε-bound compliance
    - Denormalizes images before saving
  
  - **Error Handling**:
    - Asserts L∞ constraint compliance
    - Provides detailed error messages
    - Handles batch processing errors

- **Implementation Details**:
  - Efficient batch processing
  - Progress tracking with tqdm
  - Proper device handling
  - Memory-efficient operations

This function is crucial for:
- Creating standardized adversarial datasets
- Ensuring attack constraint compliance
- Maintaining dataset organization
- Facilitating model evaluation

## Model and Dataset Setup

This cell performs the initial setup of the model and dataset for adversarial attack experiments:

- **Model Setup**:
  - Loads pre-trained ResNet-34 model with ImageNet weights
  - Moves model to appropriate device (CPU/GPU)
  - Sets model to evaluation mode

- **Data Preprocessing**:
  - Defines ImageNet normalization parameters:
    - Mean: [0.485, 0.456, 0.406]
    - Standard deviation: [0.229, 0.224, 0.225]
  - Creates transformation pipeline for image preprocessing

- **Dataset Loading**:
  - Loads test dataset from "./TestDataSet"
  - Implements robust label mapping:
    - Attempts to load ImageNet class index mapping
    - Creates fallback mapping if needed
    - Efficiently maps folder indices to ImageNet indices using O(1) lookups
  
- **DataLoader Configuration**:
  - Creates DataLoader with batch size 32
  - Disables shuffling for consistent evaluation
  - Uses 4 worker processes for efficient loading

- **Visualization and Evaluation**:
  - Implements `show_examples` function to display sample images
  - Shows example images with their labels
  - Evaluates model performance on original dataset
  - Reports Top-1 and Top-5 accuracy

This setup is crucial for:
- Establishing baseline model performance
- Ensuring proper data preprocessing
- Verifying dataset loading and organization
- Setting up the foundation for adversarial attacks

## Task 2: Fast Gradient Sign Method (FGSM) Attack Implementation

This cell implements and evaluates the FGSM attack, which is a simple but effective adversarial attack method:

- **FGSM Attack Implementation**:
  - **Function Purpose**:
    - Implements the Fast Gradient Sign Method attack
    - Generates adversarial examples with L∞ constraint
    - Ensures valid pixel ranges after perturbation
  
  - **Key Features**:
    - Single-step gradient-based attack
    - Uses sign of gradient for perturbation
    - Enforces epsilon budget constraint
    - Maintains image normalization

- **Attack Evaluation**:
  - **Visualization**:
    - Tests attack on 5 example images
    - Shows original vs. adversarial images
    - Displays model predictions and confidence
  
  - **Dataset Creation**:
    - Creates "AdversarialTestSet1" directory
    - Generates adversarial examples for all test images
    - Maintains original class structure
  
  - **Performance Analysis**:
    - Calculates Top-1 and Top-5 accuracy
    - Computes absolute and relative accuracy drops
    - Verifies attack effectiveness (≥50% relative drop)

- **Implementation Details**:
  - Uses epsilon = 0.02 for perturbation budget
  - Properly handles device placement
  - Maintains image normalization
  - Includes error checking and assertions

This implementation is crucial for:
- Demonstrating basic adversarial attack effectiveness
- Establishing baseline attack performance
- Verifying attack constraints
- Setting up comparison for more advanced attacks

In [None]:
def fgsm_attack(model, images, labels, epsilon=0.02):
    """
    Implement Fast Gradient Sign Method (FGSM) attack
    
    Args:
        model: The model to attack
        images: Input images
        labels: True labels (ImageNet indices)
        epsilon: Attack budget
        
    Returns:
        Adversarial examples
    """
    # Clone images and set requires_grad
    images = images.clone().detach().to(device)
    images.requires_grad = True
    
    # Forward pass
    outputs = model(images)
    loss = F.cross_entropy(outputs, labels)
    
    # Backward pass
    model.zero_grad()
    loss.backward()
    
    # Generate adversarial examples
    grad_sign = images.grad.sign()
    perturbed_images = images + epsilon * grad_sign
    
    # Ensure valid pixel range
    for c, (mean, std) in enumerate(zip(mean_norms, std_norms)):
        min_val = (0 - mean) / std
        max_val = (1 - mean) / std
        perturbed_images[:, c] = torch.clamp(perturbed_images[:, c], min_val, max_val)
    
    # Verify L∞ constraint
    with torch.no_grad():
        delta = perturbed_images - images
        max_perturbation = delta.abs().max().item()
        if max_perturbation > epsilon:
            # Scale down to satisfy constraint exactly
            scale_factor = epsilon / max_perturbation
            delta = delta * scale_factor
            perturbed_images = images + delta
    
    return perturbed_images.detach()

# Test FGSM attack
print("\nTesting FGSM Attack:")
test_batch, test_labels = next(iter(dataloader))
test_batch, test_labels = test_batch.to(device), test_labels.to(device)

for i in range(min(3, len(test_batch))):
    img = test_batch[i]
    folder_idx = test_labels[i].item()
    imagenet_idx = folder_to_imagenet[folder_idx]
    
    # Generate FGSM adversarial example
    epsilon = 0.02
    adv_img = fgsm_attack(
        model,
        img.unsqueeze(0),
        torch.tensor([imagenet_idx]).to(device),
        epsilon=epsilon
    ).squeeze(0)
    
    # Visualize results
    plot_images_with_predictions(model, img, adv_img, folder_idx, epsilon)

# Create FGSM adversarial dataset
adv_dataset_path_1 = "./AdversarialTestSet1"
fgsm_params = {'epsilon': 0.02}

if os.path.exists(adv_dataset_path_1):
    import shutil
    shutil.rmtree(adv_dataset_path_1)

create_adversarial_dataset(model, dataloader, fgsm_attack, fgsm_params, adv_dataset_path_1, "AdversarialTestSet1")

# Load and evaluate FGSM adversarial dataset
adv_dataset_1 = torchvision.datasets.ImageFolder(root=adv_dataset_path_1, transform=plain_transforms)
adv_dataloader_1 = DataLoader(adv_dataset_1, batch_size=32, shuffle=False, num_workers=4)

top1_acc_adv1, top5_acc_adv1 = calculate_accuracy(model, adv_dataloader_1, folder_to_imagenet, topk=(1, 5))
print(f"\nFGSM Attack Results:")
print(f"Top-1 Accuracy: {top1_acc_adv1:.2f}%")
print(f"Top-5 Accuracy: {top5_acc_adv1:.2f}%")
print(f"Absolute drop in Top-1 Accuracy: {top1_acc - top1_acc_adv1:.2f}%")

# Calculate relative drop
rel_drop_top1_fgsm = 100 * (top1_acc - top1_acc_adv1) / top1_acc if top1_acc > 0 else 100.0
print(f"Relative drop in Top-1 Accuracy: {rel_drop_top1_fgsm:.2f}%")


Task 3

## Task 3: Projected Gradient Descent (PGD) Attack Implementation

This cell implements and evaluates the PGD attack, a more sophisticated iterative adversarial attack method:

- **PGD Attack Implementation**:
  - **Function Purpose**:
    - Implements iterative PGD attack with L∞ constraint
    - Generates stronger adversarial examples than FGSM
    - Ensures valid pixel ranges and perturbation bounds
  
  - **Key Features**:
    - Multi-step gradient-based attack
    - Random initialization within epsilon ball
    - Iterative gradient updates
    - Projection to maintain L∞ constraint
    - Step size control (alpha = 0.005)
    - 10 iterations for refinement

- **Attack Evaluation**:
  - **Visualization**:
    - Tests attack on 5 example images
    - Shows original vs. adversarial images
    - Displays model predictions and confidence
  
  - **Dataset Creation**:
    - Creates "AdversarialTestSet2" directory
    - Generates adversarial examples for all test images
    - Maintains original class structure
  
  - **Constraint Verification**:
    - Implements thorough L∞ constraint checking
    - Verifies maximum perturbation ≤ 0.02
    - Ensures pixel value validity
    - Provides detailed verification output

- **Performance Analysis**:
    - Calculates Top-1 and Top-5 accuracy
    - Computes absolute and relative accuracy drops
    - Verifies attack effectiveness (≥70% relative drop)
    - Compares with FGSM results

- **Implementation Details**:
    - Uses epsilon = 0.02 for perturbation budget
    - Implements proper device handling
    - Maintains image normalization
    - Includes comprehensive error checking

This implementation is crucial for:
- Demonstrating advanced adversarial attack effectiveness
- Comparing with simpler FGSM attack
- Verifying strict L∞ constraints
- Achieving higher relative accuracy drop

In [None]:
def pgd_attack(model, images, labels, epsilon=0.02, alpha=0.005, iterations=10, random_start=True, targeted=False):
    """
    Implement Projected Gradient Descent (PGD) attack with strict L∞ constraint
    
    Args:
        model: The model to attack
        images: Input images
        labels: True labels (ImageNet indices)
        epsilon: Attack budget
        alpha: Step size
        iterations: Number of iterations
        random_start: Whether to use random initialization
        targeted: Whether to perform targeted attack
        
    Returns:
        Adversarial examples
    """
    # Clone original images
    orig_images = images.clone().detach().to(device)
    
    # Initialize perturbation
    delta = torch.zeros_like(orig_images).to(device)
    if random_start:
        delta.uniform_(-epsilon, epsilon)
    
    # PGD iterations
    for _ in range(iterations):
        delta.requires_grad = True
        
        # Forward pass
        adv_images = orig_images + delta
        outputs = model(adv_images)
        
        # Calculate loss (targeted or untargeted)
        if targeted:
            # For targeted attack, we minimize loss to target class
            with torch.no_grad():
                target_labels = outputs.argmin(dim=1)  # Least likely class
            loss = -F.cross_entropy(outputs, target_labels)
        else:
            # For untargeted attack, we maximize loss to true class
            loss = F.cross_entropy(outputs, labels)
        
        # Backward pass
        grad = torch.autograd.grad(loss, delta)[0]
        
        # Update delta
        with torch.no_grad():
            delta.data = delta.data + alpha * grad.sign()
            delta.data = torch.clamp(delta.data, -epsilon, epsilon)
            
            # Ensure valid pixel values
            adv_images = orig_images + delta
            for c, (mean, std) in enumerate(zip(mean_norms, std_norms)):
                min_val = (0 - mean) / std
                max_val = (1 - mean) / std
                valid_delta_c = torch.clamp(adv_images[:,c], min_val, max_val) - orig_images[:,c]
                delta.data[:,c] = valid_delta_c
    
    # Final adversarial images
    adv_images = orig_images + delta
    
    # Verify L∞ constraint
    with torch.no_grad():
        max_perturbation = delta.abs().max().item()
        if max_perturbation > epsilon:
            # Scale down to satisfy constraint exactly
            scale_factor = epsilon / max_perturbation
            delta = delta * scale_factor
            adv_images = orig_images + delta
    
    return adv_images.detach()

# Test PGD attack
print("\nTesting PGD Attack:")
for i in range(min(3, len(test_batch))):
    img = test_batch[i]
    folder_idx = test_labels[i].item()
    imagenet_idx = folder_to_imagenet[folder_idx]
    
    # Generate PGD adversarial example
    epsilon = 0.02
    adv_img = pgd_attack(
        model,
        img.unsqueeze(0),
        torch.tensor([imagenet_idx]).to(device),
        epsilon=epsilon,
        alpha=0.005,
        iterations=10,
        random_start=True
    ).squeeze(0)
    
    # Visualize results
    plot_images_with_predictions(model, img, adv_img, folder_idx, epsilon)

# Create PGD adversarial dataset
adv_dataset_path_2 = "./AdversarialTestSet2"
pgd_params = {
    'epsilon': 0.02,
    'alpha': 0.005,
    'iterations': 10,
    'random_start': True,
    'targeted': False
}

if os.path.exists(adv_dataset_path_2):
    import shutil
    shutil.rmtree(adv_dataset_path_2)

create_adversarial_dataset(model, dataloader, pgd_attack, pgd_params, adv_dataset_path_2, "AdversarialTestSet2")

# Load and evaluate PGD adversarial dataset
adv_dataset_2 = torchvision.datasets.ImageFolder(root=adv_dataset_path_2, transform=plain_transforms)
adv_dataloader_2 = DataLoader(adv_dataset_2, batch_size=32, shuffle=False, num_workers=4)

top1_acc_adv2, top5_acc_adv2 = calculate_accuracy(model, adv_dataloader_2, folder_to_imagenet, topk=(1, 5))
print(f"\nPGD Attack Results:")
print(f"Top-1 Accuracy: {top1_acc_adv2:.2f}%")
print(f"Top-5 Accuracy: {top5_acc_adv2:.2f}%")
print(f"Absolute drop in Top-1 Accuracy: {top1_acc - top1_acc_adv2:.2f}%")

# Calculate relative drop
rel_drop_top1_pgd = 100 * (top1_acc - top1_acc_adv2) / top1_acc if top1_acc > 0 else 100.0
print(f"Relative drop in Top-1 Accuracy: {rel_drop_top1_pgd:.2f}%")

Task 4

In [None]:
def patch_attack(model, images, labels, patch_size=32, epsilon=0.3, alpha=0.05, iterations=20, targeted=True):
    """
    Implement patch attack that only perturbs a small patch of the image
    
    Args:
        model: The model to attack
        images: Input images
        labels: True labels (ImageNet indices)
        patch_size: Size of the patch
        epsilon: Attack budget
        alpha: Step size
        iterations: Number of iterations
        targeted: Whether to perform targeted attack
        
    Returns:
        Adversarial examples and patch coordinates
    """
    # Clone original images
    orig_images = images.clone().detach().to(device)
    batch_size = images.shape[0]
    
    # Create masks and store patch coordinates
    patch_masks = []
    patch_coords = []
    
    for i in range(batch_size):
        # Get image dimensions
        _, h, w = images[i].shape
        
        # Randomly select patch location
        x = np.random.randint(0, h - patch_size)
        y = np.random.randint(0, w - patch_size)
        patch_coords.append((x, y))
        
        # Create mask (1 within patch, 0 elsewhere)
        mask = torch.zeros_like(images[i])
        mask[:, x:x+patch_size, y:y+patch_size] = 1.0
        patch_masks.append(mask)
    
    # Stack masks
    patch_masks = torch.stack(patch_masks).to(device)
    
    # Initialize delta (perturbation)
    delta = torch.zeros_like(orig_images).to(device)
    
    # Get target classes for targeted attack
    if targeted:
        with torch.no_grad():
            outputs = model(images)
            target_labels = outputs.argmin(dim=1)  # Use least-likely class as target
    
    # PGD iterations
    for _ in range(iterations):
        delta.requires_grad = True
        
        # Forward pass with current perturbation
        adv_images = orig_images + delta * patch_masks  # Apply perturbation only within patch
        outputs = model(adv_images)
        
        # Calculate loss (targeted or untargeted)
        if targeted:
            loss = -F.cross_entropy(outputs, target_labels)
        else:
            loss = F.cross_entropy(outputs, labels)
        
        # Backward pass
        grad = torch.autograd.grad(loss, delta)[0]
        
        # Update delta
        with torch.no_grad():
            # Apply update only within patch
            grad_masked = grad * patch_masks
            delta.data = delta.data + alpha * grad_masked.sign()
            delta.data = torch.clamp(delta.data, -epsilon, epsilon)
    
    # Final adversarial images
    adv_images = orig_images + delta * patch_masks
    
    # Ensure valid pixel values
    for c, (mean, std) in enumerate(zip(mean_norms, std_norms)):
        min_val = (0 - mean) / std
        max_val = (1 - mean) / std
        adv_images[:, c] = torch.clamp(adv_images[:, c], min_val, max_val)
    
    # Verify no changes outside patch
    with torch.no_grad():
        outside_patch = (adv_images - orig_images) * (1 - patch_masks)
        assert outside_patch.abs().max().item() < 1e-6, "Changes detected outside patch area!"
    
    return adv_images.detach(), patch_coords

# Test Patch attack
print("\nTesting Patch Attack:")
for i in range(min(3, len(test_batch))):
    img = test_batch[i]
    folder_idx = test_labels[i].item()
    imagenet_idx = folder_to_imagenet[folder_idx]
    
    # Generate Patch adversarial example
    epsilon = 0.3
    adv_img, patch_coords = patch_attack(
        model,
        img.unsqueeze(0),
        torch.tensor([imagenet_idx]).to(device),
        patch_size=32,
        epsilon=epsilon,
        alpha=0.05,
        iterations=20,
        targeted=True
    )
    adv_img = adv_img.squeeze(0)
    patch_coords = patch_coords[0]
    
    # Visualize results
    plot_images_with_predictions(model, img, adv_img, folder_idx, epsilon, patch_coords=patch_coords)

# Create Patch adversarial dataset
adv_dataset_path_3 = "./AdversarialTestSet3"
patch_params = {
    'patch_size': 32,
    'epsilon': 0.3,
    'alpha': 0.05,
    'iterations': 20,
    'targeted': True
}

if os.path.exists(adv_dataset_path_3):
    import shutil
    shutil.rmtree(adv_dataset_path_3)

create_adversarial_dataset(model, dataloader, patch_attack, patch_params, adv_dataset_path_3, "AdversarialTestSet3")

# Load and evaluate Patch adversarial dataset
adv_dataset_3 = torchvision.datasets.ImageFolder(root=adv_dataset_path_3, transform=plain_transforms)
adv_dataloader_3 = DataLoader(adv_dataset_3, batch_size=32, shuffle=False, num_workers=4)

top1_acc_adv3, top5_acc_adv3 = calculate_accuracy(model, adv_dataloader_3, folder_to_imagenet, topk=(1, 5))
print(f"\nPatch Attack Results:")
print(f"Top-1 Accuracy: {top1_acc_adv3:.2f}%")
print(f"Top-5 Accuracy: {top5_acc_adv3:.2f}%")
print(f"Absolute drop in Top-1 Accuracy: {top1_acc - top1_acc_adv3:.2f}%")

# Calculate relative drop
rel_drop_top1_patch = 100 * (top1_acc - top1_acc_adv3) / top1_acc if top1_acc > 0 else 100.0
print(f"Relative drop in Top-1 Accuracy: {rel_drop_top1_patch:.2f}%")


Task 5

In [None]:
print("\nTask 5 - Testing Attack Transferability:")

# Load a different pre-trained model (DenseNet-121)
new_model = torchvision.models.densenet121(weights='IMAGENET1K_V1')
new_model = new_model.to(device)
new_model.eval()

# Evaluate DenseNet on original dataset
orig_top1_acc, orig_top5_acc = calculate_accuracy(new_model, dataloader, folder_to_imagenet, topk=(1, 5))
print(f"\nDenseNet-121 on Original Test Dataset:")
print(f"Top-1 Accuracy: {orig_top1_acc:.2f}%")
print(f"Top-5 Accuracy: {orig_top5_acc:.2f}%")

# Evaluate DenseNet on FGSM adversarial dataset
adv1_top1_acc, adv1_top5_acc = calculate_accuracy(new_model, adv_dataloader_1, folder_to_imagenet, topk=(1, 5))
print(f"\nDenseNet-121 on FGSM Adversarial Dataset:")
print(f"Top-1 Accuracy: {adv1_top1_acc:.2f}%")
print(f"Top-5 Accuracy: {adv1_top5_acc:.2f}%")
print(f"Transfer success rate: {100 - adv1_top1_acc:.2f}%")

# Evaluate DenseNet on PGD adversarial dataset
adv2_top1_acc, adv2_top5_acc = calculate_accuracy(new_model, adv_dataloader_2, folder_to_imagenet, topk=(1, 5))
print(f"\nDenseNet-121 on PGD Adversarial Dataset:")
print(f"Top-1 Accuracy: {adv2_top1_acc:.2f}%")
print(f"Top-5 Accuracy: {adv2_top5_acc:.2f}%")
print(f"Transfer success rate: {100 - adv2_top1_acc:.2f}%")

# Evaluate DenseNet on Patch adversarial dataset
adv3_top1_acc, adv3_top5_acc = calculate_accuracy(new_model, adv_dataloader_3, folder_to_imagenet, topk=(1, 5))
print(f"\nDenseNet-121 on Patch Adversarial Dataset:")
print(f"Top-1 Accuracy: {adv3_top1_acc:.2f}%")
print(f"Top-5 Accuracy: {adv3_top5_acc:.2f}%")
print(f"Transfer success rate: {100 - adv3_top1_acc:.2f}%")

# Calculate transferability rates
resnet_fgsm_success = top1_acc - top1_acc_adv1
densenet_fgsm_success = orig_top1_acc - adv1_top1_acc
fgsm_transfer_rate = densenet_fgsm_success / resnet_fgsm_success * 100 if resnet_fgsm_success > 0 else 0

resnet_pgd_success = top1_acc - top1_acc_adv2
densenet_pgd_success = orig_top1_acc - adv2_top1_acc
pgd_transfer_rate = densenet_pgd_success / resnet_pgd_success * 100 if resnet_pgd_success > 0 else 0

resnet_patch_success = top1_acc - top1_acc_adv3
densenet_patch_success = orig_top1_acc - adv3_top1_acc
patch_transfer_rate = densenet_patch_success / resnet_patch_success * 100 if resnet_patch_success > 0 else 0



In [None]:
# === Visualization Utilities for Adversarial Attacks ===

import matplotlib.pyplot as plt
import numpy as np
import torch
import pandas as pd
import seaborn as sns
from sklearn.metrics import confusion_matrix

def plot_confusion_matrix(model, dataloader, folder_to_imagenet, class_names, attack_name="Clean", N=10):
    """
    Plots a confusion matrix for the given model and dataloader.
    Only shows the first N classes for clarity.
    """
    all_preds = []
    all_targets = []
    model.eval()
    with torch.no_grad():
        for inputs, targets in dataloader:
            inputs = inputs.to(next(model.parameters()).device)
            imagenet_targets = torch.tensor([folder_to_imagenet[t.item()] for t in targets]).to(inputs.device)
            outputs = model(inputs)
            preds = outputs.argmax(dim=1).cpu().numpy()
            all_preds.extend(preds)
            all_targets.extend(imagenet_targets.cpu().numpy())

    unique_classes = sorted(list(set(all_targets)))[:N]
    cm = confusion_matrix(all_targets, all_preds, labels=unique_classes)
    plt.figure(figsize=(10, 8))
    sns.heatmap(
        cm, annot=True, fmt='d', cmap='Blues',
        xticklabels=[class_names.get(str(i), f"Class {i}") for i in unique_classes],
        yticklabels=[class_names.get(str(i), f"Class {i}") for i in unique_classes]
    )
    plt.xlabel("Predicted Label")
    plt.ylabel("True Label")
    plt.title(f"Confusion Matrix ({attack_name} Set, Top {N} Classes)")
    plt.tight_layout()
    plt.show()

def top_5_classes(y, label_map=None):
    """Returns top-5 (class_name, probability) pairs from logits or probabilities."""
    if y.ndim == 2:
        y = y.squeeze(0)
    # probs = torch.softmax(y, dim=0).cpu().numpy()
    probs = torch.softmax(y, dim=0).detach().cpu().numpy()
    top5_idx = np.argsort(probs)[::-1][:5]
    if label_map is not None:
        names = [label_map.get(str(idx), f"Class {idx}") for idx in top5_idx]
    else:
        names = [f"Class {idx}" for idx in top5_idx]
    return list(zip(names, probs[top5_idx]))

def plot_prediction(img, logits, label_map=None, title=None):
    """
    Visualizes the input image and a bar chart of the top-5 model predictions.
    img: tensor or PIL image (normalized or not)
    logits: model output (logits or probabilities)
    label_map: dict mapping class indices to names (optional)
    """
    # If tensor, denormalize for display
    if isinstance(img, torch.Tensor):
        img = img.clone().detach().cpu()
        if img.ndim == 4:  # batch
            img = img[0]
        if img.shape[0] == 3:
            img = img * torch.tensor([0.229, 0.224, 0.225]).view(3,1,1) + torch.tensor([0.485, 0.456, 0.406]).view(3,1,1)
            img = torch.clamp(img, 0, 1)
        img = np.transpose(img.numpy(), (1,2,0))
    elif hasattr(img, 'convert'):
        img = np.array(img.convert('RGB')) / 255.0

    top5 = top_5_classes(logits, label_map)
    names, probs = zip(*top5)

    fig, axs = plt.subplots(1, 2, figsize=(10, 5))
    if title:
        fig.suptitle(title, fontsize=16)
    axs[0].imshow(img)
    axs[0].axis('off')
    axs[0].set_title("Input Image")
    bars = axs[1].barh(range(5), probs[::-1], color='skyblue')
    axs[1].set_yticks(range(5))
    axs[1].set_yticklabels(names[::-1])
    axs[1].set_xlim(0, 1)
    axs[1].set_xlabel("Probability")
    axs[1].set_title("Top-5 Predictions")
    for i, bar in enumerate(bars):
        axs[1].text(bar.get_width() + 0.01, bar.get_y() + bar.get_height()/2, f"{probs[::-1][i]:.3f}", va='center')
    plt.tight_layout()
    plt.show()

def visualize_perturbation(original_img, adversarial_img, title="Perturbation Visualization"):
    """
    Visualize the perturbation between original and adversarial images.
    Both images should be tensors (normalized).
    """
    original_img = original_img.clone().detach().cpu()
    adversarial_img = adversarial_img.clone().detach().cpu()
    perturbation = adversarial_img - original_img
    perturbation_display = perturbation * 10 + 0.5  # Amplify for visibility
    perturbation_display = torch.clamp(perturbation_display, 0, 1)

    # Denormalize for display
    def denorm(img):
        img = img * torch.tensor([0.229, 0.224, 0.225]).view(3,1,1) + torch.tensor([0.485, 0.456, 0.406]).view(3,1,1)
        return torch.clamp(img, 0, 1)
    original_img = denorm(original_img)
    adversarial_img = denorm(adversarial_img)

    l_inf_dist = perturbation.abs().max().item()
    l0_dist = (perturbation.abs() > 1e-5).sum().item()
    total_pixels = perturbation.numel() / 3
    l0_percentage = (l0_dist / total_pixels) * 100

    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18, 6))
    ax1.imshow(np.transpose(original_img.numpy(), (1,2,0)))
    ax1.set_title("Original Image")
    ax1.axis('off')
    ax2.imshow(np.transpose(adversarial_img.numpy(), (1,2,0)))
    ax2.set_title("Adversarial Image")
    ax2.axis('off')
    ax3.imshow(np.transpose(perturbation_display.numpy(), (1,2,0)))
    ax3.set_title(f"Perturbation (x10)\nL∞={l_inf_dist:.4f}, L0={l0_percentage:.2f}%")
    ax3.axis('off')
    plt.suptitle(title)
    plt.tight_layout()
    plt.show()

# def plot_probability_over_queries(probs_target, probs_incorrect=None, attack_name=None, label="Class Probability", xlabel="Queries", ylabel="Probability", title=None):
#     """
#     Plot the probability of a class over queries (for blackbox/iterative attacks).
#     probs: list or array of probabilities
#     """
#     plt.figure(figsize=(7, 5))
#     plt.plot(range(0, len(probs) * 50, 50), probs, label=label)
#     plt.xlabel(xlabel)
#     plt.ylabel(ylabel)
#     if title:
#         plt.title(title)
#     plt.legend()
#     plt.show()
def plot_probability_over_queries(probs_target, probs_incorrect=None, attack_name=None, 
                                label="Class Probability", xlabel="Queries", 
                                ylabel="Probability", title=None):
    """
    Plot the probability of classes over queries (for blackbox/iterative attacks).
    probs_target: list or array of target class probabilities
    probs_incorrect: optional list or array of incorrect class probabilities
    """
    plt.figure(figsize=(7, 5))
    x_values = range(0, len(probs_target) * 50, 50)  # Assuming 50 queries per point
    
    plt.plot(x_values, probs_target, label="Target Class", color='blue')
    
    if probs_incorrect is not None:
        plt.plot(x_values[:len(probs_incorrect)], probs_incorrect, 
                label="Incorrect Class", color='red', linestyle='--')
    
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    
    # Use the provided title or construct one with attack name
    if title:
        plt.title(title)
    elif attack_name:
        plt.title(f"{attack_name} Attack: Probability Over Iterations")
    
    plt.legend()
    plt.show()

def show_confusion_or_accuracy_table(df, cmap='Blues', axis=None, title=None):
    """
    Display a DataFrame (accuracy/confusion) as a color-graded table.
    """
    styled = df.style.background_gradient(cmap=cmap, axis=axis)
    if title:
        print(f"\n{title}")
    display(styled)

In [None]:
# === Example Usage of Visualization Utilities (Enhanced) ===

# 1. Model Prediction
print("Example: Image and Top-5 Predictions")
img, label = dataset[0]
logits = model(img.unsqueeze(0).to(device))
plot_prediction(img, logits.squeeze(0), label_map=imagenet_label_map, title="Model Prediction")

# 2. Perturbation Visualization
print("\nExample: Visualize Perturbation (FGSM)")
epsilon = 0.02
imagenet_idx = folder_to_imagenet[label] if 'folder_to_imagenet' in globals() else label
adv_img = fgsm_attack(model, img.unsqueeze(0).to(device), torch.tensor([imagenet_idx]).to(device), epsilon=epsilon).squeeze(0)
visualize_perturbation(img, adv_img, title=f"FGSM Perturbation (ε={epsilon})")

# 3. Probability Decay Graph (PGD)
print("\nExample: Probability Decay during PGD Attack")
probs_target = [0.9, 0.7, 0.5, 0.3, 0.1]
probs_incorrect = [0.05, 0.15, 0.3, 0.5, 0.7]
plot_probability_over_queries(
    probs_target,
    probs_incorrect,
    attack_name="PGD",
    xlabel="PGD Iterations (Queries)",
    ylabel="Probability",
    title="PGD Attack: Target vs. Incorrect Class Probability Decay"
)

# 4. Confusion Matrix for ImageNet Subset
print("\nExample: Confusion Matrix for ImageNet Subset (Clean Set)")
plot_confusion_matrix(
    model, 
    dataloader, 
    folder_to_imagenet, 
    imagenet_label_map, 
    attack_name="Clean", 
    N=10
)

Conclusion and Discussion

In [None]:
print("\nSummary of Results:")
print("=" * 80)
print(f"{'Dataset':<25} {'ResNet-34 Top-1':<15} {'ResNet-34 Top-5':<15} {'DenseNet-121 Top-1':<15} {'DenseNet-121 Top-5':<15}")
print("-" * 80)
print(f"{'Original Test Set':<25} {top1_acc:<15.2f} {top5_acc:<15.2f} {orig_top1_acc:<15.2f} {orig_top5_acc:<15.2f}")
print(f"{'Adversarial Set 1 (FGSM)':<25} {top1_acc_adv1:<15.2f} {top5_acc_adv1:<15.2f} {adv1_top1_acc:<15.2f} {adv1_top5_acc:<15.2f}")
print(f"{'Adversarial Set 2 (PGD)':<25} {top1_acc_adv2:<15.2f} {top5_acc_adv2:<15.2f} {adv2_top1_acc:<15.2f} {adv2_top5_acc:<15.2f}")
print(f"{'Adversarial Set 3 (Patch)':<25} {top1_acc_adv3:<15.2f} {top5_acc_adv3:<15.2f} {adv3_top1_acc:<15.2f} {adv3_top5_acc:<15.2f}")

print("\nTransferability Analysis:")
print("=" * 80)
print(f"{'Attack Method':<25} {'ResNet-34 Success':<20} {'DenseNet-121 Success':<20} {'Transfer Rate':<15}")
print("-" * 80)
print(f"{'FGSM':<25} {resnet_fgsm_success:<20.2f} {densenet_fgsm_success:<20.2f} {fgsm_transfer_rate:<15.2f}%")
print(f"{'PGD':<25} {resnet_pgd_success:<20.2f} {densenet_pgd_success:<20.2f} {pgd_transfer_rate:<15.2f}%")
print(f"{'Patch':<25} {resnet_patch_success:<20.2f} {densenet_patch_success:<20.2f} {patch_transfer_rate:<15.2f}%")

# Save results to file for report
with open('accuracy_results.txt', 'w') as f:
    f.write("Model Accuracy Results\n")
    f.write("=====================\n\n")

    f.write("ResNet-34 Results:\n")
    f.write(f"Original Test Set - Top-1 Accuracy: {top1_acc:.2f}%, Top-5 Accuracy: {top5_acc:.2f}%\n")
    f.write(f"FGSM Attack - Top-1 Accuracy: {top1_acc_adv1:.2f}%, Top-5 Accuracy: {top5_acc_adv1:.2f}%\n")
    f.write(f"PGD Attack - Top-1 Accuracy: {top1_acc_adv2:.2f}%, Top-5 Accuracy: {top5_acc_adv2:.2f}%\n")
    f.write(f"Patch Attack - Top-1 Accuracy: {top1_acc_adv3:.2f}%, Top-5 Accuracy: {top5_acc_adv3:.2f}%\n\n")

    f.write("DenseNet-121 Results:\n")
    f.write(f"Original Test Set - Top-1 Accuracy: {orig_top1_acc:.2f}%, Top-5 Accuracy: {orig_top5_acc:.2f}%\n")
    f.write(f"FGSM Attack - Top-1 Accuracy: {adv1_top1_acc:.2f}%, Top-5 Accuracy: {adv1_top5_acc:.2f}%\n")
    f.write(f"PGD Attack - Top-1 Accuracy: {adv2_top1_acc:.2f}%, Top-5 Accuracy: {adv2_top5_acc:.2f}%\n")
    f.write(f"Patch Attack - Top-1 Accuracy: {adv3_top1_acc:.2f}%, Top-5 Accuracy: {adv3_top5_acc:.2f}%\n\n")

    f.write("Transferability Results:\n")
    f.write(f"FGSM Transferability Rate: {fgsm_transfer_rate:.2f}%\n")
    f.write(f"PGD Transferability Rate: {pgd_transfer_rate:.2f}%\n")
    f.write(f"Patch Transferability Rate: {patch_transfer_rate:.2f}%\n")

print("\nResults saved to 'accuracy_results.txt'")

# Visualize perturbations
def visualize_perturbation(original_img, adversarial_img, title="Perturbation Visualization"):
    """Visualize the perturbation between original and adversarial images"""
    # Clone and detach images
    original_img = original_img.clone().detach().cpu()
    adversarial_img = adversarial_img.clone().detach().cpu()
    
    # Calculate perturbation
    perturbation = adversarial_img - original_img
    
    # Scale perturbation for better visualization
    perturbation_display = perturbation * 10 + 0.5
    perturbation_display = torch.clamp(perturbation_display, 0, 1)
    
    # Calculate L-infinity distance
    l_inf_dist = perturbation.abs().max().item()
    
    # Calculate L0 distance (number of perturbed pixels)
    l0_dist = (perturbation.abs() > 1e-5).sum().item()
    total_pixels = perturbation.numel() / 3  # Divide by 3 for RGB channels
    l0_percentage = (l0_dist / total_pixels) * 100
    
    # Denormalize for visualization
    original_img = original_img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    adversarial_img = adversarial_img * torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) + torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    
    original_img = torch.clamp(original_img, 0, 1)
    adversarial_img = torch.clamp(adversarial_img, 0, 1)
    
    # Plotting
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18, 6))
    
    ax1.imshow(original_img.permute(1, 2, 0))
    ax1.set_title("Original Image")
    ax1.axis('off')
    
    ax2.imshow(adversarial_img.permute(1, 2, 0))
    ax2.set_title("Adversarial Image")
    ax2.axis('off')
    
    ax3.imshow(perturbation_display.permute(1, 2, 0))
    ax3.set_title(f"Perturbation (Amplified 10x)\nL∞ = {l_inf_dist:.4f}, L0 = {l0_percentage:.2f}% of pixels")
    ax3.axis('off')
    
    plt.suptitle(title)
    plt.tight_layout()
    plt.show()

print("\nVisualizing Perturbations for Different Attack Methods:")
test_img = test_batch[0].to(device)
folder_idx = test_labels[0].item()
imagenet_idx = folder_to_imagenet[folder_idx]

# FGSM perturbation
fgsm_adv_img = fgsm_attack(
    model, 
    test_img.unsqueeze(0), 
    torch.tensor([imagenet_idx]).to(device), 
    epsilon=0.02
).squeeze(0)
visualize_perturbation(test_img, fgsm_adv_img, "FGSM Attack Perturbation (ε = 0.02)")

# PGD perturbation
pgd_adv_img = pgd_attack(
    model, 
    test_img.unsqueeze(0), 
    torch.tensor([imagenet_idx]).to(device),
    epsilon=0.02, 
    alpha=0.005, 
    iterations=10
).squeeze(0)
visualize_perturbation(test_img, pgd_adv_img, "PGD Attack Perturbation (ε = 0.02)")

# Patch perturbation
patch_adv_img, _ = patch_attack(
    model, 
    test_img.unsqueeze(0), 
    torch.tensor([imagenet_idx]).to(device),
    patch_size=32, 
    epsilon=0.3, 
    iterations=20
)
patch_adv_img = patch_adv_img.squeeze(0)
visualize_perturbation(test_img, patch_adv_img, "Patch Attack Perturbation (ε = 0.3)")

print("\nConclusion:")
print("We have successfully implemented and evaluated three types of adversarial attacks:")
print(f"1. FGSM: A simple one-step attack that achieved a {rel_drop_top1_fgsm:.2f}% relative drop in accuracy")
print(f"2. PGD: An iterative attack that improved upon FGSM with a {rel_drop_top1_pgd:.2f}% relative drop")
print(f"3. Patch: A localized attack that achieved a {rel_drop_top1_patch:.2f}% relative drop despite only modifying a small region")
print("\nAll attacks maintained the required L∞ constraint, and we demonstrated their transferability to a different model architecture.")