# Phase 4: Adversarial Attacks & Robustness - Complete Evaluation

**Project:** Tri-Objective Robust XAI for Medical Imaging  
**Author:** Viraj Pankaj Jain  
**Institution:** University of Glasgow  
**Date:** November 26, 2025  
**Platform:** Google Colab (T4 GPU)

---

## Objectives

This notebook implements **Phase 4.3, 4.4, and 4.5** of the research project:

### Phase 4.3: Baseline Robustness Evaluation
- Evaluate baseline models under adversarial attacks (FGSM, PGD, C&W, AutoAttack)
- Test on ISIC 2018 dermoscopy dataset
- Compute robust accuracy and attack success rates
- Aggregate results across 3 seeds (42, 123, 456)
- Expected: **50-70pp accuracy drop** under PGD √é¬µ=8/255

### Phase 4.4: Attack Transferability Study  
- Generate adversarial examples on ResNet-50
- Test on EfficientNet-B0 (if available)
- Compute cross-model attack success rates
- Analyze transferability patterns

### Phase 4.5: Adversarial Visualization
- Visualize clean vs adversarial images
- Amplify perturbations for visibility
- Show prediction changes
- Generate figures for dissertation

---

## Prerequisites

√¢≈ì‚Ä¶ **Phase 4.1 & 4.2 Complete:** All attacks implemented and tested (109/109 tests passing)  
√¢≈ì‚Ä¶ **Phase 3 Complete:** Baseline models trained (3 seeds)  
√¢≈ì‚Ä¶ **Infrastructure:** All code files ready in repository  
√¢≈ì‚Ä¶ **Hardware:** Google Colab T4 GPU (16GB)

# Section 1: Environment Setup

**Mount Google Drive and clone repository**

In [None]:
# ============================================================================
# CELL 1: ENVIRONMENT SETUP (Google Colab A100)
# ============================================================================

import sys
import os
from pathlib import Path

print("=" * 80)
print("PHASE 4: ADVERSARIAL ATTACKS & ROBUSTNESS")
print("=" * 80)

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')
print("√¢≈ì‚Ä¶ Google Drive mounted")

# Clone repository
REPO_PATH = Path('/content/tri-objective-robust-xai-medimg')
if not REPO_PATH.exists():
    !git clone https://github.com/viraj1011JAIN/tri-objective-robust-xai-medimg.git /content/tri-objective-robust-xai-medimg
    print("√¢≈ì‚Ä¶ Repository cloned")
else:
    !cd /content/tri-objective-robust-xai-medimg && git pull
    print("√¢≈ì‚Ä¶ Repository updated")

PROJECT_ROOT = REPO_PATH
os.chdir(PROJECT_ROOT)
sys.path.insert(0, str(PROJECT_ROOT))

print(f"Project Root: {PROJECT_ROOT}")

# Verify GPU
import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

print("\n√¢≈ì‚Ä¶ Environment setup complete")

ValueError: mount failed

In [None]:
# Install dependencies (Colab)
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
!pip install -q timm scikit-learn pandas matplotlib seaborn tqdm pillow mlflow albumentations
print("‚úÖ Dependencies installed")

In [None]:
# ============================================================================
# CELL 3: IMPORT LIBRARIES
# ============================================================================

import sys
import json
import warnings
from pathlib import Path
from typing import Dict, List, Tuple, Optional
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import transforms
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm.auto import tqdm

# Optional: plotly for interactive plots
try:
    import plotly.graph_objects as go
    from plotly.subplots import make_subplots
    HAS_PLOTLY = True
except ImportError:
    HAS_PLOTLY = False
    print("√¢≈°¬†√Ø¬∏¬è Plotly not available - using matplotlib only")

# Import project modules
from src.attacks.fgsm import FGSM, FGSMConfig
from src.attacks.pgd import PGD, PGDConfig
from src.attacks.cw import CarliniWagner, CWConfig
from src.datasets.isic import ISICDataset
from src.models.build import build_model
from src.utils.reproducibility import set_global_seed

warnings.filterwarnings('ignore')
sns.set_style('whitegrid')

print("√¢≈ì‚Ä¶ All imports successful")

# Section 2: Configuration

**Define paths, hyperparameters, and attack configurations**

In [None]:
# ============================================================================
# CONFIGURATION - Complete setup for Phase 4 evaluation
# ============================================================================
import os
import torch
from pathlib import Path

print("=" * 70)
print("PHASE 4 CONFIGURATION")
print("=" * 70)

# ============================================================================
# CONFIGURATION DICTIONARY - All parameters defined here
# ============================================================================
CONFIG = {
    # === Data Settings ===
    'data_root': Path("/content/drive/MyDrive/data/data/isic_2018"),
    'checkpoint_dir': Path("/content/drive/MyDrive/checkpoints/baseline"),
    
    # === Model Settings ===
    'model_name': 'resnet50',
    'num_classes': 7,
    'image_size': 224,
    
    # === DataLoader Settings ===
    'batch_size': 32,
    'num_workers': 0,  # Must be 0 for Google Drive
    
    # === Device ===
    'device': 'cuda' if torch.cuda.is_available() else 'cpu',
    
    # === Seeds for reproducibility ===
    'seeds': [42, 123, 456],
    
    # === FGSM Attack Parameters ===
    'epsilons': [0.01, 0.02, 0.03, 0.05, 0.1],  # L‚àû perturbation budgets
    
    # === PGD Attack Parameters ===
    'pgd_steps': [10, 20, 40],
    'pgd_alpha': 0.01,
    
    # === C&W Attack Parameters ===
    'cw_c': 1.0,
    'cw_steps': 100,
    'cw_lr': 0.01,
    
    # === Class Names ===
    'class_names': ['AKIEC', 'BCC', 'BKL', 'DF', 'MEL', 'NV', 'VASC'],
}

# ============================================================================
# Create checkpoint directories on Drive
# ============================================================================
checkpoint_base = str(CONFIG['checkpoint_dir'])
for seed in CONFIG['seeds']:
    os.makedirs(f"{checkpoint_base}/seed_{seed}", exist_ok=True)

# ============================================================================
# Verify checkpoints exist
# ============================================================================
print("\nüìÅ Checkpoint Status:")
all_checkpoints_exist = True
for seed in CONFIG['seeds']:
    path = f"{checkpoint_base}/seed_{seed}/best.pt"
    if os.path.exists(path):
        size_mb = os.path.getsize(path) / (1024*1024)
        print(f"  ‚úÖ seed_{seed}/best.pt ({size_mb:.1f} MB)")
    else:
        print(f"  ‚ùå seed_{seed}/best.pt - MISSING")
        all_checkpoints_exist = False

# ============================================================================
# Verify data exists
# ============================================================================
print("\nüìÅ Data Status:")
data_root = CONFIG['data_root']
metadata_path = data_root / "metadata.csv"
if metadata_path.exists():
    print(f"  ‚úÖ metadata.csv found")
else:
    print(f"  ‚ùå metadata.csv - MISSING at {metadata_path}")

# ============================================================================
# Summary
# ============================================================================
print("\n" + "=" * 70)
print("CONFIGURATION SUMMARY")
print("=" * 70)
print(f"üìä Device: {CONFIG['device']}")
print(f"üìÅ Data root: {CONFIG['data_root']}")
print(f"üìÅ Checkpoints: {CONFIG['checkpoint_dir']}")
print(f"üî¢ Seeds: {CONFIG['seeds']}")
print(f"‚öîÔ∏è  Epsilons (FGSM/PGD): {CONFIG['epsilons']}")
print(f"‚öîÔ∏è  PGD steps: {CONFIG['pgd_steps']}")
print(f"üñºÔ∏è  Image size: {CONFIG['image_size']}")
print(f"üì¶ Batch size: {CONFIG['batch_size']}")

if all_checkpoints_exist:
    print("\n‚úÖ ALL CHECKPOINTS FOUND - Ready to proceed!")
else:
    print("\n‚ö†Ô∏è  CHECKPOINTS MISSING - Run the upload cell below first!")
print("=" * 70)

In [None]:
# ============================================================================
# UPLOAD CHECKPOINTS (Run this cell if checkpoints are missing)
# ============================================================================
from google.colab import files
import shutil

checkpoint_base = "/content/drive/MyDrive/checkpoints/baseline"

# Upload for each seed that's missing
for seed in [42, 123, 456]:
    checkpoint_path = f"{checkpoint_base}/seed_{seed}/best.pt"
    if not os.path.exists(checkpoint_path):
        print(f"\n{'='*60}")
        print(f"üì§ Upload best.pt for SEED {seed}")
        print(f"{'='*60}")
        print(f"Select the file: checkpoints/baseline/seed_{seed}/best.pt from your computer")
        
        uploaded = files.upload()
        
        for filename in uploaded.keys():
            # Save to the correct location
            dest_path = f"{checkpoint_base}/seed_{seed}/best.pt"
            with open(dest_path, 'wb') as f:
                f.write(uploaded[filename])
            print(f"‚úÖ Saved to: {dest_path}")
            
            # Verify file size
            size_mb = os.path.getsize(dest_path) / (1024*1024)
            print(f"üìä File size: {size_mb:.1f} MB")
    else:
        print(f"‚úÖ seed_{seed}/best.pt already exists, skipping...")

print("\n" + "="*60)
print("‚úÖ UPLOAD COMPLETE! Now re-run the Configuration cell above,")
print("   then continue with Data Loading and Evaluation cells.")
print("="*60)

# Section 3: Helper Functions

**Utility functions for evaluation and visualization**

In [None]:
def load_model_and_checkpoint(
    checkpoint_path: str,
    model_name: str = "resnet50",
    num_classes: int = 7,
    device: str = "cuda"
) -> nn.Module:
    """
    Load model from checkpoint.
    
    Args:
        checkpoint_path: Path to checkpoint file
        model_name: Model architecture name
        num_classes: Number of output classes
        device: Device to load model on
        
    Returns:
        Loaded model in eval mode
    """
    print(f"Loading model from: {checkpoint_path}")
    
    # Build model - use 'architecture' parameter (not 'model_name')
    model = build_model(
        architecture=model_name,  # Fixed: use 'architecture' not 'model_name'
        num_classes=num_classes,
        pretrained=False
    )
    
    # Load checkpoint
    checkpoint = torch.load(checkpoint_path, map_location=device, weights_only=False)
    if 'model_state_dict' in checkpoint:
        model.load_state_dict(checkpoint['model_state_dict'])
    elif 'state_dict' in checkpoint:
        model.load_state_dict(checkpoint['state_dict'])
    else:
        model.load_state_dict(checkpoint)
    
    model = model.to(device)
    model.eval()
    
    print(f"‚úÖ Model loaded successfully")
    return model


def compute_accuracy(
    model: nn.Module,
    images: torch.Tensor,
    labels: torch.Tensor,
    device: str = "cuda"
) -> float:
    """Compute accuracy for a batch."""
    model.eval()
    images = images.to(device)
    labels = labels.to(device)
    
    with torch.no_grad():
        logits = model(images)
        preds = logits.argmax(dim=1)
        accuracy = (preds == labels).float().mean().item()
    
    return accuracy * 100


def evaluate_attack(
    model: nn.Module,
    attack,
    dataloader: DataLoader,
    device: str = "cuda",
    max_batches: Optional[int] = None
) -> Dict[str, float]:
    """
    Evaluate a model under adversarial attack.
    
    Args:
        model: Model to evaluate
        attack: Attack instance (FGSM, PGD, CW, etc.)
        dataloader: Test data loader
        device: Device for computation
        max_batches: Maximum number of batches (None for all)
        
    Returns:
        Dictionary with evaluation metrics
    """
    model.eval()
    
    total_clean_correct = 0
    total_adv_correct = 0
    total_samples = 0
    total_l2_dist = 0
    total_linf_dist = 0
    
    pbar = tqdm(dataloader, desc=f"Evaluating {attack.name}", leave=False)
    
    for batch_idx, batch_data in enumerate(pbar):
        if max_batches and batch_idx >= max_batches:
            break
        
        # Handle both (images, labels) and (images, labels, meta) formats
        if len(batch_data) == 2:
            images, labels = batch_data
        else:
            images, labels, _ = batch_data  # Ignore metadata
            
        images = images.to(device)
        labels = labels.to(device)
        batch_size = images.size(0)
        
        # Clean accuracy
        with torch.no_grad():
            clean_logits = model(images)
            clean_preds = clean_logits.argmax(dim=1)
            clean_correct = (clean_preds == labels).sum().item()
        
        # Generate adversarial examples
        adv_images = attack(model, images, labels)
        
        # Adversarial accuracy
        with torch.no_grad():
            adv_logits = model(adv_images)
            adv_preds = adv_logits.argmax(dim=1)
            adv_correct = (adv_preds == labels).sum().item()
        
        # Perturbation norms
        perturbation = adv_images - images
        l2_dist = torch.norm(perturbation.view(batch_size, -1), p=2, dim=1).mean().item()
        linf_dist = perturbation.abs().view(batch_size, -1).max(dim=1)[0].mean().item()
        
        total_clean_correct += clean_correct
        total_adv_correct += adv_correct
        total_samples += batch_size
        total_l2_dist += l2_dist * batch_size
        total_linf_dist += linf_dist * batch_size
        
        # Update progress bar
        pbar.set_postfix({
            'clean_acc': f'{100*total_clean_correct/total_samples:.1f}%',
            'adv_acc': f'{100*total_adv_correct/total_samples:.1f}%'
        })
    
    clean_accuracy = 100 * total_clean_correct / total_samples
    adv_accuracy = 100 * total_adv_correct / total_samples
    attack_success_rate = 100 * (1 - total_adv_correct / total_clean_correct) if total_clean_correct > 0 else 0
    
    results = {
        'clean_accuracy': clean_accuracy,
        'robust_accuracy': adv_accuracy,
        'accuracy_drop': clean_accuracy - adv_accuracy,
        'attack_success_rate': attack_success_rate,
        'mean_l2_dist': total_l2_dist / total_samples,
        'mean_linf_dist': total_linf_dist / total_samples,
        'total_samples': total_samples
    }
    
    return results


def aggregate_seed_results(
    seed_results: Dict[int, Dict[str, float]],
    metric_names: List[str]
) -> Dict[str, Dict[str, float]]:
    """
    Aggregate results across seeds.
    
    Args:
        seed_results: Dictionary mapping seed to results
        metric_names: List of metric names to aggregate
        
    Returns:
        Dictionary with mean and std for each metric
    """
    aggregated = {}
    
    for metric in metric_names:
        values = [seed_results[seed][metric] for seed in seed_results]
        aggregated[metric] = {
            'mean': np.mean(values),
            'std': np.std(values),
            'values': values
        }
    
    return aggregated


print("‚úÖ Helper functions defined")

# Section 4: Load Data and Model

**Load ISIC2018 test set and baseline checkpoints**

In [None]:
# ============================================================================
# DATA LOADING - ISIC2018 Test Dataset
# ============================================================================
print("=" * 70)
print("LOADING ISIC2018 TEST DATASET")
print("=" * 70)

import pandas as pd
import albumentations as A
from albumentations.pytorch import ToTensorV2

# ============================================================================
# Fix Windows backslashes in metadata for Linux/Colab
# ============================================================================
metadata_path = CONFIG['data_root'] / "metadata.csv"
print(f"\nüìÑ Reading metadata from: {metadata_path}")

df = pd.read_csv(metadata_path)
print(f"   Total rows: {len(df)}")

# Convert Windows backslashes to forward slashes
if 'image_path' in df.columns:
    df['image_path'] = df['image_path'].str.replace('\\', '/', regex=False)
    print("   ‚úÖ Converted backslashes to forward slashes")

# Save fixed metadata
fixed_metadata_path = CONFIG['data_root'] / "metadata_fixed.csv"
df.to_csv(fixed_metadata_path, index=False)
print(f"   ‚úÖ Saved fixed metadata to: {fixed_metadata_path}")

# ============================================================================
# Create Albumentations transform pipeline
# ============================================================================
test_transforms = A.Compose([
    A.Resize(CONFIG['image_size'], CONFIG['image_size']),
    A.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ),
    ToTensorV2()
])

# ============================================================================
# Create Dataset and DataLoader
# ============================================================================
print("\nüì¶ Creating dataset...")

test_dataset = ISICDataset(
    root=str(CONFIG['data_root']),
    split='test',
    transforms=test_transforms,
    csv_path=str(fixed_metadata_path),
    image_column='image_path',
    label_column='label'
)

test_loader = DataLoader(
    test_dataset,
    batch_size=CONFIG['batch_size'],
    shuffle=False,
    num_workers=CONFIG['num_workers'],
    pin_memory=False
)

# ============================================================================
# Summary
# ============================================================================
print("\n" + "=" * 70)
print("DATASET LOADED SUCCESSFULLY")
print("=" * 70)
print(f"üìä Test samples: {len(test_dataset)}")
print(f"üì¶ Batch size: {CONFIG['batch_size']}")
print(f"üî¢ Number of batches: {len(test_loader)}")
print(f"üè∑Ô∏è  Classes ({len(test_dataset.class_names)}): {test_dataset.class_names}")
print("=" * 70)

# Section 5: Phase 4.3 - Baseline Robustness Evaluation

**Evaluate all attacks on baseline models (3 seeds)**

Expected results:
- Clean accuracy: ~80-85%
- FGSM √é¬µ=8/255: ~30-35% (50pp drop)
- PGD √é¬µ=8/255: ~10-20% (65pp drop)
- C&W: ~5-15% (70pp drop)

In [None]:
# ============================================================================
# COMPLETE ADVERSARIAL EVALUATION - All attacks, all seeds
# ============================================================================
print("=" * 70)
print("PHASE 4: ADVERSARIAL ROBUSTNESS EVALUATION")
print("=" * 70)

import time
from datetime import datetime

# Initialize results storage
all_results = {
    'clean': [],
    'FGSM': {eps: [] for eps in CONFIG['epsilons']},
    'PGD': {f"eps{eps}_steps{steps}": [] 
            for eps in CONFIG['epsilons'] 
            for steps in CONFIG['pgd_steps']},
    'CW': []
}

start_time = time.time()
print(f"\nüïê Started at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print(f"üìä Evaluating {len(CONFIG['seeds'])} seeds: {CONFIG['seeds']}")
print(f"‚öîÔ∏è  FGSM: {len(CONFIG['epsilons'])} epsilon values")
print(f"‚öîÔ∏è  PGD: {len(CONFIG['epsilons'])} √ó {len(CONFIG['pgd_steps'])} = {len(CONFIG['epsilons']) * len(CONFIG['pgd_steps'])} configurations")
print(f"‚öîÔ∏è  C&W: 1 configuration")

# ============================================================================
# MAIN EVALUATION LOOP - Iterate over seeds
# ============================================================================
for seed_idx, seed in enumerate(CONFIG['seeds']):
    print(f"\n{'='*70}")
    print(f"SEED {seed} ({seed_idx+1}/{len(CONFIG['seeds'])})")
    print(f"{'='*70}")
    
    # Load model checkpoint
    checkpoint_path = f"{CONFIG['checkpoint_dir']}/seed_{seed}/best.pt"
    print(f"\nüìÇ Loading: {checkpoint_path}")
    
    model = load_model_and_checkpoint(
        checkpoint_path=checkpoint_path,
        model_name=CONFIG['model_name'],
        num_classes=CONFIG['num_classes'],
        device=CONFIG['device']
    )
    
    # ========================================================================
    # Clean Accuracy
    # ========================================================================
    print("\nüìä Evaluating Clean Accuracy...")
    clean_correct = 0
    total_samples = 0
    
    with torch.no_grad():
        for batch_data in tqdm(test_loader, desc="Clean eval", leave=False):
            # Handle (images, labels, meta) format from ISICDataset
            if len(batch_data) == 2:
                images, labels = batch_data
            else:
                images, labels, _ = batch_data  # Ignore metadata
                
            images = images.to(CONFIG['device'])
            labels = labels.to(CONFIG['device'])
            logits = model(images)
            preds = logits.argmax(dim=1)
            clean_correct += (preds == labels).sum().item()
            total_samples += labels.size(0)
    
    clean_acc = 100 * clean_correct / total_samples
    all_results['clean'].append({'accuracy': clean_acc, 'seed': seed})
    print(f"‚úÖ Clean Accuracy: {clean_acc:.2f}%")
    
    # ========================================================================
    # FGSM Attack Evaluation
    # ========================================================================
    print(f"\nüî• FGSM Attack Evaluation")
    print("-" * 50)
    
    for epsilon in CONFIG['epsilons']:
        # Create FGSM config and attack
        fgsm_config = FGSMConfig(
            epsilon=epsilon,
            clip_min=0.0,
            clip_max=1.0,
            targeted=False,
            device=CONFIG['device']
        )
        fgsm_attack = FGSM(fgsm_config)
        
        fgsm_results = evaluate_attack(
            model=model,
            attack=fgsm_attack,
            dataloader=test_loader,
            device=CONFIG['device']
        )
        fgsm_results['seed'] = seed
        fgsm_results['epsilon'] = epsilon
        all_results['FGSM'][epsilon].append(fgsm_results)
        
        print(f"  Œµ={epsilon:.3f}: Robust={fgsm_results['robust_accuracy']:.1f}% | Drop={fgsm_results['accuracy_drop']:.1f}pp")
    
    # ========================================================================
    # PGD Attack Evaluation
    # ========================================================================
    print(f"\nüî• PGD Attack Evaluation")
    print("-" * 50)
    
    for epsilon in CONFIG['epsilons']:
        for num_steps in CONFIG['pgd_steps']:
            # Create PGD config and attack
            pgd_config = PGDConfig(
                epsilon=epsilon,
                num_steps=num_steps,
                step_size=epsilon/4,
                random_start=True,
                clip_min=0.0,
                clip_max=1.0,
                targeted=False,
                device=CONFIG['device']
            )
            pgd_attack = PGD(pgd_config)
            
            pgd_results = evaluate_attack(
                model=model,
                attack=pgd_attack,
                dataloader=test_loader,
                device=CONFIG['device']
            )
            pgd_results['seed'] = seed
            pgd_results['epsilon'] = epsilon
            pgd_results['steps'] = num_steps
            
            config_key = f"eps{epsilon}_steps{num_steps}"
            all_results['PGD'][config_key].append(pgd_results)
            
            print(f"  Œµ={epsilon:.3f}, steps={num_steps}: Robust={pgd_results['robust_accuracy']:.1f}% | Drop={pgd_results['accuracy_drop']:.1f}pp")
    
    # ========================================================================
    # C&W Attack Evaluation (on subset for speed)
    # ========================================================================
    print(f"\nüî• C&W Attack Evaluation")
    print("-" * 50)
    
    # Create C&W config and attack
    # Note: C&W uses L2 norm, not L‚àû
    # - confidence (kappa): margin for misclassification (0 = just misclassify)
    # - initial_c: initial penalty parameter (tuned via binary search)
    cw_config = CWConfig(
        epsilon=0.5,  # L2 budget (not L‚àû)
        confidence=0.0,  # kappa (confidence margin)
        learning_rate=CONFIG['cw_lr'],
        max_iterations=CONFIG['cw_steps'],
        binary_search_steps=5,
        initial_c=CONFIG['cw_c'],  # Initial penalty parameter
        abort_early=True,
        targeted=False,
        device=CONFIG['device']
    )
    cw_attack = CarliniWagner(cw_config)
    
    # Evaluate on subset (C&W is slow)
    cw_results = evaluate_attack(
        model=model,
        attack=cw_attack,
        dataloader=test_loader,
        device=CONFIG['device'],
        max_batches=10  # Limit to 10 batches for speed
    )
    cw_results['seed'] = seed
    all_results['CW'].append(cw_results)
    
    print(f"  C&W: Robust={cw_results['robust_accuracy']:.1f}% | Drop={cw_results['accuracy_drop']:.1f}pp | L2={cw_results['mean_l2_dist']:.4f}")
    
    # Free GPU memory
    del model
    torch.cuda.empty_cache()

# ============================================================================
# FINAL SUMMARY
# ============================================================================
elapsed = time.time() - start_time
print(f"\n{'='*70}")
print(f"EVALUATION COMPLETE")
print(f"{'='*70}")
print(f"‚è±Ô∏è  Total time: {elapsed/60:.1f} minutes")
print(f"üìä Seeds evaluated: {CONFIG['seeds']}")
print(f"\nüìà Clean Accuracy Summary:")
clean_accs = [r['accuracy'] for r in all_results['clean']]
print(f"   Mean: {np.mean(clean_accs):.2f}% ¬± {np.std(clean_accs):.2f}%")

print(f"\nüìà FGSM Robust Accuracy (Œµ=0.03):")
if 0.03 in all_results['FGSM']:
    fgsm_accs = [r['robust_accuracy'] for r in all_results['FGSM'][0.03]]
    print(f"   Mean: {np.mean(fgsm_accs):.2f}% ¬± {np.std(fgsm_accs):.2f}%")

print(f"\nüìà PGD Robust Accuracy (Œµ=0.03, steps=20):")
key = "eps0.03_steps20"
if key in all_results['PGD']:
    pgd_accs = [r['robust_accuracy'] for r in all_results['PGD'][key]]
    print(f"   Mean: {np.mean(pgd_accs):.2f}% ¬± {np.std(pgd_accs):.2f}%")

print(f"\nüìà C&W Robust Accuracy:")
cw_accs = [r['robust_accuracy'] for r in all_results['CW']]
print(f"   Mean: {np.mean(cw_accs):.2f}% ¬± {np.std(cw_accs):.2f}%")

print(f"\n‚úÖ Results saved to 'all_results' dictionary")
print("=" * 70)

## 5.1: FGSM Attack Evaluation

**Fast Gradient Sign Method - Single step L√¢ÀÜ≈æ attack**

In [None]:
# ============================================================================
# FGSM RESULTS SUMMARY TABLE
# ============================================================================
print("=" * 70)
print("FGSM ATTACK RESULTS SUMMARY")
print("=" * 70)

import pandas as pd

# Create summary table
fgsm_summary = []
for epsilon in CONFIG['epsilons']:
    results = all_results['FGSM'][epsilon]
    robust_accs = [r['robust_accuracy'] for r in results]
    drops = [r['accuracy_drop'] for r in results]
    success_rates = [r['attack_success_rate'] for r in results]
    
    fgsm_summary.append({
        'Epsilon': f"{epsilon:.3f}",
        'Œµ (8-bit)': f"{epsilon*255:.1f}/255",
        'Robust Acc (%)': f"{np.mean(robust_accs):.2f} ¬± {np.std(robust_accs):.2f}",
        'Acc Drop (pp)': f"{np.mean(drops):.2f} ¬± {np.std(drops):.2f}",
        'Attack Success (%)': f"{np.mean(success_rates):.2f} ¬± {np.std(success_rates):.2f}",
    })

fgsm_df = pd.DataFrame(fgsm_summary)
print("\nüìä FGSM Results Across All Seeds:")
print(fgsm_df.to_string(index=False))

# Latex table for dissertation
print("\nüìù LaTeX Table:")
print(fgsm_df.to_latex(index=False, escape=False))

## 5.2: PGD Attack Evaluation

**Projected Gradient Descent - Multi-step iterative attack**

In [None]:
# ============================================================================
# PGD RESULTS SUMMARY TABLE
# ============================================================================
print("=" * 70)
print("PGD ATTACK RESULTS SUMMARY")
print("=" * 70)

# Create summary table
pgd_summary = []
for epsilon in CONFIG['epsilons']:
    for steps in CONFIG['pgd_steps']:
        key = f"eps{epsilon}_steps{steps}"
        results = all_results['PGD'][key]
        robust_accs = [r['robust_accuracy'] for r in results]
        drops = [r['accuracy_drop'] for r in results]
        success_rates = [r['attack_success_rate'] for r in results]
        
        pgd_summary.append({
            'Epsilon': f"{epsilon:.3f}",
            'Steps': steps,
            'Robust Acc (%)': f"{np.mean(robust_accs):.2f} ¬± {np.std(robust_accs):.2f}",
            'Acc Drop (pp)': f"{np.mean(drops):.2f} ¬± {np.std(drops):.2f}",
            'Attack Success (%)': f"{np.mean(success_rates):.2f} ¬± {np.std(success_rates):.2f}",
        })

pgd_df = pd.DataFrame(pgd_summary)
print("\nüìä PGD Results Across All Seeds:")
print(pgd_df.to_string(index=False))

# Show key configurations
print("\nüìå Key Configurations:")
for eps in [0.03, 0.05]:
    for steps in [20, 40]:
        key = f"eps{eps}_steps{steps}"
        if key in all_results['PGD']:
            results = all_results['PGD'][key]
            robust_accs = [r['robust_accuracy'] for r in results]
            print(f"   Œµ={eps}, steps={steps}: {np.mean(robust_accs):.2f}% ¬± {np.std(robust_accs):.2f}%")

## 5.3: C&W Attack Evaluation

**Carlini & Wagner - L2 optimization-based attack**

In [None]:
# ============================================================================
# C&W RESULTS SUMMARY
# ============================================================================
print("=" * 70)
print("C&W ATTACK RESULTS SUMMARY")
print("=" * 70)

# Extract C&W results
cw_results_list = all_results['CW']

robust_accs = [r['robust_accuracy'] for r in cw_results_list]
drops = [r['accuracy_drop'] for r in cw_results_list]
success_rates = [r['attack_success_rate'] for r in cw_results_list]
l2_dists = [r['mean_l2_dist'] for r in cw_results_list]

print(f"\nüìä C&W Attack Results (L2 optimization-based):")
print(f"   Configuration: c={CONFIG['cw_c']}, steps={CONFIG['cw_steps']}, lr={CONFIG['cw_lr']}")
print(f"\n   Robust Accuracy: {np.mean(robust_accs):.2f}% ¬± {np.std(robust_accs):.2f}%")
print(f"   Accuracy Drop: {np.mean(drops):.2f}pp ¬± {np.std(drops):.2f}pp")
print(f"   Attack Success Rate: {np.mean(success_rates):.2f}% ¬± {np.std(success_rates):.2f}%")
print(f"   Mean L2 Distance: {np.mean(l2_dists):.4f} ¬± {np.std(l2_dists):.4f}")

print("\nüìà Per-Seed Breakdown:")
for r in cw_results_list:
    print(f"   Seed {r['seed']}: Robust={r['robust_accuracy']:.2f}%, L2={r['mean_l2_dist']:.4f}")

# Section 6: Statistical Aggregation

**Aggregate results across 3 seeds and compute statistics**

In [None]:
# ============================================================================
# STATISTICAL AGGREGATION - Results across 3 seeds
# ============================================================================
print("=" * 70)
print("STATISTICAL AGGREGATION")
print("=" * 70)

# ============================================================================
# Clean Accuracy Summary
# ============================================================================
print("\nüìä CLEAN ACCURACY:")
clean_accs = [r['accuracy'] for r in all_results['clean']]
print(f"   Mean: {np.mean(clean_accs):.2f}% ¬± {np.std(clean_accs):.2f}%")
for r in all_results['clean']:
    print(f"   Seed {r['seed']}: {r['accuracy']:.2f}%")

# ============================================================================
# FGSM Summary
# ============================================================================
print("\nüìä FGSM ATTACK SUMMARY:")
print("-" * 50)
for epsilon in CONFIG['epsilons']:
    results = all_results['FGSM'][epsilon]
    robust_accs = [r['robust_accuracy'] for r in results]
    drops = [r['accuracy_drop'] for r in results]
    print(f"   Œµ={epsilon:.3f}: Robust={np.mean(robust_accs):.2f}% ¬± {np.std(robust_accs):.2f}% | Drop={np.mean(drops):.2f}pp")

# ============================================================================
# PGD Summary
# ============================================================================
print("\nüìä PGD ATTACK SUMMARY:")
print("-" * 50)
for epsilon in CONFIG['epsilons']:
    for steps in CONFIG['pgd_steps']:
        key = f"eps{epsilon}_steps{steps}"
        results = all_results['PGD'][key]
        robust_accs = [r['robust_accuracy'] for r in results]
        drops = [r['accuracy_drop'] for r in results]
        print(f"   Œµ={epsilon:.3f}, steps={steps}: Robust={np.mean(robust_accs):.2f}% ¬± {np.std(robust_accs):.2f}% | Drop={np.mean(drops):.2f}pp")

# ============================================================================
# C&W Summary
# ============================================================================
print("\nüìä C&W ATTACK SUMMARY:")
print("-" * 50)
results = all_results['CW']
robust_accs = [r['robust_accuracy'] for r in results]
l2_dists = [r['mean_l2_dist'] for r in results]
print(f"   Robust Accuracy: {np.mean(robust_accs):.2f}% ¬± {np.std(robust_accs):.2f}%")
print(f"   Mean L2 Distance: {np.mean(l2_dists):.4f} ¬± {np.std(l2_dists):.4f}")

# ============================================================================
# Key Findings for Dissertation
# ============================================================================
print("\n" + "=" * 70)
print("KEY FINDINGS FOR DISSERTATION")
print("=" * 70)

# Best performing attack config
best_fgsm_eps = 0.03  # Standard benchmark
fgsm_03 = all_results['FGSM'].get(0.03, [])
if fgsm_03:
    print(f"\nüéØ FGSM (Œµ=0.03): {np.mean([r['robust_accuracy'] for r in fgsm_03]):.2f}%")

pgd_key = "eps0.03_steps20"
if pgd_key in all_results['PGD']:
    pgd_results = all_results['PGD'][pgd_key]
    print(f"üéØ PGD (Œµ=0.03, 20 steps): {np.mean([r['robust_accuracy'] for r in pgd_results]):.2f}%")

cw_results = all_results['CW']
if cw_results:
    print(f"üéØ C&W: {np.mean([r['robust_accuracy'] for r in cw_results]):.2f}%")

print("\n‚úÖ Statistical aggregation complete")

In [None]:
# ============================================================================
# SAVE RESULTS TO JSON
# ============================================================================
import json

# Create results directory on Google Drive
results_dir = "/content/drive/MyDrive/results/phase4_adversarial"
os.makedirs(results_dir, exist_ok=True)

results_json_path = f"{results_dir}/baseline_robustness_aggregated.json"

# Convert to serializable format
results_serializable = {
    'clean': {
        'mean': float(np.mean([r['accuracy'] for r in all_results['clean']])),
        'std': float(np.std([r['accuracy'] for r in all_results['clean']])),
        'values': [float(r['accuracy']) for r in all_results['clean']]
    },
    'FGSM': {},
    'PGD': {},
    'CW': {}
}

# FGSM results
for epsilon in CONFIG['epsilons']:
    eps_key = str(epsilon)
    results = all_results['FGSM'][epsilon]
    results_serializable['FGSM'][eps_key] = {
        'robust_accuracy': {
            'mean': float(np.mean([r['robust_accuracy'] for r in results])),
            'std': float(np.std([r['robust_accuracy'] for r in results])),
        },
        'accuracy_drop': {
            'mean': float(np.mean([r['accuracy_drop'] for r in results])),
            'std': float(np.std([r['accuracy_drop'] for r in results])),
        }
    }

# PGD results
for epsilon in CONFIG['epsilons']:
    for steps in CONFIG['pgd_steps']:
        key = f"eps{epsilon}_steps{steps}"
        results = all_results['PGD'][key]
        results_serializable['PGD'][key] = {
            'robust_accuracy': {
                'mean': float(np.mean([r['robust_accuracy'] for r in results])),
                'std': float(np.std([r['robust_accuracy'] for r in results])),
            },
            'accuracy_drop': {
                'mean': float(np.mean([r['accuracy_drop'] for r in results])),
                'std': float(np.std([r['accuracy_drop'] for r in results])),
            }
        }

# C&W results
results = all_results['CW']
results_serializable['CW'] = {
    'robust_accuracy': {
        'mean': float(np.mean([r['robust_accuracy'] for r in results])),
        'std': float(np.std([r['robust_accuracy'] for r in results])),
    },
    'mean_l2_dist': {
        'mean': float(np.mean([r['mean_l2_dist'] for r in results])),
        'std': float(np.std([r['mean_l2_dist'] for r in results])),
    }
}

# Save
with open(results_json_path, 'w') as f:
    json.dump(results_serializable, f, indent=2)

print(f"‚úÖ Results saved to: {results_json_path}")

# Section 7: Phase 4.5 - Adversarial Visualization

**Generate and visualize adversarial examples**

In [None]:
# ============================================================================
# CELL: PhD-LEVEL VISUALIZATION FUNCTIONS
# ============================================================================

import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.gridspec import GridSpec
import seaborn as sns
import numpy as np

# Publication-quality settings
plt.rcParams.update({
    'font.size': 12,
    'font.family': 'serif',
    'axes.labelsize': 14,
    'axes.titlesize': 16,
    'xtick.labelsize': 12,
    'ytick.labelsize': 12,
    'legend.fontsize': 11,
    'figure.titlesize': 18,
    'figure.dpi': 150,
    'savefig.dpi': 300,
    'savefig.bbox': 'tight',
    'axes.grid': True,
    'grid.alpha': 0.3,
})

# Color palette for publication
COLORS = {
    'clean': '#2ecc71',      # Green
    'fgsm': '#e74c3c',       # Red
    'pgd': '#9b59b6',        # Purple
    'cw': '#f39c12',         # Orange
    'baseline': '#3498db',   # Blue
    'robust': '#1abc9c',     # Teal
}

def denormalize_image(img_tensor, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]):
    """Denormalize image tensor for visualization."""
    img = img_tensor.clone()
    for t, m, s in zip(img, mean, std):
        t.mul_(s).add_(m)
    return torch.clamp(img, 0, 1)


def create_phd_adversarial_figure(model, images, labels, attacks_dict, class_names=None, num_samples=5):
    """
    Create publication-quality adversarial examples figure.
    PhD-level visualization with detailed annotations.
    """
    model.eval()
    device = next(model.parameters()).device
    
    images = images[:num_samples].to(device)
    labels = labels[:num_samples].to(device)
    
    # Get predictions
    with torch.no_grad():
        clean_logits = model(images)
        clean_preds = clean_logits.argmax(dim=1)
        clean_probs = torch.softmax(clean_logits, dim=1)
        clean_confs = clean_probs.max(dim=1)[0]
    
    # Generate adversarial examples
    adv_data = {}
    for name, attack in attacks_dict.items():
        adv_imgs = attack(model, images, labels)
        with torch.no_grad():
            adv_logits = model(adv_imgs)
            adv_preds = adv_logits.argmax(dim=1)
            adv_confs = torch.softmax(adv_logits, dim=1).max(dim=1)[0]
        adv_data[name] = {
            'images': adv_imgs,
            'preds': adv_preds,
            'confs': adv_confs,
            'perturbation': (adv_imgs - images).abs()
        }
    
    # Create figure with GridSpec
    num_cols = len(attacks_dict) + 2  # Clean + attacks + perturbation
    fig = plt.figure(figsize=(4*num_cols, 4.5*num_samples))
    gs = GridSpec(num_samples, num_cols, figure=fig, hspace=0.3, wspace=0.1)
    
    class_labels = class_names if class_names else [f'Class {i}' for i in range(7)]
    
    for i in range(num_samples):
        # Clean image
        ax = fig.add_subplot(gs[i, 0])
        clean_img = denormalize_image(images[i].cpu()).permute(1, 2, 0).numpy()
        ax.imshow(clean_img)
        
        true_label = labels[i].item()
        pred_label = clean_preds[i].item()
        conf = clean_confs[i].item() * 100
        
        title_color = 'green' if pred_label == true_label else 'red'
        ax.set_title(f'Clean Image\nTrue: {class_labels[true_label]}\nPred: {class_labels[pred_label]} ({conf:.1f}%)', 
                    fontsize=10, color=title_color, fontweight='bold')
        ax.axis('off')
        
        # Add green border for correct
        for spine in ax.spines.values():
            spine.set_visible(True)
            spine.set_color('green')
            spine.set_linewidth(3)
        
        # Adversarial examples
        for j, (name, data) in enumerate(adv_data.items(), start=1):
            ax = fig.add_subplot(gs[i, j])
            adv_img = denormalize_image(data['images'][i].cpu()).permute(1, 2, 0).numpy()
            ax.imshow(adv_img)
            
            adv_pred = data['preds'][i].item()
            adv_conf = data['confs'][i].item() * 100
            
            # Success indicator
            attack_success = adv_pred != true_label
            border_color = 'red' if attack_success else 'green'
            title_color = 'red' if attack_success else 'green'
            
            linf = data['perturbation'][i].max().item()
            l2 = torch.norm(data['perturbation'][i]).item()
            
            ax.set_title(f'{name}\nPred: {class_labels[adv_pred]} ({adv_conf:.1f}%)\n'
                        f'L√¢ÀÜ≈æ={linf:.4f}, L√¢‚Äö‚Äö={l2:.2f}', 
                        fontsize=9, color=title_color, fontweight='bold')
            ax.axis('off')
            
            for spine in ax.spines.values():
                spine.set_visible(True)
                spine.set_color(border_color)
                spine.set_linewidth(3)
        
        # Perturbation heatmap (last column)
        ax = fig.add_subplot(gs[i, -1])
        # Use strongest attack perturbation
        strongest_attack = list(adv_data.keys())[-1]
        pert = adv_data[strongest_attack]['perturbation'][i].cpu()
        pert_magnitude = pert.norm(dim=0).numpy()  # L2 norm across channels
        
        im = ax.imshow(pert_magnitude, cmap='hot', vmin=0)
        ax.set_title(f'Perturbation\n(√É‚Äî10 amplified)', fontsize=10, fontweight='bold')
        ax.axis('off')
        
        # Add colorbar
        cbar = plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04)
        cbar.set_label('Magnitude', fontsize=8)
    
    # Main title
    fig.suptitle('Adversarial Attack Comparison on ISIC 2018 Dermoscopy Images\n'
                 '(Green border = Correct prediction, Red border = Misclassification)',
                 fontsize=16, fontweight='bold', y=1.02)
    
    return fig


def create_phd_perturbation_analysis(model, images, labels, attacks_dict, num_samples=4):
    """
    Create detailed perturbation analysis figure for dissertation.
    Shows spatial distribution and frequency analysis of perturbations.
    """
    model.eval()
    device = next(model.parameters()).device
    
    images = images[:num_samples].to(device)
    labels = labels[:num_samples].to(device)
    
    # Generate perturbations
    perturbations = {}
    for name, attack in attacks_dict.items():
        adv_imgs = attack(model, images, labels)
        perturbations[name] = (adv_imgs - images).cpu()
    
    # Create figure
    num_attacks = len(attacks_dict)
    fig, axes = plt.subplots(num_samples, num_attacks * 2 + 1, 
                             figsize=(3*(num_attacks*2+1), 3.5*num_samples))
    
    for i in range(num_samples):
        # Original image
        ax = axes[i, 0]
        clean_img = denormalize_image(images[i].cpu()).permute(1, 2, 0).numpy()
        ax.imshow(clean_img)
        ax.set_title('Original' if i == 0 else '', fontsize=11, fontweight='bold')
        ax.axis('off')
        
        col = 1
        for name, pert in perturbations.items():
            # Spatial perturbation (amplified)
            ax = axes[i, col]
            pert_spatial = pert[i] * 20  # Amplify 20x
            pert_spatial = (pert_spatial - pert_spatial.min()) / (pert_spatial.max() - pert_spatial.min() + 1e-8)
            ax.imshow(pert_spatial.permute(1, 2, 0).numpy())
            if i == 0:
                ax.set_title(f'{name}\n(Spatial √É‚Äî20)', fontsize=10, fontweight='bold')
            ax.axis('off')
            
            # Magnitude heatmap
            ax = axes[i, col + 1]
            magnitude = pert[i].abs().mean(dim=0).numpy()
            im = ax.imshow(magnitude, cmap='inferno')
            if i == 0:
                ax.set_title(f'{name}\n(Magnitude)', fontsize=10, fontweight='bold')
            ax.axis('off')
            
            col += 2
    
    fig.suptitle('Perturbation Analysis: Spatial Distribution and Magnitude Heatmaps\n'
                 'Revealing Attack Strategies on Medical Dermoscopy Images',
                 fontsize=14, fontweight='bold', y=1.02)
    
    plt.tight_layout()
    return fig


def create_phd_robustness_curves(aggregated_results, config):
    """
    Create publication-quality robustness curves for dissertation.
    """
    fig, axes = plt.subplots(2, 2, figsize=(14, 12))
    
    epsilons = config['epsilons']
    eps_labels = [f'{e*255:.0f}/255' for e in epsilons]
    eps_values = [e * 255 for e in epsilons]
    
    # 1. Robustness vs Epsilon (Line plot)
    ax = axes[0, 0]
    
    # FGSM
    fgsm_accs = [aggregated_results['FGSM'][eps]['robust_accuracy']['mean'] for eps in epsilons]
    fgsm_stds = [aggregated_results['FGSM'][eps]['robust_accuracy']['std'] for eps in epsilons]
    ax.errorbar(eps_values, fgsm_accs, yerr=fgsm_stds, marker='o', markersize=10,
                linewidth=2.5, capsize=6, label='FGSM', color=COLORS['fgsm'])
    
    # PGD-7
    pgd7_accs = [aggregated_results['PGD'][f'eps{eps}_steps7']['robust_accuracy']['mean'] for eps in epsilons]
    pgd7_stds = [aggregated_results['PGD'][f'eps{eps}_steps7']['robust_accuracy']['std'] for eps in epsilons]
    ax.errorbar(eps_values, pgd7_accs, yerr=pgd7_stds, marker='s', markersize=10,
                linewidth=2.5, capsize=6, label='PGD-7', color='#3498db')
    
    # PGD-20
    pgd20_accs = [aggregated_results['PGD'][f'eps{eps}_steps20']['robust_accuracy']['mean'] for eps in epsilons]
    pgd20_stds = [aggregated_results['PGD'][f'eps{eps}_steps20']['robust_accuracy']['std'] for eps in epsilons]
    ax.errorbar(eps_values, pgd20_accs, yerr=pgd20_stds, marker='^', markersize=10,
                linewidth=2.5, capsize=6, label='PGD-20', color=COLORS['pgd'])
    
    ax.axhline(y=100/7, color='gray', linestyle='--', linewidth=1.5, label='Random (14.3%)')
    ax.set_xlabel('Perturbation Budget (√é¬µ/255)', fontsize=13)
    ax.set_ylabel('Robust Accuracy (%)', fontsize=13)
    ax.set_title('(a) Robustness vs Perturbation Budget', fontsize=14, fontweight='bold')
    ax.legend(loc='upper right', fontsize=11)
    ax.set_ylim(0, 100)
    ax.grid(True, alpha=0.3)
    
    # 2. Attack Comparison Bar Chart
    ax = axes[0, 1]
    
    attacks = ['FGSM\n√é¬µ=8/255', 'PGD-7\n√é¬µ=8/255', 'PGD-20\n√é¬µ=8/255', 'C&W\nL√¢‚Äö‚Äö']
    accs = [
        aggregated_results['FGSM'][8/255]['robust_accuracy']['mean'],
        aggregated_results['PGD'][f'eps{8/255}_steps7']['robust_accuracy']['mean'],
        aggregated_results['PGD'][f'eps{8/255}_steps20']['robust_accuracy']['mean'],
        aggregated_results['CW']['robust_accuracy']['mean']
    ]
    stds = [
        aggregated_results['FGSM'][8/255]['robust_accuracy']['std'],
        aggregated_results['PGD'][f'eps{8/255}_steps7']['robust_accuracy']['std'],
        aggregated_results['PGD'][f'eps{8/255}_steps20']['robust_accuracy']['std'],
        aggregated_results['CW']['robust_accuracy']['std']
    ]
    
    colors = [COLORS['fgsm'], '#3498db', COLORS['pgd'], COLORS['cw']]
    bars = ax.bar(attacks, accs, yerr=stds, capsize=8, color=colors, 
                  edgecolor='black', linewidth=1.5, alpha=0.85)
    
    # Add value labels
    for bar, acc, std in zip(bars, accs, stds):
        height = bar.get_height()
        ax.annotate(f'{acc:.1f}√Ç¬±{std:.1f}%',
                   xy=(bar.get_x() + bar.get_width()/2, height + std + 1),
                   ha='center', va='bottom', fontsize=11, fontweight='bold')
    
    ax.axhline(y=100/7, color='gray', linestyle='--', linewidth=1.5)
    ax.set_ylabel('Robust Accuracy (%)', fontsize=13)
    ax.set_title('(b) Attack Comparison (Strongest Settings)', fontsize=14, fontweight='bold')
    ax.set_ylim(0, max(accs) + 20)
    
    # 3. Accuracy Drop Heatmap
    ax = axes[1, 0]
    
    # Create matrix for heatmap
    steps = [7, 10, 20]
    drop_matrix = np.zeros((len(epsilons), len(steps)))
    
    for i, eps in enumerate(epsilons):
        for j, step in enumerate(steps):
            drop_matrix[i, j] = aggregated_results['PGD'][f'eps{eps}_steps{step}']['accuracy_drop']['mean']
    
    im = ax.imshow(drop_matrix, cmap='Reds', aspect='auto')
    ax.set_xticks(range(len(steps)))
    ax.set_xticklabels([f'{s} steps' for s in steps])
    ax.set_yticks(range(len(epsilons)))
    ax.set_yticklabels(eps_labels)
    ax.set_xlabel('PGD Iterations', fontsize=13)
    ax.set_ylabel('Perturbation Budget (√é¬µ)', fontsize=13)
    ax.set_title('(c) Accuracy Drop (pp) - PGD Attack', fontsize=14, fontweight='bold')
    
    # Add annotations
    for i in range(len(epsilons)):
        for j in range(len(steps)):
            ax.text(j, i, f'{drop_matrix[i,j]:.1f}', ha='center', va='center',
                   fontsize=12, fontweight='bold', color='white' if drop_matrix[i,j] > 40 else 'black')
    
    cbar = plt.colorbar(im, ax=ax)
    cbar.set_label('Accuracy Drop (pp)', fontsize=11)
    
    # 4. Attack Success Rate
    ax = axes[1, 1]
    
    # Success rates for different attacks
    categories = ['√é¬µ=2/255', '√é¬µ=4/255', '√é¬µ=8/255']
    fgsm_sr = [aggregated_results['FGSM'][eps]['attack_success_rate']['mean'] for eps in epsilons]
    pgd_sr = [aggregated_results['PGD'][f'eps{eps}_steps20']['attack_success_rate']['mean'] for eps in epsilons]
    
    x = np.arange(len(categories))
    width = 0.35
    
    bars1 = ax.bar(x - width/2, fgsm_sr, width, label='FGSM', color=COLORS['fgsm'], 
                   edgecolor='black', linewidth=1.5)
    bars2 = ax.bar(x + width/2, pgd_sr, width, label='PGD-20', color=COLORS['pgd'],
                   edgecolor='black', linewidth=1.5)
    
    ax.set_ylabel('Attack Success Rate (%)', fontsize=13)
    ax.set_xlabel('Perturbation Budget', fontsize=13)
    ax.set_title('(d) Attack Success Rate Comparison', fontsize=14, fontweight='bold')
    ax.set_xticks(x)
    ax.set_xticklabels(categories)
    ax.legend(fontsize=11)
    ax.set_ylim(0, 100)
    
    # Add value labels
    for bars in [bars1, bars2]:
        for bar in bars:
            height = bar.get_height()
            ax.annotate(f'{height:.0f}%', xy=(bar.get_x() + bar.get_width()/2, height + 1),
                       ha='center', va='bottom', fontsize=10, fontweight='bold')
    
    plt.tight_layout()
    fig.suptitle('Baseline Model Adversarial Robustness Analysis\n'
                 'ResNet-50 on ISIC 2018 Dermoscopy Dataset (3 Seeds)', 
                 fontsize=16, fontweight='bold', y=1.02)
    
    return fig


print("√¢≈ì‚Ä¶ PhD-level visualization functions defined")

In [None]:
# ============================================================================
# LOAD MODEL FOR VISUALIZATION
# ============================================================================
print("=" * 70)
print("LOADING MODEL FOR VISUALIZATION")
print("=" * 70)

# Load model (use seed 42)
vis_checkpoint = f"{CONFIG['checkpoint_dir']}/seed_42/best.pt"
vis_model = load_model_and_checkpoint(
    checkpoint_path=vis_checkpoint,
    model_name=CONFIG['model_name'],
    num_classes=CONFIG['num_classes'],
    device=CONFIG['device']
)

# Get a batch of test images for visualization
vis_dataloader = DataLoader(
    test_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=0
)

# Handle (images, labels, meta) format from ISICDataset
batch_data = next(iter(vis_dataloader))
if len(batch_data) == 2:
    vis_images, vis_labels = batch_data
else:
    vis_images, vis_labels, _ = batch_data  # Ignore metadata

print(f"‚úÖ Loaded {vis_images.size(0)} images for visualization")
print(f"   Labels: {vis_labels.tolist()}")

In [None]:
# ============================================================================
# CREATE ATTACKS FOR VISUALIZATION
# ============================================================================
print("=" * 70)
print("CREATING ATTACKS FOR VISUALIZATION")
print("=" * 70)

# Create config objects for each attack
vis_fgsm_config = FGSMConfig(
    epsilon=0.03,
    clip_min=0.0,
    clip_max=1.0,
    targeted=False
)

vis_pgd_config = PGDConfig(
    epsilon=0.03,
    num_steps=20,
    step_size=0.03/4,
    random_start=True,
    clip_min=0.0,
    clip_max=1.0,
    targeted=False
)

vis_cw_config = CWConfig(
    confidence=0,
    learning_rate=CONFIG['cw_lr'],
    max_iterations=50,  # Reduced for faster visualization
    binary_search_steps=5,
    initial_c=CONFIG['cw_c'],
    abort_early=True,
    clip_min=0.0,
    clip_max=1.0
)

vis_attacks = {
    'FGSM (Œµ=0.03)': FGSM(vis_fgsm_config),
    'PGD-20 (Œµ=0.03)': PGD(vis_pgd_config),
    'C&W': CarliniWagner(vis_cw_config)
}

print("‚úÖ Visualization attacks created:")
for name in vis_attacks:
    print(f"   - {name}")

In [None]:
# ============================================================================
# GENERATE ADVERSARIAL VISUALIZATIONS
# ============================================================================
print("=" * 70)
print("GENERATING ADVERSARIAL VISUALIZATIONS")
print("=" * 70)

# Create results directory
results_dir = "/content/drive/MyDrive/results/phase4_adversarial"
os.makedirs(results_dir, exist_ok=True)

# Generate adversarial figure
fig = create_phd_adversarial_figure(
    model=vis_model,
    images=vis_images,
    labels=vis_labels,
    attacks_dict=vis_attacks,
    class_names=CONFIG['class_names'],
    num_samples=5
)

# Save figure
vis_save_path = f"{results_dir}/adversarial_examples_visualization.png"
fig.savefig(vis_save_path, dpi=150, bbox_inches='tight')
print(f"‚úÖ Visualization saved to: {vis_save_path}")

plt.show()

In [None]:
# ============================================================================
# PERTURBATION VISUALIZATION
# ============================================================================
print("=" * 70)
print("GENERATING PERTURBATION VISUALIZATIONS")
print("=" * 70)

# Generate adversarial examples for all attacks
print("   Generating adversarial examples...")
adv_examples_dict = {}
for attack_name, attack in vis_attacks.items():
    print(f"      - {attack_name}")
    adv_examples_dict[attack_name] = attack(
        vis_model, 
        vis_images.to(CONFIG['device']), 
        vis_labels.to(CONFIG['device'])
    )
print("   ‚úÖ Adversarial examples generated")

# Create perturbation analysis figure
print("   Creating perturbation analysis figure...")
fig = create_phd_perturbation_analysis(
    model=vis_model,
    images=vis_images,
    labels=vis_labels,
    attacks_dict=vis_attacks,
    num_samples=4
)

# Save
pert_save_path = f"{results_dir}/perturbation_analysis.png"
fig.savefig(pert_save_path, dpi=150, bbox_inches='tight')
print(f"‚úÖ Perturbation analysis saved to: {pert_save_path}")

plt.show()

# Section 8: Results Summary and Comparison

**Create comparison plots and final summary**

In [None]:
# ============================================================================
# ROBUSTNESS CURVES - PhD Quality Visualization
# ============================================================================
print("=" * 70)
print("GENERATING ROBUSTNESS CURVES")
print("=" * 70)

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# ============================================================================
# Plot 1: Robust accuracy vs epsilon (FGSM and PGD)
# ============================================================================
epsilons_plot = [e*255 for e in CONFIG['epsilons']]

# FGSM accuracies
fgsm_accs = [np.mean([r['robust_accuracy'] for r in all_results['FGSM'][eps]]) for eps in CONFIG['epsilons']]
fgsm_stds = [np.std([r['robust_accuracy'] for r in all_results['FGSM'][eps]]) for eps in CONFIG['epsilons']]

# PGD-20 accuracies
pgd_accs = []
pgd_stds = []
for eps in CONFIG['epsilons']:
    key = f"eps{eps}_steps20"
    results = all_results['PGD'][key]
    pgd_accs.append(np.mean([r['robust_accuracy'] for r in results]))
    pgd_stds.append(np.std([r['robust_accuracy'] for r in results]))

# Clean accuracy baseline
clean_acc_mean = np.mean([r['accuracy'] for r in all_results['clean']])

axes[0].axhline(y=clean_acc_mean, color='green', linestyle='-', linewidth=2, 
                label=f'Clean ({clean_acc_mean:.1f}%)', alpha=0.7)
axes[0].errorbar(epsilons_plot, fgsm_accs, yerr=fgsm_stds, marker='o', linewidth=2, 
                 capsize=5, label='FGSM', markersize=8, color=COLORS['fgsm'])
axes[0].errorbar(epsilons_plot, pgd_accs, yerr=pgd_stds, marker='s', linewidth=2, 
                 capsize=5, label='PGD-20', markersize=8, color=COLORS['pgd'])
axes[0].axhline(y=100/CONFIG['num_classes'], color='gray', linestyle='--', 
                label=f'Random ({100/CONFIG["num_classes"]:.1f}%)', alpha=0.5)

axes[0].set_xlabel('Perturbation Budget (Œµ √ó 255)', fontsize=13)
axes[0].set_ylabel('Robust Accuracy (%)', fontsize=13)
axes[0].set_title('Robustness vs Perturbation Budget', fontsize=14, fontweight='bold')
axes[0].legend(fontsize=11, loc='upper right')
axes[0].grid(True, alpha=0.3)
axes[0].set_ylim(0, 100)

# ============================================================================
# Plot 2: Attack comparison bar chart
# ============================================================================
attack_names = ['FGSM\nŒµ=0.03', 'PGD-20\nŒµ=0.03', 'PGD-40\nŒµ=0.03', 'C&W']
attack_accs = []
attack_stds = []

# FGSM Œµ=0.03
results = all_results['FGSM'][0.03]
attack_accs.append(np.mean([r['robust_accuracy'] for r in results]))
attack_stds.append(np.std([r['robust_accuracy'] for r in results]))

# PGD-20 Œµ=0.03
results = all_results['PGD']['eps0.03_steps20']
attack_accs.append(np.mean([r['robust_accuracy'] for r in results]))
attack_stds.append(np.std([r['robust_accuracy'] for r in results]))

# PGD-40 Œµ=0.03
results = all_results['PGD']['eps0.03_steps40']
attack_accs.append(np.mean([r['robust_accuracy'] for r in results]))
attack_stds.append(np.std([r['robust_accuracy'] for r in results]))

# C&W
results = all_results['CW']
attack_accs.append(np.mean([r['robust_accuracy'] for r in results]))
attack_stds.append(np.std([r['robust_accuracy'] for r in results]))

colors = [COLORS['fgsm'], COLORS['pgd'], '#6c3483', COLORS['cw']]
bars = axes[1].bar(attack_names, attack_accs, yerr=attack_stds, 
                   color=colors, alpha=0.8, capsize=8, width=0.6, edgecolor='black', linewidth=1.5)

# Add value labels on bars
for bar, acc in zip(bars, attack_accs):
    height = bar.get_height()
    axes[1].annotate(f'{acc:.1f}%', xy=(bar.get_x() + bar.get_width()/2, height),
                    xytext=(0, 5), textcoords='offset points',
                    ha='center', va='bottom', fontsize=11, fontweight='bold')

axes[1].axhline(y=clean_acc_mean, color='green', linestyle='-', linewidth=2, 
                label=f'Clean ({clean_acc_mean:.1f}%)', alpha=0.7)
axes[1].axhline(y=100/CONFIG['num_classes'], color='gray', linestyle='--', 
                label=f'Random ({100/CONFIG["num_classes"]:.1f}%)', alpha=0.5)

axes[1].set_ylabel('Robust Accuracy (%)', fontsize=13)
axes[1].set_title('Attack Comparison (Œµ=0.03)', fontsize=14, fontweight='bold')
axes[1].set_ylim(0, max(attack_accs) + 20)
axes[1].legend(fontsize=11, loc='upper right')
axes[1].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
fig.suptitle('Baseline ResNet-50 Adversarial Robustness on ISIC 2018\n(Mean ¬± Std across 3 seeds)', 
             fontsize=16, fontweight='bold', y=1.02)

# Save
curves_save_path = f"{results_dir}/robustness_curves.png"
fig.savefig(curves_save_path, dpi=150, bbox_inches='tight')
print(f"‚úÖ Robustness curves saved to: {curves_save_path}")

plt.show()

In [None]:
# ============================================================================
# FINAL SUMMARY REPORT
# ============================================================================
print("=" * 70)
print("PHASE 4 - BASELINE ROBUSTNESS EVALUATION - FINAL SUMMARY")
print("=" * 70)

# Extract key results
clean_acc_mean = np.mean([r['accuracy'] for r in all_results['clean']])
clean_acc_std = np.std([r['accuracy'] for r in all_results['clean']])

# FGSM Œµ=0.03 results
fgsm_results = all_results['FGSM'][0.03]
fgsm_robust_mean = np.mean([r['robust_accuracy'] for r in fgsm_results])
fgsm_robust_std = np.std([r['robust_accuracy'] for r in fgsm_results])
fgsm_drop_mean = np.mean([r['accuracy_drop'] for r in fgsm_results])
fgsm_success_mean = np.mean([r['attack_success_rate'] for r in fgsm_results])

# PGD-20 Œµ=0.03 results
pgd_results = all_results['PGD']['eps0.03_steps20']
pgd_robust_mean = np.mean([r['robust_accuracy'] for r in pgd_results])
pgd_robust_std = np.std([r['robust_accuracy'] for r in pgd_results])
pgd_drop_mean = np.mean([r['accuracy_drop'] for r in pgd_results])
pgd_success_mean = np.mean([r['attack_success_rate'] for r in pgd_results])

# C&W results
cw_results = all_results['CW']
cw_robust_mean = np.mean([r['robust_accuracy'] for r in cw_results])
cw_robust_std = np.std([r['robust_accuracy'] for r in cw_results])
cw_drop_mean = np.mean([r['accuracy_drop'] for r in cw_results])
cw_l2_mean = np.mean([r['mean_l2_dist'] for r in cw_results])

print("\nüìã KEY FINDINGS:")
print("-" * 70)

print(f"\n1. BASELINE CLEAN ACCURACY:")
print(f"   {clean_acc_mean:.2f}% ¬± {clean_acc_std:.2f}%")

print(f"\n2. FGSM ATTACK (Œµ=0.03):")
print(f"   Robust Accuracy: {fgsm_robust_mean:.2f}% ¬± {fgsm_robust_std:.2f}%")
print(f"   Accuracy Drop: {fgsm_drop_mean:.2f}pp")
print(f"   Attack Success Rate: {fgsm_success_mean:.2f}%")

print(f"\n3. PGD-20 ATTACK (Œµ=0.03):")
print(f"   Robust Accuracy: {pgd_robust_mean:.2f}% ¬± {pgd_robust_std:.2f}%")
print(f"   Accuracy Drop: {pgd_drop_mean:.2f}pp")
print(f"   Attack Success Rate: {pgd_success_mean:.2f}%")

print(f"\n4. CARLINI & WAGNER ATTACK:")
print(f"   Robust Accuracy: {cw_robust_mean:.2f}% ¬± {cw_robust_std:.2f}%")
print(f"   Accuracy Drop: {cw_drop_mean:.2f}pp")
print(f"   Mean L2 Distance: {cw_l2_mean:.4f}")

print("\n" + "=" * 70)
print("PHASE 4.3 CHECKLIST VERIFICATION:")
print("=" * 70)
print("‚úÖ All attacks implemented and tested (FGSM, PGD, C&W)")
print("‚úÖ Baseline robustness evaluated across 3 seeds")
print(f"‚úÖ Accuracy drop verified: {pgd_drop_mean:.1f}pp under PGD-20")
print("‚úÖ Statistical aggregation completed (mean ¬± std)")
print("‚úÖ Adversarial examples visualized")
print(f"‚úÖ Results saved to: {results_dir}")

print("\nüéØ CONCLUSION:")
if pgd_drop_mean >= 30:
    print("   ‚úÖ Baseline model shows SIGNIFICANT VULNERABILITY to adversarial attacks")
    print("   ‚úÖ This validates the need for robust training in Phase 5")
    print("   ‚úÖ Ready to proceed with Tri-Objective Robust XAI Training")
else:
    print("   ‚ö†Ô∏è  Baseline model shows some robustness")
    print("   ‚ÑπÔ∏è  Consider stronger attack parameters for evaluation")

print("\n" + "=" * 70)
print("üéì DISSERTATION TAKEAWAY:")
print("-" * 70)
print(f"   The baseline ResNet-50 achieves {clean_acc_mean:.1f}% clean accuracy on ISIC 2018,")
print(f"   but drops to only {pgd_robust_mean:.1f}% under PGD-20 attack (Œµ=0.03).")
print(f"   This {pgd_drop_mean:.1f}pp accuracy drop demonstrates the critical need")
print("   for adversarially robust training in medical imaging applications.")
print("=" * 70)

# Section 9: Phase 4.4 - Attack Transferability (Optional)

**Test adversarial transferability across different model architectures**

√¢≈°¬†√Ø¬∏¬è **Note:** This section requires checkpoints from different architectures (e.g., EfficientNet, DenseNet).
If not available, skip this section.

In [None]:
# Transferability study (optional - requires additional model checkpoints)
# Uncomment and run if you have checkpoints from other architectures

"""
# Example: Test transferability from ResNet-50 to EfficientNet

# Load target model (EfficientNet)
target_checkpoint = "/content/drive/MyDrive/checkpoints/efficientnet/seed_42/best.pt"
target_model = load_model_and_checkpoint(
    checkpoint_path=target_checkpoint,
    model_name="efficientnet_b0",
    num_classes=CONFIG['num_classes'],
    device=CONFIG['device']
)

# Generate adversarials on source model (ResNet-50)
source_model = vis_model  # Already loaded ResNet-50

# Get test batch
transfer_images, transfer_labels = next(iter(test_loader))
transfer_images = transfer_images.to(CONFIG['device'])
transfer_labels = transfer_labels.to(CONFIG['device'])

# Generate adversarials with PGD on ResNet-50
pgd_transfer = PGD(
    epsilon=8/255,
    alpha=2/255,
    num_steps=20,
    random_start=True,
    clip_min=0.0,
    clip_max=1.0,
    targeted=False
)

adv_images_transfer = pgd_transfer(source_model, transfer_images, transfer_labels)

# Evaluate on source model
with torch.no_grad():
    source_clean_logits = source_model(transfer_images)
    source_adv_logits = source_model(adv_images_transfer)
    
    source_clean_acc = (source_clean_logits.argmax(1) == transfer_labels).float().mean().item() * 100
    source_adv_acc = (source_adv_logits.argmax(1) == transfer_labels).float().mean().item() * 100

# Evaluate on target model
with torch.no_grad():
    target_clean_logits = target_model(transfer_images)
    target_adv_logits = target_model(adv_images_transfer)
    
    target_clean_acc = (target_clean_logits.argmax(1) == transfer_labels).float().mean().item() * 100
    target_adv_acc = (target_adv_logits.argmax(1) == transfer_labels).float().mean().item() * 100

# Compute transferability rate
transfer_rate = (source_clean_acc - target_adv_acc) / (source_clean_acc - source_adv_acc) * 100

print(f"Source Model (ResNet-50):")
print(f"  Clean Accuracy: {source_clean_acc:.2f}%")
print(f"  Adversarial Accuracy: {source_adv_acc:.2f}%")
print(f"  Accuracy Drop: {source_clean_acc - source_adv_acc:.2f}pp")

print(f"\nTarget Model (EfficientNet):")
print(f"  Clean Accuracy: {target_clean_acc:.2f}%")
print(f"  Adversarial Accuracy (transferred): {target_adv_acc:.2f}%")
print(f"  Accuracy Drop: {target_clean_acc - target_adv_acc:.2f}pp")

print(f"\nTransferability Rate: {transfer_rate:.2f}%")
"""

print("√¢≈°¬†√Ø¬∏¬è  Transferability study skipped - requires additional model checkpoints")
print("   To enable, uncomment the code above and provide checkpoints from different architectures")

# √∞≈∏≈Ω‚Ä∞ Phase 4 Execution Complete!

---

## √¢≈ì‚Ä¶ Completed Tasks

### Phase 4.3: Baseline Robustness Evaluation
- √¢≈ì‚Ä¶ Evaluated FGSM attack (3 epsilons √É‚Äî 3 seeds = 9 experiments)
- √¢≈ì‚Ä¶ Evaluated PGD attack (3 epsilons √É‚Äî 3 steps √É‚Äî 3 seeds = 27 experiments)
- √¢≈ì‚Ä¶ Evaluated C&W attack (3 seeds)
- √¢≈ì‚Ä¶ Statistical aggregation (mean √Ç¬± std)
- √¢≈ì‚Ä¶ Results saved to JSON

### Phase 4.5: Adversarial Visualization
- √¢≈ì‚Ä¶ Generated adversarial example visualizations
- √¢≈ì‚Ä¶ Created amplified perturbation visualizations
- √¢≈ì‚Ä¶ Comparison plots (robustness vs epsilon, attack comparison)
- √¢≈ì‚Ä¶ All figures saved to results directory

### Phase 4.4: Attack Transferability
- √¢¬è¬≠√Ø¬∏¬è Skipped (requires additional model architectures)

---

## √∞≈∏‚Äú≈† Expected Outputs

All results saved to: `/content/drive/MyDrive/results/robustness/`

**Files Generated:**
1. `baseline_robustness_aggregated.json` - Statistical results across seeds
2. `adversarial_examples_visualization.png` - Clean vs adversarial examples
3. `perturbation_visualization.png` - Amplified perturbations
4. `attack_comparison.png` - Attack effectiveness comparison

---

## √∞≈∏≈Ω¬Ø Next Steps

1. **Review Results:** Check accuracy drops match expected 50-70pp range
2. **Dissertation:** Use generated figures for Phase 4 results chapter
3. **Phase 5:** Proceed to tri-objective robust XAI training if baseline vulnerability confirmed
4. **Optional:** Run transferability study if you train models with different architectures

---

## √∞≈∏‚Äú¬ù Citation


- FGSM: Goodfellow et al., https://doi.org/10.48550/arXiv.1412.6572  ,"Explaining and Harnessing Adversarial Examples" (2015)
- PGD: Madry et al., https://openreview.net/forum?id=rJzIBfZAb  ,"Towards Deep Learning Models Resistant to Adversarial Attacks" (2018)
- C&W: Carlini & Wagner, 
https://doi.org/10.48550/arXiv.1608.04644
   ,"Towards Evaluating the Robustness of Neural Networks" (2017)