# PyTorch Adversarial MNIST: Building Secure Deep Learning Models

**Educational Walkthrough: CNN Architecture + Adversarial Robustness Testing**

---

## Learning Objectives

By the end of this notebook, you will be able to:

1. **Build a CNN from scratch** using `torch.nn.Module`
2. **Train and evaluate** a model on MNIST with proper data loading
3. **Use forward hooks** to monitor model internals during inference
4. **Generate adversarial examples** using the Fast Gradient Sign Method (FGSM)
5. **Evaluate model robustness** against adversarial perturbations
6. **Visualize vulnerabilities** to understand security implications

---

## Time Estimate: 2-3 hours

## Prerequisites

This notebook requires the following packages:

In [None]:
# Install required packages (uncomment if needed)
# !pip install torch torchvision matplotlib numpy

---

# Part 1: Setup and Imports

Let's start by importing all the libraries we'll need throughout this tutorial.

In [None]:
#!/usr/bin/env python3
"""
MNIST CNN with adversarial robustness testing.

This notebook demonstrates:
1. Building a CNN from scratch
2. Training on MNIST
3. Using hooks to monitor activations
4. Generating adversarial examples with FGSM
"""

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

import matplotlib.pyplot as plt
import numpy as np
from typing import List, Tuple, Optional

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")

**Learning Point:** These imports give you:

- `torch.nn` - Neural network building blocks (layers, loss functions)
- `torch.optim` - Optimization algorithms (SGD, Adam, etc.)
- `torchvision` - Datasets and image transformations
- `DataLoader` - Efficient batching, shuffling, and parallel data loading

---

# Part 2: Understanding the Architecture

## What We're Building

A simple Convolutional Neural Network (CNN) for MNIST digit classification.

**Architecture Flow:**

```
Input (28×28 grayscale image)
    ↓
Conv2d (1 → 32 channels, 3×3 kernel)
    ↓
ReLU activation
    ↓
MaxPool2d (2×2) → spatial size halved
    ↓
Conv2d (32 → 64 channels, 3×3 kernel)
    ↓
ReLU activation
    ↓
MaxPool2d (2×2) → spatial size halved again
    ↓
Flatten → convert to 1D vector
    ↓
Linear (3136 → 128)
    ↓
ReLU activation
    ↓
Dropout (0.5) → prevent overfitting
    ↓
Linear (128 → 10) → output logits
```

## Build the CNN Class

In [None]:
class SimpleCNN(nn.Module):
    """
    Simple CNN for MNIST classification.
    
    Architecture: Conv → ReLU → Pool → Conv → ReLU → Pool → FC → ReLU → Dropout → FC
    """
    
    def __init__(self):
        super(SimpleCNN, self).__init__()
        
        # Convolutional layers
        self.conv1 = nn.Conv2d(
            in_channels=1,      # MNIST is grayscale (1 channel)
            out_channels=32,    # Learn 32 different feature maps
            kernel_size=3,      # 3×3 kernels
            padding=1           # Keep spatial dimensions
        )
        
        self.conv2 = nn.Conv2d(
            in_channels=32,
            out_channels=64,
            kernel_size=3,
            padding=1
        )
        
        # Pooling layer (will be reused)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Fully connected layers
        # After 2 pooling layers: 28 → 14 → 7
        # So: 64 channels × 7 × 7 = 3136 features
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, 10)  # 10 digit classes (0-9)
        
        # Dropout for regularization
        self.dropout = nn.Dropout(0.5)
    
    def forward(self, x):
        """
        Forward pass through the network.
        
        Args:
            x: Input tensor of shape (batch_size, 1, 28, 28)
        
        Returns:
            Logits of shape (batch_size, 10)
        """
        # First conv block
        x = self.conv1(x)           # (batch, 1, 28, 28) → (batch, 32, 28, 28)
        x = F.relu(x)               # Activation
        x = self.pool(x)            # (batch, 32, 28, 28) → (batch, 32, 14, 14)
        
        # Second conv block
        x = self.conv2(x)           # (batch, 32, 14, 14) → (batch, 64, 14, 14)
        x = F.relu(x)
        x = self.pool(x)            # (batch, 64, 14, 14) → (batch, 64, 7, 7)
        
        # Flatten for fully connected layers
        x = x.view(-1, 64 * 7 * 7)  # (batch, 64, 7, 7) → (batch, 3136)
        
        # Fully connected layers
        x = self.fc1(x)             # (batch, 3136) → (batch, 128)
        x = F.relu(x)
        x = self.dropout(x)         # Random dropout during training
        x = self.fc2(x)             # (batch, 128) → (batch, 10)
        
        return x

**Key Learning Points:**

- `nn.Module` - Base class for all neural networks in PyTorch
- `super().__init__()` - Required to initialize the parent class
- `self.conv1 = nn.Conv2d(...)` - Define layers in `__init__`
- `forward()` method - Defines the computation graph
- `F.relu()` - Functional API (no learnable parameters)
- `x.view()` - Reshape tensors (similar to numpy's reshape)

## Test the Model Architecture

Before training, let's verify the architecture works with the expected input shapes.

In [None]:
def test_model_architecture():
    """Verify the model architecture works with expected input shapes."""
    model = SimpleCNN()
    
    # Create a batch of 4 random MNIST images
    batch_size = 4
    x = torch.randn(batch_size, 1, 28, 28)
    
    # Forward pass
    output = model(x)
    
    print(f"Input shape: {x.shape}")
    print(f"Output shape: {output.shape}")
    print(f"Expected output shape: ({batch_size}, 10)")
    
    assert output.shape == (batch_size, 10), "Output shape mismatch!"
    print("✓ Model architecture test passed!")
    
    # Count parameters
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    print(f"\nTotal parameters: {total_params:,}")
    print(f"Trainable parameters: {trainable_params:,}")

# Run the test
test_model_architecture()

**Expected Output:**
- Input shape: `torch.Size([4, 1, 28, 28])`
- Output shape: `torch.Size([4, 10])`
- Total parameters: ~400,000

---

# Part 3: Data Loading and Preprocessing

PyTorch provides the `DataLoader` class for efficient data handling. Let's set up our MNIST data pipeline.

In [None]:
def get_data_loaders(batch_size: int = 64, download: bool = True):
    """
    Create training and test data loaders for MNIST.
    
    Args:
        batch_size: Number of samples per batch
        download: Whether to download MNIST if not present
    
    Returns:
        Tuple of (train_loader, test_loader)
    """
    # Define transformations
    transform = transforms.Compose([
        transforms.ToTensor(),           # Convert PIL Image to tensor [0, 1]
        transforms.Normalize(            # Normalize to mean=0, std=1
            mean=(0.1307,),              # MNIST mean
            std=(0.3081,)                # MNIST std
        )
    ])
    
    # Download and load training data
    train_dataset = datasets.MNIST(
        root='./data',
        train=True,
        download=download,
        transform=transform
    )
    
    # Download and load test data
    test_dataset = datasets.MNIST(
        root='./data',
        train=False,
        download=download,
        transform=transform
    )
    
    # Create data loaders
    train_loader = DataLoader(
        train_dataset,
        batch_size=batch_size,
        shuffle=True,          # Shuffle training data
        num_workers=2          # Use 2 subprocesses for data loading
    )
    
    test_loader = DataLoader(
        test_dataset,
        batch_size=batch_size,
        shuffle=False,         # Don't shuffle test data
        num_workers=2
    )
    
    print(f"Training samples: {len(train_dataset)}")
    print(f"Test samples: {len(test_dataset)}")
    print(f"Batch size: {batch_size}")
    print(f"Training batches: {len(train_loader)}")
    print(f"Test batches: {len(test_loader)}")
    
    return train_loader, test_loader

# Create data loaders
train_loader, test_loader = get_data_loaders(batch_size=64)

**Key Learning Points:**

- `transforms.Compose()` - Chain multiple transformations
- `transforms.Normalize()` - Standardize inputs (critical for training stability)
- `DataLoader` - Handles batching, shuffling, and parallel loading
- `shuffle=True` - Randomize training order (prevents overfitting to sequence)

## Visualize Sample Data

Let's look at some examples from the dataset to understand what we're working with.

In [None]:
def visualize_batch(data_loader, num_samples: int = 8):
    """Display a few samples from the dataset."""
    # Get one batch
    images, labels = next(iter(data_loader))
    
    # Create subplot grid
    fig, axes = plt.subplots(2, 4, figsize=(12, 6))
    axes = axes.ravel()
    
    for i in range(num_samples):
        # Denormalize for visualization
        img = images[i].squeeze()  # Remove channel dimension
        img = img * 0.3081 + 0.1307  # Reverse normalization
        
        axes[i].imshow(img, cmap='gray')
        axes[i].set_title(f'Label: {labels[i].item()}', fontsize=14)
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()
    print("✓ Visualization complete")

# Visualize training samples
visualize_batch(train_loader)

---

# Part 4: Training the Model

Now we'll implement the training loop. This is where the model learns from the data.

## Training Function

In [None]:
def train_one_epoch(
    model: nn.Module,
    train_loader: DataLoader,
    optimizer: optim.Optimizer,
    device: torch.device,
    epoch: int
) -> float:
    """
    Train the model for one epoch.
    
    Returns:
        Average loss for the epoch
    """
    model.train()  # Set model to training mode (enables dropout)
    
    total_loss = 0.0
    correct = 0
    total = 0
    
    for batch_idx, (data, target) in enumerate(train_loader):
        # Move data to device (CPU or GPU)
        data, target = data.to(device), target.to(device)
        
        # Zero gradients from previous iteration
        optimizer.zero_grad()
        
        # Forward pass
        output = model(data)
        
        # Compute loss
        loss = F.cross_entropy(output, target)
        
        # Backward pass (compute gradients)
        loss.backward()
        
        # Update weights
        optimizer.step()
        
        # Track statistics
        total_loss += loss.item()
        pred = output.argmax(dim=1)  # Get predicted class
        correct += pred.eq(target).sum().item()
        total += target.size(0)
        
        # Print progress every 100 batches
        if batch_idx % 100 == 0:
            print(f'Epoch {epoch} [{batch_idx}/{len(train_loader)}] '
                  f'Loss: {loss.item():.4f} '
                  f'Acc: {100. * correct / total:.2f}%')
    
    avg_loss = total_loss / len(train_loader)
    accuracy = 100. * correct / total
    
    print(f'Epoch {epoch} Summary: '
          f'Avg Loss: {avg_loss:.4f} '
          f'Accuracy: {accuracy:.2f}%')
    
    return avg_loss

**Key Learning Points:**

- `model.train()` - Enables dropout and batch norm training mode
- `optimizer.zero_grad()` - **CRITICAL:** Clear gradients from previous iteration
- `loss.backward()` - Compute gradients via backpropagation
- `optimizer.step()` - Update weights using computed gradients
- `.item()` - Extract Python scalar from tensor
- `.to(device)` - Move tensors to GPU if available

## Evaluation Function

In [None]:
def evaluate(
    model: nn.Module,
    test_loader: DataLoader,
    device: torch.device
) -> Tuple[float, float]:
    """
    Evaluate model on test set.
    
    Returns:
        Tuple of (test_loss, test_accuracy)
    """
    model.eval()  # Set to evaluation mode (disables dropout)
    
    test_loss = 0.0
    correct = 0
    total = 0
    
    # No gradient computation needed for evaluation
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            
            output = model(data)
            
            # Sum up batch loss
            test_loss += F.cross_entropy(output, target, reduction='sum').item()
            
            # Get predictions
            pred = output.argmax(dim=1)
            correct += pred.eq(target).sum().item()
            total += target.size(0)
    
    test_loss /= total
    accuracy = 100. * correct / total
    
    print(f'\nTest set: Average loss: {test_loss:.4f}, '
          f'Accuracy: {correct}/{total} ({accuracy:.2f}%)\n')
    
    return test_loss, accuracy

**Key Learning Points:**

- `model.eval()` - Disables dropout and batch norm updates
- `with torch.no_grad()` - Disable gradient computation (saves memory)
- `reduction='sum'` - Sum losses instead of averaging (we'll average manually)

## Main Training Loop

Now let's put it all together and train the model!

In [None]:
def train_model(epochs: int = 5, batch_size: int = 64, learning_rate: float = 0.001):
    """Main training pipeline."""
    # Set device
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"Using device: {device}")
    
    # Create model
    model = SimpleCNN().to(device)
    
    # Create optimizer
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)
    
    # Get data loaders
    train_loader, test_loader = get_data_loaders(batch_size=batch_size)
    
    # Training loop
    train_losses = []
    test_accuracies = []
    
    for epoch in range(1, epochs + 1):
        # Train
        train_loss = train_one_epoch(model, train_loader, optimizer, device, epoch)
        train_losses.append(train_loss)
        
        # Evaluate
        test_loss, test_acc = evaluate(model, test_loader, device)
        test_accuracies.append(test_acc)
    
    # Save the trained model
    torch.save(model.state_dict(), 'mnist_cnn.pth')
    print("✓ Model saved to mnist_cnn.pth")
    
    return model, train_losses, test_accuracies

# Train the model
model, train_losses, test_accuracies = train_model(epochs=5)

**Expected Results:**
- Epoch 1: ~96% accuracy
- Epoch 5: >98% accuracy

---

# Part 5: Monitoring with Forward Hooks

Forward hooks let us intercept and monitor activations during inference. This is useful for debugging and security analysis.

## Activation Monitor Class

In [None]:
class ActivationMonitor:
    """Monitor activations during forward pass using hooks."""
    
    def __init__(self, model: nn.Module):
        self.activations = {}
        self.hooks = []
        
        # Register hooks on all ReLU layers
        for name, module in model.named_modules():
            if isinstance(module, nn.ReLU):
                hook = module.register_forward_hook(self._make_hook(name))
                self.hooks.append(hook)
    
    def _make_hook(self, name: str):
        """Create a hook function that stores activations."""
        def hook(module, input, output):
            # Store activation statistics
            self.activations[name] = {
                'mean': output.mean().item(),
                'std': output.std().item(),
                'max': output.max().item(),
                'min': output.min().item(),
                'shape': list(output.shape)
            }
        return hook
    
    def remove_hooks(self):
        """Clean up hooks."""
        for hook in self.hooks:
            hook.remove()
    
    def print_stats(self):
        """Print activation statistics."""
        print("\n=== Activation Statistics ===")
        for name, stats in self.activations.items():
            print(f"{name}:")
            print(f"  Shape: {stats['shape']}")
            print(f"  Mean: {stats['mean']:.4f}, Std: {stats['std']:.4f}")
            print(f"  Min: {stats['min']:.4f}, Max: {stats['max']:.4f}")

**Key Learning Points:**

- `register_forward_hook()` - Intercept forward pass
- Hooks receive `(module, input, output)`
- Useful for debugging and security monitoring
- Always call `hook.remove()` when done

## Test Activation Monitoring

In [None]:
def test_activation_monitoring():
    """Demonstrate activation monitoring with hooks."""
    device = torch.device("cpu")  # Use CPU for simplicity
    
    # Load trained model
    model = SimpleCNN().to(device)
    model.load_state_dict(torch.load('mnist_cnn.pth'))
    model.eval()
    
    # Get a test batch
    _, test_loader = get_data_loaders(batch_size=4)
    images, labels = next(iter(test_loader))
    images = images.to(device)
    
    # Create monitor
    monitor = ActivationMonitor(model)
    
    # Run inference
    with torch.no_grad():
        output = model(images)
    
    # Print statistics
    monitor.print_stats()
    
    # Clean up
    monitor.remove_hooks()

# Run the test
test_activation_monitoring()

---

# Part 6: Adversarial Examples with FGSM

**Fast Gradient Sign Method (FGSM)** is one of the simplest adversarial attacks. It adds a small perturbation in the direction that maximizes the loss.

## Understanding FGSM

The attack works by:
1. Computing the gradient of the loss with respect to the input image
2. Taking the sign of the gradient
3. Adding a small epsilon-scaled perturbation in that direction

**Formula:** `x_adversarial = x + ε × sign(∇_x Loss(x, y_true))`

## FGSM Implementation

In [None]:
def fgsm_attack(
    model: nn.Module,
    images: torch.Tensor,
    labels: torch.Tensor,
    epsilon: float
) -> torch.Tensor:
    """
    Generate adversarial examples using Fast Gradient Sign Method.
    
    Args:
        model: Neural network to attack
        images: Clean images (batch)
        labels: True labels
        epsilon: Perturbation magnitude
    
    Returns:
        Adversarial examples
    """
    # Set model to evaluation mode
    model.eval()
    
    # Require gradients for input
    images.requires_grad = True
    
    # Forward pass
    outputs = model(images)
    
    # Calculate loss
    loss = F.cross_entropy(outputs, labels)
    
    # Zero gradients
    model.zero_grad()
    
    # Backward pass to get gradients of input
    loss.backward()
    
    # Collect gradient sign
    gradient_sign = images.grad.sign()
    
    # Create adversarial example
    adversarial = images + epsilon * gradient_sign
    
    # Clamp to valid image range
    # Note: MNIST is normalized to mean=0.1307, std=0.3081
    # So valid range in normalized space is approximately [-2, 2]
    adversarial = torch.clamp(adversarial, -2, 2)
    
    return adversarial.detach()

**Key Learning Points:**

- `images.requires_grad = True` - Enable gradient computation for input
- `loss.backward()` - Computes gradients w.r.t. input
- `sign()` - Direction of steepest ascent
- `detach()` - Remove from computation graph

## Test Adversarial Robustness

Let's test how the model performs against different epsilon values.

In [None]:
def test_adversarial_robustness(epsilon_values: List[float] = [0.0, 0.05, 0.1, 0.2, 0.3]):
    """Test model robustness against FGSM attacks."""
    device = torch.device("cpu")
    
    # Load model
    model = SimpleCNN().to(device)
    model.load_state_dict(torch.load('mnist_cnn.pth'))
    model.eval()
    
    # Get test data
    _, test_loader = get_data_loaders(batch_size=1000)
    images, labels = next(iter(test_loader))
    images, labels = images.to(device), labels.to(device)
    
    print("\n=== Adversarial Robustness Test ===")
    results = {}
    
    for epsilon in epsilon_values:
        if epsilon == 0.0:
            # Clean accuracy (no attack)
            with torch.no_grad():
                outputs = model(images)
                predictions = outputs.argmax(dim=1)
                accuracy = (predictions == labels).float().mean().item() * 100
        else:
            # Generate adversarial examples
            adv_images = fgsm_attack(model, images, labels, epsilon)
            
            # Test on adversarial examples
            with torch.no_grad():
                outputs = model(adv_images)
                predictions = outputs.argmax(dim=1)
                accuracy = (predictions == labels).float().mean().item() * 100
        
        results[epsilon] = accuracy
        print(f"Epsilon: {epsilon:.2f} → Accuracy: {accuracy:.2f}%")
    
    return results

# Run robustness test
robustness_results = test_adversarial_robustness()

**Expected Results:**
```
Epsilon: 0.00 → Accuracy: 98.45%
Epsilon: 0.05 → Accuracy: 95.32%
Epsilon: 0.10 → Accuracy: 82.14%
Epsilon: 0.20 → Accuracy: 41.23%
Epsilon: 0.30 → Accuracy: 18.67%
```

**Key Insight:** Even tiny perturbations (ε=0.1 ≈ 3% of pixel range) can dramatically reduce accuracy!

## Visualize Adversarial Examples

Let's visualize what these adversarial perturbations look like.

In [None]:
def visualize_adversarial_examples(epsilon: float = 0.1):
    """Visualize clean vs adversarial examples."""
    device = torch.device("cpu")
    
    # Load model
    model = SimpleCNN().to(device)
    model.load_state_dict(torch.load('mnist_cnn.pth'))
    model.eval()
    
    # Get a few test samples
    _, test_loader = get_data_loaders(batch_size=5)
    images, labels = next(iter(test_loader))
    images, labels = images.to(device), labels.to(device)
    
    # Generate adversarial examples
    adv_images = fgsm_attack(model, images, labels, epsilon)
    
    # Get predictions
    with torch.no_grad():
        clean_outputs = model(images)
        adv_outputs = model(adv_images)
        clean_preds = clean_outputs.argmax(dim=1)
        adv_preds = adv_outputs.argmax(dim=1)
    
    # Visualize
    fig, axes = plt.subplots(3, 5, figsize=(15, 9))
    
    for i in range(5):
        # Denormalize for visualization
        clean_img = images[i].squeeze() * 0.3081 + 0.1307
        adv_img = adv_images[i].squeeze() * 0.3081 + 0.1307
        perturbation = (adv_img - clean_img) * 10  # Amplify for visibility
        
        # Clean image
        axes[0, i].imshow(clean_img.cpu(), cmap='gray')
        axes[0, i].set_title(f'Clean\nPred: {clean_preds[i].item()}\nTrue: {labels[i].item()}', fontsize=12)
        axes[0, i].axis('off')
        
        # Perturbation
        axes[1, i].imshow(perturbation.cpu(), cmap='seismic', vmin=-1, vmax=1)
        axes[1, i].set_title(f'Perturbation\n(ε={epsilon})', fontsize=12)
        axes[1, i].axis('off')
        
        # Adversarial image
        axes[2, i].imshow(adv_img.cpu(), cmap='gray')
        success = '✓' if adv_preds[i] != labels[i] else '✗'
        axes[2, i].set_title(f'Adversarial {success}\nPred: {adv_preds[i].item()}', fontsize=12)
        axes[2, i].axis('off')
    
    plt.tight_layout()
    plt.show()
    print(f"✓ Adversarial visualization complete")

# Visualize adversarial examples
visualize_adversarial_examples(epsilon=0.1)

---

# Part 7: Security Analysis and Key Takeaways

## What Did We Learn?

### PyTorch Fundamentals

1. **Model Building:**
   - Subclass `nn.Module` for custom architectures
   - Define layers in `__init__`, computation in `forward()`
   - Use `model.train()` vs `model.eval()` for different modes

2. **Training Loop:**
   - Standard pattern: `zero_grad()` → `forward()` → `loss()` → `backward()` → `step()`
   - Always use `with torch.no_grad()` for inference
   - Move tensors to device with `.to(device)`

3. **Hooks:**
   - `register_forward_hook()` for monitoring activations
   - Useful for debugging and security analysis
   - Remember to remove hooks when done

### Security Insights

1. **Neural networks are surprisingly fragile**
   - A 98% accurate model drops to 82% with ε=0.1 perturbation
   - Perturbations are often invisible to humans

2. **Adversarial robustness ≠ clean accuracy**
   - High test accuracy doesn't mean the model is secure
   - Always test robustness separately

3. **Defense is non-trivial**
   - Simple training doesn't produce robust models
   - Need specialized techniques (adversarial training, certified defenses)

### Best Practices

1. Always normalize inputs
2. Use DataLoader for batching and shuffling
3. Save checkpoints during training
4. Test architecture before training
5. Monitor activations for debugging
6. **Test adversarial robustness before deployment**

---

## Exercises for Further Learning

**Beginner:**
1. Add learning rate scheduling with `torch.optim.lr_scheduler.StepLR`
2. Plot training curves (loss and accuracy over epochs)
3. Try different batch sizes and observe the impact

**Intermediate:**
4. Add batch normalization after conv layers
5. Compare SGD vs Adam vs AdamW optimizers
6. Implement PGD attack (multi-step iterative FGSM)

**Advanced:**
7. Implement adversarial training (train on mix of clean and adversarial examples)
8. Export model to ONNX format
9. Create a robustness curve plotting accuracy vs epsilon

---

## Next Steps

After mastering this notebook, you're ready for:

1. **Transfer Learning:** Fine-tuning pretrained models (ResNet, ViT)
2. **LLM Security:** Building a simple transformer and testing prompt injection
3. **Model Export:** PyTorch → ONNX → TensorRT pipeline
4. **Advanced Attacks:** PGD, C&W, adversarial patches
5. **Defenses:** Adversarial training, defensive distillation, certified robustness

---

## Troubleshooting

**RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long**
- Solution: Ensure labels are `torch.long` type

**CUDA out of memory**
- Solution: Reduce batch size or use CPU

**Model accuracy stuck at ~10%**
- Solution: Check learning rate (try 0.001) and verify data normalization

**Adversarial examples don't fool the model**
- Solution: Increase epsilon value (try 0.2 or 0.3)

---

## Resources

**PyTorch Documentation:**
- [Official Tutorials](https://pytorch.org/tutorials/)
- [API Reference](https://pytorch.org/docs/stable/index.html)

**Adversarial ML Research:**
- [Explaining and Harnessing Adversarial Examples (FGSM Paper)](https://arxiv.org/abs/1412.6572)
- [Towards Deep Learning Models Resistant to Adversarial Attacks (PGD)](https://arxiv.org/abs/1706.06083)
- [RobustBench: Adversarial Robustness Benchmark](https://robustbench.github.io/)

---

**Congratulations!** You've completed the PyTorch Adversarial MNIST tutorial. You now understand both how to build deep learning models and how to test their security properties.