# Module 05: Famous CNN Architectures

Learn from the architectures that revolutionized computer vision.

## Architectures Covered
- LeNet-5 (1998) - The pioneer
- AlexNet (2012) - The ImageNet winner
- VGG (2014) - Simplicity and depth
- ResNet (2015) - Solving vanishing gradients
- Modern architectures overview

## Time: 45 minutes

In [None]:
import torch
import torch.nn as nn
import torchvision.models as models

## Part 1: LeNet-5 (1998)

**First successful CNN**, designed by Yann LeCun for handwritten digit recognition.

**Architecture:**
```
Input (32×32) → Conv(6) → Pool → Conv(16) → Pool → FC(120) → FC(84) → FC(10)
```

**Key Insights:**
- Convolution preserves spatial information
- Pooling reduces dimensions
- Hierarchical feature learning

In [None]:
class LeNet5(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.max_pool2d(x, 2)
        x = torch.relu(self.conv2(x))
        x = torch.max_pool2d(x, 2)
        x = x.view(-1, 16 * 5 * 5)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x


lenet = LeNet5()
print("LeNet-5: The CNN that started it all!")
print(lenet)

## Part 2: AlexNet (2012)

**Won ImageNet 2012**, sparked the deep learning revolution!

**Innovations:**
- Used ReLU (faster than sigmoid/tanh)
- Dropout for regularization
- Data augmentation
- GPU acceleration
- Much deeper (8 layers)

**Impact:** Showed deep learning works for complex real-world problems

In [None]:
# Load pretrained AlexNet
alexnet = models.alexnet(pretrained=False)
print("AlexNet architecture:")
print(alexnet)

## Part 3: VGG (2014)

**Philosophy:** Deeper networks with small (3×3) filters work better!

**Key Ideas:**
- Only 3×3 convolutions
- Many layers (16-19 layers)
- Simple, uniform architecture

**VGG-16:** 13 conv layers + 3 FC layers = 16 layers

**Drawback:** Very large number of parameters (138M)

In [None]:
# Load VGG16
vgg = models.vgg16(pretrained=False)
print("VGG-16: Deep and simple")
total_params = sum(p.numel() for p in vgg.parameters())
print(f"Total parameters: {total_params:,}")

## Part 4: ResNet (2015)

**Revolutionary idea:** Skip connections!

**Problem Solved:** Vanishing gradients in very deep networks

**Skip Connections:** Allow gradients to flow directly through network

```
x → [Conv → Conv] → Add → Output
 \________________↗ (skip connection)
```

**Impact:** Enabled networks with 100+ layers (ResNet-152)

**Variants:** ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152

In [None]:
# Load ResNet-18
resnet = models.resnet18(pretrained=False)
print("ResNet-18: Revolutionary skip connections!")
print(resnet)


# Basic residual block
class ResidualBlock(nn.Module):
    def __init__(self, channels):
        super().__init__()
        self.conv1 = nn.Conv2d(channels, channels, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(channels)
        self.conv2 = nn.Conv2d(channels, channels, 3, padding=1)
        self.bn2 = nn.BatchNorm2d(channels)

    def forward(self, x):
        identity = x  # Save input

        out = torch.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))

        out += identity  # Skip connection!
        out = torch.relu(out)
        return out


print("\nResidual blocks enable training very deep networks!")

## Part 5: Modern Architectures

### EfficientNet (2019)
- Balances depth, width, and resolution
- State-of-the-art accuracy with fewer parameters

### Vision Transformers (2020)
- Apply transformer architecture (from NLP) to vision
- Competitive with CNNs

### MobileNet
- Designed for mobile devices
- Depthwise separable convolutions
- Fast and lightweight

## Summary

### Evolution of CNNs:
1. **LeNet (1998)**: Proved CNNs work
2. **AlexNet (2012)**: Showed depth + GPUs = power
3. **VGG (2014)**: Simple is good, depth matters
4. **ResNet (2015)**: Skip connections enable very deep networks
5. **Modern**: Focus on efficiency and performance

### Key Takeaway:
You don't need to design from scratch - use proven architectures!

### Next: Module 06 - Transfer Learning
Learn to use these pre-trained models for your own problems!