# Module 06: Transfer Learning

**Leverage Pre-Trained Models for Your Own Projects**

Don't train from scratch! Use models trained on millions of images.

## What You'll Learn
- What is transfer learning?
- Feature extraction vs fine-tuning
- Using PyTorch's pre-trained models
- Adapting models to your dataset
- Practical project: Custom image classifier

## Time: 60 minutes

In [None]:
import torch
import torch.nn as nn
import torchvision.models as models
import torchvision.transforms as transforms
from torchvision import datasets

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## Part 1: What is Transfer Learning?

### The Idea

**Transfer Learning:** Use knowledge learned from one task to solve a different but related task.

### Why Transfer Learning?

1. **Saves Time**: No need to train for days/weeks
2. **Less Data Needed**: Pre-trained models already know image features
3. **Better Performance**: Models trained on millions of images
4. **Practical**: Most real-world applications use transfer learning

### How It Works

```
Pre-trained Model (trained on ImageNet - 1.2M images, 1000 classes)
    ↓
[Keep early layers] → They recognize general features (edges, textures, shapes)
    ↓
[Replace final layers] → Train on YOUR specific task
    ↓
Your Custom Classifier!
```

## Part 2: Two Approaches

### Approach 1: Feature Extraction

- **Freeze** all pre-trained layers
- **Train** only the new final layer
- **Fast**, requires little data
- Use when: Dataset is small and similar to ImageNet

### Approach 2: Fine-Tuning

- **Freeze** early layers
- **Unfreeze** some later layers
- **Train** final layers + unfrozen layers
- **Slower**, needs more data
- Use when: Larger dataset or different from ImageNet

In [None]:
# Load pre-trained ResNet18
model = models.resnet18(pretrained=True)
print("Loaded pre-trained ResNet18!")
print(f"\nOriginal output layer: {model.fc}")
print(f"Designed for {model.fc.out_features} classes (ImageNet)")

## Part 3: Feature Extraction Example

In [None]:
# Approach 1: Feature Extraction
model_fe = models.resnet18(pretrained=True)

# Freeze all parameters
for param in model_fe.parameters():
    param.requires_grad = False

# Replace final layer for YOUR task (e.g., 10 classes)
num_features = model_fe.fc.in_features
model_fe.fc = nn.Linear(num_features, 10)

model_fe = model_fe.to(device)

print("Feature Extraction Model:")
print(f"Frozen layers: {sum(1 for p in model_fe.parameters() if not p.requires_grad)}")
print(f"Trainable layers: {sum(1 for p in model_fe.parameters() if p.requires_grad)}")
print(f"\nNew output layer: {model_fe.fc}")

## Part 4: Fine-Tuning Example

In [None]:
# Approach 2: Fine-Tuning
model_ft = models.resnet18(pretrained=True)

# Freeze early layers only
for name, param in model_ft.named_parameters():
    if "layer4" not in name and "fc" not in name:
        param.requires_grad = False

# Replace final layer
num_features = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_features, 10)

model_ft = model_ft.to(device)

print("Fine-Tuning Model:")
trainable = sum(p.numel() for p in model_ft.parameters() if p.requires_grad)
total = sum(p.numel() for p in model_ft.parameters())
print(f"Trainable parameters: {trainable:,} / {total:,} ({100*trainable/total:.1f}%)")

## Part 5: Training with Transfer Learning

In [None]:
# Data transforms for pre-trained models
# Important: Use ImageNet normalization!
train_transform = transforms.Compose(
    [
        transforms.Resize(256),
        transforms.CenterCrop(224),  # ResNet expects 224×224
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]  # ImageNet mean
        ),  # ImageNet std
    ]
)

print("Data transforms ready!")
print("\nKey points:")
print("- Resize to 224×224 (ResNet input size)")
print("- Use ImageNet normalization")
print("- Apply same transforms during inference")

## Part 6: Available Pre-Trained Models

### Classification Models
- **ResNet** (18, 34, 50, 101, 152): Great all-purpose
- **VGG** (11, 13, 16, 19): Simple, accurate
- **MobileNet**: Lightweight, fast
- **EfficientNet**: State-of-the-art accuracy
- **DenseNet**: Dense connections

### How to Choose?

- **Need speed?** → MobileNet
- **Need accuracy?** → EfficientNet or ResNet-50+
- **Limited resources?** → ResNet-18 or MobileNet
- **Just starting?** → ResNet-18 (good balance)

In [None]:
# Load different models
print("Available pre-trained models:")
print("=" * 50)

# Small and fast
mobilenet = models.mobilenet_v2(pretrained=True)
print(f"MobileNetV2: {sum(p.numel() for p in mobilenet.parameters()):,} parameters")

# Balanced
resnet18 = models.resnet18(pretrained=True)
print(f"ResNet-18: {sum(p.numel() for p in resnet18.parameters()):,} parameters")

# More accurate
resnet50 = models.resnet50(pretrained=True)
print(f"ResNet-50: {sum(p.numel() for p in resnet50.parameters()):,} parameters")

print("\nAll trained on ImageNet (1.2M images, 1000 classes)")

## Summary

### What You Learned:

1. **Transfer Learning Concept**
   - Reuse knowledge from large datasets
   - Saves time and improves performance

2. **Two Approaches**
   - Feature extraction: Freeze all, train final layer
   - Fine-tuning: Unfreeze some layers, train more

3. **Practical Implementation**
   - Load pre-trained models from torchvision
   - Replace final layer for your task
   - Use ImageNet normalization

4. **Model Selection**
   - Different models for different needs
   - ResNet-18 is a great starting point

### Next: Module 07 - Image Classification Project
Build a complete end-to-end classifier using transfer learning!