# Transfer Learning with a Pretrained VGG Network

**Objective:** In this exercise, you will learn how to perform transfer learning by using a VGG network pretrained on ImageNet and fine-tuning it for the CIFAR-10 dataset.

This notebook will cover:
1.  Loading a pretrained model from `torchvision.models`.
2.  Adapting the input data to the pretrained model's requirements.
3.  Freezing the weights of the convolutional base to leverage learned features.
4.  Replacing the model's classifier for a new dataset.
5.  Training only the new classifier for efficient fine-tuning.

## 1. Setup and Imports

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import torchvision.models as models

print(f"PyTorch Version: {torch.__version__}")
print(f"Torchvision Version: {torchvision.__version__}")

## 2. Loading and Preparing the CIFAR-10 Dataset

Pretrained models like VGG were trained on ImageNet and expect input images of a specific size (224x224) and normalization. We must process our CIFAR-10 images (32x32) to match these requirements.

**Your Task:** Define the `transform` pipeline. It must:
1. Resize the images to 224x224.
2. Convert them to PyTorch Tensors.
3. Normalize them with the standard ImageNet mean and standard deviation.

In [None]:
# TODO: Define transformations for the dataset
# The mean and std for ImageNet are [0.485, 0.456, 0.406] and [0.229, 0.224, 0.225]
transform = transforms.Compose([
    # Your code here
    raise NotImplementedError("Define the data transformations."),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Load the datasets
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=2) # Using a smaller batch size due to larger image size

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

## 3. Load Pretrained VGG Model and Adapt for CIFAR-10

We will load a VGG16 model pretrained on ImageNet. Then, we will perform two key steps:
1.  **Freeze the feature extractor:** The convolutional layers have already learned powerful features. We will freeze them to prevent their weights from being updated during training.
2.  **Replace the classifier:** The original classifier was trained for 1000 ImageNet classes. We need to replace it with a new one for our 10 CIFAR-10 classes.

**Your Task:** 
1. Load the pretrained `vgg16` model.
2. Freeze the parameters of the `features` part of the model by setting `requires_grad` to `False`.
3. Replace the final layer of the `classifier` with a new `nn.Linear` layer suitable for 10 classes.

In [None]:
# TODO: Load the pretrained VGG16 model
model = None # Your code here
raise NotImplementedError("Load the pretrained VGG16 model.")

# TODO: Freeze the feature extractor layers
# Your code here
raise NotImplementedError("Freeze the feature extractor.")

# TODO: Replace the classifier
# The original VGG16 classifier's last layer is at index 6
num_features = model.classifier[6].in_features
model.classifier[6] = None # Your code here
raise NotImplementedError("Replace the classifier.")

# Move the model to the correct device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = model.to(device)

print(model)
print(f'Model is on device: {next(model.parameters()).device}')

## 4. Define Loss Function and Optimizer

We will use `CrossEntropyLoss` and the `Adam` optimizer. Importantly, we only want to train the parameters of the new classifier, not the frozen layers.

**Your Task:** Instantiate the loss function and the optimizer. Make sure the optimizer is only passed the parameters of the classifier that need to be trained.

In [None]:
# TODO: Define the loss function and optimizer
criterion = None # Your code here
optimizer = None # Your code here

raise NotImplementedError("Define criterion and optimizer.")

## 5. Train and Test the Network

Now we will write the training and testing loops. This process should be much faster than training from scratch because we are only updating the weights of the small classifier part of the network.

**Your Task:** Complete the training and testing loops.

In [None]:
def train(epoch):
    print(f'Epoch: {epoch}')
    model.train()
    train_loss = 0
    correct = 0
    total = 0
    for batch_idx, (inputs, targets) in enumerate(trainloader):
        inputs, targets = inputs.to(device), targets.to(device)
        
        # TODO: Complete the training steps (zero grad, forward, loss, backward, step)
        raise NotImplementedError("Implement the training steps.")

        train_loss += loss.item()
        _, predicted = outputs.max(1)
        total += targets.size(0)
        correct += predicted.eq(targets).sum().item()

        if batch_idx % 100 == 0:
            print(f'Loss: {train_loss/(batch_idx+1):.3f} | Acc: {100.*correct/total:.3f}% ({correct}/{total})')

def test():
    model.eval()
    test_loss = 0
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            # TODO: Complete the testing steps (forward pass and loss calculation)
            raise NotImplementedError("Implement the testing steps.")

            test_loss += loss.item()
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()

    print(f'Test Loss: {test_loss/len(testloader):.3f} | Test Acc: {100.*correct/total:.3f}%')

for epoch in range(3): # Train for 3 epochs for demonstration
    train(epoch)
    test()

## 6. Bonus Questions

1.  What happens if you don't freeze the convolutional layers? How does it affect training time and accuracy? (Hint: You would pass `model.parameters()` to the optimizer).
2.  Try fine-tuning more than just the classifier. For example, unfreeze the last convolutional block of `model.features` and add its parameters to the optimizer. You might want to use a smaller learning rate for these layers.
3.  Experiment with a different pretrained model, like `resnet18`.