## PyTorch Transfer Learning Exercise with Hugging Face & CIFAR-10

### Objective:
Learn the fundamentals of transfer learning by loading a pre-trained visual model,
adding a custom classification layer, and training it on the CIFAR-10 dataset.

Follow the steps below, filling in the required code.

Your aim is to populate the functions that compose all steps of this training run:

```# This block ties everything together.
print("--- PyTorch Transfer Learning Exercise ---")

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# 1. Get Dataloaders
train_loader, test_loader = get_cifar10_dataloaders()

# 2. Load Base Model
base_model = load_pretrained_model()

# 3. Create Custom Model
cifar_model = EfficientNetCIFAR10(base_model)

# 4. Define Loss and Optimizer
# We only want to train the parameters of our new classifier layer.
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cifar_model.classifier.parameters(), lr=0.001)

# 5. Train the Model
# Note: Training for more epochs will yield better results.
# 3 epochs is a good starting point to verify the setup works.
train_model(cifar_model, train_loader, criterion, optimizer, device, epochs=3)

# 6. Evaluate the Model
evaluate_model(cifar_model, test_loader, device)```

### step 1: `import`
This initial step involves importing all the necessary packages for the project.
We import `torch` and `torch.nn` for core deep learning functionalities and building the model.
`DataLoader` is used for efficiently loading and batching data. `torchvision` provides access
to popular datasets like CIFAR-10 and image transformation functions. The `transformers` library
from Hugging Face is key for easily downloading and using pre-trained models. Finally, `tqdm`
is a utility that provides progress bars for our loops, making the training process more informative.

In [None]:
# --- Step 1: Import Necessary Libraries ---
# We need torch for building the neural network, torchvision for datasets and
# transformations, and transformers from Hugging Face to load our pre-trained model.
# tqdm is a handy utility for creating progress bars.

import torch
import torch.nn as nn
from torch.utils.data import DataLoader
import torchvision
import torchvision.transforms as transforms
from transformers import AutoModel
from tqdm import tqdm

### step 2. `get_cifar10_dataloaders`

In this step, we prepare our data for training and testing. We first define a series of
transformations to apply to each image. This is crucial because pre-trained models expect
a specific input format; here, we resize CIFAR-10's small 32x32 images to the 224x224 size
expected by EfficientNet. We also convert images to PyTorch Tensors and normalize their
pixel values. Finally, we create `DataLoader` instances for both the training and test sets,
which will handle batching the data, shuffling it for training, and loading it in parallel.

In [None]:
# --- Step 2: Prepare the CIFAR-10 Dataset ---
# We'll load the CIFAR-10 dataset and apply some transformations to prepare it
# for the model. EfficientNet models expect a specific input size (e.g., 224x224).

def get_cifar10_dataloaders():
    """
    Prepares and returns the CIFAR-10 training and testing dataloaders.
    """
    print("Step 2: Preparing CIFAR-10 Dataloaders...")

    # Define the transformations for the images.
    # - Resize to the expected input size of the pre-trained model.
    # - Convert the image to a PyTorch Tensor.
    # - Normalize the tensor values to a standard range.
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])

    # Download and load the training dataset.
    train_dataset = ...

    # Download and load the test dataset.
    test_dataset = ...

    # Create DataLoader instances to handle batching and shuffling.
    train_loader = ...
    test_loader = ...

    print("Dataloaders created successfully.")
    return train_loader, test_loader

### step 3. `load_pretrained_model`
This step leverages the power of transfer learning by fetching a model that has already been
trained on a very large dataset (like ImageNet). We use the `AutoModel.from_pretrained`
function from the Hugging Face library to download and instantiate 'google/efficientnet-b0'.
This model has already learned a rich set of features for recognizing various objects, which we
can adapt for our specific task without having to train a large model from scratch.

In [None]:
# --- Step 3: Load a Pre-trained Model from Hugging Face ---
# We will use the 'google/efficientnet-b0' model. The AutoModel class from
# Hugging Face automatically fetches the correct model architecture.

def load_pretrained_model():
    """
    Loads the EfficientNet-B0 model from Hugging Face.
    """
    print("\nStep 3: Loading pre-trained model from Hugging Face...")
    model_name = "google/efficientnet-b0"
    base_model = AutoModel.from_pretrained(model_name)
    print(f"'{model_name}' loaded successfully.")
    return base_model



### step 4. `EfficientNetCIFAR10`

 Here we define our new model architecture. The core idea is to use the pre-trained model as a
 fixed feature extractor. We achieve this by "freezing" all the parameters of the base model
 (`param.requires_grad = False`), so they won't be updated during training. Then, we add a new,
 trainable `nn.Linear` layer on top. This layer, our "classifier," is the only part of the model
 that will learn. It takes the high-level features extracted by the base model and learns to map
 them to the 10 classes of the CIFAR-10 dataset.

In [None]:
# --- Step 4: Build the Custom Classifier Model ---
# Here, we'll define a new PyTorch module. This module will contain the
# pre-trained EfficientNet as its base and a new, trainable linear layer
# on top, which will act as our CIFAR-10 classifier.

class EfficientNetCIFAR10(nn.Module):
    def __init__(self, base_model):
        """
        Initializes the custom classifier model.
        Args:
            base_model: A pre-trained model from Hugging Face.
        """
        super(EfficientNetCIFAR10, self).__init__()
        print("\nStep 4: Building the custom classifier...")

        self.base_model = base_model
        # The number of output classes for CIFAR-10 is 10.
        num_classes = 10

        # Freeze the parameters of the base model.
        # This is a crucial step in transfer learning. We don't want to update
        # the weights of the pre-trained layers, only our new classifier.
        ...

        # Get the number of output features from the base model's last layer.
        # For this EfficientNet model, this is found in the last pooling layer's output.
        # This can vary between models, so you might need to inspect the model architecture.
        in_features = ...

        # Create a new linear layer for classification.
        self.classifier = ...
        print("Custom classifier built successfully.")


    def forward(self, x):
        """
        Defines the forward pass of the model.
        """
        # Pass the input through the base model.
        outputs = ...

        # The output from Hugging Face models often includes more than just the final
        # layer's activations. We're interested in `last_hidden_state`.
        # We then take the mean across the spatial dimensions to get a feature vector.
        pooled_output = outputs.last_hidden_state.mean(dim=[2, 3])

        # Pass the pooled output through our new classifier.
        logits = ...
        return logits

### step 5. `train_model` 
This function contains the logic for training our model. It iterates over the training dataset
for a specified number of `epochs`. In each iteration, it performs the standard PyTorch training
steps: 1) get a batch of data, 2) perform a forward pass to get the model's predictions,
3) calculate the loss (how wrong the predictions are) using the specified criterion,
4) perform a backward pass (`loss.backward()`) to compute gradients, and 5) update the model's
trainable weights (only our classifier layer) using the optimizer (`optimizer.step()`).

In [None]:
# --- Step 5: Define the Training Loop ---
def train_model(model, train_loader, criterion, optimizer, device, epochs=3):
    """
    The main training loop.
    """
    print("\nStep 5: Starting the training process...")
    model.to(device) # Move model to the selected device (GPU/CPU)

    for epoch in range(epochs):
        model.train() # Set the model to training mode
        running_loss = 0.0
        
        # Use tqdm for a nice progress bar
        progress_bar = tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs}")

        for images, labels in progress_bar:
            # Move images and labels to the device
            images, labels = images.to(device), labels.to(device)

            # 1. Zero the parameter gradients
            ...

            # 2. Forward pass
            outputs = ...
            loss = ...

            # 3. Backward pass and optimize
            ...
            ...

            running_loss += loss.item()
            progress_bar.set_postfix({'loss': f'{loss.item():.4f}'})

        avg_loss = running_loss / len(train_loader)
        print(f"End of Epoch {epoch+1}, Average Training Loss: {avg_loss:.4f}")

    print("Training finished.")

### step 6. `evaluate_model`

After training, we need to evaluate how well our model performs on unseen data. This function
iterates through the test dataset. For each batch, it gets the model's predictions and compares
them to the true labels. It's important to set the model to `eval()` mode, which disables
certain layers like dropout, and to use `torch.no_grad()` to stop PyTorch from calculating
gradients, which makes the process faster and uses less memory. The function calculates and prints the final accuracy of the model on the test set.

In [None]:
# --- Step 6: Define the Evaluation Loop ---
def evaluate_model(model, test_loader, device):
    """
    Evaluates the model's performance on the test set.
    """
    print("\nStep 6: Evaluating the model...")
    model.to(device)
    model.eval() # Set the model to evaluation mode

    correct = 0
    total = 0
    
    # Since we're not training, we don't need to calculate gradients
    with torch.no_grad():
        for images, labels in tqdm(test_loader, desc="Evaluating"):
            images, labels = images.to(device), labels.to(device)
            
            # Get model predictions
            outputs = ...
            
            # Get the class with the highest score
            _, predicted = ...
            
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f"Accuracy of the model on the {total} test images: {accuracy:.2f} %")

## Putting everything together

All code written above should "fill" the different function calls. Once all is done, the following cell should run and do the training! 


In [None]:
# --- Step 7: Main Execution ---
# This block ties everything together.
print("--- PyTorch Transfer Learning Exercise ---")

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# 1. Get Dataloaders
train_loader, test_loader = get_cifar10_dataloaders()

# 2. Load Base Model
base_model = load_pretrained_model()

# 3. Create Custom Model
cifar_model = EfficientNetCIFAR10(base_model)

# 4. Define Loss and Optimizer
# We only want to train the parameters of our new classifier layer.
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cifar_model.classifier.parameters(), lr=0.001)

# 5. Train the Model
# Note: Training for more epochs will yield better results.
# 3 epochs is a good starting point to verify the setup works.
train_model(cifar_model, train_loader, criterion, optimizer, device, epochs=3)

# 6. Evaluate the Model
evaluate_model(cifar_model, test_loader, device)

print("\nExercise Complete!")
