# Transfer Learning with ResNet34 on CIFAR-10

This notebook demonstrates transfer learning using a pretrained ResNet34 model on the CIFAR-10 dataset. We explore two common approaches:

- **Feature Extraction:** Freeze the pretrained base layers and train only the final classifier.
- **Fine Tuning:** Train the entire model, updating all weights.

---

## Setup and Imports


In [None]:
import torch
import torch.nn as nn
import copy
import matplotlib.pyplot as plt
from resnet_models import initialize_resnet34
from dataloader_generator import get_dataloaders
from utils import train_model, plot_loss_accuracy, plot_confusion_matrix, plot_predictions

# For reproducibility
torch.manual_seed(42)

# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")


## Load CIFAR-10 Data

We use standard transforms for training and validation data with normalization and data augmentation on training.


In [None]:
batch_size = 50
train_dl, valid_dl, class_names_dict = get_dataloaders(batch_size=batch_size)
print(f"Number of classes: {len(class_names_dict)}")


## Initialize ResNet34 Model

We initialize the pretrained ResNet34 model and modify the final fully connected layer to match CIFAR-10 classes.


In [None]:
num_classes = len(class_names_dict)
model_fe = initialize_resnet34(num_classes=num_classes, pretrained=True)
model_fe.to(device)
model_ft = copy.deepcopy(model_fe)

## Transfer Learning: Feature Extraction

Freeze base layers and train only the final fully connected layer.


In [None]:
loss_fn = nn.CrossEntropyLoss()
optimizer_fe = torch.optim.Adam(model_fe.fc.parameters(), lr=1e-4)
num_epochs = 25

# Freeze base layers
for param in model_fe.parameters():
    param.requires_grad = False
for param in model_fe.fc.parameters():
    param.requires_grad = True

# Train model
train_loss_fe, train_acc_fe, valid_loss_fe, valid_acc_fe = train_model(
    model_fe, train_dl, valid_dl, loss_fn, optimizer_fe, num_epochs, device
)

plot_loss_accuracy(
    {"train_loss": train_loss_fe, "valid_loss": valid_loss_fe},
    {"train_accu": train_acc_fe, "valid_accu": valid_acc_fe}
)


### Confusion Matrix and Predictions for Feature Extraction Model


In [None]:
plot_confusion_matrix(model_fe, valid_dl, class_names_dict, device)
plot_predictions(model_fe, train_dl, class_names_dict, device)


## Transfer Learning: Fine Tuning

Now, we train the entire ResNet34 model, updating all weights.


In [None]:
optimizer_ft = torch.optim.Adam(model_ft.parameters(), lr=1e-4)

# Train model (all layers trainable)
train_loss_ft, train_acc_ft, valid_loss_ft, valid_acc_ft = train_model(
    model_ft, train_dl, valid_dl, loss_fn, optimizer_ft, num_epochs, device
)

plot_loss_accuracy(
    {"train_loss": train_loss_ft, "valid_loss": valid_loss_ft},
    {"train_accu": train_acc_ft, "valid_accu": valid_acc_ft}
)


### Confusion Matrix and Predictions for Fine-Tuned Model


In [None]:
plot_confusion_matrix(model_ft, valid_dl, class_names_dict, device)
plot_predictions(model_ft, train_dl, class_names_dict, device)


## Conclusion

- **Feature Extraction** is faster to train and requires fewer parameters to update, but may achieve slightly lower accuracy.
- **Fine Tuning** typically yields better accuracy by adapting all layers but is computationally more intensive.

In this experiment, fine tuning improved validation accuracy significantly compared to feature extraction.

Both approaches are useful depending on the compute resources available and specific use case. Transfer learning leverages pretrained weights effectively, providing a strong baseline compared to training from scratch.

---

You can further explore hyperparameter tuning, other architectures, or custom datasets to extend this project.
