In this exercise, you will explore three common strategies for applying Transfer Learning using a pretrained ResNet18 model on the CIFAR-10 dataset. The goal is to compare how much of the pretrained knowledge from ImageNet should be reused versus fine-tuned for a new task.
You will implement and evaluate:

Feature Extraction (Fixed Base) – freeze all pretrained layers and train only the final classifier.

Partial Fine-Tuning – unfreeze the deeper layers (e.g., layer4 and fc) to adapt high-level features.

Full Fine-Tuning – train all layers with a lower learning rate for full adaptation.

The script trains each variant for a few epochs, measures test accuracy, and prints a summary comparison of the three approaches.



In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [2]:
imagenet_mean = [0.485, 0.456, 0.406]
imagenet_std  = [0.229, 0.224, 0.225]

transform_train = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(imagenet_mean, imagenet_std)
])
transform_test = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(imagenet_mean, imagenet_std)
])

train_data = datasets.CIFAR10("./data", train=True, download=True, transform=transform_train)
test_data  = datasets.CIFAR10("./data", train=False, download=True, transform=transform_test)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True)
test_loader  = DataLoader(test_data, batch_size=128, shuffle=False)

100%|██████████| 170M/170M [00:01<00:00, 86.7MB/s]


In [3]:
def train_model(model, optimizer, criterion, epochs=3):
    model.train()
    for epoch in range(epochs):
        running_loss = 0
        for x, y in train_loader:
            x, y = x.to(device), y.to(device)
            optimizer.zero_grad()
            loss = criterion(model(x), y)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        print(f"Epoch {epoch+1}/{epochs}  Loss: {running_loss/len(train_loader):.4f}")

def evaluate(model):
    model.eval()
    correct, total = 0, 0
    with torch.no_grad():
        for x, y in test_loader:
            x, y = x.to(device), y.to(device)
            preds = model(x).argmax(1)
            correct += (preds == y).sum().item()
            total += y.size(0)
    return correct / total

Strategy 1 – Feature Extraction

Train the model using feature extraction and explain how freezing pretrained layers affects its ability to learn the new CIFAR-10 classes.

In [4]:
model = models.resnet18(weights="IMAGENET1K_V1")
for p in model.parameters():
    p.requires_grad = False
model.fc = nn.Linear(512, 10)
model = model.to(device)

optimizer = optim.Adam(model.fc.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

print("\n=== Feature Extraction ===")
train_model(model, optimizer, criterion, epochs=3)
acc_fixed = evaluate(model)
print(f"Feature Extraction Accuracy: {acc_fixed:.3f}")

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


100%|██████████| 44.7M/44.7M [00:00<00:00, 195MB/s]



=== Feature Extraction ===
Epoch 1/3  Loss: 0.8204
Epoch 2/3  Loss: 0.6201
Epoch 3/3  Loss: 0.5923
Feature Extraction Accuracy: 0.804


Strategy 2 – Partial Fine-Tuning

Complete and run the code to fine-tune only the layer4 and fc layers for 3 epochs, then write down the final accuracy (acc_partial) and explain whether partial fine-tuning improved the results compared to full freezing.

In [5]:
model = models.resnet18(weights="IMAGENET1K_V1")
for name, p in model.named_parameters():
    p.requires_grad = ("layer4" in name) or ("fc" in name)
model.fc = nn.Linear(512, 10)
model = model.to(device)

optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4)
criterion = nn.CrossEntropyLoss()

print("\n=== Partial Fine-Tuning ===")
train_model(model, optimizer, criterion, epochs=3)
acc_partial = evaluate(model)
print(f"Partial Fine-Tuning Accuracy: {acc_partial:.3f}")


=== Partial Fine-Tuning ===
Epoch 1/3  Loss: 0.4336
Epoch 2/3  Loss: 0.2013
Epoch 3/3  Loss: 0.1211
Partial Fine-Tuning Accuracy: 0.910


Strategy 3 – Full Fine-Tuning

Run the code to fully fine-tune all layers of ResNet18 for 3 epochs, record the final accuracy (acc_full), and compare the result to the previous two strategies to determine which transfer learning method performs best.

In [6]:
model = models.resnet18(weights="IMAGENET1K_V1")
for p in model.parameters():
    p.requires_grad = True
model.fc = nn.Linear(512, 10)
model = model.to(device)

optimizer = optim.Adam(model.parameters(), lr=1e-5)
criterion = nn.CrossEntropyLoss()

print("\n=== Full Fine-Tuning ===")
train_model(model, optimizer, criterion, epochs=3)
acc_full = evaluate(model)
print(f"Full Fine-Tuning Accuracy: {acc_full:.3f}")


=== Full Fine-Tuning ===
Epoch 1/3  Loss: 0.8003
Epoch 2/3  Loss: 0.3019
Epoch 3/3  Loss: 0.2114
Full Fine-Tuning Accuracy: 0.928


In [7]:
print("\n---------------------------")
print(f"Feature Extraction:  {acc_fixed:.3f}")
print(f"Partial Fine-Tuning: {acc_partial:.3f}")
print(f"Full Fine-Tuning:    {acc_full:.3f}")
print("---------------------------")



---------------------------
Feature Extraction:  0.804
Partial Fine-Tuning: 0.910
Full Fine-Tuning:    0.928
---------------------------
