# Weekly Project: Image Classification with Transfer Learning

In this project, you will build a complete image classification pipeline using transfer learning. You'll work with the dataset provided by your instructor.

**Learning Objectives:**
- Load and prepare image datasets for deep learning
- Use pre-trained models for transfer learning
- Implement two transfer learning strategies: fine-tuning and feature extraction
- Evaluate model performance
- Deploy models using ONNX for production (Optional)

**References:**

- [Training with PyTorch](https://docs.pytorch.org/tutorials/beginner/introyt/trainingyt.html)
- [PyTorch Transfer Learning Tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html)

## Table of Contents

1. [Data Ingestion](#1)
2. [Data Preparation](#2)
3. [Model Building](#3)
4. [Training](#4)
   - [4.1 ConvNet as Fixed Feature Extractor](#4-1)
   - [4.2 Fine-tuning the ConvNet](#4-2)
5. [Evaluation](#5)
6. [Inference on Custom Images](#6)
7. [Deployment (ONNX)](#7)

## Imports

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torch.backends.cudnn as cudnn
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
from PIL import Image
from tempfile import TemporaryDirectory

cudnn.benchmark = True
plt.ion()

## Setup Device



```
`# This is formatted as code`
```

**Note: you will need a GPU; so please run this on Colab and specify a GPU runtime (e.g., T4-GPU)**

In [None]:
# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

<a name='1'></a>
## 1. Data Ingestion

**Task**: The dataset should be downloaded and extracted to a local directory.

**References:**

- [Dataset and DataLoader](https://docs.pytorch.org/tutorials/beginner/introyt/trainingyt.html#dataset-and-dataloader)
- [torchvision.datasets.ImageFolder](https://pytorch.org/vision/stable/generated/torchvision.datasets.ImageFolder.html)

In [None]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("puneet6060/intel-image-classification")

print("Path to dataset files:", path)

**Task**: create a `train_dataset` and `test_dataset` (without transforms for now).

In [None]:
# YOUR CODE HERE
from torchvision import datasets
import os

# Paths to train and test folders
train_dir = os.path.join(path, "seg_train", "seg_train")
test_dir  = os.path.join(path, "seg_test", "seg_test")

# Create datasets (no transforms yet)
train_dataset = datasets.ImageFolder(root=train_dir)
test_dataset  = datasets.ImageFolder(root=test_dir)

# Quick sanity check
print("Number of training images:", len(train_dataset))
print("Number of test images:", len(test_dataset))
print("Classes:", train_dataset.classes)

**Quick Check**: verify the counts of both train and test sets, match what's in the original source (Kaggle).

> Add blockquote



In [None]:
# YOUR CODE HERE
# Quick check: verify dataset sizes
print("Training set size:", len(train_dataset))
print("Test set size:", len(test_dataset))


<a name='2'></a>
## 2. Data Preparation

Before training, we need to:
1. Define augmentation for training
2. Define normalization for both training and testing
3. Create **`DataLoader`** for efficient batch processing

**Task:** Create transformation pipelines for training and validation. Pre-trained models expect ImageNet normalization statistics.

**Reference:**

- [torchvision.transforms](https://pytorch.org/vision/stable/transforms.html)

In [None]:
# YOUR CODE HERE
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
import os

# ImageNet normalization stats (required for pretrained models)
IMAGENET_MEAN = [0.485, 0.456, 0.406]
IMAGENET_STD  = [0.229, 0.224, 0.225]

IMG_SIZE = 224  # common size for pretrained CNNs

# 1) Transforms
train_transforms = transforms.Compose([
    transforms.RandomResizedCrop(IMG_SIZE),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
])

val_transforms = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(IMG_SIZE),
    transforms.ToTensor(),
    transforms.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
])

# 2) Re-create datasets WITH transforms
train_dir = os.path.join(path, "seg_train", "seg_train")
test_dir  = os.path.join(path, "seg_test", "seg_test")

train_dataset = datasets.ImageFolder(root=train_dir, transform=train_transforms)
test_dataset  = datasets.ImageFolder(root=test_dir,  transform=val_transforms)

# 3) DataLoaders
BATCH_SIZE = 32
NUM_WORKERS = 2  # Colab usually ok with 2

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True,
                          num_workers=NUM_WORKERS, pin_memory=True)
test_loader  = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False,
                          num_workers=NUM_WORKERS, pin_memory=True)

print("Train batches:", len(train_loader))
print("Test batches:", len(test_loader))
print("Classes:", train_dataset.classes)


In [None]:
# YOUR CODE HERE

**Quick Check**: Visualize a batch of training images

In [None]:
helper_utils.visualize_batch?

In [None]:
# YOUR CODE HERE
import matplotlib.pyplot as plt
import numpy as np
import torchvision

# Get one batch of training images
images, labels = next(iter(train_loader))

# Function to unnormalize images for visualization
def unnormalize(img):
    img = img.clone()
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3,1,1)
    std  = torch.tensor([0.229, 0.224, 0.225]).view(3,1,1)
    img = img * std + mean
    return img

# Unnormalize a batch
images = torch.stack([unnormalize(img) for img in images])

# Make a grid
grid = torchvision.utils.make_grid(images[:8], nrow=4)

# Plot
plt.figure(figsize=(10,6))
plt.imshow(np.transpose(grid.numpy(), (1, 2, 0)))
plt.axis("off")
plt.title("Sample Training Images")
plt.show()


<a name='3'></a>
## 3. Model Building

We'll use a pre-trained ResNet-18 model and adapt it for our 6-class classification task.

**Task:** Load a pre-trained ResNet-18 model and modify the final layer for 6 classes.

**Reference:**

- [PyTorch Transfer Learning Tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html)
- [torchvision.models](https://pytorch.org/vision/stable/models.html)
- [ResNet documentation](https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet18.html)

In [None]:
# YOUR CODE HERE
import torch
import torch.nn as nn
from torchvision import models

NUM_CLASSES = 6

# 1) Load pretrained ResNet-18
model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)

# 2) Replace final fully-connected layer
in_features = model.fc.in_features
model.fc = nn.Linear(in_features, NUM_CLASSES)

# Move model to device
model = model.to(device)

print(model.fc)


<a name='4'></a>
## 4. Training

**Task:** Implement a training **function** and then train using two different transfer learning strategies.

**Reference:** [PyTorch Training Tutorial](https://docs.pytorch.org/tutorials/beginner/introyt/trainingyt.html#the-training-loop)

In [None]:
import torch
import torch.nn as nn
from tqdm.auto import tqdm

def train_model(model, train_loader, val_loader, criterion, optimizer, device, epochs=5):
    history = {"train_loss": [], "train_acc": [], "val_loss": [], "val_acc": []}

    for epoch in range(epochs):
        # -------- Train --------
        model.train()
        running_loss, running_correct, total = 0.0, 0, 0

        for images, labels in tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs} [train]", leave=False):
            images, labels = images.to(device), labels.to(device)

            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item() * images.size(0)
            preds = outputs.argmax(dim=1)
            running_correct += (preds == labels).sum().item()
            total += images.size(0)

        train_loss = running_loss / total
        train_acc = running_correct / total

        # -------- Validate/Test --------
        model.eval()
        val_running_loss, val_running_correct, val_total = 0.0, 0, 0

        with torch.no_grad():
            for images, labels in tqdm(val_loader, desc=f"Epoch {epoch+1}/{epochs} [val]", leave=False):
                images, labels = images.to(device), labels.to(device)

                outputs = model(images)
                loss = criterion(outputs, labels)

                val_running_loss += loss.item() * images.size(0)
                preds = outputs.argmax(dim=1)
                val_running_correct += (preds == labels).sum().item()
                val_total += images.size(0)

        val_loss = val_running_loss / val_total
        val_acc = val_running_correct / val_total

        history["train_loss"].append(train_loss)
        history["train_acc"].append(train_acc)
        history["val_loss"].append(val_loss)
        history["val_acc"].append(val_acc)

        print(f"Epoch {epoch+1}/{epochs} | "
              f"train loss: {train_loss:.4f}, train acc: {train_acc:.4f} | "
              f"val loss: {val_loss:.4f}, val acc: {val_acc:.4f}")

    return history


In [None]:
import torch
import torch.nn as nn
from torchvision import models

NUM_CLASSES = 6
criterion = nn.CrossEntropyLoss()

# -------- Strategy 1: Feature Extraction --------
model_fe = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
model_fe.fc = nn.Linear(model_fe.fc.in_features, NUM_CLASSES)

for p in model_fe.parameters():
    p.requires_grad = False
for p in model_fe.fc.parameters():
    p.requires_grad = True

model_fe = model_fe.to(device)
optimizer_fe = torch.optim.Adam(model_fe.fc.parameters(), lr=1e-3)

history_fe = train_model(model_fe, train_loader, test_loader, criterion, optimizer_fe, device, epochs=5)


# -------- Strategy 2: Fine-tuning --------
model_ft = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
model_ft.fc = nn.Linear(model_ft.fc.in_features, NUM_CLASSES)

for p in model_ft.parameters():
    p.requires_grad = True

model_ft = model_ft.to(device)
optimizer_ft = torch.optim.Adam(model_ft.parameters(), lr=1e-4)

history_ft = train_model(model_ft, train_loader, test_loader, criterion, optimizer_ft, device, epochs=5)


<a name='4-1'></a>
### 4.1 ConvNet as Fixed Feature Extractor

In this approach, we freeze all the convolutional layers and only train the final classifier layer.

**Task:**

1. Load a fresh pre-trained model
2. Freeze all parameters except the final layer
3. Set up optimizer to only train the final layer
4. Train the model

In [None]:
# YOUR CODE HERE
import torch
import torch.nn as nn
from torchvision import models

NUM_CLASSES = 6

model_fe = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
model_fe.fc = nn.Linear(model_fe.fc.in_features, NUM_CLASSES)
model_fe = model_fe.to(device)


In [None]:
# YOUR CODE HERE
for param in model_fe.parameters():
    param.requires_grad = False

for param in model_fe.fc.parameters():
    param.requires_grad = True


In [None]:
# YOUR CODE HERE
criterion = nn.CrossEntropyLoss()
optimizer_fe = torch.optim.Adam(model_fe.fc.parameters(), lr=1e-3)


In [None]:
# YOUR CODE HERE
history_fe = train_model(
    model_fe,
    train_loader,
    test_loader,
    criterion,
    optimizer_fe,
    device,
    epochs=5
)


**Quick Check**: Visualize training history

In [None]:
import matplotlib.pyplot as plt
import helper_utils


helper_utils.visualize_training_history(history_fe)
plt.show()

**Quick Check**: Visualize predictions

In [None]:
import matplotlib.pyplot as plt

class_names = train_dataset.classes

dataloaders = {"val": test_loader}

helper_utils.visualize_predictions(
    model_fe,
    dataloaders["val"],
    class_names,
    device,
    num_images=6
)
plt.show()


In [None]:
import matplotlib.pyplot as plt

class_names = train_dataset.classes

dataloaders = {"val": test_loader}

helper_utils.visualize_predictions(
    model_ft,
    dataloaders["val"],
    class_names,
    device,
    num_images=6
)
plt.show()


<a name='4-2'></a>
### 4.2 Fine-tuning the ConvNet

In this approach, we unfreeze all layers and train the entire network with a smaller learning rate.

**Task:**

1. Load a fresh pre-trained model
2. Modify the final layer
3. Set up optimizer for all parameters with a smaller learning rate
4. Train the model

In [None]:
# YOUR CODE HERE
import torch
import torch.nn as nn
from torchvision import models

NUM_CLASSES = 6

model_ft = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
model_ft.fc = nn.Linear(model_ft.fc.in_features, NUM_CLASSES)
model_ft = model_ft.to(device)


In [None]:
# YOUR CODE HERE
for param in model_ft.parameters():
    param.requires_grad = True

In [None]:
# YOUR CODE HERE
criterion = nn.CrossEntropyLoss()
optimizer_ft = torch.optim.Adam(model_ft.parameters(), lr=1e-4)

In [None]:
# YOUR CODE HERE
history_ft = train_model(
    model_ft,
    train_loader,
    test_loader,
    criterion,
    optimizer_ft,
    device,
    epochs=5
)


**Quick Check**: Visualize training history

In [None]:
helper_utils.visualize_training_history(history_ft)
plt.show()

**Quick Check**: Visualize predictions

In [None]:
helper_utils.visualize_predictions(model_ft, dataloaders['val'], class_names, device, num_images=6)
plt.show()

<a name='5'></a>
## 5. Evaluation

Compare the performance of both approaches.

**Task:** Evaluate both models and compare their performance metrics.

In [None]:
# Evaluate models on validation set
# YOUR CODE HERE
# Compare final validation accuracies, training times, etc.

print("Feature Extractor Approach:")
print(f"  Best Val Accuracy: {max(history_fe['val_acc']):.4f}")
print(f"  Final Val Accuracy: {history_fe['val_acc'][-1]:.4f}")
print()

print("Fine-tuning Approach:")
print(f"  Best Val Accuracy: {max(history_ft['val_acc']):.4f}")
print(f"  Final Val Accuracy: {history_ft['val_acc'][-1]:.4f}")

<a name='6'></a>
## 6. Inference on Custom Images

Test your trained model on custom images.

**Task:** Load a custom image, preprocess it, and make a prediction using your trained model.

**Reference:** [Image Preprocessing](https://pytorch.org/vision/stable/transforms.html)

In [None]:
# Make prediction on a custom image
img_path = "Sea.jpg"

helper_utils.visualize_single_prediction(
    model_ft,
    img_path,
    val_transforms,
    class_names,
    device
)
plt.show()


# üèÜüéâ Congratulations on completing the Weekly Final Project! üéâüèÜ

Fantastic job on finishing the Weekly Final Project! You‚Äôve put your skills to the test and made it to the end. Take a moment to celebrate your hard work and dedication. Keep up the great work and continue your learning journey!

<a name='7'></a>
## 7. Deployment (ONNX)

Convert your trained model to ONNX format for deployment.

**Task:**
1. Convert the PyTorch model to ONNX format
2. Load the ONNX model and perform inference

**Reference:**
- [PyTorch to ONNX](https://docs.pytorch.org/tutorials/beginner/onnx/export_simple_model_to_onnx_tutorial.html)

In [None]:
!pip install onnx onnxscript

In [None]:
import torch

# Set model to evaluation mode
model_ft.eval()

# Dummy input (matches input shape of ResNet)
dummy_input = torch.randn(1, 3, 224, 224).to(device)

# Export model to ONNX
onnx_path = "model_ft.onnx"
torch.onnx.export(
    model_ft,
    dummy_input,
    onnx_path,
    input_names=["input"],
    output_names=["output"],
    dynamic_axes={
        "input": {0: "batch_size"},
        "output": {0: "batch_size"}
    },
    opset_version=11
)

print(f"Model exported to {onnx_path}")


In [None]:
!pip install onnxruntime


In [None]:
import onnxruntime as ort
import numpy as np
from PIL import Image

# Load ONNX model
ort_session = ort.InferenceSession("model_ft.onnx")

# Load and preprocess image (same transforms as validation)
img = Image.open(img_path).convert("RGB")
img = val_transforms(img)          # apply transforms
img = img.unsqueeze(0).numpy()     # shape: (1, 3, 224, 224)

# Run inference with ONNX
outputs = ort_session.run(
    None,
    {"input": img}
)

# Get predicted class
pred_class = outputs[0].argmax(axis=1)[0]
print("Predicted class:", class_names[pred_class])
