# Chapter 5 Bonus: Practical CNN Lab

This bonus notebook extends **Chapter 5 – Convolutional Neural Networks** with practical, visual experiments.

We focus on:

1. **Convolution Kernel Visualizations** – what 3×3 filters actually do to images
2. **Feature Map Exploration** – visualizing activations inside a pretrained CNN
3. **Training a Small CNN** – end-to-end training on a small dataset
4. **Data Augmentation Demo** – how augmentations change images and affect training
5. **Transfer Learning Exercise** – fine-tuning a pretrained model on a new task
6. **(Optional) CNN Architecture Quiz** – quick recap of key architectures

These experiments build **intuition** for CNNs as feature extractors and practical tools, complementing the more theoretical and architectural focus of the main Chapter 5 notebook.

## Setup and Imports

We will use **PyTorch** and **torchvision** for CNNs and datasets, plus **NumPy** and **matplotlib** for visualization.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, random_split

import torchvision
from torchvision import datasets, transforms, models

# Plot style
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 11

# Reproducibility
np.random.seed(42)
torch.manual_seed(42)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

# 1. Convolution Kernel Visualizations

Before diving into deep CNNs, it’s helpful to see how **simple kernels** transform images.

In this section we:

- Load a sample image
- Define a few classic 3×3 kernels:
  - **Identity** (no change)
  - **Blur / Box filter**
  - **Sharpen**
  - **Edge detectors** (Sobel-like filters)
- Apply them using a single convolution and visualize the results.

This connects directly to the idea that convolutional filters act as **feature detectors** (edges, textures, etc.), as discussed in Section 5.1.

In [None]:
# Load a sample image from CIFAR10 (or use your own image if you prefer)

transform_basic = transforms.Compose([
    transforms.ToTensor()
])

cifar10 = datasets.CIFAR10(root="./data", train=False, download=True, transform=transform_basic)
img, label = cifar10[0]

# Use only a single channel (convert to grayscale for kernel visualization)
img_gray = img.mean(dim=0, keepdim=True)  # shape: (1, H, W)

print("Image shape (C,H,W):", img.shape)

# Define 3x3 kernels
kernels = {
    "identity": torch.tensor([[0, 0, 0],
                               [0, 1, 0],
                               [0, 0, 0]], dtype=torch.float32),
    "blur": (1/9.0) * torch.ones((3, 3), dtype=torch.float32),
    "sharpen": torch.tensor([[0, -1, 0],
                              [-1, 5, -1],
                              [0, -1, 0]], dtype=torch.float32),
    "edge_horizontal": torch.tensor([[-1, -2, -1],
                                      [ 0,  0,  0],
                                      [ 1,  2,  1]], dtype=torch.float32),
    "edge_vertical": torch.tensor([[-1, 0, 1],
                                    [-2, 0, 2],
                                    [-1, 0, 1]], dtype=torch.float32),
}

# Apply kernels via conv2d
img_batch = img_gray.unsqueeze(0)  # (1,1,H,W)

outputs = {}
for name, k in kernels.items():
    kernel = k.view(1, 1, 3, 3)  # (out_channels,in_channels,3,3)
    out = F.conv2d(img_batch, kernel, padding=1)
    outputs[name] = out.squeeze().detach().numpy()

# Plot original and filtered images
fig, axes = plt.subplots(2, 3, figsize=(12, 8))
axes = axes.ravel()

axes[0].imshow(img.permute(1, 2, 0).numpy())
axes[0].set_title("Original RGB")
axes[0].axis("off")

axes[1].imshow(img_gray.squeeze().numpy(), cmap="gray")
axes[1].set_title("Grayscale")
axes[1].axis("off")

for ax, (name, out) in zip(axes[2:], outputs.items()):
    ax.imshow(out, cmap="gray")
    ax.set_title(name)
    ax.axis("off")

plt.tight_layout()
plt.show()

**Observations:**

- Blur filters smooth the image and remove high-frequency details.
- Sharpen filters emphasize edges and small details.
- Edge detectors highlight horizontal or vertical transitions.

This is exactly how early CNN layers learn to detect basic patterns such as edges and simple textures.

# 2. Feature Map Exploration in a Pretrained CNN

Next we peek **inside** a pretrained CNN (e.g., ResNet18) to see what different layers are doing.

We will:

- Load a pretrained ResNet18
- Register forward hooks on an early conv layer and a deeper layer
- Pass an image through the network
- Visualize a few activation maps from early vs late layers

Early layers usually detect **edges and simple color/texture blobs**, while deeper layers respond to **more abstract, class-specific patterns**.

In [None]:
# Load a pretrained ResNet18 (may download weights the first time)
resnet18 = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)
resnet18 = resnet18.to(device)
resnet18.eval()

# Preprocessing for ImageNet models
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225]),
])

# Use a CIFAR10 image as input (resized to 224x224)
img_raw, _ = cifar10[10]
img_pil = transforms.ToPILImage()(img_raw)
img_input = preprocess(img_pil).unsqueeze(0).to(device)  # (1,3,224,224)

# Containers for feature maps
features = {}


def get_activation(name):
    def hook(model, input, output):
        features[name] = output.detach().cpu()
    return hook

# Register hooks on first conv layer and a deeper layer (layer3)
resnet18.conv1.register_forward_hook(get_activation("conv1"))
resnet18.layer3.register_forward_hook(get_activation("layer3"))

# Forward pass
with torch.no_grad():
    _ = resnet18(img_input)

# Visualize a few feature maps
conv1_feats = features["conv1"]  # (1,C,H,W)
layer3_feats = features["layer3"]  # (1,C,H,W)

print("conv1 feature map shape:", conv1_feats.shape)
print("layer3 feature map shape:", layer3_feats.shape)

# Helper to plot a grid of feature maps

def plot_feature_maps(feats, title, n_maps=8):
    feats = feats[0]  # remove batch dim -> (C,H,W)
    n_maps = min(n_maps, feats.shape[0])
    fig, axes = plt.subplots(1, n_maps, figsize=(2*n_maps, 2))
    for i in range(n_maps):
        ax = axes[i]
        fmap = feats[i].numpy()
        fmap = (fmap - fmap.min()) / (fmap.max() - fmap.min() + 1e-8)
        ax.imshow(fmap, cmap="viridis")
        ax.axis("off")
    fig.suptitle(title)
    plt.tight_layout()
    plt.show()

# Plot early and deeper feature maps
plot_feature_maps(conv1_feats, "ResNet18 conv1 feature maps")
plot_feature_maps(layer3_feats, "ResNet18 layer3 feature maps")

**Observations:**

- Early feature maps tend to look like **edge detectors** and simple blob detectors.
- Deeper feature maps are harder to interpret visually but tend to focus on **parts or high-level patterns**.

This matches the idea that CNNs build a **hierarchy of features**, from low-level to high-level, as discussed in Section 5.2.

# 3. Training a Small CNN on a Tiny Dataset

Now we train a **small CNN** end-to-end on a subset of CIFAR10.

We will:

- Define a simple CNN with a few conv + pooling layers
- Train it on a **small subset** of CIFAR10 (e.g., first 10,000 images)
- Track training and validation accuracy over epochs
- Optionally show overfitting if the network is too large relative to the data

This brings to life the training process described in Section 5.1.4.

In [None]:
# Define a simple CNN for CIFAR10-like images

class SmallCNN(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),  # 32x16x16
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),  # 64x8x8
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64 * 8 * 8, 128),
            nn.ReLU(),
            nn.Linear(128, num_classes),
        )
    
    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x


# Data transforms (basic for now; augmentation later)
train_transform_basic = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
])

train_full = datasets.CIFAR10(root="./data", train=True, download=True, transform=train_transform_basic)
test_full = datasets.CIFAR10(root="./data", train=False, download=True, transform=test_transform)

# Use a smaller subset for quicker experiments
subset_size = 10000
train_subset, _ = random_split(train_full, [subset_size, len(train_full) - subset_size])

batch_size = 64
train_loader_cnn = DataLoader(train_subset, batch_size=batch_size, shuffle=True)
val_loader_cnn = DataLoader(test_full, batch_size=batch_size, shuffle=False)


def train_cnn(model, train_loader, val_loader, epochs=10, lr=1e-3):
    model.to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()
    train_accs, val_accs = [], []
    
    for epoch in range(epochs):
        # Train
        model.train()
        correct = 0
        total = 0
        for xb, yb in train_loader:
            xb, yb = xb.to(device), yb.to(device)
            optimizer.zero_grad()
            logits = model(xb)
            loss = criterion(logits, yb)
            loss.backward()
            optimizer.step()
            
            preds = logits.argmax(dim=1)
            correct += (preds == yb).sum().item()
            total += yb.size(0)
        train_acc = correct / total
        train_accs.append(train_acc)
        
        # Validation
        model.eval()
        correct = 0
        total = 0
        with torch.no_grad():
            for xb, yb in val_loader:
                xb, yb = xb.to(device), yb.to(device)
                logits = model(xb)
                preds = logits.argmax(dim=1)
                correct += (preds == yb).sum().item()
                total += yb.size(0)
        val_acc = correct / total
        val_accs.append(val_acc)
        
        print(f"Epoch {epoch+1}/{epochs} - train_acc={train_acc:.3f}, val_acc={val_acc:.3f}")
    
    return train_accs, val_accs


small_cnn = SmallCNN(num_classes=10)
train_accs, val_accs = train_cnn(small_cnn, train_loader_cnn, val_loader_cnn, epochs=8, lr=1e-3)

# Plot accuracy curves
plt.figure(figsize=(8, 4))
plt.plot(train_accs, label="Train accuracy")
plt.plot(val_accs, label="Val accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("Small CNN on CIFAR10 subset")
plt.legend()
plt.tight_layout()
plt.show()

You can experiment by:

- Increasing the network size (more filters or layers) to see overfitting (train accuracy → high, val accuracy → stagnates).
- Reducing the subset size to make overfitting more dramatic.

Next, we will add **data augmentation** and compare.

# 4. Data Augmentation Demo

Data augmentation is a simple but powerful way to improve **generalization** of CNNs by exposing them to varied versions of the training data.

In this section we:

- Visualize common augmentations: random horizontal flip, random crop, color jitter
- Compare training the small CNN **with vs without** augmentation (qualitatively via accuracy curves)

This matches the practical training techniques discussed in Section 5.1.4.

In [None]:
# Visualize augmentations on a few CIFAR10 images

augment_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomCrop(32, padding=4),
    transforms.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.3),
    transforms.ToTensor(),
])

cifar_train_raw = datasets.CIFAR10(root="./data", train=True, download=True, transform=transforms.ToTensor())

fig, axes = plt.subplots(3, 6, figsize=(12, 6))
axes = axes.reshape(3, 6)

for i in range(3):
    img, label = cifar_train_raw[i]
    axes[i, 0].imshow(img.permute(1, 2, 0).numpy())
    axes[i, 0].set_title("Original")
    axes[i, 0].axis("off")
    
    for j in range(1, 6):
        img_aug = augment_transform(transforms.ToPILImage()(img))
        axes[i, j].imshow(img_aug.permute(1, 2, 0).numpy())
        axes[i, j].axis("off")

plt.suptitle("Data Augmentation Examples (CIFAR10)")
plt.tight_layout()
plt.show()

# (Optional) You can re-train SmallCNN with augment_transform instead of train_transform_basic
# and compare accuracy curves manually.

# 5. Transfer Learning Exercise (ResNet18)

Transfer learning allows us to reuse a pretrained CNN as a **feature extractor** and adapt it to a new task with relatively little data.

In this section we:

- Load a pretrained ResNet18
- Replace the final fully-connected layer with a new layer for a **small number of classes** (e.g., 2 for cats vs dogs)
- Freeze the backbone parameters and train only the new head
- Show how to evaluate on a few images

You can adapt the code to any `ImageFolder` dataset on disk (e.g., `data/cats_vs_dogs/train` and `data/cats_vs_dogs/val`).

In [None]:
# Example transfer learning setup (paths assume an ImageFolder structure):
# data/
#   cats_vs_dogs/
#     train/
#       cats/...
#       dogs/...
#     val/
#       cats/...
#       dogs/...

from pathlib import Path

# Change this path to your own small dataset if you have one
root_tl = Path("data/cats_vs_dogs")

transform_tl = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225]),
])

if root_tl.exists():
    train_ds_tl = datasets.ImageFolder(root_tl / "train", transform=transform_tl)
    val_ds_tl = datasets.ImageFolder(root_tl / "val", transform=transform_tl)

    train_loader_tl = DataLoader(train_ds_tl, batch_size=16, shuffle=True)
    val_loader_tl = DataLoader(val_ds_tl, batch_size=16, shuffle=False)

    num_classes_tl = len(train_ds_tl.classes)

    # Load pretrained ResNet18
    base_model = models.resnet18(weights=models.ResNet18_Weights.IMAGENET1K_V1)

    # Freeze all layers
    for param in base_model.parameters():
        param.requires_grad = False

    # Replace final FC layer
    in_features = base_model.fc.in_features
    base_model.fc = nn.Linear(in_features, num_classes_tl)

    base_model = base_model.to(device)

    optimizer = torch.optim.Adam(base_model.fc.parameters(), lr=1e-3)
    criterion = nn.CrossEntropyLoss()

    def train_transfer(model, train_loader, val_loader, epochs=5):
        train_accs, val_accs = [], []
        for epoch in range(epochs):
            model.train()
            correct = 0
            total = 0
            for xb, yb in train_loader:
                xb, yb = xb.to(device), yb.to(device)
                optimizer.zero_grad()
                logits = model(xb)
                loss = criterion(logits, yb)
                loss.backward()
                optimizer.step()

                preds = logits.argmax(dim=1)
                correct += (preds == yb).sum().item()
                total += yb.size(0)
            train_acc = correct / total
            train_accs.append(train_acc)

            model.eval()
            correct = 0
            total = 0
            with torch.no_grad():
                for xb, yb in val_loader:
                    xb, yb = xb.to(device), yb.to(device)
                    logits = model(xb)
                    preds = logits.argmax(dim=1)
                    correct += (preds == yb).sum().item()
                    total += yb.size(0)
            val_acc = correct / total
            val_accs.append(val_acc)
            print(f"Epoch {epoch+1}/{epochs} - train_acc={train_acc:.3f}, val_acc={val_acc:.3f}")
        return train_accs, val_accs

    print("\nStarting transfer learning training (if dataset is available)...")
    train_accs_tl, val_accs_tl = train_transfer(base_model, train_loader_tl, val_loader_tl, epochs=3)

    plt.figure(figsize=(8, 4))
    plt.plot(train_accs_tl, label="Train accuracy")
    plt.plot(val_accs_tl, label="Val accuracy")
    plt.xlabel("Epoch")
    plt.ylabel("Accuracy")
    plt.title("Transfer Learning (ResNet18) on Small Dataset")
    plt.legend()
    plt.tight_layout()
    plt.show()
else:
    print("Transfer learning data folder not found. Please create 'data/cats_vs_dogs/train' and 'data/cats_vs_dogs/val' to run this section.")

## Summary

In this practical CNN lab we:

- Visualized how simple **3×3 kernels** (blur, sharpen, edge detectors) transform images.
- Explored **feature maps** inside a pretrained ResNet18, highlighting the hierarchy from low-level to high-level features.
- Trained a **small CNN** on a CIFAR10 subset and inspected training/validation accuracy.
- Demonstrated **data augmentation** techniques and how they change images.
- Provided a **transfer learning** template to fine-tune ResNet18 on a small custom dataset.
- Recapped important **CNN architectures** with a short quiz.

These experiments should give you stronger intuition for how CNNs operate in practice and how to apply them to real-world image tasks.