<a href="https://colab.research.google.com/github/urness/CS167Fall2025/blob/main/Day22_CNNs_Part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CS167: Day22
## Intro to Convolutional Neural Networks (CNNs) Part 2

#### CS167: Machine Learning, Fall 2025


## __Put the Model on Training Device (GPU or CPU)__


We want to accelerate the training process using graphical processing unit (GPU). Fortunately, in Colab we can access for GPU. You need to enable it from _Runtime (or click on the down arrow near RAM & DISK in upper right)-->Change runtime type-->GPU or TPU_

Professor Urness tested this code with the GPU option: T4

In [None]:
# check to see if torch.cuda is available, otherwise it will use CPU
import torch
import torch.nn as nn
import numpy as np
device = (
    "cuda"
    if torch.cuda.is_available()
    else "cpu"
)
print(f"Using {device} device")
# if it prints 'cuda' then colab is running using GPU device

In [None]:
import torch
import numpy as np
import random

# Set seeds for reproducibility
seed = 41  # you can choose any integer
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)

# If using CUDA:
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)  # if using multi-GPU



---



#__Load the Dataset for your CNN__

We can easily import some [built-in datasets](https://pytorch.org/vision/stable/datasets.html) from PyTorch's `torchvision.datasets` module
- [The Street View House Numbers (SVHN) Dataset](http://ufldl.stanford.edu/housenumbers/)
  - each image size: 32x32 color images
  - each image is associated with a label from __10 classes__ (0 through 9)
  - 73257 digits for training, 26032 digits for testing

In [None]:
# import libraries
import torch
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
from torchvision.transforms import ToTensor
from torchvision.datasets import SVHN
from torch.utils.data import DataLoader

In [None]:
# Credit: ChatGPT
# download the SVHN dataset -- this may take around 1 minute to download...
transform = transforms.ToTensor()
target_transform = lambda t: int(t) % 10  # maps label 10→0

train_set = SVHN(root="/content/drive/MyDrive/CS167/datasets", split="train", download=True,
                 transform=transform, target_transform=target_transform)
test_set  = SVHN(root="/content/drive/MyDrive/CS167/datasets", split="test", download=True,
                 transform=transform, target_transform=target_transform)

__Explore some sample training images__

In [None]:
# Credit: ChatGPT

# load up a batch of 32 train images
train_loader = DataLoader(train_set, batch_size=32, shuffle=True, num_workers=2)

# Helper to show a single tensor image
def show_tensor_img(img_tensor, title=None):
    # img_tensor is CxHxW in [0,1]
    img = img_tensor.permute(1, 2, 0)  # -> HxWxC
    plt.imshow(img)
    plt.axis("off")
    if title is not None:
        plt.title(title)
    plt.show()

# Show a few individual samples from the dataset
for i in range(3):
    img, label = train_set[i]
    show_tensor_img(img, title=f"Label: {label}")

# Show a small grid from a batch (with titles)
batch_imgs, batch_labels = next(iter(train_loader))  # batch_imgs: [B,3,32,32]
fig, axes = plt.subplots(2, 8, figsize=(12, 4))
axes = axes.flatten()
for ax, img, lbl in zip(axes, batch_imgs[:16], batch_labels[:16]):
    ax.imshow(img.permute(1, 2, 0))
    ax.set_title(f"{int(lbl)}")
    ax.axis("off")
plt.tight_layout()
plt.show()

#__Building Convolutional Neural Network (CNN)__

Create a network class with two methods:
- _init()_
- _forward()_

In general, we will follow this template for constructing other neural networks such as MLP in PyTorch. Here are the useful PyTorch modules we will be using for CNN construction:
- [nn.Conv2d()](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)
  - applies a 2D convolution over an input volume of $(C_{in}​,H_{in},W_{in})$ and produces an output volume of $(C_{out}​,H_{out},W_{out})$   between two adjacent layers.
  - to create this, you need to provide the followings:
    - __channel_dimension_of_input_layer__ i.e., $C_{in}$
    - __channel_dimension_of_output_layer__ i.e., $C_{out}$
    - __filter_size__ i.e., $F$

  - the other two optional parameters are __stride__: $S=1$ and __padding__: $P=0$, with default values as shown.


#__EXERCISE: You fill in Step #3__
Refer to last Thursday's code. You will need to adjust the CNN to accomodate the color image and the 32x32 image dimensions.

#__Putting Everything Together CNN__

In [None]:
# Step 1: load the Torch library and other utilities
#----------------------------------------------------

# import libraries
import torch
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
from torchvision.transforms import ToTensor
from torchvision.datasets import SVHN
from torch.utils.data import DataLoader

import time

In [None]:
# Step 2: load the dataset (you did this above -- repeated here to put all of the steps in sequence)
#--------------------------------------------------------------------------------------------------

transform = transforms.ToTensor()
target_transform = lambda t: int(t) % 10  # maps label 10→0

train_set = SVHN(root="/content/drive/MyDrive/CS167/datasets", split="train", download=True,
                 transform=transform, target_transform=target_transform)
test_set  = SVHN(root="/content/drive/MyDrive/CS167/datasets", split="test", download=True,
                 transform=transform, target_transform=target_transform)

In [None]:
###

# Step 3: Create your CNN Network with 2 conv_2d layers + 2 layers of MLP
#--------------------------------------------------------------------------------------------------

#### Your code here -- Look at last Thursday's code for inspiration
####    you will have to update some numbers for different image sizes


In [None]:
## Step 4: Your training and testing functions (updated -- now outputs accuracy to be visualized)
#--------------------------------------------------------------------------------------

def train_loop(dataloader, model, loss_fn, optimizer):
    """
    Executes one full training epoch for the given model.

    Iterates over all batches in the provided DataLoader, performing the following steps:
    - Moves input and target tensors to the selected device (CPU or GPU)
    - Computes predictions and loss for each batch
    - Performs backpropagation and optimizer updates
    - Tracks and prints training loss periodically

    Args:
        dataloader (torch.utils.data.DataLoader):
            The DataLoader providing batches of training data (inputs and labels).
        model (torch.nn.Module):
            The neural network model to be trained.
        loss_fn (torch.nn.Module or callable):
            The loss function used to compute the training loss.
        optimizer (torch.optim.Optimizer):
            The optimizer responsible for updating the model’s parameters.

    Returns:
        float: The average training loss across all batches in this epoch.
    """
    size = len(dataloader.dataset)

    model.train()                   # set the model to training mode for best practices

    size        = len(dataloader.dataset)
    train_loss, correct = 0, 0

    for batch, (X, y) in enumerate(dataloader):

        # compute prediction and loss
        X = X.to(device)                  # send data to the GPU device (if available)
        y = y.to(device)
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()      # compute gradients
        optimizer.step()     # apply updates
        optimizer.zero_grad()# clear old gradients

        train_loss += loss.item()
        correct += (pred.argmax(1) == y).type(torch.float).sum().item()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
    correct /= size

    return train_loss/len(dataloader), 100*correct

def test_loop(dataloader, model, loss_fn):
    """
    Evaluates the model’s performance on a test (or validation) dataset.

    Runs a forward pass over all batches in the provided DataLoader with gradient
    computation disabled, accumulating loss and accuracy metrics.

    Args:
        dataloader (torch.utils.data.DataLoader):
            The DataLoader providing batches of test or validation data.
        model (torch.nn.Module):
            The trained neural network model to evaluate.
        loss_fn (torch.nn.Module or callable):
            The loss function used to compute the evaluation loss.

    Returns:
        float: The average loss over all test batches.

    Prints:
        Accuracy (% of correct predictions) and average test loss.
    """

    model.eval()                    # set the model to evaluation mode for best practices

    size        = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    # Evaluating the model with torch.no_grad() ensures that no gradients are computed during test mode
    # also serves to reduce unnecessary gradient computations and memory usage for tensors with requires_grad=True
    with torch.no_grad():
        for X, y in dataloader:

            X = X.to(device)                     # send data to the GPU device (if available)
            y = y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
    return test_loss, 100*correct


In [None]:

# Step 5: prepare the DataLoader and select your optimizer and set the parameters for learning the model from DataLoader
#------------------------------------------------------------------------------------------------------------------------------
cnn_model = SimpleCNNv1() ## model Class name here
cnn_model.to(device)      ## device should have been determined earlier (at top of notebook)
learning_rate = 0.001
batch_size_val = 64
epochs = 10
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(cnn_model.parameters(), lr=learning_rate)

train_dataloader = DataLoader(train_set, batch_size=batch_size_val)
test_dataloader = DataLoader(test_set, batch_size=batch_size_val)


train_losses = []
test_losses  = []
train_accuracy = []
test_accuracy  = []
start_time   = time.time()
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    avg_train_loss, train_acc = train_loop(train_dataloader, cnn_model, loss_fn, optimizer)
    avg_test_loss, test_acc  = test_loop(test_dataloader, cnn_model, loss_fn)
    train_losses.append(avg_train_loss)
    test_losses.append(avg_test_loss)
    train_accuracy.append(train_acc)
    test_accuracy.append(test_acc)

print("Done!")

print("Total execution time: %.3f sec" %( (time.time()-start_time)) )
print("Total execution time: %.3f hrs" %( (time.time()-start_time)/3600) )

print(cnn_model.__class__.__name__, " model has been trained!")


In [None]:
# Step 6: Visualizing the accuracy curves
#------------------------------------------------------------------------------------------------------------------------------
plt.plot(range(1,epochs+1), train_accuracy)
plt.plot(range(1,epochs+1), test_accuracy)
plt.title('Model accuracy after each epoch')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'])
plt.show()


__# Try other networks??__
What about a kernel of 5?



---



#__Download the Dataset for AlexNet__

- __Bike-Cat-Dog-Person Dataset__
  - Download `bcdp_v1.zip` via blackboard (under the dataset directory)
    - Unzip the folder
    - Upload the folder (bcdp_v1) to your Google Drive
  - Each image size: __100x100x3__
    - Note that these are color images
  - Each image is associated with a label from __4 classes__
  - Training set of __1500__ examples and test set of __300__ examples



#__Prepare Your Data for Training__


In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms

# For fine-tuning with an AlexNet/VGG/ResNet architecture that has been pre-trained using the ImageNet dataset, you need to normalize
# each image with the given mean and standard deviation.

transform = transforms.Compose([
    transforms.Resize((227, 227)),                     # Resize all images to 227x227 pixels (AlexNet input size)
    transforms.ToTensor(),                             # Convert image to a PyTorch tensor and scale pixel values to [0, 1]
    transforms.Normalize((0.485, 0.456, 0.406),        # Subtract ImageNet mean for (R, G, B)
                         (0.229, 0.224, 0.225))        # Divide by ImageNet std for (R, G, B)
])
train_dir       = '/content/drive/MyDrive/CS167/datasets/bcdp_v1/train'
test_dir        = '/content/drive/MyDrive/CS167/datasets/bcdp_v1/test'

train_dataset   = datasets.ImageFolder(train_dir, transform=transform)
test_dataset    = datasets.ImageFolder(test_dir,  transform=transform)

n_train         = len(train_dataset)
n_test          = len(test_dataset)

number_of_classes = 4

print("Size of train set:", n_train)
print("Size of test set:",  n_test)

#__Building Convolutional Neural Network (CNN)__

Create a network class with two methods:
- _init()_
- _forward()_


In [None]:
import torch
import torch.nn as nn
from torchvision import models
from torchvision.models import alexnet, AlexNet_Weights
import pdb

# You can give any name to your new network, e.g., AlexNet.
# You should load the pretrained AlexNet model from torchvision.models.
# This model was trained on over a million real-world images from ImageNet.
# The idea is to bootstrap our CNN network weights with pretrained weights.
# Our model will converge to a solution faster.
# This training process is called 'fine-tuning.'


class AlexNet(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(AlexNet, self).__init__()
        net = models.alexnet(weights=AlexNet_Weights.DEFAULT)

        # retain convolutional and pooling layers from the pretrained AlexNet
        self.features = net.features
        self.avgpool = net.avgpool

        # replace the original classifier with a new one for custom output classes
        self.classifier = nn.Sequential(
            nn.Linear(256 * 6 * 6, 128),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        x = self.features(x)         # extract convolutional feature maps
        x = self.avgpool(x)          # reduce spatial dimensions to fixed 6x6
        x = torch.flatten(x, 1)      # flatten to a 1D vector per image (batch stays intact)
        x = self.classifier(x)       # apply fully connected layers for classification
        return x                     # output class scores



In [None]:
# check the structures of our cnn (based on AlexNet)

cnn_model = AlexNet(number_of_classes)
cnn_model.to(device)
print(cnn_model)

#__Putting Everything Together for AlexNet__

__Putting Everything Together using our AlexNet Network on our 4-class image recognition Dataset__

In [None]:
# Step 1: load the Torch library and other utilities
#----------------------------------------------------

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.autograd import Variable
from torchvision import transforms, datasets
from torchvision import models
from sklearn.metrics import confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay
from torchvision.models import alexnet, AlexNet_Weights
import matplotlib.pyplot as plt
import pandas
import time
import numpy as np
import os
import pdb

# check to see if torch.cuda is available, otherwise it will use CPU
device = (
    "cuda"
    if torch.cuda.is_available()
    else "cpu"
)
print(f"Using {device} device")

In [None]:
# Step 2: load the dataset (as we did above)
#--------------------------------------------------------------------------------------------------
# For fine-tuning with an AlexNet/VGG/ResNet architecture that has been pre-trained using the ImageNet dataset, you need to normalize
# each image with the given mean and standard deviation.

transform = transforms.Compose([
    transforms.Resize((227, 227)),                     # Resize all images to 227x227 pixels (AlexNet input size)
    transforms.ToTensor(),                             # Convert image to a PyTorch tensor and scale pixel values to [0, 1]
    transforms.Normalize((0.485, 0.456, 0.406),        # Subtract ImageNet mean for (R, G, B)
                         (0.229, 0.224, 0.225))        # Divide by ImageNet std for (R, G, B)
])
train_dir       = '/content/drive/MyDrive/CS167/datasets/bcdp_v1/train'
test_dir        = '/content/drive/MyDrive/CS167/datasets/bcdp_v1/test'

train_dataset   = datasets.ImageFolder(train_dir, transform=transform)
test_dataset    = datasets.ImageFolder(test_dir,  transform=transform)

n_train         = len(train_dataset)
n_test          = len(test_dataset)

number_of_classes = 4

print("Size of train set:", n_train)
print("Size of test set:",  n_test)

In [None]:
# Step 3: Use the AlexNet from above
#--------------------------------------------------------------------------------------------------

# You can give any name to your new network, e.g., AlexNet.
# You should load the pretrained AlexNet model from torchvision.models.
# This model was trained on over a million real-world images from ImageNet.
# The idea is to bootstrap our CNN network weights with pretrained weights.
# Our model will converge to a solution faster.
# This training process is called 'fine-tuning.'

class AlexNet(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(AlexNet, self).__init__()
        net = alexnet(weights=AlexNet_Weights.IMAGENET1K_V1)

        # retain convolutional and pooling layers from the pretrained AlexNet
        self.features = net.features
        self.avgpool = net.avgpool

        # replace the original classifier with a new one for custom output classes
        self.classifier = nn.Sequential(
            nn.Linear(256 * 6 * 6, 128),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        x = self.features(x)         # extract convolutional feature maps
        x = self.avgpool(x)          # reduce spatial dimensions to fixed 6x6
        x = torch.flatten(x, 1)      # flatten to a 1D vector per image (batch stays intact)
        x = self.classifier(x)       # apply fully connected layers for classification
        return x                     # output class scores



In [None]:
# Step 4: Your training and testing functions UPDATED TO OUTPUT TESTING CONFUSTION MATRIX
#--------------------------------------------------------------------------------------

def train_loop(dataloader, model, loss_fn, optimizer):

    size            = len(dataloader.dataset)
    num_batches     = len(dataloader)

    model.train()                   # set the model to training mode for best practices

    train_loss      = 0
    correct         = 0
    train_pred_all  = []
    train_y_all     = []

    for batch, (X, y) in enumerate(dataloader):
        # compute prediction and loss

        # ----------- putting data into gpu or sticking to cpu ----------
        X = X.to(device)     # send data to the GPU device (if available)
        y = y.to(device)
        # -----------                                         ----------

        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        train_loss += loss.item()

        if batch % 10 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

        # compute the accuracy
        pred_prob   = softmax(pred)
        pred_y 			= torch.max(pred_prob, 1)[1]
        train_correct = (pred_y == y).sum()
        correct    += train_correct.data

        train_pred_all.append(pred_y) # save predicted output for the current batch
        train_y_all.append(y)         # save ground truth for the current batch

    #pdb.set_trace()
    train_pred_all = torch.cat(train_pred_all) # need to concatenate batch-wise appended items
    train_y_all = torch.cat(train_y_all)

    train_loss = train_loss/num_batches
    correct    = correct.cpu().numpy()/size

    print('Confusion matrix for training set:\n', confusion_matrix(train_y_all.cpu().data, train_pred_all.cpu().data))
    return train_loss, 100*correct


def test_loop(dataloader, model, loss_fn):

    model.eval()                    # set the model to evaluation mode for best practices

    size                = len(dataloader.dataset)
    num_batches         = len(dataloader)
    test_loss, correct  = 0, 0
    test_pred_all       = []
    test_y_all          = []

    # Evaluating the model with torch.no_grad() ensures that no gradients are computed during test mode
    # also serves to reduce unnecessary gradient computations and memory usage for tensors with requires_grad=True
    with torch.no_grad():

      for X, y in dataloader:

        # ----------- putting data into gpu or sticking to cpu ----------
        X = X.to(device)     # send data to the GPU device (if available)
        y = y.to(device)
        # -----------                                         ----------

        pred = model(X)
        test_loss += loss_fn(pred, y).item()

        # calculate probability and save the outputs for confusion matrix computation
        pred_prob     = softmax(pred)
        pred_y        = torch.max(pred_prob, 1)[1]
        test_correct  = (pred_y == y).sum()
        correct      += test_correct.data

        test_pred_all.append(pred_y) # save predicted output for the current batch
        test_y_all.append(y)         # save ground truth for the current batch


    #pdb.set_trace()
    test_pred_all = torch.cat(test_pred_all)
    test_y_all = torch.cat(test_y_all)

    test_loss = test_loss/num_batches
    correct   = correct.cpu().numpy()/size
    print(f"Test Performance: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
    print('Confusion matrix for test set:\n', confusion_matrix(test_y_all.cpu().data, test_pred_all.cpu().data))
    return test_loss, 100*correct, confusion_matrix(test_y_all.cpu().data, test_pred_all.cpu().data)


In [None]:
# Step 5: prepare the DataLoader and select your optimizer and set the hyper-parameters for learning the model from DataLoader
#------------------------------------------------------------------------------------------------------------------------------

cnn_model = AlexNet(number_of_classes)
cnn_model.to(device)
print(cnn_model)


learning_rate     = 1e-4
batch_size_val    = 32
epochs            = 10
loss_fn           = nn.CrossEntropyLoss()
optimizer         = torch.optim.Adam(cnn_model.parameters(), lr=learning_rate)
softmax           = nn.Softmax(dim=1) # for calculating the probability of the network prediction. It is used in train_loop() and test_loop().

train_dataloader  = DataLoader(train_dataset, batch_size=batch_size_val, shuffle=True)  # shuffle the images in training set during fine-tuning
test_dataloader   = DataLoader(test_dataset, batch_size=batch_size_val,  shuffle=False) # you don't need to shuffle test images as they are not used during training


train_losses = []
test_losses  = []
train_accuracies = []
test_accuracies = []
start_time = time.time()
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    avg_train_loss, train_accuracy                    = train_loop(train_dataloader, cnn_model, loss_fn, optimizer)
    avg_test_loss, test_accuracy, conf_matrix_test    = test_loop(test_dataloader,   cnn_model, loss_fn)
    # save the losses and accuracies
    train_losses.append(avg_train_loss)
    test_losses.append(avg_test_loss)
    train_accuracies.append(train_accuracy)
    test_accuracies.append(test_accuracy)

print("AlexNet model has been fine-tuned!")
print("Total fine-tuning time: %.3f sec" %( (time.time()-start_time)) )
print("Total fine-tuning time: %.3f hrs" %( (time.time()-start_time)/3600) )


In [None]:
# visualizing the accuracy curves

plt.plot(range(1,epochs+1), train_accuracies)
plt.plot(range(1,epochs+1), test_accuracies)
plt.title('Model accuracies after each epoch')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'test'])
plt.show()

In [None]:
# visualizing the confusion matrix on the test set after the final epoch
dataset_labels = ['bike', 'cat', 'dog', 'person'] # datasets.ImageFolder(): assigns labels according to the sorted order of the folder names

# option #1: text
print(pandas.DataFrame(conf_matrix_test, index = dataset_labels, columns = dataset_labels))

# option #2: prettify
displ = ConfusionMatrixDisplay(confusion_matrix=conf_matrix_test, display_labels=dataset_labels)
displ.plot(cmap="Blues")
plt.show()

Now let's demo our model --

Credit: The following code was generated by ChatGPT

In [None]:
import torch
import matplotlib.pyplot as plt

# --- 1) Get one random sample from the test set ---
cnn_model.eval()
idx = torch.randint(len(test_dataset), (1,)).item()
img_tensor, y_true = test_dataset[idx]         # img_tensor is already transformed (resized + normalized)
x = img_tensor.unsqueeze(0).to(device)         # shape [1, 3, 227, 227]

# --- 2) Predict ---
with torch.no_grad():
    logits = cnn_model(x)
    probs  = torch.softmax(logits, dim=1).squeeze(0).cpu()
pred_idx = int(torch.argmax(probs))
conf     = float(probs[pred_idx])

# --- 3) Prepare image for display (un-normalize for viewing) ---
# NOTE: This inverts *your* current Normalize(mean=(.229,.224,.225), std=(.485,.456,.406))
# If you swap to the standard ImageNet stats (mean=(0.485,0.456,0.406), std=(0.229,0.224,0.225)),
# also update these to match.
mean = torch.tensor([0.229, 0.224, 0.225]).view(3,1,1)
std  = torch.tensor([0.485, 0.456, 0.406]).view(3,1,1)

img_disp = img_tensor.cpu() * std + mean       # unnormalize
img_disp = img_disp.clamp(0,1)                 # keep in [0,1]
img_disp = img_disp.permute(1,2,0).numpy()     # CHW -> HWC

# --- 4) Labels and thumbnail plot ---
class_names = test_dataset.classes             # taken from ImageFolder folder names (sorted)
pred_name   = class_names[pred_idx]
true_name   = class_names[y_true]

plt.figure(figsize=(2.4, 2.4))                 # "thumbnail"-sized
plt.imshow(img_disp)
plt.axis("off")
plt.title(f"Pred: {pred_name} ({conf:.2f})\nTrue: {true_name}")
plt.tight_layout()
plt.show()

__#Exercise__:

Fine-tune a model with the bcdfh_v1.zip (bike, car, dog, flower, horse) dataset

