### **Project Title: Fine-Tuning ResNet50 on Oxford-IIIT Pet Dataset**  
**Description:**  
This project demonstrates the fine-tuning of a pre-trained ResNet50 model on the Oxford-IIIT Pet dataset. The dataset contains 37 classes of cats and dogs, and the goal is to classify images accurately into these categories. Using transfer learning, the ResNet50 model, pre-trained on ImageNet, was adapted to this specific task by modifying the final layers and optimizing the model for the pet classification problem. The notebook covers data preprocessing, model training, evaluation, and predictions on sample images.

Downloading module from GitHub by mrdbourke
use module in going_modular

In [1]:
import os

print("[INFO] Checking if required files and folders already exist.")

# ตรวจสอบว่ามีโฟลเดอร์ 'going_modular' และไฟล์ 'helper_functions.py' แล้วหรือไม่
if os.path.exists("going_modular") and os.path.exists("going_modular/helper_functions.py"):
    print("[INFO] Required files and folders already exist. Skipping download.")
else:
    print("[INFO] Downloading module from GitHub by mrdbourke.")
    # Clone repository
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    
    # Move 'going_modular' to the current directory
    !move pytorch-deep-learning\going_modular .
    
    # Move 'helper_functions.py' to the 'going_modular' folder
    !move pytorch-deep-learning\helper_functions.py going_modular
    
    # Remove the cloned repository
    !rmdir /s /q pytorch-deep-learning
    
    # Remove 'going_modular/models' folder if it exists
    !rmdir /s /q going_modular\models
    !rmdir /s /q pytorch-deep-learning
    print("[INFO] Download and setup completed.")
    


[INFO] Checking if required files and folders already exist.
[INFO] Downloading module from GitHub by mrdbourke.


Cloning into 'pytorch-deep-learning'...
Updating files:  54% (135/248)
Updating files:  55% (137/248)
Updating files:  56% (139/248)
Updating files:  57% (142/248)
Updating files:  58% (144/248)
Updating files:  59% (147/248)
Updating files:  60% (149/248)
Updating files:  61% (152/248)
Updating files:  62% (154/248)
Updating files:  63% (157/248)
Updating files:  64% (159/248)
Updating files:  65% (162/248)
Updating files:  66% (164/248)
Updating files:  67% (167/248)
Updating files:  68% (169/248)
Updating files:  69% (172/248)
Updating files:  70% (174/248)
Updating files:  71% (177/248)
Updating files:  72% (179/248)
Updating files:  73% (182/248)
Updating files:  74% (184/248)
Updating files:  75% (186/248)
Updating files:  76% (189/248)
Updating files:  77% (191/248)
Updating files:  78% (194/248)
Updating files:  79% (196/248)
Updating files:  80% (199/248)
Updating files:  81% (201/248)
Updating files:  82% (204/248)
Updating files:  83% (206/248)
Updating files:  84% (209/248)

        1 dir(s) moved.
        1 file(s) moved.
[INFO] Download and setup completed.


The system cannot find the file specified.


Import lib

In [1]:
import os
import torch 
import torchvision 
import matplotlib.pyplot as plt 
import pathlib
import torchvision.transforms as transforms
import pandas as pd
from going_modular.going_modular import data_setup,engine,utils
from going_modular.helper_functions import download_data, set_seeds, plot_loss_curves
from torch import nn
from torchinfo import summary
from pathlib import Path
from PIL import Image
from timeit import default_timer as timer
from tqdm.auto import tqdm
from typing import List, Dict ,Tuple


  from .autonotebook import tqdm as notebook_tqdm


Setup Device

In [2]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

Creating a model for resnet50 

In [3]:

def create_resnet50_model(num_classes:int=3,
                          seed:int=42,
                          dropout_rate: float = 0.3):  # เพิ่ม dropout_rate
  # 1. Setup pretrained ResNet50 weights
  weights = torchvision.models.ResNet50_Weights.DEFAULT  # Use the default weights
  
  # 2. Get ResNet50 transforms
  transforms = weights.transforms()
  
  # 3. Setup pretrained model instance
  model = torchvision.models.resnet50(weights=weights)  # Use the pretrained ResNet50
  
  # 4. Freeze the base layers in the model (this will stop all layers from training)
  for param in model.parameters():
    param.requires_grad = False
  
  # 5. Change classifier head with random seed for reproducibility
  torch.manual_seed(seed)
  
  # เพิ่ม Dropout และกำหนด output เป็น num_classes
  model.fc = nn.Sequential(
    nn.Dropout(p=dropout_rate),  # เพิ่ม dropout
    nn.Linear(in_features=model.fc.in_features, out_features=num_classes, bias=True)
  )
  
  return model, transforms




call function create_resnet50_model()

In [4]:

resnet50_pets, resnet50_transforms = create_resnet50_model(num_classes=37,dropout_rate=0.3)

In [5]:
resnet50_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[232]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BILINEAR
)

# Print ResNet50 model summary (uncomment for full output)

In [6]:
summary(resnet50_pets,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                  Input Shape          Output Shape         Param #              Trainable
ResNet (ResNet)                          [1, 3, 224, 224]     [1, 37]              --                   Partial
├─Conv2d (conv1)                         [1, 3, 224, 224]     [1, 64, 112, 112]    (9,408)              False
├─BatchNorm2d (bn1)                      [1, 64, 112, 112]    [1, 64, 112, 112]    (128)                False
├─ReLU (relu)                            [1, 64, 112, 112]    [1, 64, 112, 112]    --                   --
├─MaxPool2d (maxpool)                    [1, 64, 112, 112]    [1, 64, 56, 56]      --                   --
├─Sequential (layer1)                    [1, 64, 56, 56]      [1, 256, 56, 56]     --                   False
│    └─Bottleneck (0)                    [1, 64, 56, 56]      [1, 256, 56, 56]     --                   False
│    │    └─Conv2d (conv1)               [1, 64, 56, 56]      [1, 64, 56, 56]      (4,096)              False
│    │    

Create DataLoaders for resnet50 feature extractor

In [29]:
data_pet_path = Path("./img")
train_dir = data_pet_path / "train"
test_dir = data_pet_path / "test"
train_dir,test_dir

(WindowsPath('img/train'), WindowsPath('img/test'))

Check cpu core
Count cpu for use on num_workers

In [8]:
import os
os_cpu_count = os.cpu_count()
os_cpu_count

16

Setup resnet50 dataloaders

In [30]:
# Setup resnet50 dataloaders
from going_modular.going_modular import data_setup
train_dataloader_resnet50_pet_pretrained, test_dataloader_resnet50_pet_pretrained, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                                                     test_dir=test_dir,
                                                                                                     transform=resnet50_transforms,
                                                                                                     batch_size=32,
                                                                                                     num_workers=16) # Could increase if we had more samples, such as here: https://arxiv.org/abs/2205.01580 (there are other improvements there too...)


checking batch size each dataloder from 
train_dataloader_pet_pretrained and test_dataloader_pet_pretrained

In [31]:
len(train_dataloader_resnet50_pet_pretrained),len(test_dataloader_resnet50_pet_pretrained),class_names[:10]

(149,
 47,
 ['Abyssinian',
  'Bengal',
  'Birman',
  'Bombay',
  'British_Shorthair',
  'Egyptian_Mau',
  'Maine_Coon',
  'Persian',
  'Ragdoll',
  'Russian_Blue'])

Training resnet50 Feature 
**`lr=1e-3 = 0.001`** is a suitable starting value, especially for medium to large models (e.g., ResNet or EfficientNet) and optimizers like Adam or AdamW.

In [None]:
# Create optimizer and loss function
optimizer = torch.optim.Adam(params=resnet50_pets.parameters(),
                             lr=1e-3)
loss_fn = torch.nn.CrossEntropyLoss()

# Train the classifier head of the pretrained resnet50 feature extractor model
set_seeds()
pretrained_resnet50_pets_results = engine.train(model=resnet50_pets,
                                      train_dataloader=train_dataloader_resnet50_pet_pretrained,
                                      test_dataloader=test_dataloader_resnet50_pet_pretrained,
                                      optimizer=optimizer,
                                      loss_fn=loss_fn,
                                      epochs=5,
                                      device=device)

 20%|██        | 1/5 [02:59<11:59, 179.99s/it]

Epoch: 1 | train_loss: 1.7972 | train_acc: 0.6758 | test_loss: 0.8676 | test_acc: 0.8863


 40%|████      | 2/5 [06:03<09:06, 182.05s/it]

Epoch: 2 | train_loss: 0.5926 | train_acc: 0.9121 | test_loss: 0.5158 | test_acc: 0.9076


plot loss curves 

In [None]:
plot_loss_curves(pretrained_resnet50_pets_results)

Saving resnet50 model

In [None]:
utils.save_model(model=resnet50_pets,
                 target_dir="models",
                 model_name="pretrained_resnet50_pets.pth")

Get the model size in bytes and convert to megabytes

In [None]:
pretrained_resnet50_model_size = Path("models/pretrained_resnet50_pets.pth").stat().st_size / (1024 * 1024)
print(f"Pretrained Resnet50 feature extractor model size: {round(pretrained_resnet50_model_size, 2)} MB")

Count number of parameters in model

In [None]:
resnet50_pets_total_params = sum(torch.numel(param) for param in resnet50_pets.parameters())
resnet50_pets_total_params

 Create a dictionary with Resnet50 Pet for keep records statistics

In [None]:
resnet50_pets_status = {"test_loss": pretrained_resnet50_pets_results["test_loss"][-1],
                  "test_acc": pretrained_resnet50_pets_results["test_acc"][-1],
                  "number_of_parameters": resnet50_pets_total_params,
                  "model_size (MB)": pretrained_resnet50_model_size}

resnet50_pets_status

Visualize Visualize Visualize data

Making predictions with our trained models and timing them

In [None]:
Making predictions with our trained models and timing them

In [None]:
# Get all test data img 
test_data_paths = list(Path(test_dir).glob("*/*.jpg"))
test_data_paths[:5]

Creating a function to make across the test dataset

In [None]:
#predict resnet50_pets
from going_modular.going_modular import predictions
resnet50_pets_test_pred_dicts  = pred_and_store(model = resnet50_pets,
                         target_paths = test_data_paths,
                         transforms = resnet50_transforms,
                         class_names = class_names,
                         device = "cpu")


Turn the resnet50_pets_test_pred_dicts into DataFrame

In [None]:
resnet50_pets_test_pred_df = pd.DataFrame(resnet50_pets_test_pred_dicts)
resnet50_pets_test_pred_df.head()

Check number of correct predictions

In [None]:
resnet50_pets_test_pred_df.correct.value_counts()

Find the avg time per prediction

In [None]:
resnet50_pets_average_time_per_pred  = round(resnet50_pets_test_pred_df.prediction_time.mean(),4)
print(f"avg per sec. predition : {resnet50_pets_average_time_per_pred}")

Add time per pred to Resnet50 Pet stats dictionary record

In [None]:
resnet50_pets_status["time_per_pred_cpu"] = resnet50_pets_average_time_per_pred
resnet50_pets_status

Create Function Plot Perdict random Image

In [None]:
def pred_and_plot_image(
    model: torch.nn.Module,
    class_names: List[str],
    image_path: str,
    image_size: Tuple[int, int] = (224, 224),
    transform: torchvision.transforms = None,
    device: torch.device = device,
):
    """Predicts on a target image with a target model.

    Args:
        model (torch.nn.Module): A trained (or untrained) PyTorch model to predict on an image.
        class_names (List[str]): A list of target classes to map predictions to.
        image_path (str): Filepath to target image to predict on.
        image_size (Tuple[int, int], optional): Size to transform target image to. Defaults to (224, 224).
        transform (torchvision.transforms, optional): Transform to perform on image. Defaults to None which uses ImageNet normalization.
        device (torch.device, optional): Target device to perform prediction on. Defaults to device.
    """

    # Open image
    img = Image.open(image_path)

    # Create transformation for image (if one doesn't exist)
    if transform is not None:
        image_transform = transform
    else:
        image_transform = transforms.Compose(
            [
                transforms.Resize(image_size),
                transforms.ToTensor(),
                transforms.Normalize(
                    mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                ),
            ]
        )

    ### Predict on image ###

    # Make sure the model is on the target device
    model.to(device)

    # Turn on model evaluation mode and inference mode
    model.eval()
    with torch.inference_mode():
        # Transform and add an extra dimension to image (model requires samples in [batch_size, color_channels, height, width])
        transformed_image = image_transform(img).unsqueeze(dim=0)

        # Make a prediction on image with an extra dimension and send it to the target device
        target_image_pred = model(transformed_image.to(device))

    # Convert logits -> prediction probabilities (using torch.softmax() for multi-class classification)
    target_image_pred_probs = torch.softmax(target_image_pred, dim=1)

    # Convert prediction probabilities -> prediction labels
    target_image_pred_label = torch.argmax(target_image_pred_probs, dim=1)

    # Plot image with predicted label and probability
    plt.figure()
    plt.imshow(img)
    plt.title(
        f"Pred: {class_names[target_image_pred_label]} | Prob: {target_image_pred_probs.max():.3f}"
    )
    plt.axis(False)

Show random 5 image perdiction

In [None]:
# Get a random list of image paths from test set
import random
num_images_to_plot = 5
test_image_path_list = list(Path(test_dir).glob("*/*.jpg")) # get list all image paths from test data 
test_image_path_sample = random.sample(population=test_image_path_list, # go through all of the test image paths
                                       k=num_images_to_plot) # randomly select 'k' image paths to pred and plot

# Make predictions on and plot the images
for image_path in test_image_path_sample:
    pred_and_plot_image(model=resnet50_pets, 
                        image_path=image_path,
                        class_names=class_names,
                        # transform=weights.transforms(), # optionally pass in a specified transform from our pretrained model weights
                        image_size=(224, 224))

In [None]:
!echo "*.ipynb filter=nbstripout" > .gitattributes
!copy .gitattributes demos\

### **Model Summary**  
The model achieved a **test accuracy of 92.09%** with a **test loss of 0.3186** on a 37-class classification task. It has **23,583,845 parameters** and a model size of **90.27 MB**, making it efficient for deployment.  

Out of **1,479 total predictions**, the model correctly predicted **1,360 images** and misclassified **119 images**, resulting in an **error rate of approximately 8.05%**.  

Further improvements could include fine-tuning with a lower learning rate, additional data augmentation, or error analysis for targeted enhancements.

Turning our Resnet50 Pet model into a deployable app via Hugingface

Create Folder demos/Resnet50_pet

In [None]:
import shutil
from pathlib import Path

#Create FoodVision big demo path
Resnet50_pet_demo_path = Path("demos")

#Remove files than might exist and create a new directorty (ถ้ามีไฟล์เก่าให้ลบและสร้างใหม่)
if Resnet50_pet_demo_path.exists():
  shutil.rmtree(Resnet50_pet_demo_path)
  Resnet50_pet_demo_path.mkdir(parents=True, exist_ok=True)
else:
  Resnet50_pet_demo_path.mkdir(parents=True, exist_ok=True)

create an examples directory

In [None]:
Resnet50_pet_examples_path = Resnet50_pet_demo_path / "examples"
Resnet50_pet_examples_path.mkdir(parents=True, exist_ok=True)

Move training Model to demos/Resnet50_pet

In [None]:
!copy models\pretrained_resnet50_pets.pth demos\

create path to Resnet50 Pet class names

In [None]:
resnet50_pet_classname_path = Resnet50_pet_demo_path / "class_names.txt"
resnet50_pet_classname_path

Get classnames

In [None]:
resnet50_pet_classname = class_names
resnet50_pet_classname[:5]

Record classname to class_names.txt

In [None]:
# Write Restnet50 Pet class names to text file
with open(resnet50_pet_classname_path, "w") as f:
  print(f"[INFO] Saving Pet class names to {resnet50_pet_classname_path}")
  f.write("\n".join(resnet50_pet_classname)) # new line per class name

Open and Read

In [None]:
# Open resnet50 pet class names file and read each line into a list
with open(resnet50_pet_classname_path, "r") as f:
  resnet50_pet_classname_loaded = [pet.strip() for pet in f.readlines()]
resnet50_pet_classname_loaded[:5]

Turning our Resnet50 Pet model into a Python script (model.py)

In [None]:
%%writefile demos/model.py
import torch
import torchvision
from torch import nn

def create_resnet50_model(num_classes:int=3,
                          seed:int=42,
                          dropout_rate: float = 0.3):  # เพิ่ม dropout_rate
  # 1. Setup pretrained ResNet50 weights
  weights = torchvision.models.ResNet50_Weights.DEFAULT  # Use the default weights
  
  # 2. Get ResNet50 transforms
  transforms = weights.transforms()
  
  # 3. Setup pretrained model instance
  model = torchvision.models.resnet50(weights=weights)  # Use the pretrained ResNet50
  
  # 4. Freeze the base layers in the model (this will stop all layers from training)
  for param in model.parameters():
    param.requires_grad = False
  
  # 5. Change classifier head with random seed for reproducibility
  torch.manual_seed(seed)
  
  # เพิ่ม Dropout และกำหนด output เป็น num_classes
  model.fc = nn.Sequential(
    nn.Dropout(p=dropout_rate),  # เพิ่ม dropout
    nn.Linear(in_features=model.fc.in_features, out_features=num_classes, bias=True)
  )
  
  return model, transforms




Turning our Resnet50 Pet Gradio app into a Python script (app.py)

In [None]:
%%writefile demos/app.py

# 1. imports and class names setup
import gradio as gr
import os
import torch
import torchvision


from model import create_resnet50_model
from timeit import default_timer as timer
from typing import Tuple, Dict

# Setup class names
with open("class_names.txt", "r") as f:
  class_names = [pet.strip() for pet in f.readlines()]

# 2. model and transforms preparation
resnet50_pets, resnet50_transforms = create_resnet50_model(num_classes=37)

# Load saved weights
resnet50_pets.load_state_dict(
    torch.load(
        f="pretrained_resnet50_pets.pth",
        map_location=torch.device("cpu") # load the model to the CPU
    )
)

from typing import Tuple, Dict
def predict(img) -> Tuple[Dict, float]:
  # Start a timer
  start_time = timer()

  # Transform the input image for use with EffNetB2
  img = resnet50_transforms(img).unsqueeze(0) # unsqueeze = add batch dimension on 0th index

  # Put model into eval mode, make prediction
  resnet50_pets.eval()
  with torch.inference_mode():
    # Pass transformed image through the model and turn the prediction logits into probaiblities
    pred_probs = torch.softmax(resnet50_pets(img), dim=1)

  # Create a prediction label and prediction probability dictionary
  # ความน่าจะเป็นการทำนาย (prediction probability)
  pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_names))}

  # Calculate pred time
  end_time = timer()
  pred_time = round(end_time - start_time, 4)

  # Return pred dict and pred time
  return pred_labels_and_probs, pred_time


#create a example list
example_list = ["examples/" + example for example in os.listdir("examples")]
example_list


title = "Resnet50 Pet 🐶🐱🐈"
description = "An Resnet50 feature extractor computer vision model to classify Pet images into 37 classes Dog & Cat"
article = " Created at [https://github.com/raksitkan/Pytorch_vision_Pet])."
demo = gr.Interface(fn=predict,
                    inputs=gr.Image(type="pil"),
                    outputs=[gr.Label(num_top_classes=3, label="Predictions"),
                             gr.Number(label="Prediction time (s)")],
                    examples=example_list,
                    title=title,
                    description=description,
                    article=article)
demo.launch()

Create a requirements file for resnet50 pet (requirements.txt)

In [None]:
import gradio as gr
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")
print(f"torchvision version: {gr.__version__}")

In [None]:
%%writefile demos/requirements.txt
torch==2.5.1
torchvision==0.20.1
gradio==5.7.1


Zip file demos for upload to huggingface

In [None]:
import shutil

shutil.make_archive('demos', 'zip', 'demos')


See our Porject below

In [None]:
# IPython is a library to help make Python interactive
from IPython.display import IFrame

# Embed FoodVision Mini Gradio demo
IFrame(src="https://raksitkan-resnet50-pet-classifier.hf.space", width=900, height=750)