# Loading Pretrained Models

In this notebook, we experiment with different pretrained models like `VGG16` and `AlexNet`. For each model we will train it on a subset of the data and observe its performance on the corresponding validation set. From there, we will take the best performing model and finetune it further by training it on the entire dataset in `fine_tuning.ipynb`.

In [1]:
# Standard library imports
import json
import time

# PyTorch imports
import torch
from torch import nn
from torch.utils.data import DataLoader
import torch.optim as optim

# torchvision imports
from torchvision.models import VGG16_Weights
from torchvision.datasets import ImageFolder

# Project-specific imports
from src.config import PROCESSED_DATA_DIR, LEARNING_RATE, BATCH_SIZE, MOMENTUM
from utils import train_validate_model, modify_model_output

# Logging and experiment tracking
from loguru import logger

# Load hyperparameters
batch_size = BATCH_SIZE
lr = LEARNING_RATE
momentum = MOMENTUM

print('Imported necessary libraries/variables')

[32m2024-07-26 09:33:22.146[0m | [1mINFO    [0m | [36msrc.config[0m:[36m<module>[0m:[36m11[0m - [1mPROJ_ROOT path is: C:\Git\hamburger-hotdog-pizza-classifier[0m


Imported necessary libraries/variables


In [None]:
# Directory paths
subset_image_path = PROCESSED_DATA_DIR / "pizza_hamburger_hotdog_20_percent"
train_dir = subset_image_path / 'train'
test_dir = subset_image_path / 'test'
valid_dir = subset_image_path / 'valid'

# Data loading parameters
weights = VGG16_Weights.DEFAULT
transform = weights.transforms()

# Data preparation
train_data = ImageFolder(train_dir, transform=transform)
valid_data = ImageFolder(valid_dir, transform=transform)
test_data = ImageFolder(test_dir, transform=transform)

# Data loaders
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

# Device setup
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Start with Pre-Trained Models

Let's choose some popular CNN architectures as a starting point

First, we need to load these pretrained models

In [6]:
num_classes = 3
alexnet = modify_model_output('alexnet', num_classes, device)
vgg16 = modify_model_output('vgg16', num_classes, device)
resnet50 = modify_model_output('resnet50', num_classes, device)

## 1. Trying ResNet50

In [10]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(resnet50.parameters(), lr=0.001, momentum=0.3)
model_name = "ResNet50"
date_time = time.time()
logger.add(f"logs/{model_name}/training_log-{date_time}.log", format="{time} {level} {message}", level="INFO")
train_validate_model(num_epochs=10, model=resnet50, train_loader=train_loader, valid_loader=valid_loader, criterion=criterion, optimizer=optimizer, device=device, model_save_path="models/best_resnet50_model.pth")

Overall Training Progress: 100%|██████████| 10/10 [00:24<00:00,  2.50s/it, Best Val Accuracy=65.00%, Current Train Accuracy=67.04%, Current Val Accuracy=65.00%]


## 2. Trying VGG16

In [7]:
# Prepare data into dataloader
from torchvision.models import vgg16, VGG16_Weights
weights = VGG16_Weights.DEFAULT
preprocess = weights.transforms()

train_data = ImageFolder(train_dir, transform=preprocess)
valid_data = ImageFolder(valid_dir, transform=preprocess)
test_data = ImageFolder(test_dir, transform=preprocess)
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

In [8]:
criterion = nn.CrossEntropyLoss()
vgg16 = modify_model_output('vgg16', num_classes, device)
optimizer = optim.SGD(vgg16.parameters(), lr=0.001, momentum=0.3)
model_name = "VGG16"
date_time = time.time()
logger.add(f"logs/{model_name}/training_log-{date_time}.log", format="{time} {level} {message}", level="INFO")
train_validate_model(num_epochs=20, model=vgg16, train_loader=train_loader, valid_loader=valid_loader, criterion=criterion, optimizer=optimizer, device=device, model_save_path="models/vgg16.pth")  

Overall Training Progress: 100%|██████████| 20/20 [01:07<00:00,  3.38s/it, Best Val Accuracy=93.89%, Current Train Accuracy=98.15%, Current Val Accuracy=91.11%]


## 3. Trying AlexNet

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(alexnet.parameters(), lr=0.001, momentum=0.3)
model_name = "AlexNet"
date_time = time.time()
logger.add(f"logs/{model_name}/training_log-{date_time}.log", format="{time} {level} {message}", level="INFO")
train_validate_model(num_epochs=10, model=alexnet, train_loader=train_loader, valid_loader=valid_loader, criterion=criterion, optimizer=optimizer, device=device, model_save_path="models/best_alexnet_model.pth")

Since VGG16 was able to achieve nearly 88% accuracy on the validation set, let's start off with VGG16 as our base model to improve upon.