# Phase 3 - Vehicle Classification

This is an open ended phase. You must build a classifier for the Stanford Cars Dataset. You can use any techniques and knowledge from Phase 1 & 2 to aid you. The ultimate goal is to get as good performance as you can on the test set.

Please follow TA instructions in lab to learn how to access the data.

**Note**: You will need to achieve more than 80% accuracy on the test set to receive credit for Phase 3!

# Imports

Import all the libraries you need for your project

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import torch
import torch.nn as nn
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import Dataset, DataLoader
from torchvision import utils
import torchvision.models as models
from skimage import io
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import time
import pandas as pd
import os
from torch.optim.lr_scheduler import StepLR

In [None]:
print(torch.__version__)

2.1.0+cu118


## 1) Load Dataset
The Stanford Dataset is not provided in ```torchvision.datasets```. You will need to create your inherit from ```Dataset``` class to load the dataset and overload the ```___init__()```, ```__len__()```, and ```__getitem__()``` functions.

You will find the following link useful to help create the dataset class: https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

If you wish, you can create another method in the class: ```visualize()``` to help visualize the data. This is optional and will not be graded

In [None]:
class CarDataSet(Dataset):
    def __init__(self, csv_file, root_dir, transform=None):
        self.car_data = pd.read_csv(csv_file)
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.car_data)

    def __getitem__(self, idx):
        # if torch.is_tensor(idx):
        #     idx = idx.tolist()
        #     img_name = os.path.join(self.root_dir,
        #                         self.car_data.iloc[idx, 0])
        #     image = io.imread(img_name)
        #     labels = self.car_data.iloc[idx, 1:]
        #     labels = np.array([labels], dtype=float).reshape(-1, 2)
        #     sample = {'image': image, 'labels': labels}

        #     if self.transform:
        #         sample = self.transform(sample)

        #     return sample
        img_path = os.path.join(self.root_dir, self.car_data.iloc[idx, 0])
        image = Image.open(img_path)
        image = image.convert("RGB")
        label = self.car_data.iloc[idx, -1] -1

        # print(type(image))
        # print(image.size)
        if self.transform:
          image = self.transform(image)

        # print(type(image))
        # print(image.shape)
        return image, label

In [None]:
transform = transforms.Compose([
    # transforms.RandomResizedCrop(224),
    # transforms.RandomHorizontalFlip(),
    # transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    # transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

train_data = CarDataSet(csv_file='/content/drive/MyDrive/Colab Notebooks/stanford_cars_eec174/train_make.csv',
                                           root_dir='/content/drive/MyDrive/Colab Notebooks/stanford_cars_eec174/images/train',
                                           transform=transform)
test_data = CarDataSet(csv_file='/content/drive/MyDrive/Colab Notebooks/stanford_cars_eec174/val_make.csv',
                                           root_dir='/content/drive/MyDrive/Colab Notebooks/stanford_cars_eec174/images/val',
                                           transform=transform)

# Batch Size
batch_size = 64

# # Import to Dataloaders
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)

test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

## 2) Define your Model Architecture

It is up to you to decide which model you use. You can create yor own CNN or use transfer learning.

In [None]:
# Model definition
resnet18 = models.resnet50(pretrained=True)
resnet18.fc = nn.Linear(resnet18.fc.in_features, 49)
# Load Model onto GPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
resnet18.to(device)



ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

## 3) Define Loss function and Optimizer

You chose the loss function and optimizer.

In [None]:
# Loss Function
loss_function = nn.CrossEntropyLoss()

# Optimizer
optimizer = torch.optim.SGD(resnet18.parameters(), lr=0.01, momentum=0.9)
# scheduler = StepLR(optimizer, step_size=5, gamma=0.1)

## 4) Train your Network

In [None]:
def train(model, loss_fn, optimizer, train_loader, valid_loader, batch_size, num_epochs, device):
    if device is not None:
        model.to(device)
    else:
        device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
        model.to(device)

    train_losses = []
    train_accuracies = []
    valid_losses = []  # Track validation loss
    valid_accuracies = []  # Track validation accuracy

    for epoch in range(num_epochs):
        running_loss = 0.0
        correct = 0
        total = 0
        start_time = time.time()
        # print("loader")
        # Training phase
        model.train()
        for i, data in enumerate(train_loader, 0):
            # print(i)
            images, labels = data[0].to(device), data[1].to(device)
            optimizer.zero_grad()
            outputs = model(images)
            # print('-'*20)
            # print(outputs)
            # print('='*20)
            # print(labels)
            loss = loss_fn(outputs, labels)
            loss.backward()
            optimizer.step()
            # scheduler.step()

            running_loss += loss.item()
            # print("Loss: {}".format(loss.item()))
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        epoch_loss = running_loss / len(train_loader)
        epoch_accuracy = 100 * correct / total

        # Validation phase
        model.eval()
        valid_running_loss = 0.0
        valid_correct = 0
        valid_total = 0
        for i, data in enumerate(valid_loader, 0):
            # print("valid")
            images, labels = data[0].to(device), data[1].to(device)
            outputs = model(images)
            loss = loss_fn(outputs, labels)
            valid_running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            valid_total += labels.size(0)
            valid_correct += (predicted == labels).sum().item()

        valid_epoch_loss = valid_running_loss / len(valid_loader)
        valid_epoch_accuracy = 100 * valid_correct / valid_total

        end_time = time.time()
        elapsed_time = end_time - start_time

        # Print statistics
        print('Epoch [%d/%d], Train Loss: %.4f, Train Accuracy: %.4f, Valid Loss: %.4f, Valid Accuracy: %.4f, Time: %.2fs'
              % (epoch + 1, num_epochs, epoch_loss, epoch_accuracy, valid_epoch_loss, valid_epoch_accuracy, elapsed_time))

        # Append to the lists
        train_losses.append(epoch_loss)
        train_accuracies.append(epoch_accuracy)
        valid_losses.append(valid_epoch_loss)
        valid_accuracies.append(valid_epoch_accuracy)

    return train_losses, train_accuracies, valid_losses, valid_accuracies

def test_accuracy(model, test_loader, loss_function, device):
    model.to(device)
    correct = 0
    total = 0
    with torch.no_grad():
        for test_data in test_loader:
            images, labels = test_data[0].cuda(), test_data[1].cuda()
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print('Accuracy: %d %%' % (100 * correct / total))

## 4) Evaluate (on Test Set)

In [None]:
num_epochs = 10
lr = 0.01
num_classes = 49

# Define Loss func and Optimizer
# loss_function = nn.CrossEntropyLoss()
# optimizer = torch.optim.SGD(resnet18.parameters(), lr=0.01, momentum=0.7)

# Train Model
# train_losses, train_accuracies = train(resnet18, loss_function, optimizer, train_loader, test_loader, batch_size, num_epochs, device)
train_losses, train_accuracies, valid_losses, valid_accuracies = train(resnet18, loss_function, optimizer, train_loader, test_loader, batch_size, num_epochs, device)

# Evaluate on Test Set
test_accuracy(resnet18, test_loader, loss_function, device)

Epoch [1/10], Train Loss: 2.2807, Train Accuracy: 39.9116, Valid Loss: 1.8890, Valid Accuracy: 48.9423, Time: 2191.07s
Epoch [2/10], Train Loss: 0.8870, Train Accuracy: 74.2601, Valid Loss: 1.3545, Valid Accuracy: 61.9660, Time: 168.40s
Epoch [3/10], Train Loss: 0.3709, Train Accuracy: 89.2669, Valid Loss: 1.0691, Valid Accuracy: 70.5516, Time: 167.94s
Epoch [4/10], Train Loss: 0.1702, Train Accuracy: 95.3580, Valid Loss: 0.8769, Valid Accuracy: 77.8100, Time: 168.59s
Epoch [5/10], Train Loss: 0.0518, Train Accuracy: 98.8825, Valid Loss: 0.7251, Valid Accuracy: 79.8839, Time: 168.18s
Epoch [6/10], Train Loss: 0.0476, Train Accuracy: 98.8088, Valid Loss: 0.7847, Valid Accuracy: 79.6765, Time: 168.16s
Epoch [7/10], Train Loss: 0.0379, Train Accuracy: 99.1649, Valid Loss: 0.5938, Valid Accuracy: 84.1560, Time: 170.82s
Epoch [8/10], Train Loss: 0.0126, Train Accuracy: 99.8158, Valid Loss: 0.5388, Valid Accuracy: 85.6491, Time: 170.22s
Epoch [9/10], Train Loss: 0.0050, Train Accuracy: 99.96

In [None]:
PATH = '/content/drive/MyDrive/Colab Notebooks/Copy of Phase_3.ipynb'
torch.save(resnet18.state_dict(), PATH)
resnet18.load_state_dict(torch.load( PATH))

<All keys matched successfully>

## 5) Report

Please write a report discussing all your choices and procedure to implement your vehicle classifier. In your report, include all your choices (i.e. hyperparameters, lr, models, loss, optimizer) and explain why you made those choices to achieve your performance. Your report must be thorough and comprehensive, please discuss fully how you were able to obtain a high performance.

We built a Dataset using the functions given in the starter code, the get item function in the dataset was important as it is the main function which opens the images from the root_dir(the location of the images). Then we transform the image by resizing the image and transforming it into a Tensor so that we can pass it into the model. To cut time on training and improving accuracy we used a resnet50 model. Initially we tried the resnet18 model, however the resnet50 gave us better accuracy so we switched to it. The learning rate is set to 0.01 as when we tried a lower learning rate it was overfitting, and when we switched to a higher one it was converging very fast so we decided to stick to this. The batch size could have been 128 however due to the GPU restrictions and the accuracy being better on 64 for this run we stuck to 64. The optimizer which was better for our model was SGD, which we compared to ADAM.