<a href="https://colab.research.google.com/github/GeneralCoder365/cmsc421_hw3/blob/main/CMSC421_HW3_final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this homework you'll train a simple image classifier using using Pytorch. If you're not familiar with Pytorch please take a look at our [introduction to Pytorch notebook](https://colab.research.google.com/drive/1GO_YSErd9TjiH0R-hZ2XkKIfMbQK9Ask?usp=sharing). There are four parts to the homework:

1. Dataset class: you'll create a custom dataset class that you'll use to load the images for training. Your image transformations should be done in the dataset class.

2. Model design: in this section you'll design a simple CNN based model architecture to learn to predict image classes correctly.

3. Model training: train an image classifier using the model you designed in step 2. You'll need to instantiate a dataloader that uses an instance of your dataset class to iterate through the dataset for training.

4. Evaluation: load your model's weight and run inference on the dataset and report your result

In [None]:
import os
import tarfile

if not os.path.isfile('./data.tar.gz'):
    !wget 'http://cs.umd.edu/~pulkit/hw_3_data.tar.gz' -O data.tar.gz

with tarfile.open('./data.tar.gz', 'r:gz') as tar:
    tar.extractall(path='./')


--2024-04-27 00:06:35--  http://cs.umd.edu/~pulkit/hw_3_data.tar.gz
Resolving cs.umd.edu (cs.umd.edu)... 128.8.127.4
Connecting to cs.umd.edu (cs.umd.edu)|128.8.127.4|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.cs.umd.edu/~pulkit/hw_3_data.tar.gz [following]
--2024-04-27 00:06:35--  http://www.cs.umd.edu/~pulkit/hw_3_data.tar.gz
Resolving www.cs.umd.edu (www.cs.umd.edu)... 128.8.127.4
Reusing existing connection to cs.umd.edu:80.
HTTP request sent, awaiting response... 200 OK
Length: 7171530 (6.8M) [application/x-gzip]
Saving to: ‘data.tar.gz’


2024-04-27 00:06:37 (4.65 MB/s) - ‘data.tar.gz’ saved [7171530/7171530]



### Dataset class

In [None]:
import torch
from torch.utils.data import Dataset
import pandas as pd
import cv2
import os

data = pd.read_csv('/content/data/csvs/train.csv')
print(data)

class CustomImageDataset(Dataset):
    def __init__(self, csv_path, data_root, transform=None, device='cpu'):
        self.transform = transform
        self.root_path = data_root
        self.device = device

        # TODO: Read the csv file. You can use the pandas library
        data = pd.read_csv(data_root)

        # TODO : Get the image paths from the csv
        self.image_paths = data["image_path"]

        # TODO: Get the class ids from the csv. You might want to check if the class_id
        # column exists in the csv before trying to get it. The test csv does not have
        # the class_id column
        self.class_ids = data["class_id"]

        # TODO: Get the image names from the csv. This is required for the test part at
        # the end of the notebook. You should only return the image name for testing,
        # otherwise you should return the image and the class id
        self.image_names = data["image_name"]

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx: int):
        rel_img_path = self.image_paths[idx]
        image_name = self.image_names[idx]

        # TODO: Read the image file
        img = None

        # TODO Apply transformations if any to the image. You will want to at least convert
        # the image to a tensor. You can also apply other transformations.

        if self.class_ids is None: # for testing purposes
            return img, image_name, rel_img_path
        else:
            class_id = torch.tensor(self.class_ids[idx], dtype=torch.long)
            return img.to(self.device), class_id.to(self.device)

      class_id  class_name     image_name                image_path
0            0         cat  718jai12.JPEG  data/train/718jai12.JPEG
1            0         cat  upajvagf.JPEG  data/train/upajvagf.JPEG
2            0         cat  u3di1s04.JPEG  data/train/u3di1s04.JPEG
3            0         cat  8p2hg2ln.JPEG  data/train/8p2hg2ln.JPEG
4            0         cat  98em9mks.JPEG  data/train/98em9mks.JPEG
...        ...         ...            ...                       ...
2995         9  guinea_pig  92vqrxoe.JPEG  data/train/92vqrxoe.JPEG
2996         9  guinea_pig  s24t07ac.JPEG  data/train/s24t07ac.JPEG
2997         9  guinea_pig  4syf2nyz.JPEG  data/train/4syf2nyz.JPEG
2998         9  guinea_pig  rhix2ll1.JPEG  data/train/rhix2ll1.JPEG
2999         9  guinea_pig  d7shdeik.JPEG  data/train/d7shdeik.JPEG

[3000 rows x 4 columns]


In [None]:
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
import torchvision.transforms as T

def plot_image(img, title=None):
    plt.imshow(img)
    if title:
        plt.title(title)
    plt.axis('off')
    plt.show()


# TODO: Create an instance of the CustomImageDataset class for the training dataset
dataset = None

# TODO: Show the first 3 images from the dataset


### Model definition

Define your image classifier model here. Since we're working with images, you should consider an convolution neural network type model architecture. Start simple and make it more complex if you need to once you have something working.

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class ImageClassifier(nn.Module):
    def __init__(self, n_classes):
        super().__init__()

        # TODO: Create one or more convolutional neural network layers. This is just a suggestion.
        self.conv = None

        # TODO: Create one or more feed forward layers. This is just a suggestion.
        self.ff = None

    def forward(self, x):
        x = self.conv(x)
        x = torch.flatten(x, 1)
        return self.ff(x)

### Training

In [None]:
from torch.utils.data import DataLoader
from torch.utils.data import random_split
from torch.optim import Adam
import torchvision.transforms as T
from tqdm import tqdm

device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Feel free to try other batch sizes. The batch size is usually a power of 2.
batch_size = 16

# You can try other learning rates to see how it affects the training.
learning_rate = 1e-4

# Try to explore different transformation functions. You can use transformations
# to make your model more robust to translation, color changes etc. This is a good
# article that explains some transformations available in Pytorch
# https://pytorch.org/vision/stable/transforms.html. You can use transformations
# to augment/"increase" your training data.
transform = None

img_root = "./"

# TODO: Create an instance of the CustomImageDataset class for the training and validation datasets
train_set = None
val_set = None

# TODO: Create two dataloaders for bothe datasets.
# TODO: Shuffle the training dataloader. This is important to prevent the model from learning the order of the data.
train_loader = None
val_loader = None

# TODO: Initialize your model
n_classes = 10
model = None

# TODO: Initialize the optimizer. Feel free to try other optimizers in the torch.optim
# module.
# Hint: the Adam optimizer and its variants are the staple these days.
optimizer = Adam(model.parameters(), lr=learning_rate)

# TODO: Instantiate the loss function.
# Hint: Cross entropy loss works great for classification tasks like this one.
# Reading up on what loss function to use for what task could be informative
loss_fn = nn.CrossEntropyLoss()

# Training loop
eval_every = 2

# Consider training for longer. Keep an eye on the validation loss and decide
# on what works best for you.
n_epochs = 10
val_loss_values = []
training_loss_values = []
eval_epochs = []

for epoch in range(n_epochs):
    # set your model to training mode. This is important if you're using normalization
    # or dropout
    model.train()
    for img, label in tqdm(train_loader):
        # zero the parameter gradients
        optimizer.zero_grad()

        # TODO: Make a forward pass (predict the class of the image)
        pred = None

        # Calculate the loss
        loss = None

        loss.backward()
        optimizer.step()

    # Validation loop
    if epoch > 0 and epoch % eval_every == 0:
        model.eval()
        # don't calculate gradients when evaluating your model
        with torch.no_grad():
            for img, label in tqdm(val_loader):
                pred = None
                val_loss = None

        print(f"Epoch: {epoch}, Train Loss: {loss.item()} Eval Loss: {val_loss.item()}")
        eval_epochs.append(epoch)
        training_loss_values.append(loss.item())
        val_loss_values.append(val_loss.item())


# Save your model's weights
torch.save(model.state_dict(), "model.pth")

### Plot your training and validation loss

In [None]:
# Plot the training and validation loss
plt.plot(eval_epochs, training_loss_values, label='Training loss')
plt.plot(eval_epochs, val_loss_values, label='Validation loss')
plt.legend()
plt.show()

### Test your model against the validation dataset
This should give you a rough idea on how your model will do on the test set that you don't have labels for.
This section is not required and is only provided as a sanity check for you

In [None]:
## Load from your saved model
import numpy as np
import torch

# Load from your saved model using torch.load
model_state_dict = torch.load("./model.pth")
model = None
model.load_state_dict(model_state_dict)

# set model to inference mode
model.eval()
batch_size = 1

# Load the validation dataset
test_dataset = None
test_loader = None

preds = []
model.eval()
with torch.no_grad():
    for img, label in tqdm(test_loader):
        # TODO: predict the classes of the images and append them to the preds list
        pass

# Get the true labels for the validation dataset
true_labels = None

accuracy = (true_labels == torch.tensor(preds)).float().mean().item()
print(f"Accuracy: {accuracy}")

### Evaluation
Evaluate your model on the test dataset and create a CSF file. This is the file you need to submit.
> Important: make sure the prediction file has the columns: image_name, prediction, image_path

In [None]:
## Load from your saved model
import numpy as np
import torch

# Load from your saved model using torch.load
model_state_dict = torch.load("./model.pth")
model = None
model.load_state_dict(model_state_dict)

# set model to inference mode
model.eval()

# Load the test data
test_dataset = None

# TODO: Create a DataLoader for the test dataset
test_loader = None

# TODO: Predict and save output to a CSV file. We are just looking for the top class predicted
# Hint: Lookup argmax.
final_preds = []

with torch.no_grad():
    for img, img_names, img_paths in tqdm(test_loader):
        pred = None
        # For every element in the batch, get its predicted class id in `pred`
        # Class ID will be in int.
        batch_preds = [
            (img_name, pred_img, img_path)
            for (img_name, pred_img, img_path) in zip(img_names, pred, img_paths)
        ]
        final_preds.extend(batch_preds)

In [None]:
# DO NOT MODIFY
test_prediction = pd.DataFrame(final_preds, columns=['image_name', 'prediction', 'image_path'])
test_prediction.to_csv('prediction.csv')

# You can comment these lines out if you're running the notebook locally
from google.colab import files
files.download('prediction.csv')

Upload *prediction.csv* on gradescope