# Project Three: Image Classification with ConvNets

The project here is design to teach you the basics of deep neural networks.  
You will design a ConvNet (also call a CNN or Convolution Neural Network) to ingest images and  
classify the dominant object in the image. The objective here is to get accustomed to programming  
with PyTorch (or Keras), and learn the workflow for training neural networks. In addition to training  
from scratch, to will also learn to load pretrained models that are then finetuned on your dataset (transfer learning).  


We will work through a series of steps:
1. Acquire, Load, and Explore the Data
2. Prepare the Data for your ConvNet
3. Select and Prepare your ConvNet Model
4. Utilize transfer learning, finetuning, and training from scratch.
5. Evaluate and reiterate to improve your ConvNet.

Similar to the previous projects, we will consolidate this notebook into four following task:

* Task 1 is dedicated to EDA on the image dataset.
* Task 2 prepares the data into the correct format and instantiates your ConvNet (+ with pretrained weights if doing EC).
* Task 3 is building an accurate ConvNet. Train your own model. You can use an off-the-self model with finetuning as 20% EC.
* Task 4 completes the testing, and analysis. Is the model accurate on the testset? How can it be improved? Show what changes you did to improve (batch size, learning rate, etc).

For this project, the data is provided in a zip file attached into the Canvas page.

### Task 1: Exploratory Data Analysis (10%)

There are a few steps to this task. These involve an unzip operation, then traversal into the directory for statistics.  
A comprehensive check of the data involves the image dimensions, number of images, number of image classes.  
Organize your results into a readable format and also provide previews of a subset of the images.  
In addition to the above, commute the mean & std for the pixel values in the dataset (across all images).

In [None]:
#EDA goes here
#Oliver James, branch 1 test comment

### Task 2a: Prepare the Data (10%)
There are two critical steps that need to done here.  
First, you need to create a list of file paths (to the actual files). Then you randomize/shuffle the list.  
The list now holds the paths to all your data. Use the following splits: trainset is 90%, validset is 5%, and testset is 5%.  
Second, explore some data augmentation techniques with torch.transform.  

Documentation on transforms:  
[https://pytorch.org/vision/0.13/transforms.html](https://pytorch.org/vision/0.13/transforms.html)  
[https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_getting_started.html#sphx-glr-auto-examples-transforms-plot-transforms-getting-started-py](https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_getting_started.html#sphx-glr-auto-examples-transforms-plot-transforms-getting-started-py)   
Documentation for DataLoader: [https://pyimagesearch.com/2021/10/04/image-data-loaders-in-pytorch/](https://pyimagesearch.com/2021/10/04/image-data-loaders-in-pytorch/)

In [None]:
#code to get you started on transforms
from torchvision.transforms import Compose
from torchvision.transforms import ToTensor
from torchvision.transforms import Resize
from torchvision.transforms import Normalize
from torchvision.transforms import RandomCrop
from torchvision.transforms import RandomHorizontalFlip

#put your calculated std and mean below
std  = 0
mean = 0
transform = Compose([ToTensor(),
                     #additional transforms such as RandomCrop
                     #optionally resize images to ensure consistent input dimensions
                    Normalize(std=std,mean=mean)])
#print out tensors to verify correct transforms

In [None]:
#code to get you started on dataloading
import os
import path
import torch
from torch.utils.data import Dataset, DataLoader, random_split
from PIL import Image

class CustomImageDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        """
        Args:
            root_dir (string): Directory with all the images
            transform (callable, optional): Optional transform to be applied on a sample
        """
        self.root_dir = root_dir
        self.transform = transform
        
        # Get class names (folder names)
        self.classes = sorted(entry.name for entry in os.scandir(root_dir) if entry.is_dir())
        
        # Create class to index mapping
        self.class_to_idx = {cls_name: idx for idx, cls_name in enumerate(self.classes)}
        
        # Collect all image paths and their labels
        self.image_paths = []
        self.labels = []
        
        for class_name in self.classes:
            class_dir = os.path.join(root_dir, class_name)
            for img_name in os.listdir(class_dir):
                if img_name.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.gif')):
                    img_path = os.path.join(class_dir, img_name)
                    self.image_paths.append(img_path)
                    self.labels.append(self.class_to_idx[class_name])
    
    def __len__(self):
        return len(self.image_paths)
    
    def __getitem__(self, idx):
        img_path = self.image_paths[idx]
        image = Image.open(img_path).convert('RGB')
        label = self.labels[idx]
        
        if self.transform:
            image = self.transform(image)
        
        return image, label

In [None]:
def create_data_splits(root_dir,
                       transforms,
                       train_ratio=0.7,
                       val_ratio=0.15,
                       test_ratio=0.15,
                       random_seed=42):
    """
    Create train, validation, and test splits for the dataset
    
    Args:
        root_dir (str): Path to the root directory containing image folders
        train_ratio (float): Proportion of data for training
        val_ratio (float): Proportion of data for validation
        test_ratio (float): Proportion of data for testing
        random_seed (int): Random seed for reproducibility
    
    Returns:
        tuple: (train_dataset, val_dataset, test_dataset)
    """
    #ensure ratios sum to 1
    assert abs(train_ratio + val_ratio + test_ratio - 1.0) < 1e-5, "Ratios must sum to 1"
    
    # Create the full dataset
    full_dataset = CustomImageDataset(root_dir, transform=transform)
    
    # Set random seed for reproducibility
    torch.manual_seed(random_seed)
    
    # Calculate split sizes
    total_size = len(full_dataset)
    train_size = int(total_size * train_ratio)
    val_size   = int(total_size * val_ratio)
    test_size  = total_size - train_size - val_size
    
    # Create splits
    train_dataset, val_dataset, test_dataset = random_split(
        full_dataset, 
        [train_size, val_size, test_size],
        generator=torch.Generator().manual_seed(random_seed)
    )
    
    return train_dataset, val_dataset, test_dataset

In [None]:
#create three dataloaders, one each for the train, valid, and test sets
#print the sizes for each dataset, and ensure they sum to the total dataset size
from tabulate import tabulate

#set to your actual data directory
data_dir = '/path/to/your/image/folders'
    
#create data splits (hint: see above functions)

#create dataloaders, experiment with batchsize

#print dataset sizes
print("Dataset Sizes")
print("-------------")
table = [['train', 'valid', 'test', 'total'], [0,0,0,0]]
print(tabulate(table, headers='firstrow', tablefmt='fancy_grid'))

### Task 2b: Instantiate the Model (20%)
The sequence of layers and their composition has a tremendous effect on the testset accuracy.  
When you find your model not 'working', it is a good idea to revisit this task.  
Building a ConvNet is simple with PyTorch. There are only two functions, the **\_\_init\_\_()** and **forward()**.  
**\_\_init\_\_()** inherits its functionality from the nn.Module and is used to enumerate the layers and their properties.  
**forward()** defines how the layers ingest input (the raw pixels) between the layers leading up to the output.  
For more documentation: [https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html](https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html)  
To get you started, use the template below.

In [None]:
import torch.nn as nn
import torch.functional as F


class Example(nn.Module):
    def __init__(self,
                 inf,
                 outf,
                 kernel_size=3,
                 stride=1,
                 padding=1,
                 bias=False,
                ):
        #initializes the class with PyTorch's neural network utilities
        nn.Module.__init__(self)

        #BatchNorm is a type of normalization applied across a batch of samples.
        #this means, if 32 images are processed concurrently, their statistics are
        #used for the ConvLayer output normalization. Improves final accuracy.
        self.norm_layer = nn.BatchNorm2d
        #ReLU is the activation applied to the ConvLayer output
        self.activation = nn.ReLU(inplace=True)

        #define convolutional layers below (you should experiment here).
        self.conv1 = nn.Conv2d(inf, outf, kernel_size=3,
                               padding=1, stride=stride, bias=bias)
        self.norm1 = self.norm_layer(outf)
        self.conv2 = nn.Conv2d(outf,outf, kernel_size=3,
                               padding=1, stride=1, bias=bias)
        self.norm2 = self.norm_layer(outf)

        #define one or more linear layers. the last layers output must match the number of classes.
        #self.fc1 = nn.Linear()

    def forward(self, X):
        Y  = self.conv1(X)
        Y  = self.norm1(Y)
        Y  = self.activation(Y)

        Y  = self.conv2(Y)
        Y  = self.norm2(Y)

        Y = self.activation(Y)
        return Y

### Task 3a: Selecting Hyperparameters (10%)

The hyperparameters can make or break your model (regardless of how good the architecuture is).  
Take your time to thoughtfully select an optimizer, loss function, learning rate, and train epochs.  
For more details on **Task 3** as whole, read the documentation at [https://pytorch.org/tutorials/beginner/introyt/trainingyt.html](https://pytorch.org/tutorials/beginner/introyt/trainingyt.html)


In [None]:
# your code goes here

### Task 3b: Model Training and Validation (30%)

In [None]:
# type a function here that trains for one epoch
def train(insert_parameters_here):
    #turn on training model; enables gradient calculations.
    model.train()

    #your code

In [None]:
# type a function here that loops over the validation set
@torch.inference_mode()
def validation(insert_parameters_here):
    #turn off training model; disables gradient calculations.
    model.eval()

    #your code

### Task 3c: Testset Evaluation (10%)

In [None]:
# calculate the accuracy for your model.
# note: the testset should only be used here.
# type a function here that loops over the validation set
@torch.inference_mode()
def test(insert_parameters_here):
    #turn off training model; disables gradient calculations.
    model.eval()

    #your code

### Task 4: Analysis and Evaluation (10%)


1. An introduction to the dataset, what did your EDA reveal?
2. If your EDA helped you build a better model, briefly type about.
3. What kind of distribution is the data? Are the images of similar classes?
4. Explain how you built your model. What type of layers, how many, and why.
5. Discuss pain points and how you iterated towards a better model.

The points above are for guidance; you can choose your template and structure.  
The idea is to present a short report (no word counts) that is structured, clear, and concise.  
You can refer back to your figures and use external links to explain your insights.

insert analysis here

### Extra Credit: Use a Pretrained Model (20%)

In [None]:
# extra credit code goes here. PyTorch has many pretrained models on the web and also from torch hub.
# you will need to retrain the final layer(s) but should leave most of the model frozen.

### Submission:

You need to prepare your ipynb/jupyter notebook for grading.
The two main tasks are ensuring all your cell outputs are present and that you convert the notebook to PDF.

The instructs will vary slightly based on the platform (collab, kaggle, anaconda, etc).
Generally, inside the notebook, you will want to:
1. Restart & clear all cell outputs (optional, may detect buggy program control flow)
2. Run all (must do; I need to see your code cell outputs!)

Next, you need to download the notebook as a PDF. Exporting as PDF is a bit tricky but the steps below should work:
1. Download the notebook. (all platforms allow the default .ipynb export)
2. https://onlineconvertfree.com/convert-format/ipynb-to-pdf/

### Rubric:

Complete all tasks above and see their associated percentage allocations.  
In general, ensure your code runs correctly.  Make sure the submitted PDF includes your code outputs.  

You will be given significant credit for documentation and pseudo-code.
For more details, please read the rubric PDF in the assignment files.
