# Homework 2: Multiclass Classification with PyTorch

In this assignment, you will build, train, and evaluate a neural network for multiclass classification using PyTorch.
You will use the [Garbage dataset](https://www.kaggle.com/datasets/mostafaabla/garbage-classification).
The goal is to gain hands-on experience with:
- Dataset preparation 
- Building two  PyTorch models
- Loss functions for multiclass
- Training loop and evaluation
- Visualize of performance

## About Dataset
### Context
This dataset has 15,150 images from 12 different classes of household garbage; paper, cardboard, biological, metal, plastic, green-glass, brown-glass, white-glass, clothes, shoes, batteries, and trash.

Garbage Recycling is a key aspect of preserving our environment. To make the recycling process possible/easier, the garbage must be sorted to groups that have similar recycling process. I found that most available data sets classify garbage into a few classes (2 to 6 classes at most). Having the ability to sort the household garbage into more classes can result in dramatically increasing the percentage of the recycled garbage.

### Content
An ideal setting for data collection would be to place a camera above a conveyor where the garbage comes one by one, so that the camera can capture real garbage images. But since such a setup is not feasible at the moment I collected most of the images in this dataset by web scraping, I tried to get images close to garbage images whenever possible, for example in biological garbage category I searched for rotten vegetables, rotten fruits and food remains, etc. However, for some classes such as clothes or shoes it was more difficult to get images of clothes or shoes from the garbage, so mostly it was images of normal clothes. Nevertheless, being able to classify the images of this data set to 12 classes can be a big step towards improving the recycling process.

### Imports

In [None]:
import torch
import torchvision
import torch.nn as nn
import kagglehub
from torchvision.transforms import ToTensor
from torchvision.datasets import ImageFolder
import os
import torch.nn.functional as F
from tqdm import tqdm
from torchvision.utils import make_grid
from torch.utils.data import random_split
from torch.utils.data.dataloader import DataLoader
import matplotlib.pyplot as plt
from torchvision import transforms


torch.manual_seed(42)
# %matplotlib inline

### Dwonload and prepare dataset from kagglehub
`kagglehub.dataset_download` downloads and extracts Kaggle datasets to a local cache directory (usually under `~/.cache/kagglehub/datasets/`). It returns the path to the unzipped dataset, preserving the original folder structure as found on Kaggle, such as one subfolder per class for image datasets.

---

### **What is the structure of the downloaded content?**

* Inside the returned directory (`path`), you will find the files and folders as originally organized on Kaggle.
* For the **garbage classification** dataset, you typically get a folder like:

  ```
  garbage_classification/
      cardboard/
      glass/
      metal/
      paper/
      plastic/
      trash/
      ...
  ```

  Each subfolder contains images belonging to that class (a classic structure for use with `torchvision.datasets.ImageFolder`).


In [75]:
# Download the latest version of the Kaggle dataset to a local directory
path = kagglehub.dataset_download("mostafaabla/garbage-classification",)

# Set the data directory to the location of the downloaded images
data_dir = os.path.join(path, "garbage_classification")

# Set the desired image size for resizing
img_size = 64

# Define the image transformations to apply to each image:
# - Resize the image to (img_size, img_size)
# - Convert the image to a PyTorch tensor
transform = transforms.Compose([
    transforms.Resize((img_size, img_size)),  # Resize images
    transforms.ToTensor()                     # Convert to tensor
])  

print(f"image transform: {transform}")

image transform: Compose(
    Resize(size=(64, 64), interpolation=bilinear, max_size=None, antialias=True)
    ToTensor()
)


In [3]:
dataset  = ImageFolder(root=data_dir, transform=transform)

classes_names = dataset.classes
# dataset = ImageFolder(data_dir + '/Training', transform=transform)
print("Number of images:", len(dataset), "number of classes:", len(classes_names))
print("Class names:", classes_names)


Number of images: 15515 number of classes: 12
Class names: ['battery', 'biological', 'brown-glass', 'cardboard', 'clothes', 'green-glass', 'metal', 'paper', 'plastic', 'shoes', 'trash', 'white-glass']


### TODO 1:
Create a 4×3 subplot that displays one example image from each category in the dataset.


In [None]:
# your code here

### TODO 2:
Shuffle the dataset and split it into training and validation sets, using 80% of the samples for training and 20% for validation. Make sure that the class distribution is preserved as much as possible in both splits.


In [None]:
# your code here

### TODO 3:
Visualize the class distribution in both the training and validation sets using a bar plot, so you can compare how well the splits represent the overall dataset.


In [None]:
# your code here

### TODO 4:
Ensure that no single category accounts for more than 15% of the samples in the training set. If necessary, downsample the dominant classes. Then, visualize the new class distribution in the training set using a bar plot.


In [None]:
# your code here

## Implementation of Regularization Layers

Implement two regularization layers from scratch:
1. `BatchNorm2d`
2. `LayerNorm`

Make sure that all trainable parameters (such as scale and shift) are properly registered as part of the computational graph, so they are optimized during training.


### TODO 5:
Implement the BatchNorm2d layer from scratch using only basic PyTorch components (such as `nn.Module` and tensor operations), without relying on `nn.BatchNorm2d`.


In [None]:
# your code here

### TODO 6:
Implement the LayerNorm layer from scratch using only basic PyTorch components (such as `nn.Module` and tensor operations), without relying on `nn.LayerNorm`.


In [None]:
# your code here

## Traning 

### TODO 7:
Complete the `GarbageClassifier` neural network by designing and implementing an architecture of your choice.  
Make use of the provided `_block` and `_block_mp` building blocks as you see fit.  
Allow the regularization type (e.g., `BatchNorm2d` or `LayerNorm`) to be specified from outside the class, so you can later compare the results between the two types of regularization.


In [None]:
class GarbageClassifier(nn.Module):
    def __init__(self, num_classes):
        super(GarbageClassifier, self, norm_layer).__init__()
        # self.features = nn.Sequential(
        #    your code here
        # )

        # self.classifier = nn.Sequential(
        #    your code here
        # )


       
    def _block_mp(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1, norm_layer=None, kernel_size_mp=2):
        """AlexNet style block with max pooling"""
        return nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding),
            norm_layer(out_channels), # Normalization layer that you have implemented
            nn.MaxPool2d(kernel_size=kernel_size_mp),
            nn.ReLU(inplace=True)
        )

    def _block(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1, norm_layer=None):
        """AlexNet style block without max pooling"""

        return nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding),
            norm_layer(out_channels),
            nn.ReLU(inplace=True)
        )
    

    def forward(self, x):
        # Forward pass through the feature extractor givven self.features and self.classifier
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

### TODO 8:
Prepare all components needed for training:
1. Build your neural network with one type of regularization.
2. Create DataLoaders for the training (and optionally validation) sets.
3. Define the loss criterion.
4. Define the optimizer and assign it the trainable parameters of your model.
5. Print a summary of your network architecture.


In [None]:
# your conde here

### TODO 9:
Write a training loop to train your network for 10 epochs using the training set.
- Track and print the training loss for each epoch.
- After each epoch, compute and store both the loss and accuracy on the test set.
- After training, plot both the training and test losses on the same graph to visualize the learning process.
- Your model should achieve at least 75% accuracy on the test set.
- Remember to set your model to training mode (`model.train()`) during training, and to evaluation mode (`model.eval()`) when computing metrics on the test set.


In [None]:
# your code

### TODO 10:
Compute and report the accuracy of your trained model on the test set for each individual category (class).

For example:
* Class battery: 80%
* Class biological: 71%
* Class brown-glass: 70%
* Class cardboard: 85%
* Class clothes: 92%
* Class green-glass: 88%
* Class metal: 43%
* Class paper: 54%
* Class plastic: 39%
* Class shoes: 71%
* Class trash: 68%
* Class white-glass: 55%


### Repeat TODOs 8,9 and 10 with the other type of regularization

Repeat the previous steps for preparing your model, DataLoaders, optimizer, and training loop, but this time using the alternative regularization layer. After training, compare the performance between the two regularization types.


In [None]:
# your code here

### Final Task: Summary and Analysis

Write a summary of your work and the results you obtained. In 3–4 paragraphs, discuss your approach, key findings, and any challenges you encountered. Compare the performance of the two different regularization techniques you implemented, and suggest possible reasons for any differences you observed. Reflect on what you learned and what you might try differently in future experiments.


Good luck!