# Garbage Classification Model

This assignment follows the model proposed in assignment 1 to solve the garbage classification problem. The notebook highlights the predictions from the model and the rasoning behind decisions.

The data came pre-split with the assignment instructions.
Number of images in the training folder creates the batch size when training
Initial number of epochs was set arbitrarily


To solve this problem, a custom dataset was created and a MultiModal model was implemented
1. Read data and create numpy arrays of images, text and labels associated.

2. Data pre-processing
- Custom dataset was created. This outputs the images, labels, text, attetion mask and input ids.
The transformation resizes image and applies normalization using statistics based on ResNet training.
- Data Loader was created

3. Model created
- The model uses ResNet50 and BERT as the feature extraction backbone.
- A dense layer is added after each to ensure both input similar number of features. This was normalized and the ReLu activation function added.
- The text and image feature was merged and passed through the final classification layer.
- a log_softmax was used at the end as the output.

![Garbage Classification Model](./Assignment2.png)

4. Training and Validation
- Batch size, learning rate, number of epochs were the main hyperparameters changed
- Early stopping after 5 iterations with no change during training was introduced to the model after noticing  overfitting. The AdamW optimizer with weight decay was used.
- With an imbalanced dataset, a weighted CrossEntropy Loss was used. The class distribution of the training set was:
![Garbage Classification Model](./ClassDistribution.png)

This notebook runs the test section of the code with the best model. Results for testing is given by accuracy, F1 score and the confusion matrix.

## Experimental Setup


In [1]:
# Importing important functions and modules
# Import necessary functions from python files
# Model, Custom dataset and data extraction function
from Data_and_Model import read_text_files_with_labels, CustomDataset, GarbageModel

# Metrics
from Metrics import metrics_eval

#---------- Importing useful packages --------------#
import torch # pytorch main library
import glob
import torchvision # computer vision utilities
import torchvision.transforms as transforms # transforms used in the pre-processing of the data
from torchvision import *

from PIL import Image
from torchvision.models import resnet18, resnet50, ResNet50_Weights
from transformers import DistilBertModel, DistilBertTokenizer
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
import torch.nn.functional as F

import matplotlib.pyplot as plt
import numpy as np

import time
import copy
import os
import re

from sklearn.metrics import confusion_matrix, f1_score, accuracy_score
import seaborn as sns


# Check if GPU is available
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Assuming that we are on a CUDA machine, this should print a CUDA device:
print(device)
##-----------------------------------------------------------------------------------------------------------##

ImportError: cannot import name 'Self' from 'typing_extensions' (C:\Users\Veron\anaconda3\lib\site-packages\typing_extensions.py)

In [3]:
# Set the hyperparameters
batch_size = 256 # Change Batch Size o
num_workers = 2

# Extract data and create Data Loaders
PRED_PATH = r"./CVPR_2024_dataset_Test" # Prediction images path

torchvision_transform_test = transforms.Compose([transforms.Resize((224,224)),\
    transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406] ,std=[0.229, 0.224, 0.225])])

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased') # BERT Tokenizer for caption text
max_len = 24

#Dataset has been pre-split, load all data sets.
pred_dataset = CustomDataset(PRED_PATH, max_len, tokenizer, transform=torchvision_transform_test)

# Get the data loader for the train set
predloader = DataLoader(pred_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers)

#Dataset has been pre-split, load all data sets.
pred_set_size = int(len(pred_dataset))

#classes = ('green', 'blue', 'black', 'other')
print("Prediction set:", pred_set_size)

# ##-----------------------------------------------------------------------------------------------------------##

# get some random training images
pred_iterator = iter(predloader)
pred_batch = next(pred_iterator)

# Visualizing a sample image from dataset
plt.figure()
plt.imshow(pred_batch['image'].numpy()[8].transpose(1,2,0).reshape(128, 128)) # Convert tensor to numpy array
plt.show()



Prediction set: 3431


OSError: Caught OSError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "c:\Python38\lib\site-packages\PIL\ImageFile.py", line 271, in load
    s = read(self.decodermaxblock)
  File "c:\Python38\lib\site-packages\PIL\PngImagePlugin.py", line 932, in load_read
    cid, pos, length = self.png.read()
  File "c:\Python38\lib\site-packages\PIL\PngImagePlugin.py", line 167, in read
    length = i32(s)
  File "c:\Python38\lib\site-packages\PIL\_binary.py", line 85, in i32be
    return unpack_from(">I", c, o)[0]
struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Python38\lib\site-packages\torch\utils\data\_utils\worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "c:\Python38\lib\site-packages\torch\utils\data\_utils\fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "c:\Python38\lib\site-packages\torch\utils\data\_utils\fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "c:\Users\Veron\OneDrive\GitHub\ENEL645\Assignment 2\Submit\Data_and_Model.py", line 81, in __getitem__
    image = Image.open(self.image_paths[idx]).convert('RGB')
  File "c:\Python38\lib\site-packages\PIL\Image.py", line 922, in convert
    self.load()
  File "c:\Python38\lib\site-packages\PIL\ImageFile.py", line 278, in load
    raise OSError(msg) from e
OSError: image file is truncated


# Predictions

In [6]:
# Loading the best model and predicting classes
PATH = './garbage_net731.pth' # Path to save the best model
net = GarbageModel(4, (3,224,224), False)
net.load_state_dict(torch.load(PATH))

metrics_eval(net, predloader, device) # Plots the confusion matrix too

  net.load_state_dict(torch.load(PATH))


RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Plotting the Misclassified samples

In [None]:
#PREDICTIONS
# Plot the misclassified samples
def show_data(data_sample):
    plt.imshow(data_sample[0].numpy().reshape(224, 224), cmap='gray')
    plt.title(' Predicted Bin= '+ str(data_sample[2]) + ' Actual Bin= ' + str(data_sample[1]))

# label to word
def bin(label):
    if label == 0:
        color = 'BLACK'
    elif label == 1:
        color = 'BLUE'
    elif label == 2:
        color = 'GREEN'
    elif label == 3:
        color = 'OTHER'
    return color

count = 0
for batch in predloader:
    input_ids = batch['input_ids']
    attention_mask = batch['attention_mask']
    label = batch['label']
    images = batch['image']
    z = net(images, input_ids, attention_mask)
    _, ypred = torch.max(z, 1)
    if ypred != label:
        plt.figure()
        show_data((images, label, ypred))
        plt.show()
        count += 1
    if count >= 5:
        break  


# References

These were useful in writing the code, and implementing the model.

1.   CNN Minst, and garbage classification tutorials
2.   https://www.pluralsight.com/resources/blog/guides/introduction-to-resnet
3. https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
4. Multimodal https://www.kaggle.com/code/fabraz/image-and-text-multimodal
5. https://wandb.ai/mostafaibrahim17/ml-articles/reports/The-Basics-of-ResNet50---Vmlldzo2NDkwNDE2#pytorch
6. Custom Data set looking at ImageFolder https://dilithjay.com/blog/custom-image-classifier-with-pytorch
7. Handling imbalanced data: https://saturncloud.io/blog/how-to-use-class-weights-with-focal-loss-in-pytorch-for-imbalanced-multiclass-classification/
8. Adam optimizer: https://www.datacamp.com/tutorial/adamw-optimizer-in-pytorch

