## <center><b>Object Detection on COVID-19 Dataset</b></center>
<center>Done by: Group 16</center>
<center><br>Aardran Premakumar - 8844491</center>
<center><br> Meenu Ramesh - 8945753</center>

In [1]:
import os
import torch
import pandas as pd
from PIL import Image

### Step 1: Dataset Collection

In [2]:
# Dataset Selection

# Set the path to the dataset and labels CSV files
dataset_path = r"C:\Users\aardr\OneDrive\CONESTOGA\AI & ML\Sem1\Foundations of ML\Project\Datasets\COVID-19 PPE data\dataset"
train_labels_path = r"C:\Users\aardr\OneDrive\CONESTOGA\AI & ML\Sem1\Foundations of ML\Project\Datasets\COVID-19 PPE data\tf_record_files\train_labels.csv"
test_labels_path = r"C:\Users\aardr\OneDrive\CONESTOGA\AI & ML\Sem1\Foundations of ML\Project\Datasets\COVID-19 PPE data\tf_record_files\test_labels.csv"

# Paths for train and test datasets
train_path = os.path.join(dataset_path, "train")
test_path = os.path.join(dataset_path, "test")


# Load labels from CSV files
train_labels_df = pd.read_csv(train_labels_path)
test_labels_df = pd.read_csv(test_labels_path)

### Finding the number of classes

In [5]:
# find the number of classes
train_labels = train_labels_df['class'].values
a = pd.Series(train_labels).unique()
c = pd.Series(a).value_counts()
num_classes = len(c)
print("The number of unique classes = ", num_classes) # number of classes

The number of unique classes =  5


### Preprocessing Images on train and test data

In [7]:
import numpy as np
# TODO: Implement function to preprocess images and extract bounding boxes
def preprocess_images(image_path, labels):
    # load image, resize, and extract bounding boxes
    image = Image.open(image_path)
    image = image.resize((640, 640))
    image = np.array(image)  # convert image to NumPy array
    image = torch.tensor(image)
    # Check the shape of the image tensor
    if len(image.shape) < 3:
        image = image.unsqueeze(0)  # Add a dimension for channels if it's missing
    
    image = image.permute(2, 0, 1)  # Permute the dimensions
    image = image.float()
    image = image / 255.0
    # extract bounding boxes from labels
    bounding_boxes = []
    for label in labels:
        if label[0] == image_path:
            bounding_boxes.append(label[1:])
    return image, bounding_boxes

# preprocess train and test images
train_images = []
train_bounding_boxes = []
for image_path in train_labels_df["filename"]:
    image, bounding_boxes = preprocess_images(os.path.join(train_path, image_path), train_labels_df.values)
    train_images.append(image)
    train_bounding_boxes.append(bounding_boxes)

test_images = []
test_bounding_boxes = []
for image_path in test_labels_df["filename"]:
    image, bounding_boxes = preprocess_images(os.path.join(test_path, image_path), test_labels_df.values)
    test_images.append(image)
    test_bounding_boxes.append(bounding_boxes)



### Loading Model (YOLO)

In [8]:
import torch
# Load the pre-trained YOLOv5 model
model_yaml = r'C:\Users\aardr\OneDrive\CONESTOGA\AI & ML\Sem1\Foundations of ML\Project\yolov5\models\yolov5s.yaml'

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True) # yolov5s is the smallest


# Reference: https://github.com/ultralytics/yolov5

Using cache found in C:\Users\aardr/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5  2023-12-14 Python-3.10.5 torch-2.1.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


### Customizing the YOLO Model

In [9]:
# customizing the model
my_model = model.model

In [10]:
num_classes = 5  # number of classes in the dataset
my_model.nc = num_classes  # number of classes in dataset
my_model.names = ['Hard Hat', 'Mask', 'Vest', 'Boots', 'Gloves']  # class names


### Freezing some layers

In [11]:
# freeze first few layers
for param in my_model.parameters():
    param.requires_grad = False

In [12]:
# hyperparameters
learning_rate = 0.001
batch_size = 32
epochs = 10

# define optimizer and loss function
optimizer = torch.optim.Adam(my_model.parameters(), lr=learning_rate)
criterion = torch.nn.MSELoss()



### Training the model

In [13]:
# Note: There was a RunTimeError on the next code cell, so to ensure image should have 3 channels.

import torch.nn.functional as F
def resize_image(image):
    # Ensure image has 3 channels (RGB)
    if image.shape[0] == 4:  # If the image has 4 channels (e.g., RGBA)
        image = image[:3, :, :]  # Keep only the first 3 channels (RGB)
    elif image.shape[0] == 1:  # If the image has 1 channel (e.g., grayscale)
        image = image.repeat(3, 1, 1)  # Repeat the channel to create a 3-channel image

    # Resize the image
    resized_image = F.interpolate(image.unsqueeze(0), size=(640, 640), mode='bilinear', align_corners=False).squeeze(0)
    return resized_image


In [12]:
import torch.nn.functional as F

for epoch in range(epochs):
    print(f"Epoch {epoch + 1}\n-------------------------------")
    for i in range(0, len(train_images), batch_size):
        batch_images = train_images[i:i+batch_size]
        
        loss = torch.tensor(0, requires_grad=True, dtype=torch.float32)  # Initialize loss as a tensor
        
        # FILEPATH: Untitled-3.ipynb
        batch_bounding_boxes = train_bounding_boxes[i:i+batch_size]
        # Resize bounding boxes to match the number of output channels
        batch_bounding_boxes = [boxes[:3] for boxes in batch_bounding_boxes]
        print(f"Batch Images Shape: {torch.stack(batch_images).shape}")

        optimizer.zero_grad()  # Clear the gradients
        
        if batch_bounding_boxes:  # Check if bounding_boxes is not empty
            # Resize images to ensure they have the same dimensions
            resized_images = []
            for image in batch_images:
                resized_image = resize_image(image)
                resized_images.append(resized_image)
            resized_images.append(resized_image)
            print(f"Resized Image Shape: {resized_image.shape}")  # Debug print
            resized_images = torch.stack(resized_images)  # Stack the resized images

            outputs = my_model(resized_images)
            print("O/P shape", outputs.shape)

            for output, bounding_boxes in zip(outputs, batch_bounding_boxes):
                if len(bounding_boxes) > 0:  # Check if bounding_boxes is not empty
                    loss += criterion(output, torch.tensor(bounding_boxes, dtype=torch.float32))

            loss.backward()  # Backpropagation
            optimizer.step()  # Update the weights

            print(f"Loss: {loss.item():.4f}")


# Refernce: https://www.cs.toronto.edu/~lczhang/360/lec/w02/training.html

Epoch 1
-------------------------------
Batch Images Shape: torch.Size([32, 3, 640, 640])
Resized Image Shape: torch.Size([3, 640, 640])
O/P shape torch.Size([33, 25200, 85])
Loss: 0.0000
Batch Images Shape: torch.Size([32, 3, 640, 640])
Resized Image Shape: torch.Size([3, 640, 640])
O/P shape torch.Size([33, 25200, 85])
Loss: 0.0000
Batch Images Shape: torch.Size([32, 3, 640, 640])
Resized Image Shape: torch.Size([3, 640, 640])
O/P shape torch.Size([33, 25200, 85])
Loss: 0.0000
Batch Images Shape: torch.Size([32, 3, 640, 640])
Resized Image Shape: torch.Size([3, 640, 640])


Note: Couldn't complete the training since it's raising error.

`RuntimeError: stack expects each tensor to be equal size, but got [3, 640, 640] at entry 0 and [4, 640, 640] at entry 14`

Have been stuck on it, and couldn't debug it.

### Conclusion

This project was about trying YOLO (You Only Look Once) for finding objects. We started really excited because YOLO is known for being fast and good at spotting objects. But soon, we faced many hard problems that made our work slow. It felt like fixing one thing just caused another problem, showing how tough it is to work with object detection. Even though we didn't get the results we wanted, we learned a lot about the real challenges of doing object detection, which helps us get ready for more work in this interesting area.