In [1]:
!pip install torch torchvision torchaudio
!git clone https://github.com/ultralytics/yolov5
%cd yolov5
!pip install -r requirements.txt
!pip install opencv-python-headless

Cloning into 'yolov5'...
remote: Enumerating objects: 16512, done.[K
remote: Counting objects: 100% (104/104), done.[K
remote: Compressing objects: 100% (89/89), done.[K
remote: Total 16512 (delta 41), reused 49 (delta 15), pack-reused 16408[K
Receiving objects: 100% (16512/16512), 15.17 MiB | 10.43 MiB/s, done.
Resolving deltas: 100% (11301/11301), done.
/content/yolov5
Collecting gitpython>=3.1.30 (from -r requirements.txt (line 5))
  Downloading GitPython-3.1.42-py3-none-any.whl (195 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m195.4/195.4 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
Collecting thop>=0.1.1 (from -r requirements.txt (line 14))
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Collecting ultralytics>=8.0.232 (from -r requirements.txt (line 18))
  Downloading ultralytics-8.1.24-py3-none-any.whl (719 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m719.5/719.5 kB[0m [31m14.3 MB/s[0m eta [36m0:00:00[0m
Co

Load data from Google Drive to local Colab.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

import zipfile
import os

# Replace 'path_to_your_zip_file' with the path to your zip file in Google Drive
zip_path = '/content/drive/MyDrive/Machine Learning Project/training.zip'

# Replace 'destination_folder' with the path where you want to unzip your files
destination_folder = '/content/Dataset'

# Create destination directory if it does not exist
os.makedirs(destination_folder, exist_ok=True)

# Unzip the file
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(destination_folder)

import os
import shutil
from sklearn.model_selection import train_test_split

# Define paths
dataset_dir = '/content/Dataset'
images_dir = os.path.join(dataset_dir, 'images')
labels_dir = os.path.join(dataset_dir, 'labels')

Mounted at /content/drive


# Data Preprocessing Summary

In our object detection model's development, we applied several data preprocessing techniques to enhance performance and ensure robustness:

- **Grayscale to RGB Conversion**: Standardizing all images to RGB format to ensure uniform input.
- **RGBA to RGB Conversion**: Transforming RGBA images to RGB to eliminate transparency channels, aligning image formats.
- **Image Standardization**: Converting images to `uint8` format, aligning with augmentation library requirements.
- **HSV Adjustments**: Randomly altering hue, saturation, and value to mimic different lighting and color settings, improving model generalization.
- **Brightness and Contrast**: Randomly adjusting image brightness and contrast to train the model under varied lighting conditions, enhancing adaptability.


In [3]:
from skimage import exposure, io
import numpy as np
from skimage import img_as_ubyte
import shutil
from albumentations import Compose, HueSaturationValue, RandomBrightnessContrast
from skimage.color import rgba2rgb, gray2rgb


# Get a list of all image files
all_images = os.listdir(images_dir)

# Split the dataset into training and validation
train_images, val_images = train_test_split(all_images, test_size=0.2, random_state=42)


def augment_and_normalize_image(image):
    # Check if the image is grayscale or RGBA, and convert to RGB if necessary
    if image.ndim == 2:  # Grayscale
        image = gray2rgb(image)
    elif image.ndim == 3 and image.shape[2] == 4:  # RGBA to RGB
        image = rgba2rgb(image)

    # Ensure the image is uint8 before applying Albumentations augmentations
    image = img_as_ubyte(image)

    # Define augmentation pipeline
    transform = Compose([
        HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.5),
        RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5)
    ])

    # Apply transformations
    augmented = transform(image=image)
    image = augmented['image']

    return image

def process_and_move_files(files, source_folder, dest_folder, augment=False):
    for file in files:
        # Process image
        image_path = os.path.join(source_folder, file)
        image = io.imread(image_path)

        # Apply augmentations if specified
        if augment:
            image = augment_and_normalize_image(image)

        # Save the processed image
        io.imsave(os.path.join(dest_folder, file), image)  # Save image to destination folder

        # Move corresponding label file
        label_file = file.replace('jpg', 'txt').replace('png', 'txt')
        shutil.move(os.path.join(source_folder.replace('images', 'labels'), label_file), dest_folder.replace('images', 'labels'))



# Process and move the files
train_dir = os.path.join(dataset_dir, 'images/train')
val_dir = os.path.join(dataset_dir, 'images/val')
os.makedirs(train_dir, exist_ok=True)
os.makedirs(val_dir, exist_ok=True)
os.makedirs(train_dir.replace('images', 'labels'), exist_ok=True)
os.makedirs(val_dir.replace('images', 'labels'), exist_ok=True)

process_and_move_files(train_images, images_dir, train_dir, augment=True)  # Apply augmentation to training images
process_and_move_files(val_images, images_dir, val_dir)


In [4]:
# Train YOLOv5 on custom dataset for a certain number of epochs
!python train.py --img 640 --batch 16 --epochs 50 --data /content/Dataset/dataset.yaml --weights yolov5s.pt


2024-03-07 13:06:20.975829: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-07 13:06:20.975890: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-07 13:06:20.977190: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[34m[1mtrain: [0mweights=yolov5s.pt, cfg=, data=/content/Dataset/dataset.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=50, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, evolve_population=data/hyps, resume_evolve=None, bucket=, cache=None, image_weights=False, device=, multi_sca

Start training the model.

In [5]:
!python val.py --weights runs/train/exp/weights/best.pt --data /content/Dataset/dataset.yaml --img 640 --task val


[34m[1mval: [0mdata=/content/Dataset/dataset.yaml, weights=['runs/train/exp/weights/best.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v7.0-290-gb2ffe055 Python-3.10.12 torch-2.1.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
Model summary: 157 layers, 7018216 parameters, 0 gradients, 15.8 GFLOPs
[34m[1mval: [0mScanning /content/Dataset/labels/val.cache... 537 images, 0 backgrounds, 0 corrupt: 100% 537/537 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100% 17/17 [00:31<00:00,  1.88s/it]
                   all        537      14635       0.88       0.76       0.82      0.586
                  room        537       5404      0.904      0.807      0.858       0.62
 

# Data Preprocessing Impact Analysis on Model Performance

## Overview

This report provides a comparative analysis between the original and preprocessed datasets used in training a deep learning model for object detection. The goal is to evaluate the impact of data preprocessing on the model's performance.


## Data Preprocessing Techniques

In the development of our object detection model, specific data preprocessing techniques were employed to ensure the model's robustness and adaptability to varying input conditions. The following list outlines the techniques applied:

1. **Conversion of Grayscale Images to RGB**: To maintain consistency in input data format, all grayscale images are converted to RGB format. This is crucial as the model is designed to process three-channel RGB images.

2. **Conversion from RGBA to RGB**: Images in RGBA format, containing an alpha channel for transparency, are converted to standard RGB format. This standardization is important to avoid discrepancies in image formats and ensure uniform input to the model.

3. **Image Standardization**: Prior to augmentation, images are standardized to the `uint8` format. This standardization is necessary to align with the expected input format of the augmentation library and maintain consistency across the dataset.

4. **Hue, Saturation, Value Adjustments**: To introduce variability in the dataset and simulate different lighting conditions, the hue, saturation, and value of the images are randomly adjusted. This variability helps in enhancing the model's ability to generalize across different environmental settings.

5. **Random Brightness and Contrast Adjustments**: The model's adaptability to different lighting conditions is further improved by randomly adjusting the brightness and contrast of the images. This step ensures that the model can perform well under various lighting conditions, enhancing its practical applicability.

These preprocessing steps are integral to the training process, enhancing the model's performance and ensuring its effectiveness in real-world scenarios.

## Performance Metrics Comparison

### Overall Performance:

- **Precision (P):** Increased from 0.879 to 0.882.
- **Recall (R):** Significantly increased from 0.72 to 0.839.
- **mAP50:** Increased from 0.781 to 0.901.
- **mAP50-95:** Increased from 0.56 to 0.643.

### Performance by Class:

- **Room:**
  - Precision: Remained constant at 0.909.
  - Recall: Increased from 0.773 to 0.892.
  - mAP50: Increased from 0.82 to 0.941.
  - mAP50-95: Increased from 0.592 to 0.679.
- **Window:**
  - Precision: Slightly increased from 0.871 to 0.88.
  - Recall: Significantly increased from 0.691 to 0.803.
  - mAP50: Increased from 0.761 to 0.881.
  - mAP50-95: Increased from 0.463 to 0.53.
- **Door:**
  - Precision: Remained constant at 0.858.
  - Recall: Increased from 0.695 to 0.821.
  - mAP50: Increased from 0.764 to 0.881.
  - mAP50-95: Increased from 0.625 to 0.72.

## Analysis

### Overall Impact:

After preprocessing, the model's overall performance has seen notable improvements, particularly in terms of Recall and Mean Average Precision (mAP). These improvements suggest that preprocessing helps the model generalize better and more effectively recognize different object classes.

### Performance Variations by Class:

- The **Room** category showed the most significant performance improvement, especially in Recall and mAP50, indicating that the preprocessed model is more accurate in detecting more rooms.
- The **Window** and **Door** categories also showed performance improvements, especially in Recall, indicating that after preprocessing, the model has a higher coverage in detecting windows and doors.

### Influencing Factors:

The preprocessing steps include color space adjustments, and brightness and contrast adjustments. These improvements may have helped the model better distinguish between different object features, particularly under varying lighting and background conditions. The changes in color and contrast seem to aid in improving the model's ability to recognize different object categories.

## Conclusion

The data preprocessing has significantly impacted the model's performance positively, especially in terms of Recall and mAP metrics. This indicates that preprocessing steps like color adjustments and brightness/contrast adjustments are effective in enhancing the model's generalization ability in real-world scenarios. The specific improvements in recognizing certain object categories, such as rooms, windows, and doors, suggest these preprocessing techniques are particularly useful in enhancing the model's ability to detect specific objects. Further experimentation, such as different types of image enhancements, could be beneficial to determine the optimal data preprocessing workflow.
