# Assignment: Custom Object Detection with YOLO

## Goal
In this assignment, you will create a custom object detection model using YOLO. You will select an object to detect, build a dataset, label the data, train the model, and evaluate its performance. This project will provide hands-on experience in data collection, model training, and evaluation—essential skills in machine learning.

### 1. Select an Object to Detect

#### Choose an Object
Select a simple, easily recognizable object such as an apple, cup, pen, or bicycle.

#### Goal
The goal is to create a dataset that will help YOLO detect this object in images with different backgrounds, angles, and lighting conditions.

#### Tip
Choose an object that you can easily find and photograph, as this will simplify the dataset-building process.

### 2. Collect and Label Data

#### Purpose
The purpose of this step is to create a well-labeled dataset that YOLO can use to learn to detect your chosen object.

#### Instructions
Collect 50–100 images of your object, focusing on variety. Capture images from different angles, with different lighting, and at various distances. Start with this smaller set, adding more images only if needed.

#### Image Variety
- **Angles:** Capture your object from different viewpoints (e.g., front, side, top).
- **Distances:** Include both close-up and distant shots to help YOLO recognize the object at different scales.
- **Lighting:** Take pictures in various lighting conditions (e.g., natural light, low light).
- **Backgrounds:** Use different backgrounds so that YOLO can detect the object regardless of the surroundings.

#### Labeling the Data
Label each image by drawing a bounding box around the object.

##### Tools
Use [LabelImg](https://github.com/tzutalin/labelImg) or [Makesense.ai](https://www.makesense.ai/).

##### How to Label
1. Open each image in the labeling tool.
2. Draw a bounding box around the object.
3. Save the label in YOLO format. Each image should have a corresponding text file with the same name as the image (e.g., `image1.jpg` and `image1.txt`).

##### YOLO Format Explained
The YOLO format is a text file containing the object’s class ID and bounding box coordinates. The class ID tells YOLO which object it is detecting (use `0` if you’re detecting only one object type). Bounding box coordinates specify the box’s position, which YOLO uses to learn object location.

##### Optional - Auto-Labeling Tools
To save time, use auto-labeling tools like [Roboflow](https://roboflow.com/) or [Supervisely](https://supervise.ly/).

###### How Auto-Labeling Works
These tools use pre-trained models to detect common objects and automatically draw bounding boxes around them.

###### Review Needed
Check each auto-labeled image to ensure the boxes are accurate and make small adjustments if necessary.

###### Export in YOLO Format
Both Roboflow and Supervisely allow you to export your labeled dataset in YOLO format.


In [1]:
from google.colab import drive
import sys

# Mount Google Drive
drive.mount('/content/drive')

Mounted at /content/drive


### 3. Organize Your Data

After collecting and labeling your images, it's crucial to organize the dataset so that YOLO can use it effectively.

#### Folder Structure
Create folders to store your images and labels in a way that YOLO can process.

1. **Create two main folders named `images` and `labels`.**
2. **Inside each folder, create subfolders: `train`, `val` (for validation), and `test`.**

#### Data Split
Divide your images into the following proportions:

- **Train (70%):** Used to train the model. This subset should be the largest to provide ample examples for learning.
- **Validation (20%):** Used to tune model parameters. This helps to moderate the training process and prevent overfitting.
- **Test (10%):** Used to test the model’s final performance. This set is crucial for evaluating how well your model generalizes to new, unseen data.

#### Efficiency Tip
For images with similar scenes, you can copy bounding boxes from one image to another and adjust them. This method saves time compared to redrawing each box from scratch. Efficient labeling is key, especially when dealing with large datasets or when you need to make slight adjustments to similar images.


In [2]:
# Step 3: Organize your data
import os
import shutil
from sklearn.model_selection import train_test_split

# Path to the directories where all images and labels are initially stored
images_dir = '/content/drive/My Drive/Colab Notebooks/CV/Images/'
labels_dir = '/content/drive/My Drive/Colab Notebooks/CV/Labels/'

# Get list of filenames without file extension
files = [os.path.splitext(file)[0] for file in os.listdir(images_dir) if file.endswith('.HEIC')]

# Split data
train_files, test_files = train_test_split(files, test_size=0.3, random_state=42)  # 70% training, 30% test
val_files, test_files = train_test_split(test_files, test_size=0.333, random_state=42)  # Of the 30%, split to 20% val, 10% test

# Function to move files
def move_files(files_list, src, dest, folder_type):
    for file in files_list:
        image_src = os.path.join(src, f"{file}.HEIC")
        label_src = os.path.join(labels_dir, f"{file}.txt")

        image_dest = os.path.join(dest, folder_type, 'images', f"{file}.HEIC")
        label_dest = os.path.join(dest, folder_type, 'labels', f"{file}.txt")

        shutil.move(image_src, image_dest)
        shutil.move(label_src, label_dest)

# Define your dataset base directory
base_dir = '/content/drive/My Drive/Colab Notebooks/CV/'

# Create subdirectories
for folder in ['train', 'val', 'test']:
    os.makedirs(os.path.join(base_dir, folder, 'images'), exist_ok=True)
    os.makedirs(os.path.join(base_dir, folder, 'labels'), exist_ok=True)

# Move files to their respective folders
move_files(train_files, images_dir, base_dir, 'train')
move_files(val_files, images_dir, base_dir, 'val')
move_files(test_files, images_dir, base_dir, 'test')


In [4]:
# Clone YOLOv5 and install dependencies
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
!pip install -r requirements.txt  # install dependencies


Cloning into 'yolov5'...
remote: Enumerating objects: 17067, done.[K
remote: Counting objects: 100% (45/45), done.[K
remote: Compressing objects: 100% (33/33), done.[K
Receiving objects: 100% (17067/17067), 15.68 MiB | 399.00 KiB/s, done.
remote: Total 17067 (delta 24), reused 27 (delta 12), pack-reused 17022 (from 1)[K
Resolving deltas: 100% (11714/11714), done.
/content/yolov5
Collecting thop>=0.1.1 (from -r requirements.txt (line 14))
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl.metadata (2.7 kB)
Collecting ultralytics>=8.2.34 (from -r requirements.txt (line 18))
  Downloading ultralytics-8.3.47-py3-none-any.whl.metadata (35 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics>=8.2.34->-r requirements.txt (line 18))
  Downloading ultralytics_thop-2.0.12-py3-none-any.whl.metadata (9.4 kB)
Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Downloading ultralytics-8.3.47-py3-none-any.whl (898 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m 

In [5]:
!pip install pillow_heif


Collecting pillow_heif
  Downloading pillow_heif-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.8 kB)
Downloading pillow_heif-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pillow_heif
Successfully installed pillow_heif-0.21.0


### Convert .heif Images To .jpeg

In [6]:
import os
from PIL import Image
import pillow_heif

def convert_heic_to_jpeg(heic_folder, jpeg_folder):
    for filename in os.listdir(heic_folder):
        if filename.lower().endswith('.heic'):
            # Load HEIC file
            heif_file = pillow_heif.read_heif(os.path.join(heic_folder, filename))

            # Convert to PIL Image
            image = Image.frombytes(
                heif_file.mode,
                heif_file.size,
                heif_file.data,
                "raw",
                heif_file.mode,
                heif_file.stride,
            )

            # Save as JPEG
            jpeg_path = os.path.join(jpeg_folder, filename[:-5] + '.jpg')
            image.save(jpeg_path, "JPEG")

# Set the paths to your HEIC images and where you want the JPEGs saved
heic_folder = '/content/drive/My Drive/Colab Notebooks/CV/train/images/'
jpeg_folder = '/content/drive/My Drive/Colab Notebooks/CV/train/images_converted/'

# Ensure the target folder exists
os.makedirs(jpeg_folder, exist_ok=True)

# Convert all HEIC images to JPEG
convert_heic_to_jpeg(heic_folder, jpeg_folder)


In [7]:
import os
from PIL import Image
import pillow_heif

def convert_heic_to_jpeg(heic_folder, jpeg_folder):
    for filename in os.listdir(heic_folder):
        if filename.lower().endswith('.heic'):
            # Load HEIC file
            heif_file = pillow_heif.read_heif(os.path.join(heic_folder, filename))

            # Convert to PIL Image
            image = Image.frombytes(
                heif_file.mode,
                heif_file.size,
                heif_file.data,
                "raw",
                heif_file.mode,
                heif_file.stride,
            )

            # Save as JPEG
            jpeg_path = os.path.join(jpeg_folder, filename[:-5] + '.jpg')  # Change file extension from .heic to .jpg
            image.save(jpeg_path, "JPEG")

# Set the paths to your HEIC images and where you want the JPEGs saved
heic_folder = '/content/drive/My Drive/Colab Notebooks/CV/test/images/'
jpeg_folder = '/content/drive/My Drive/Colab Notebooks/CV/test/images_converted/'

# Ensure the target folder exists
os.makedirs(jpeg_folder, exist_ok=True)

# Convert all HEIC images to JPEG
convert_heic_to_jpeg(heic_folder, jpeg_folder)


In [8]:
import os
from PIL import Image
import pillow_heif

def convert_heic_to_jpeg(heic_folder, jpeg_folder):
    for filename in os.listdir(heic_folder):
        if filename.lower().endswith('.heic'):
            # Load HEIC file
            heif_file = pillow_heif.read_heif(os.path.join(heic_folder, filename))

            # Convert to PIL Image
            image = Image.frombytes(
                heif_file.mode,
                heif_file.size,
                heif_file.data,
                "raw",
                heif_file.mode,
                heif_file.stride,
            )

            # Save as JPEG
            jpeg_path = os.path.join(jpeg_folder, filename[:-5] + '.jpg')  # Change file extension from .heic to .jpg
            image.save(jpeg_path, "JPEG")

# Set the paths to your HEIC images and where you want the JPEGs saved
heic_folder = '/content/drive/My Drive/Colab Notebooks/CV/val/images/'
jpeg_folder = '/content/drive/My Drive/Colab Notebooks/CV/val/images_converted/'

# Ensure the target folder exists
os.makedirs(jpeg_folder, exist_ok=True)

# Convert all HEIC images to JPEG
convert_heic_to_jpeg(heic_folder, jpeg_folder)


### Move All The Labels To Stay In The Same Folder As Images

In [9]:
import shutil
import os

# Define the base directory for operations
base_dir = '/content/drive/My Drive/Colab Notebooks/CV/'

# Define the dataset types
dataset_types = ['test', 'train', 'val']

# Loop through each dataset type
for dataset_type in dataset_types:
    # Define source and destination directories for labels
    source_dir = os.path.join(base_dir, dataset_type, 'labels/')
    destination_dir = os.path.join(base_dir, dataset_type, 'images_converted/')

    # Ensure the destination directory exists, create if it doesn't
    os.makedirs(destination_dir, exist_ok=True)

    # List all label files in the source directory
    label_files = [f for f in os.listdir(source_dir) if f.endswith('.txt')]

    # Copy each label file to the destination directory
    for file_name in label_files:
        source_file_path = os.path.join(source_dir, file_name)
        destination_file_path = os.path.join(destination_dir, file_name)
        shutil.copy(source_file_path, destination_file_path)
        print(f'Copied {file_name} from {source_dir} to {destination_dir}')

    print(f'All label files for {dataset_type} have been copied successfully.')

print('Label transfer for all datasets completed successfully.')


Copied IMG_1318.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
Copied IMG_1319.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
Copied IMG_1322.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
Copied IMG_1333.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
Copied IMG_1356.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
Copied IMG_1373.txt from /content/drive/My Drive/Colab Notebooks/CV/test/labels/ to /content/drive/My Drive/Colab Notebooks/CV/test/images_converted/
All label files for test have been copied successfully.
Copied IMG_1316.txt from /content/drive/My D

### 4. Set Up YOLO and Train Your Model

With your data organized, you can now set up YOLO and begin training your model. This step is crucial as it involves setting up the YOLO environment, configuring the training settings, and initiating the training process on your dataset.

#### Set Up YOLO Environment
- **Use YOLOv5 or YOLOv8** in Python. For environments that support high-performance computing, consider using Google Colab, which provides free GPU access for faster training.

#### Training Settings:


*   Confidence Threshold: 0.5 (this is how sure YOLO needs to be before it confirms an object detection).
* Image Size: 416x416 pixels (YOLO uses this resolution to process images).

* Epochs: 20–50 (each epoch is one complete pass through the training dataset).

#### Train the Model:
Run the training code on your dataset. You will see loss metrics (numbers that show how well the model is learning) as training progresses.

#### Goal:
As training continues, these loss metrics should decrease, which means the model is improving in recognizing your object.


In [11]:
# Step 4: Set Up YOLO and Train Your Model

import torch
from yolov5 import train  # import the training module from YOLOv5

def train_yolov5():
    # Configure the training
    config = '/content/yolov5/data/yolov5_config.yaml'  # Adjust the path based on your actual directory structure
    epochs = 50  # set the number of training epochs
    img_size = 416  # define the image size for training
    batch_size = 16  # determine batch size based on your GPU capacity

    # Start the training process
    train.run(data=config,
              imgsz=img_size,
              batch=batch_size,
              epochs=epochs,
              weights='yolov5s.pt',  # start from pre-trained weights
              cache_images=True)

train_yolov5()  # Call the training function


[34m[1mtrain: [0mweights=yolov5s.pt, cfg=, data=/content/yolov5/data/yolov5_config.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=50, batch_size=16, imgsz=416, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, evolve_population=data/hyps, resume_evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest, ndjson_console=False, ndjson_file=False, batch=16, cache_images=True
[34m[1mgithub: [0mup to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v7.0-388-g882c35fc Python-3.10.12 torch-2.5.1+cu121 CUDA:0 (Tesla T4, 15102MiB)

[34m[1mhyperparameters: [0mlr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005,

## 5. Evaluate Model Performance

### Validation Set
- **Dataset**: Use images from the `val` folder (20% of your dataset) that were not seen by the model during training.

### Key Metrics
1. **Precision**:  
   - Definition: The percentage of detections that were correct.  
   - Example: How many of the detected objects were actually your chosen object.
2. **Recall**:  
   - Definition: The percentage of actual objects in the images that the model detected.  
   - Example: How many true objects were found out of all possible objects.

### Observations
- **Challenges**:
  - Did the model struggle with certain backgrounds or lighting conditions?
  - Were small or overlapping objects difficult for the model to detect?
  - Did specific angles affect detection accuracy?

- **Patterns**:
  - Note down recurring trends in model performance (e.g., improved accuracy on plain backgrounds but reduced performance in cluttered scenes).


In [12]:
# Step 5: Evaluate Model Performance

from yolov5 import val  # import the validation module from YOLOv5

def evaluate_yolov5():
    # Set up validation
    config = '/content/yolov5/data/yolov5_config.yaml'  # path to your dataset config file
    weights = '/content/yolov5/runs/train/exp/weights/best.pt'  # path to the trained weights
    img_size = 416  # the image size should be the same as during training
    batch_size = 32  # can be larger if only validating because it requires less GPU memory

    # Run validation
    results = val.run(data=config,
                      weights=weights,
                      imgsz=img_size,
                      batch_size=batch_size,  # Corrected argument here
                      conf_thres=0.5,  # confidence threshold
                      iou_thres=0.5)  # IOU threshold for metrics

    # Print results (precision, recall, mAP, etc.)
    print(results)

evaluate_yolov5()


YOLOv5 🚀 v7.0-388-g882c35fc Python-3.10.12 torch-2.5.1+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
Model summary: 157 layers, 7012822 parameters, 0 gradients, 15.8 GFLOPs
[34m[1mval: [0mScanning /content/drive/My Drive/Colab Notebooks/CV/val/images_converted.cache... 11 images, 0 backgrounds, 0 corrupt: 100%|██████████| 11/11 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 1/1 [00:04<00:00,  4.67s/it]
                   all         11         20          1          1      0.995      0.716
Speed: 0.1ms pre-process, 4.0ms inference, 4.0ms NMS per image at shape (32, 3, 416, 416)
Results saved to [1mruns/val/exp[0m


((1.0, 1.0, 0.995, 0.7156004385964911, 0.0, 0.0, 0.0), array([     0.7156]), (0.1110597090287642, 3.958680412986062, 3.9808316664262247))
