# 2024 Coding Challenge: Automated Object Detection and Counting

**Objective:** Develop a Python-based application that uses computer vision techniques to count and classify different objects from a set of images. The system should be able to differentiate at least three types of objects (e.g., cars, bicycles, pedestrians) in diverse lighting and background conditions. 

**Resources:** 
- http://cocodataset.org/
- https://github.com/cocodataset/cocoapi/tree/master
- https://github.com/ultralytics/yolov5
- https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/#clearml-logging-and-automation-new
- https://github.com/ultralytics/yolov5/blob/master/tutorial.ipynb

## Data Preparation

1. Download Data:
- Run the download_data.py script to download the dataset:

### 1. Download Data:

In [1]:
# run 'download_data.py script' to download and get COCO dataset
# examine 5 images
%run ./../scripts/download_data.py

**Transfer to google drive**

In [None]:
!cp -r data/raw/ "/Users/taishajoseph/Library/CloudStorage/GoogleDrive-taishavj28@gmail.com/My Drive/computer-vision-project/data/raw/"

### 2. Prep Data:

1.1. Filter COCO Annotations
- We'll create a Python script to filter the annotations for the classes person, car, and bicycle. 
- In this section, we will update the annotations and copy the relevant images to a new directory.

- *Load the COCO annotations using COCO(input_annotation_file).*

    - Get category IDs for the classes of interest using getCatIds(catNms=categories_to_keep).
    - Get image IDs associated with the filtered categories using getImgIds(catIds=category_ids).

- *Filtering Process:*

    - Iterate over the filtered image IDs.
    - Load image information and corresponding annotations.
    - Append filtered images and annotations to the respective lists.
    - Copy relevant images to the output directory.

- *Saving Filtered Data:*

    - Save the filtered images, annotations, and categories to the output annotation file in COCO format.

In [5]:
import os
import shutil
import json
from pycocotools.coco import COCO
from src.data_preprocessing import filter_coco_annotations, convert_coco_to_yolo


In [6]:
categories_to_keep = ['person', 'car', 'bicycle']

filter_coco_annotations(
    './../data/raw/coco_annotations/annotations/instances_train2017.json',
    './../data/filtered/annotations/filtered_instances_train2017.json',
    './../data/raw/coco_train2017/train2017',
    './../data/filtered/images/train2017',
    categories_to_keep
)

loading annotations into memory...
Done (t=31.63s)
creating index...
index created!


In [8]:
filter_coco_annotations(
    './../data/raw/coco_annotations/annotations/instances_val2017.json',
    './../data/filtered/annotations/filtered_instances_val2017.json',
    './../data/raw/coco_val2017/val2017/val2017',
    './../data/filtered/images/val2017',
    categories_to_keep
)

loading annotations into memory...
Done (t=0.77s)
creating index...
index created!


2.1. Organize Dataset Directory
After filtering, we'll organize the dataset directory structure to be compatible with YOLOv5.
Right now, the dataset directory should look like this:

- data/
  - filtered/
    - images/
      - train2017/
      - val2017/
    - annotations/
      - filtered_instances_train2017.json
      - filtered_instances_val2017.json

2.2. Convert COCO Annotations to YOLO Format
We need to convert the filtered COCO annotations to YOLO format (i.e., .txt files for each image).

In [11]:
convert_coco_to_yolo(
    './../data/filtered/annotations/filtered_instances_train2017.json',
    # 'data/filtered/images/train2017',
    './../data/filtered/labels/train2017'
)

In [12]:
convert_coco_to_yolo(
    './../data/filtered/annotations/filtered_instances_val2017.json',
    # 'data/filtered/images/val2017',
    './../data/filtered/labels/val2017'
)

Right now, the dataset directory should look like this:

- data/
  - filtered/
    - images/
      - train2017/
      - val2017/
    - annotations/
      - filtered_instances_train2017.json
      - filtered_instances_val2017.json
    - labels/
      - train2017/
      - val2017/

## **Before moving on to training:**

**In a different directory of your choosing:** (i.e. *~/Documents/Projects/*) 

- clone YOLOv5 repository (https://github.com/ultralytics/yolov5)
- install requirements

In [None]:
# !git clone https://github.com/ultralytics/yolov5
# !cd yolov5
# %pip install -r requirements.txt

**YOLOv5 Data Configuration**

- Create a YOLOv5 data configuration file titled: "coco_person_car_bicycle.yaml"
- cp configuration file to base directory of YOLOv5 repo where the train.py script is

In [None]:
# !cp coco_person_car_bicycle.yaml ~/Documents/Projects/yolov5/coco_person_car_bicycle.yaml

### Checkpoint 1: Clear Memory

In [None]:
# Clear variables
%reset -f

# Import necessary libraries
import gc

# Run garbage collector
gc.collect()

## Train (unable to complete locally, see Google Colab version of notebook for remaining steps)

**2 Options for training a model with YOLOv5:**

- Fine-tune a pre-trained model with pre-exisiting weights
- Train a model from scratch

### Train Model (smallest one is yolov5n) from scratch
Should be in *~/Documents/Projects/yolov5/* directory

- Navigate to the models folder in the cloned yolov5 repo
- Modify the "yolov5n.yaml" file by replacing the number of classes from 80 to 3
- Adjust other parameters for each layer of the model as needed. (Let's start with the default parameters)

**Define parameters for training:**

- img 640: Use 640x640 images for training.
- batch 16: Use a batch size of 16.
- epochs 50: Train for 50 epochs.
- data coco_person_car_bicycle.yaml: Use custom data configuration file with fewer classes.
- weights '': Train from scratch by not using any pre-trained weights.
- cfg yolov5n.yaml: Use the configuration file for YOLOv5n (the smallest model). --> "models/yolov5n.yaml"
- device 0: Use the first GPU for training.
--project:Directory where results will be saved. ("/Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/")

**Recommended:** run in terminal

**If running in Jupyter notebook, be sure to clear the notebook memory first**

In [13]:
import os
#Define parameters for training:
data_config="coco_person_car_bicycle.yaml"
config="yolov5n.yaml"
out_dir = "/Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/"
os.makedirs(out_dir, exist_ok=True)

In [None]:
# #example run in terminal (don't run):
# cd .. #navigate to computer vision project root directory

# mkdir yolov5n #create yolov5n directory

# mkdir /Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/ #create train subdirectory

# cd ~/Documents/Projects/yolov5 #navigate to yolov5 project directory

# python train.py --img 640 --batch 16 --epochs 50 --data coco_person_car_bicycle.yaml --weights '' --cfg yolov5n.yaml --device 0 --project /Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/

In [None]:
#run as subprocess
!cd ~/Documents/Projects/yolov5
%run train.py --img 640 --batch 16 --epochs 50 --data coco_person_car_bicycle.yaml --weights '' --cfg {config} --device cpu --project {out_dir}

In [None]:
# finetune pre-trained model (don't run --> training model from scratch instead)
# python train.py --img 640 --batch 8 --epochs 10 --data coco_person_car_bicycle.yaml --weights yolov5s.pt --cfg yolov5n.yaml --device cpu --project /Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/

## Checkpoint 2: Clear Memory

In [None]:
# Clear variables
%reset -f

# Import necessary libraries
import gc

# Run garbage collector
gc.collect()

## Evaluate
Should be in *~/Documents/Projects/yolov5/* directory

**Recommended:** run in terminal

**If running in Jupyter notebook, be sure to clear the notebook memory first**


**Evaluate the trained model on the validation dataset:**

In [None]:
import os

out_dir = "/Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/"
os.makedirs(out_dir, exist_ok=True)
data_config="coco_person_car_bicycle.yaml"
wts = os.path.join(out_dir, "exp/weights/best.pt")
wts

In [None]:
# !python val.py --data "coco_person_car_bicycle.yaml" --weights "runs/train/exp/weights/best.pt" --cache

In [None]:
#run as subprocess
!cd ~/Documents/Projects/yolov5
%run val.py --data {data_config} --weights {wts}

## Checkpoint 3: Clear Memory

In [None]:
# Clear variables
%reset -f

# Import necessary libraries
import gc

# Run garbage collector
gc.collect()

## Visualize Results

In [None]:
out_dir = "/Users/taishajoseph/Documents/Projects/computer-vision-project/yolov5n/train/"

In [None]:
import os
results = os.path.join(out_dir, "results.csv")
results

In [None]:
!cd ~/Documents/Projects/yolov5
from utils.plots import plot_results

plot_results(results)  # plot 'results.csv' as 'results.png'