# NAME: RAJA HAIDER ALI
# CMS ID: 346900
# GROUP: 2

# Lab 8: Object Detection

Object detection is a fundamental task in computer vision that involves identifying and locating objects of interest within images or video frames. It plays a crucial role in various real-world applications, such as autonomous vehicles, video surveillance, image-based search, robotics, medical image analysis, and augmented reality. By accurately detecting and localizing objects, computer vision systems can better understand the visual world and make informed decisions based on the detected objects' positions, sizes, and relationships.

# Dataset

We will use the Vehicle-OpenImages dataset from Roboflow. The dataset contains images of various vehicles in varied traffic conditions. These images have been collected from the Open Image dataset. The images are from varied conditions and scenes. It contains 5 classes in total. They are Car, Bus, Motorcycle, Truck, Ambulance.


### Download and unzip the Dataset

For each of the train, validation, and test set folders within the dataset, you need to create two subfolders:

1. "images": This folder should contain all the images corresponding to the specific set (train, validation, or test).
2. "labels": This folder should have a ".txt" file for every image in the "images" folder. Each label file will include one line per bounding box, with the bounding box information in the YOLO format as follows:

object_class  x  y  width  height

* "object_class": This is an integer representing the object's class. Start with 0 for the first class and increment by 1 for each additional unique class in the dataset.
* "x" and "y": These values represent the normalized coordinates of the bounding box's center, with respect to the image width and height, respectively. Ensure that the coordinates are within the range of [0, 1].
* "width" and "height": These values represent the normalized width and height of the bounding box, with respect to the image width and height, respectively. Ensure that these dimensions are within the range of [0, 1].

In [None]:
!gdown https://drive.google.com/file/d/16BZkCbPQNUwTEOYXsA2cPavUPQys0Acr/view?usp=sharing --fuzzy

Downloading...
From: https://drive.google.com/uc?id=16BZkCbPQNUwTEOYXsA2cPavUPQys0Acr
To: /content/Vehicles-OpenImages.zip
  0% 0.00/40.3M [00:00<?, ?B/s] 53% 21.5M/40.3M [00:00<00:00, 208MB/s]100% 40.3M/40.3M [00:00<00:00, 198MB/s]


In [None]:
!unzip /content/Vehicles-OpenImages.zip -d ./vehicles

Archive:  /content/Vehicles-OpenImages.zip
replace ./vehicles/README.dataset.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
 extracting: ./vehicles/README.dataset.txt  
 extracting: ./vehicles/README.roboflow.txt  
 extracting: ./vehicles/data.yaml    
 extracting: ./vehicles/test/images/00dea1edf14f09ab_jpg.rf.3f17c8790a68659d03b1939a59ccda80.jpg  
 extracting: ./vehicles/test/images/00dea1edf14f09ab_jpg.rf.KJ730oDTFPdXdJxvSLnX.jpg  
 extracting: ./vehicles/test/images/00e481ea1a520175_jpg.rf.6e6a8b3b45c9a11d106958f88ff714ea.jpg  
 extracting: ./vehicles/test/images/00e481ea1a520175_jpg.rf.MV6sZ8QCFwFeMYaI2tHm.jpg  
 extracting: ./vehicles/test/images/08c8b73e0c2e296e_jpg.rf.7IkYAamjZhnwsoXSrwKt.jpg  
 extracting: ./vehicles/test/images/08c8b73e0c2e296e_jpg.rf.effa65856584463c08848031cab357b9.jpg  
 extracting: ./vehicles/test/images/10c26c6598677a1f_jpg.rf.USCbBYVcUICkLhuq07Lw.jpg  
 extracting: ./vehicles/test/images/10c26c6598677a1f_jpg.rf.f72b2b91e750909f68fffeee777e9350.jpg  
 extr

# Annotation
In order to annotate your own dataset use the labelimg tool

# YOLOv5

Clone the YOLOv5 Repository

In [None]:
!git clone https://github.com/ultralytics/yolov5

fatal: destination path 'yolov5' already exists and is not an empty directory.


Install dependency from requirements.txt file

In [None]:
!pip install -r yolov5/requirements.txt



## Training Options
* img        : Size of image. The image is a square one. The original image is resized while maintaining the aspect ratio. The longer side of the image is resized to this number. The shorter side is padded with grey color.

* batch: The batch size

* epochs: Number of epochs to train for

* data: Data YAML file that contains information about the
dataset (path of images, labels)

* workers: Number of CPU workers

* cfg: Model architecture. There are 4 choices available: yolo5s.yaml, yolov5m.yaml, yolov5l.yaml, yolov5x.yaml. The size and complexity of these models increases in the ascending order and you can choose a model which suits the complexity of your object detection task. In case you want to work with a custom architecture, you will have to define a YAML file in the models folder specifying the network architecture.

* weights: Pretrained weights you want to start training from. If you want to train from scratch, use --weights ' '

* name: Various things about training such as train logs. Training weights would be stored in a folder named runs/train/name

* hyp: YAML file that describes hyperparameter choices. For examples of how to define hyperparameters, see data/hyp.scratch.yaml. If unspecified, the file data/hyp.scratch.yaml is used.

## Data Config File

* Details for the dataset you want to train your model on are defined by the data config YAML file. The following parameters have to be defined in a data config file:

* train, test, and val: Locations of train, test, and validation images.

* nc: Number of classes in the dataset.

* names: Names of the classes in the dataset. The index of the
classes in this list would be used as an identifier for the class names in the code.

## Hyperparameter Config File
The hyperparameter config file helps us define the hyperparameters for our neural network. We are going to use the default one, data/hyp.scratch.yaml. This is what it looks like.

## Custom Network Architecture
YOLO v5 also allows you to define your own custom architecture and anchors if one of the pre-defined networks doesn't fit the bill for you. For this you will have to define a custom weights config file. For this example, we use the the yolov5s.yaml. This is what it looks like.

## Train Model from pretrained model
The models range from nano to large as follows:


*   yolov5n
*   yolov5s
* yolov5m
* yolov5l



### Important Note
Please provide absolute paths for train and validation datasets in "data.yaml" file of you dataset.

To train the model we use the "train.py" file and provide various arguments listed above.

In [None]:
!python ./yolov5/train.py --img 640  --batch 32 --epochs 10 --data ./vehicles/data.yaml --weights 'yolov5n.pt' --name vehicles

2023-11-02 10:34:56.624718: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-02 10:34:56.624772: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-02 10:34:56.624815: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[34m[1mtrain: [0mweights=yolov5n.pt, cfg=, data=./vehicles/data.yaml, hyp=yolov5/data/hyps/hyp.scratch-low.yaml, epochs=10, batch_size=32, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_

## Perform validation on test/val set

To validate your model on the test set change the path of the "val" in your "data.yaml" file to the test set directory. If you want to validate it on validation set again, keep the validation path against the "val" variable.

In [None]:
!python ./yolov5/val.py --img 640  --batch 32 --data ./vehicles/data.yaml --weights '/content/yolov5/runs/train/vehicles8/weights/best.pt' --name vehicles

[34m[1mval: [0mdata=./vehicles/data.yaml, weights=['/content/yolov5/runs/train/vehicles8/weights/best.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=yolov5/runs/val, name=vehicles, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v7.0-231-gc2f131a Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
Model summary: 157 layers, 1765930 parameters, 0 gradients, 4.1 GFLOPs
[34m[1mval: [0mScanning /content/vehicles/valid/labels.cache... 250 images, 0 backgrounds, 0 corrupt: 100% 250/250 [00:00<?, ?it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100% 8/8 [00:06<00:00,  1.26it/s]
                   all        250        454       0.49      0.496      0.482       0.31
             Ambulance        250         64      0.558      0.632

## Perform inference using trained model

To perform inference using our models, we use the "detect.py" file. By providing the source path to a folder, video or a single image, it will perform detection on all the images in the folder, or the complete video or the single image.

In [None]:
!python ./yolov5/detect.py --weights '/content/yolov5/runs/train/vehicles8/weights/best.pt' --source /content/vehicles/test/images

[34m[1mdetect: [0mweights=['/content/yolov5/runs/train/vehicles8/weights/best.pt'], source=/content/vehicles/test/images, data=yolov5/data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5/runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-231-gc2f131a Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
Model summary: 157 layers, 1765930 parameters, 0 gradients, 4.1 GFLOPs
image 1/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.3f17c8790a68659d03b1939a59ccda80.jpg: 640x640 1 Ambulance, 1 Truck, 6.4ms
image 2/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.KJ730oDTFPdXdJxvSLnX.jpg: 640x640 1 Ambulance, 1 Truck, 6.3ms
imag

# YOLOv8

YOLOv8 is available in the form of a python package for ease of use. You can use tensorboard with each of the YOLO models given in this repo. You just have to provide the path of the specific directory containing the training experiments. In the cell below, before starting training we create the experiment directories ourselves so that we can start tensorboard before starting the model training for tracking model learning in real time

In [None]:
!mkdir runs2
!mkdir runs2/detect

mkdir: cannot create directory ‘runs2’: File exists
mkdir: cannot create directory ‘runs2/detect’: File exists


In [None]:
import tensorboard
%load_ext tensorboard
%tensorboard --logdir /content/runs/detect
# Don't worry if tensor board is not working for your. You may ignore this error and move on to the next cells.

<IPython.core.display.Javascript object>

In [None]:
# This is the ultralytics library for using Ultralytics' various tools for object detection.
!pip install ultralytics



## Train YOLOv8 Model
The YOLOv8 can be trained, validated and used for inferenced by either using the CLI or its python API.

1. Start model training of nano model from scratch using command line:

In [None]:
!yolo task=detect mode=train model=yolov8n.yaml imgsz=640 data=/content/vehicles/data.yaml epochs=10 batch=8 name=yolov8n_scratch


                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128

2. Start training by using the "ultralytics" package in python.

In [None]:
from ultralytics import YOLO
# Load the model.
model = YOLO('yolov8n.pt')

# Training.
results = model.train(
   data='/content/vehicles/data.yaml',
   imgsz=640,
   epochs=10,
   batch=64,
   name='vehicles_pretrained')

Ultralytics YOLOv8.0.203 🚀 Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8n.pt, data=/content/vehicles/data.yaml, epochs=10, patience=50, batch=64, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=vehicles_pretrained3, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, stream_buffer=False, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=t

## Validate YOLOv8 Model
Validate the model on val/test. You need to change the path for the "val" variable to the test set in the "data.yaml" file

In [None]:
# Load the model.
model = YOLO('/content/yolov8n.pt')

# Training.
results = model.val(
   data='/content/vehicles/data.yaml',
   imgsz=640,
   epochs=10,
   batch=64,
   name='vehicles_pretrained')

Ultralytics YOLOv8.0.203 🚀 Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs
[34m[1mval: [0mScanning /content/vehicles/valid/labels.cache... 250 images, 0 backgrounds, 0 corrupt: 100%|██████████| 250/250 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:07<00:00,  1.93s/it]
                   all        250        454      0.422      0.257      0.241      0.173
                person        250         64          0          0   7.94e-05   4.76e-05
               bicycle        250         46          0          0          0          0
                   car        250        238      0.511      0.546      0.481      0.335
            motorcycle        250         46      0.598      0.739       0.69      0.499
              airplane        250         60          1          0     0.0335     0.0287
Speed: 0.3ms p

## Perform Inference Using YOLOv8
We use CLI for inferencing on the images in the test set

In [None]:
!yolo predict model=/content/yolov8n.pt source='/content/vehicles/test/images'

Ultralytics YOLOv8.0.203 🚀 Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs

image 1/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.3f17c8790a68659d03b1939a59ccda80.jpg: 640x640 1 car, 1 bus, 8.5ms
image 2/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.KJ730oDTFPdXdJxvSLnX.jpg: 640x640 1 car, 1 bus, 9.4ms
image 3/126 /content/vehicles/test/images/00e481ea1a520175_jpg.rf.6e6a8b3b45c9a11d106958f88ff714ea.jpg: 640x640 3 buss, 11.5ms
image 4/126 /content/vehicles/test/images/00e481ea1a520175_jpg.rf.MV6sZ8QCFwFeMYaI2tHm.jpg: 640x640 3 buss, 9.5ms
image 5/126 /content/vehicles/test/images/08c8b73e0c2e296e_jpg.rf.7IkYAamjZhnwsoXSrwKt.jpg: 640x640 5 persons, 2 bicycles, 1 car, 3 buss, 1 stop sign, 8.1ms
image 6/126 /content/vehicles/test/images/08c8b73e0c2e296e_jpg.rf.effa65856584463c08848031cab357b9.jpg: 640x640 5 persons, 2 bicycles, 1 car, 3 buss, 1 stop sign, 8.3ms
image 7/126

You can also use the python code for this but it will return the predictions as a list.

In [None]:
model = YOLO('/content/yolov8n.pt')
results = model.predict(
   source='/content/vehicles/test/images')



image 1/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.3f17c8790a68659d03b1939a59ccda80.jpg: 640x640 1 car, 1 bus, 13.1ms
image 2/126 /content/vehicles/test/images/00dea1edf14f09ab_jpg.rf.KJ730oDTFPdXdJxvSLnX.jpg: 640x640 1 car, 1 bus, 12.5ms
image 3/126 /content/vehicles/test/images/00e481ea1a520175_jpg.rf.6e6a8b3b45c9a11d106958f88ff714ea.jpg: 640x640 3 buss, 12.2ms
image 4/126 /content/vehicles/test/images/00e481ea1a520175_jpg.rf.MV6sZ8QCFwFeMYaI2tHm.jpg: 640x640 3 buss, 13.2ms
image 5/126 /content/vehicles/test/images/08c8b73e0c2e296e_jpg.rf.7IkYAamjZhnwsoXSrwKt.jpg: 640x640 5 persons, 2 bicycles, 1 car, 3 buss, 1 stop sign, 12.7ms
image 6/126 /content/vehicles/test/images/08c8b73e0c2e296e_jpg.rf.effa65856584463c08848031cab357b9.jpg: 640x640 5 persons, 2 bicycles, 1 car, 3 buss, 1 stop sign, 13.3ms
image 7/126 /content/vehicles/test/images/10c26c6598677a1f_jpg.rf.USCbBYVcUICkLhuq07Lw.jpg: 640x640 10 persons, 1 motorcycle, 12.1ms
image 8/126 /content/vehicles/test/images/

# Tasks:


1.   Please fully train the YOLOv5s (small) and YOLOv8s (small) models on the provided dataset
2.   Evaluate your models on the test sets and provide the mAP values.
3.   Prepare a report in which you compare the performance of both the models. Please study the architectures of both YOLOV5 and YOLOV8 and illustrate these in your report.
4. In your report, discuss which model is performing better on the test set and what can be the reason? If YOLOV8 is performing better, then what are the innovations that allow YOLOV8 to perform better than YOLOV5.
5. Show predictions of both models on 4-5 test images in the report and compare them side by side.  



# Source

https://github.com/Vision-At-SEECS/Pytorch_Labs/blob/main/Lab5_CV_Object_Detection.ipynb

Prepared by: Bostan Khan, Team Lead, Machine Vision and Intelligent Systems Lab