<a href="https://colab.research.google.com/github/Ceciliawangwang/object_detection_tutorial/blob/main/Tutorial_Object_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ojbect detection tutorial

## 1 Introduction
In this tutorial, we will use You Only Look Once (YOLO) to train an object detection model in Python.

The objectives of this tutorial are:

*   Prepare object detection annotation data
*   Understand the fundamentals of YOLO model
*   Train an object detection model


## 2 Configure environment

**2.1 Get start with Google Colab**

In this tutorial, we will access to a pre-configured Jupyter Notebooks running on *Google Colaboratory* : https://colab.research.google.com/

It is a free, cloud-based development environment provided by Google. It allows users to write and run Python code in a web browser, with access to powerful computing resources such as GPUs and TPUs.



**2.2 Mount the Google Drive**


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


**2.3 Import libraries**

Install all the libraries used in YOLO by installing `ulbralytics`.




In [None]:
!pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.0.173-py3-none-any.whl (614 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/614.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/614.2 kB[0m [31m1.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m604.2/614.2 kB[0m [31m9.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m614.2/614.2 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: ultralytics
Successfully installed ultralytics-8.0.173


In [None]:
import os

# import torch

from ultralytics import YOLO


in this tutorial we will use GPU to speed up the trainning process of the image segmentaion. Here is the settings in Google Colaboratory:
`Runtime >> Change runtime type >> Hardware accelerator >> GPU`

Test the connect of gpu resources, please type in the `torch.cuda.is_available()`

In [None]:
# DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

NameError: ignored

## 3 Data Preparation and Annotation

### 3.1 Data annotation:

Here are some tools that we could use to annotate data.

*   cvat
*   LabelMe

### 3.2 Dataset
In this tutorial, we will use the data from: https://www.kaggle.com/datasets/sakshamjn/vehicle-detection-8-classes-object-detection

## 4 YOLO

You Only Look Once (YOLO) is a popular and real-time object proposed by Joseph Redmon in 2015. Using a single convolutional neural network (CNN). It has a lot of advantages, such as fast speed, high detection accuracy, open sourced, and better generalization.
It’s done by dividing an image into a grid and predicting bounding boxes and class probabilities for each cell in a grid.


### 4.1 YOLO Data Format

*   *.jpg
*   *.txt

We should add a custom dataset file `data.yaml`



```
train: path1
val: path2

nc: 2   #number of classes
names: ['class1', 'class2']
```

### 4.2 Folder structure


Set up the folder structure as follows:

```
root/
    images/
        train/
            image1.jpg
            image2.jpg
            ...
        val/
            image1.jpg
            image2.jpg
            ...

    labels/
        train/
            label1.txt
            label2.txt
            ...
        val/
            image1.jpg
            image2.jpg
            ...
```


## 5 YOLO in Python
### 5.1 load a model
Models could be found in [YOLO models](https://github.com/ultralytics/ultralytics).


*   YOLOv8n (Nano model)
*   YOLOv8s (Small model)
*   YOLOv8m (Medium model)
*   YOLOv8l (Large model)
*   YOLOv8x (Extra large model)

Here we chose the smallest and the most light-weight one YOLOv8n to showcase the training process.

In [None]:
ROOT_DIR = '/content/drive/MyDrive/od_dataset'

In [None]:
# load a model
model = YOLO('yolov8n.yaml') # build a new model from scratch, nano model
# model = YOLO('yolov8n.pt') #load a pretrained model


                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128

### 5.2 Train the model

In [None]:
# model training
results = model.train(
    data ='/content/drive/MyDrive/od_dataset/data.yaml',
    #imgsz = ,
    epochs= 3
    #batch =
    )

Ultralytics YOLOv8.0.173 🚀 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla T4, 15102MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8n.yaml, data=/content/drive/MyDrive/od_dataset/data.yaml, epochs=3, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=None, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, stream_buffer=False, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=

In [None]:
metrics = model.val() # evaluate model performance on the validation set

### 5.3 evaluate the performance
Evalue the model performance on the validation dataset.

In [None]:
results = model.val()

Ultralytics YOLOv8.0.173 🚀 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla T4, 15102MiB)
YOLOv8n summary (fused): 168 layers, 3007208 parameters, 0 gradients
[34m[1mval: [0mScanning /content/drive/MyDrive/od_dataset/labels/val.cache... 13 images, 0 backgrounds, 0 corrupt: 100%|██████████| 13/13 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  1.76it/s]
                   all         13         41          0          0          0          0
Speed: 0.2ms preprocess, 11.5ms inference, 0.0ms loss, 3.1ms postprocess per image
Results saved to [1mruns/detect/val[0m


### 5.4 Predict
You could predict your own image by using the following code.

In [None]:
model_train = YOLO('/content/runs/detect/train/weights/best.pt')
model('image_path')

YOLO also supports classification, segmentation and pose detection tasks in images and videos. Please check the offical tutorial for more details.

### Summary
In this tutorial, we had a detailed walkthrough to train the YOLOv8 models on a custom dataset. In the process, we also carried out a small real-world training experiment for pothole detection.

The experiments revealed that training object detection models on small objects could be challenging even with sufficient samples. We could observe this as training for 50 epochs was insufficient, and the mAP graphs were still increasing. Also, with smaller objects, larger object detection models (YOLOv8 Medium vs Nano in this case) seem to perform better when carrying out detection on new images and videos.

## Reference

https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb