<a href="https://colab.research.google.com/github/nyp-sit/it3103-2024s2/blob/main/yolo8_custom_train.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Object Detection using YOLO

Welcome to this week's hands-on lab. In this lab, we are going to learn how to train a balloon detector!

At the end of this exercise, you will be able to:

- create an object detection dataset in YOLO format
- finetune a YOLOv8 pretrained model with the custom dataset
- monitor the training progress and evaluation metrics
- deploy the trained model for object detection

## Create an object detection dataset

We will use a sample balloon dataset to illustrate the process of annotation and packaga the dataset in different format for object detection (e.g. YOLO, Pascal VOC, COCO, etc).

To annotate, there are many different tools available, such as the very basic [LabelImg](https://github.com/HumanSignal/labelImg) , or the more feature-packed tool such as [Label Studio](https://labelstud.io/), or online service such as [Roboflow](https://roboflow.com/).

### Raw Image Dataset

You can download the balloon images (without annotations) from this link:

https://github.com/nyp-sit/iti107-2024S2/raw/refs/heads/main/data/balloon_raw_dataset.zip

Unzip the file to a local folder.

There are total of 74 images. You should divide the images into both training and validation set (e.g. 80%-20%, i.e. 59 images for train, and 15 for test).


### Option 1: Label Studio

You can follow the [steps](https://labelstud.io/guide/quick_start) here to setup Label Studio on your PC. It is recommended to setup a conda environment before you install the Label Studio.  

Here are the steps that need to be done:
1. Create a new Project
2. Import hte images into Label Studio
3. Set up the Labelling UI tempalte (choose Object Detection with Bounding Box template)
4. Export the dataset in YOLO format.

The exported dataset will have the following folder structure:
```
<root folder>
classes.txt    --> contains the labels, with each class label on a new line
--images --> contains the images
--labels --> contains the annotations (i.e. bbox coordinates)
notes.json --> some info about this dataset (i.e. not used)
```

For training with Ultralytics, you need to organize the files into train and validate (and optionally test) folders, and to create a data.yaml file to provide information about the folder location of test and validation set:

```
<root folder>
--train
----images
----labels
--valid
----images
----labels
data.yaml
```

The data.yaml file should specify the following:
```
path:../datasets/balloon
train: train/images
val: valid/images
test: test/images

names:
    0: balloon
```

If you more than one class of object to detect, specify the rest of the names under the names field.


## Option 2: Roboflow

You can aso create a new account with [Roboflow](https://roboflow.com/). Roboflow integrates very well with Ultralytics and you can easily export the dataset in a format recognized by Ultralytics trainer (for YOLO model)

Similarly, you can create a new account, upload all the raw images, annotate them and then export.

You can choose the format to be YOLOv8 and choose local directory to download the dataset locally instead of pushing it to the Roboflow universal wish.

Here is a [introductory blog](https://blog.roboflow.com/getting-started-with-roboflow/) on using the Roboflow to annotate.





## Auto Labelling using Grounding DINO

Both Label Studio and Roboflow supports the use of Grounding DINO to auto label the dataset.

Grounding DINO is open-set object detector, marrying Transformer-based detector DINO with grounded pre-training, which can detect arbitrary objects with human inputs (prompts) such as category names or referring expressions.

###Label Studio

You can follow the instruction [here](https://labelstud.io/blog/using-text-prompts-for-image-annotation-with-grounding-dino-and-label-studio/)  to setup the Grounding DINO ML backend to integrate with your label studio.

###Roboflow

Here is a [video tutorial](https://youtu.be/SDV6Gz0suAk) on using Grounding DINO with Roboflow.


### Download Annotated Dataset

To save you time for this lab, you can download a pre-annotated balloon dataset [here](https://github.com/nyp-sit/iti107-2024S2/raw/refs/heads/main/data/balloon_annotated_dataset.zip).

We download and unzip to the directory called `datasets`



In [11]:
%%capture
%%bash
wget https://github.com/nyp-sit/iti107-2024S2/raw/refs/heads/main/data/balloon_annotated_dataset.zip
mkdir -p datasets
unzip balloon_annotated_dataset.zip -d datasets/

In [7]:
%%capture
!pip install ultralytics
!pip install comet_ml

## Training the Model

YOLOv8 comes with different sizes of pretrained models: yolov8n, yolov8s, .... The differs in terms of their sizes, inference speeds and precision:

<img src="https://github.com/nyp-sit/iti107-2024S2/blob/main/assets/yolo-models.png?raw=true" width="70%"/>


We will use the small pretrained model yolo8s and finetune it on our custom dataset.


### Setup the logging

Ultralytics support logging to wandb, comet.ml and tensorboard, out of the box. Here we only enable wandb.

You need to create an account at [wandb](https://wandb.ai) and get the API key from https://wandb.ai/authorize.


In [1]:
from ultralytics import settings

settings.update({"wandb": True,
                 "comet.ml": False,
                 "tensorboard": False})

View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


### Training

We specify the path to data.yaml file, and train with a batch size of 15, and we also save the checkpoint at each epoch (save_period=1). We assume here you are connected to a GPU, hence we can specify the device to use as `device=0` to select the first GPU.  We specify the project name as balloon, this will create a folder called `balloon` to store the weights and various training artifacts such as F1, PR curves, confusion matrics, training results (loss, mAP, etc).

For a complete listing of train settings, you can see [here](https://docs.ultralytics.com/modes/train/#train-settings).

You can also specify the type of data [augmentation](https://docs.ultralytics.com/modes/train/#augmentation-settings-and-hyperparameters)  you want as part of the train pipeline.

You can monitor your training progress at wandb (the link is given in the train output below)


In [2]:
from ultralytics import YOLO
from ultralytics import settings

model = YOLO("yolov8s.pt")  # Load a pre-trained YOLO model
result = model.train(data="datasets/data.yaml",
                     epochs=30,
                     save_period=1,
                     batch=16,
                     device=0,
                     project='balloon',
                     plots=True)

Ultralytics 8.3.17 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8s.pt, data=datasets/data.yaml, epochs=30, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=1, cache=False, device=0, workers=8, project=balloon, name=train6, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_

[34m[1mwandb[0m: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: [32m[41mERROR[0m API key must be 40 characters long, yours was 9505541


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


Freezing layer 'model.22.dfl.conv.weight'
[34m[1mAMP: [0mrunning Automatic Mixed Precision (AMP) checks with YOLO11n...
[34m[1mAMP: [0mchecks passed ✅


[34m[1mtrain: [0mScanning /content/datasets/train/labels... 59 images, 0 backgrounds, 0 corrupt: 100%|██████████| 59/59 [00:00<00:00, 1885.05it/s]

[34m[1mtrain: [0mNew cache created: /content/datasets/train/labels.cache





[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))


  check_for_updates()
[34m[1mval: [0mScanning /content/datasets/valid/labels... 15 images, 0 backgrounds, 0 corrupt: 100%|██████████| 15/15 [00:00<00:00, 608.01it/s]

[34m[1mval: [0mNew cache created: /content/datasets/valid/labels.cache





Plotting labels to balloon/train6/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mballoon/train6[0m
Starting training for 30 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/30      3.93G     0.7897      3.007      1.137         59        640: 100%|██████████| 4/4 [00:06<00:00,  1.61s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  3.89it/s]

                   all         15         71      0.228      0.423      0.204      0.128






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/30      3.67G     0.7005      2.128      1.067         90        640: 100%|██████████| 4/4 [00:01<00:00,  3.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.36it/s]


                   all         15         71      0.838      0.817      0.882      0.788

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/30      3.85G     0.6625      1.387      1.021         83        640: 100%|██████████| 4/4 [00:01<00:00,  3.27it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  4.70it/s]

                   all         15         71       0.64        0.8      0.676      0.586






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/30      3.87G     0.6503     0.9087     0.9763         94        640: 100%|██████████| 4/4 [00:01<00:00,  2.52it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.08it/s]

                   all         15         71      0.812      0.789      0.858      0.748






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/30      3.79G     0.5968     0.7561      0.912         53        640: 100%|██████████| 4/4 [00:01<00:00,  2.55it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.14it/s]

                   all         15         71       0.86      0.803      0.822      0.739






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/30      4.03G      0.611     0.7922      0.967         44        640: 100%|██████████| 4/4 [00:01<00:00,  3.44it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  3.63it/s]

                   all         15         71      0.913      0.741      0.811      0.707






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/30      4.05G     0.6659     0.7912     0.9886        111        640: 100%|██████████| 4/4 [00:01<00:00,  3.00it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.69it/s]

                   all         15         71      0.883      0.743      0.782      0.677






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/30      4.06G     0.6585      0.747      0.947        112        640: 100%|██████████| 4/4 [00:01<00:00,  3.66it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.89it/s]


                   all         15         71      0.862      0.746      0.803      0.688

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/30      4.09G     0.6141     0.6651     0.9689         77        640: 100%|██████████| 4/4 [00:01<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  4.61it/s]

                   all         15         71      0.888      0.761      0.815      0.722






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/30      4.06G     0.6297     0.6564     0.9819         90        640: 100%|██████████| 4/4 [00:01<00:00,  2.55it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.92it/s]

                   all         15         71      0.885      0.761      0.828      0.733






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/30      3.89G     0.6067     0.6336     0.9676         52        640: 100%|██████████| 4/4 [00:02<00:00,  1.52it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  1.32it/s]

                   all         15         71      0.863      0.746      0.791      0.674






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/30      3.99G     0.6669     0.6267     0.9624         81        640: 100%|██████████| 4/4 [00:01<00:00,  2.28it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  1.79it/s]

                   all         15         71      0.846      0.718      0.771      0.676






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/30      4.01G     0.5664     0.6078     0.9251         48        640: 100%|██████████| 4/4 [00:01<00:00,  3.39it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  6.33it/s]


                   all         15         71      0.933      0.704      0.789      0.699

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/30      3.93G       0.56     0.5767     0.9179         84        640: 100%|██████████| 4/4 [00:01<00:00,  3.31it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  4.44it/s]

                   all         15         71      0.912      0.704      0.783      0.689






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/30      3.77G     0.5661     0.5579     0.9286         47        640: 100%|██████████| 4/4 [00:01<00:00,  2.58it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.92it/s]

                   all         15         71      0.624      0.655      0.639      0.559






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/30      3.86G     0.5401      0.551     0.9069        111        640: 100%|██████████| 4/4 [00:01<00:00,  2.34it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.80it/s]

                   all         15         71      0.944       0.69      0.775      0.683






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/30      4.04G     0.5538     0.5637     0.9642         45        640: 100%|██████████| 4/4 [00:01<00:00,  3.70it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.94it/s]


                   all         15         71       0.87      0.756      0.783      0.688

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/30         4G     0.5382     0.5403     0.9255         57        640: 100%|██████████| 4/4 [00:01<00:00,  3.79it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  6.79it/s]

                   all         15         71      0.792      0.704      0.689      0.607






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/30         4G     0.5109     0.4758     0.8856         72        640: 100%|██████████| 4/4 [00:01<00:00,  3.69it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.62it/s]

                   all         15         71       0.89      0.732      0.758      0.662






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/30      4.07G     0.5427     0.5748      0.925        112        640: 100%|██████████| 4/4 [00:01<00:00,  3.83it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.98it/s]


                   all         15         71      0.928      0.731      0.771      0.673
Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      21/30      4.02G      0.526     0.6225     0.9353         30        640: 100%|██████████| 4/4 [00:03<00:00,  1.04it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  1.59it/s]

                   all         15         71      0.929      0.761      0.796      0.686






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      22/30      3.87G     0.5432     0.5353     0.8838         35        640: 100%|██████████| 4/4 [00:01<00:00,  3.19it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  4.79it/s]

                   all         15         71      0.961      0.775      0.828      0.722






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      23/30      4.03G      0.484     0.4786     0.8436         19        640: 100%|██████████| 4/4 [00:01<00:00,  3.35it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  3.79it/s]

                   all         15         71      0.951      0.775      0.833       0.73






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      24/30      3.87G     0.4795     0.4558     0.8538         52        640: 100%|██████████| 4/4 [00:01<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  6.62it/s]

                   all         15         71      0.949      0.785      0.839      0.735






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      25/30      3.99G     0.4812     0.4509     0.8575         35        640: 100%|██████████| 4/4 [00:01<00:00,  3.48it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  4.32it/s]

                   all         15         71       0.94      0.789      0.845      0.749






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      26/30      3.86G     0.4413     0.4173     0.8291         26        640: 100%|██████████| 4/4 [00:01<00:00,  2.76it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.65it/s]

                   all         15         71      0.943      0.789      0.856      0.757






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      27/30      3.87G     0.4616     0.4293      0.871         53        640: 100%|██████████| 4/4 [00:01<00:00,  2.63it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.08it/s]

                   all         15         71      0.949      0.787      0.862      0.767






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      28/30      4.03G     0.4243     0.4187     0.8657         53        640: 100%|██████████| 4/4 [00:01<00:00,  3.38it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  6.44it/s]


                   all         15         71      0.929      0.789      0.871      0.771

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      29/30      4.03G     0.4422      0.408     0.8517         62        640: 100%|██████████| 4/4 [00:01<00:00,  3.36it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  5.53it/s]


                   all         15         71      0.933      0.788      0.874       0.77

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      30/30         4G     0.4215     0.3859      0.836         44        640: 100%|██████████| 4/4 [00:01<00:00,  3.38it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  6.43it/s]

                   all         15         71      0.919      0.799      0.878      0.785






30 epochs completed in 0.028 hours.
Optimizer stripped from balloon/train6/weights/last.pt, 22.5MB
Optimizer stripped from balloon/train6/weights/best.pt, 22.5MB

Validating balloon/train6/weights/best.pt...
Ultralytics 8.3.17 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
Model summary (fused): 168 layers, 11,125,971 parameters, 0 gradients, 28.4 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  2.74it/s]


                   all         15         71      0.839      0.817      0.883      0.789
Speed: 0.3ms preprocess, 6.1ms inference, 0.0ms loss, 2.3ms postprocess per image
Results saved to [1mballoon/train6[0m


VBox(children=(Label(value='26.973 MB of 26.973 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
lr/pg0,▁▂▃▄▄▅▆▆▇▇▇████████▇▇▇▆▆▅▅▄▃▂▁
lr/pg1,▁▂▃▄▄▅▆▆▇▇▇████████▇▇▇▆▆▅▅▄▃▂▁
lr/pg2,▁▂▃▄▄▅▆▆▇▇▇████████▇▇▇▆▆▅▅▄▃▂▁
metrics/mAP50(B),▁█▆█▇▇▇▇▇▇▇▇▇▇▅▇▇▆▇▇▇▇▇███████
metrics/mAP50-95(B),▁█▆█▇▇▇▇▇▇▇▇▇▇▆▇▇▆▇▇▇▇▇▇██████
metrics/precision(B),▁▇▅▇▇█▇▇▇▇▇▇██▅█▇▆▇██████████▇
metrics/recall(B),▁██▇█▇▇▇▇▇▇▆▆▆▅▆▇▆▆▆▇▇▇▇▇▇▇▇▇█
model/GFLOPs,▁
model/parameters,▁
model/speed_PyTorch(ms),▁

0,1
lr/pg0,9e-05
lr/pg1,9e-05
lr/pg2,9e-05
metrics/mAP50(B),0.8832
metrics/mAP50-95(B),0.78937
metrics/precision(B),0.83909
metrics/recall(B),0.8169
model/GFLOPs,28.647
model/parameters,11135987.0
model/speed_PyTorch(ms),10.14


You can see the various graphs in your wandb dashboard, for example:

*metrics*

<img src="https://github.com/nyp-sit/iti107-2024S2/blob/main/assets/wandb-metrics.png?raw=true"/>

*Train and validation loss*

<img src="https://github.com/nyp-sit/iti107-2024S2/blob/main/assets/wandb-loss.png?raw=true"/>

You can go to the folder `balloon-->train-->weights` and you will files like epoch0.pt, epoch1.pt, .... and also best.pt.
The epoch0.pt, epoch1.pt are the checkpoints that are saved every period (in our case, we specify period as 1 epoch).  The best.pt contains the best checkpoint.

We can run the best model (using the best checkpoint) against the validation dataset to see the overall model performance on validation set.  

You should see around `0.88` for `mAP50`, and `0.78` for `mAP50-95`.

In [7]:
from ultralytics import YOLO

model = YOLO("balloon/train/weights/best.pt")
validation_results = model.val(data="datasets/data.yaml", device="0")

Ultralytics 8.3.17 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
Model summary (fused): 168 layers, 11,125,971 parameters, 0 gradients, 28.4 GFLOPs


[34m[1mval: [0mScanning /content/datasets/valid/labels.cache... 15 images, 0 backgrounds, 0 corrupt: 100%|██████████| 15/15 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<00:00,  1.10it/s]


                   all         15         71      0.838      0.817      0.882      0.783
Speed: 0.3ms preprocess, 23.2ms inference, 0.0ms loss, 6.0ms postprocess per image
Results saved to [1mruns/detect/val[0m


## Export and Deployment

Your model is in pytorch format (.pt). You can export the model to various format, e.g. TorchScript, ONNX, OpenVINO, TensorRT, etc. depending on your use case, and deployment platform (e.g. CPU or GPU, etc)

You can see the list of [supported formats](https://docs.ultralytics.com/modes/export/#export-formats)  and the option they support in terms of further optimization (such as imagesize, int8, half-precision, etc) in the ultralytics site.

Ultralytics provide a utility function to benchmark your model using different supported formats automatically. You can uncomment the code in the following code cell to see the benchmark result. If you are benchmark for CPU only, the remove change the device to `device='cpu'`.

**Beware: it will take quite a while to complete the benchmark**

In [10]:
from ultralytics.utils.benchmarks import benchmark

# Benchmark on GPU (device=0)
benchmark(model="balloon/train/weights/best.pt", data="datasets/data.yaml", imgsz=640, half=False, device='cpu')

Setup complete ✅ (2 CPUs, 12.7 GB RAM, 36.5/112.6 GB disk)

Benchmarks complete for best.pt on datasets/data.yaml at imgsz=640 (756.31s)
                   Format Status❔  Size (MB)  metrics/mAP50-95(B)  Inference time (ms/im)    FPS
0                 PyTorch       ✅       21.5               0.7787                  406.94   2.46
1             TorchScript       ✅       42.9               0.7874                  592.69   1.69
2                    ONNX       ✅       42.7               0.7874                  365.91   2.73
3                OpenVINO       ✅       42.8               0.7874                  331.79   3.01
4                TensorRT       ❌        0.0                  NaN                     NaN    NaN
5                  CoreML       ❎       21.4                  NaN                     NaN    NaN
6   TensorFlow SavedModel       ✅      106.7               0.7874                   62.75  15.93
7     TensorFlow GraphDef       ✅       42.7               0.7874                  111.

Unnamed: 0,Format,Status❔,Size (MB),metrics/mAP50-95(B),Inference time (ms/im),FPS
0,PyTorch,✅,21.5,0.7787,406.94,2.46
1,TorchScript,✅,42.9,0.7874,592.69,1.69
2,ONNX,✅,42.7,0.7874,365.91,2.73
3,OpenVINO,✅,42.8,0.7874,331.79,3.01
4,TensorRT,❌,0.0,,,
5,CoreML,❎,21.4,,,
6,TensorFlow SavedModel,✅,106.7,0.7874,62.75,15.93
7,TensorFlow GraphDef,✅,42.7,0.7874,111.57,8.96
8,TensorFlow Lite,✅,42.7,0.7874,585.54,1.71
9,TensorFlow Edge TPU,❎,11.0,,,



In the following code, we export it as OpenVINO.

After export, you can find the openvino model in `balloon\train\weights\best_openvino_model` directory.

In [8]:
model = YOLO("balloon/train/weights/best.pt")
exported_path = model.export(format="openvino")

Ultralytics 8.3.17 🚀 Python-3.10.12 torch-2.4.1+cu121 CPU (Intel Xeon 2.00GHz)
Model summary (fused): 168 layers, 11,125,971 parameters, 0 gradients, 28.4 GFLOPs

[34m[1mPyTorch:[0m starting from 'balloon/train/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 5, 8400) (21.5 MB)
[31m[1mrequirements:[0m Ultralytics requirement ['openvino>=2024.0.0'] not found, attempting AutoUpdate...
Collecting openvino>=2024.0.0
  Downloading openvino-2024.4.0-16579-cp310-cp310-manylinux2014_x86_64.whl.metadata (8.3 kB)
Collecting openvino-telemetry>=2023.2.1 (from openvino>=2024.0.0)
  Downloading openvino_telemetry-2024.1.0-py3-none-any.whl.metadata (2.3 kB)
Downloading openvino-2024.4.0-16579-cp310-cp310-manylinux2014_x86_64.whl (42.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.6/42.6 MB 37.3 MB/s eta 0:00:00
Downloading openvino_telemetry-2024.1.0-py3-none-any.whl (23 kB)
Installing collected packages: openvino-telemetry, openvino
Successfully installed o

## Inference

Let's test our model on some sample pictures. You can optionally specify the confidence threshold (e.g. `conf=0.5`), and the IoU (e.g. `iou=0.6`) for the NMS. The model will only output the bounding boxes of those detection that exceeds the confidence threshould and the IoU threshold.  

In [12]:
import ultralytics
from ultralytics import YOLO
from PIL import Image

source = 'https://raw.githubusercontent.com/nyp-sit/iti107-2024S2/refs/heads/main/session-3/samples/sample_balloon.jpeg'
#source = './samples/sample_balloon.jpeg'
model = YOLO("balloon/train/weights/best_openvino_model", task='detect')
result = model(source, conf=0.5, iou=0.6)

# Visualize the results
for i, r in enumerate(result):
    print(r)
    # Plot results image
    im_bgr = r.plot()  # BGR-order numpy array
    im_rgb = Image.fromarray(im_bgr[..., ::-1])  # RGB-order PIL image

    # Show results to screen (in supported environments)
    r.show()

    # Save results to disk
    r.save(filename=f"results{i}.jpg")

Loading balloon/train/weights/best_openvino_model for OpenVINO inference...
Using OpenVINO LATENCY mode for batch=1 inference...

1/1: https://github.com/nyp-sit/iti107-2024S2/blob/main/session-3/samples/sample_balloon.jpeg?raw=true... Success ✅ (inf frames of shape 640x427 at 25.00 FPS)


errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

0: 640x640 9 balloons, 280.9ms
0: 640x640 9 balloons, 273.2ms
0: 640x640 9 balloons, 272.4ms
0: 640x640 9 balloons, 278.5ms
0: 640x640 9 balloons, 278.6ms
0: 640x640 9 balloons, 262.7ms
0: 640x640 9 balloons, 270.7ms
0: 640x640 9 balloons, 266.4ms
0: 640x640 9 balloons, 405.2ms
0: 6

KeyboardInterrupt: 

https://docs.ultralytics.com/modes/predict/#streaming-source-for-loop

In [None]:
!conda install opencv

Channels:
 - defaults
Platform: osx-arm64
Collecting package metadata (repodata.json): done
Solving environment: done

# All requested packages already installed.



## Streaming

In [None]:
from ultralytics import YOLO
import cv2

# Load the YOLO model
model = YOLO("balloon.onnx")

# Open the video file
video_path = "balloon.mp4"
cap = cv2.VideoCapture(video_path)

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLO inference on the frame
        results = model(frame)

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLO Inference", annotated_frame)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

Loading balloon.onnx for ONNX Runtime inference...



TypeError: Unsupported image type. For supported types see https://docs.ultralytics.com/modes/predict