# Overview
This file is responsible for training the card detection aspect of our APS360 project. For this task, we are using a YOLOv8 model from the Ultralytics library and training it on our custom dataset.

The file has been designed to work by simply selecting "Run All", and is organized into 5 main sections:
1.   Setup
2.   Dataset Loading
3.   Training
4.   Saving Models

# Section 1: Setup

In [1]:
# Options
# ... put any config stuff here ...
global PROJECT_FOLDER_PATH # comment in yours
PROJECT_FOLDER_PATH = '/content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection' # Nathan

In [2]:
# Mount to Google Drive
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


In [3]:
# Install and import the Ultralytics library
%pip install ultralytics
from ultralytics import YOLO

Collecting ultralytics
  Downloading ultralytics-8.2.77-py3-none-any.whl.metadata (41 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/41.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.0-py3-none-any.whl.metadata (8.5 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu1

In [4]:
# Import other relevant libraries
import os
import os.path as path

import shutil

# Section 2: Dataset Loading

This section contains code to prepare the dataset in the local runtime. This includes copying it from Google Drive and unzipping it to /content.

Most of the work for data processing is in merging datasets and making a custom yaml file, so this section simply needs to unzip the prepared dataset.

In [None]:
print('Unzipping dataset file to local runtime, this can take a while...')
dataset_zip_path = path.join(PROJECT_FOLDER_PATH, 'dataset.zip')
dataset_folder_path = 'datasets'

shutil.unpack_archive(dataset_zip_path, dataset_folder_path)

Unzipping dataset file to local runtime, this can take a while...


# Section 3: Training

In [None]:
model = YOLO('yolov8n.pt')

print(os.listdir('datasets/dataset/valid/images')[:10])

#yaml_path = path.join(PROJECT_FOLDER_PATH, 'dataset', 'formatted_data.yaml')
yaml_path = path.join('datasets', 'dataset', 'data.yaml')
results = model.train(data=yaml_path, epochs=5, batch=64)

Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt to 'yolov8n.pt'...


100%|██████████| 6.25M/6.25M [00:00<00:00, 168MB/s]


['594046321_jpg.rf.cc4a06633f9a3931d17a5f8ea1439b33.jpg', '144236322_jpg.rf.027ab44c35a8d442a3fdf99d8f99d421.jpg', '861294994_jpg.rf.abb02e22d2f027a4a9fdde7bae314288.jpg', '830780958_jpg.rf.c3e1933e25d76450f7539db2659de1e5.jpg', '303237274_jpg.rf.c41effb268a6a52aefa062e66311ff32.jpg', '212788355_jpg.rf.6f81ac2ac018b709795f323bffd3ef2d.jpg', '285154979_jpg.rf.9d4149e094311630949ead8d2a5c79bb.jpg', '860604500_jpg.rf.7217a3e41f34d91090702c092e15433f.jpg', '954068760_jpg.rf.9f46b7bcf190f693c66ce3e42518bf66.jpg', '629236490_jpg.rf.c119f8d2991dfff8ae3bafae051874b1.jpg']
Ultralytics YOLOv8.2.73 🚀 Python-3.10.12 torch-2.3.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8n.pt, data=datasets/dataset/data.yaml, epochs=5, time=None, patience=100, batch=64, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, 

100%|██████████| 755k/755k [00:00<00:00, 43.6MB/s]


Overriding model.yaml nc=80 with nc=52

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytic

[34m[1mtrain: [0mScanning /content/datasets/dataset/train/labels... 35203 images, 0 backgrounds, 0 corrupt: 100%|██████████| 35203/35203 [00:50<00:00, 690.75it/s] 


[34m[1mtrain: [0mNew cache created: /content/datasets/dataset/train/labels.cache
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))


  self.pid = os.fork()
[34m[1mval: [0mScanning /content/datasets/dataset/valid/labels... 6020 images, 0 backgrounds, 0 corrupt: 100%|██████████| 6020/6020 [00:07<00:00, 776.34it/s]


[34m[1mval: [0mNew cache created: /content/datasets/dataset/valid/labels.cache
Plotting labels to runs/detect/train/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.000179, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
[34m[1mTensorBoard: [0mmodel graph visualization added ✅
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mruns/detect/train[0m
Starting training for 5 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        1/5      9.45G      1.453      4.202       1.07         17        640: 100%|██████████| 551/551 [13:52<00:00,  1.51s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:24<00:00,  1.75s/it]


                   all       6020      23239      0.177      0.402      0.188      0.136

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        2/5      8.47G      1.165      2.362     0.9765         14        640: 100%|██████████| 551/551 [13:27<00:00,  1.47s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:08<00:00,  1.43s/it]


                   all       6020      23239      0.576      0.739      0.697      0.551

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        3/5      8.63G      1.098      1.611     0.9634          7        640: 100%|██████████| 551/551 [13:03<00:00,  1.42s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:06<00:00,  1.38s/it]


                   all       6020      23239      0.771      0.877      0.886      0.671

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        4/5      8.62G      1.044      1.254     0.9537         25        640: 100%|██████████| 551/551 [12:36<00:00,  1.37s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:07<00:00,  1.41s/it]


                   all       6020      23239      0.842      0.921      0.933      0.754

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        5/5      8.43G      1.007       1.08      0.948         12        640: 100%|██████████| 551/551 [12:40<00:00,  1.38s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:04<00:00,  1.34s/it]


                   all       6020      23239      0.873      0.938      0.953      0.798

5 epochs completed in 1.199 hours.
Optimizer stripped from runs/detect/train/weights/last.pt, 6.3MB
Optimizer stripped from runs/detect/train/weights/best.pt, 6.3MB

Validating runs/detect/train/weights/best.pt...
Ultralytics YOLOv8.2.73 🚀 Python-3.10.12 torch-2.3.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
Model summary (fused): 168 layers, 3,015,788 parameters, 0 gradients, 8.1 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 48/48 [01:29<00:00,  1.86s/it]


                   all       6020      23239      0.874      0.937      0.953      0.798
                   10c        309        454      0.827      0.893      0.935      0.733
                   10d        312        451      0.842      0.913      0.947      0.756
                   10h        304        470      0.798      0.828      0.892      0.724
                   10s        277        403      0.823      0.903      0.918      0.747
                    2c        299        439       0.93      0.979      0.988      0.858
                    2d        284        424      0.936      0.958      0.984      0.833
                    2h        323        477      0.919      0.941      0.978       0.83
                    2s        294        436      0.957      0.952       0.99      0.849
                    3c        323        484      0.869      0.979      0.984      0.865
                    3d        273        395      0.935       0.91      0.969      0.834
                    3

# Section 4: Saving Models

In [None]:
def save_run(src_folder, dest_folder):
    src_path = path.join('runs', 'detect', src_folder)
    dest_path = path.join(PROJECT_FOLDER_PATH, 'runs', dest_folder)
    print(f'Copying from {src_path} to {dest_path}...')
    shutil.copytree(src_path, dest_path)
    print('Done!')

save_run('train', 'train1')

Copying from runs/detect/train to /content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/runs/train1...
Done!


# Section 5: Testing

In [None]:
from ultralytics.utils.benchmarks import benchmark

model_path = path.join(PROJECT_FOLDER_PATH, 'runs', 'train1', 'weights', 'best.pt')
yaml_path = path.join('datasets', 'dataset', 'data.yaml')
benchmark(model=model_path, data=yaml_path, imgsz=640, half=False)


image 1/1 /usr/local/lib/python3.10/dist-packages/ultralytics/assets/bus.jpg: 640x480 (no detections), 792.3ms
Speed: 24.8ms preprocess, 792.3ms inference, 25.0ms postprocess per image at shape (1, 3, 640, 480)
Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf'...


100%|██████████| 755k/755k [00:00<00:00, 119MB/s]
[34m[1mval: [0mScanning /content/datasets/dataset/valid/labels... 6020 images, 0 backgrounds, 0 corrupt: 100%|██████████| 6020/6020 [00:02<00:00, 2117.19it/s]


[34m[1mval: [0mNew cache created: /content/datasets/dataset/valid/labels.cache


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 6020/6020 [23:00<00:00,  4.36it/s]


                   all       6020      23239      0.874      0.938      0.953        0.8
Speed: 5.8ms preprocess, 211.7ms inference, 0.0ms loss, 1.0ms postprocess per image

[34m[1mPyTorch:[0m starting from '/content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/runs/train1/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 56, 8400) (6.0 MB)

[34m[1mTorchScript:[0m starting export with torch 2.3.1+cu121...
[34m[1mTorchScript:[0m export success ✅ 2.3s, saved as '/content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/runs/train1/weights/best.torchscript' (11.9 MB)

Export complete (2.7s)
Results saved to [1m/content/gdrive/.shortcut-targets-by-id/1DsQfGNPcKchvFxCyS5CsfyAnvFYoU-0-/APS360 Project/Card Detection/runs/train1/weights[0m
Predict:         yolo predict task=detect model=/content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/runs/train1/weights/best.torchscript imgsz=640  
Validate:        yolo val task=det

[34m[1mval: [0mScanning /content/datasets/dataset/valid/labels.cache... 6020 images, 0 backgrounds, 0 corrupt: 100%|██████████| 6020/6020 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 6020/6020 [18:52<00:00,  5.31it/s]


                   all       6020      23239      0.872      0.938      0.953      0.773
Speed: 1.5ms preprocess, 177.4ms inference, 0.0ms loss, 1.0ms postprocess per image

[34m[1mPyTorch:[0m starting from '/content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/runs/train1/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 56, 8400) (6.0 MB)
[31m[1mrequirements:[0m Ultralytics requirement ['onnx>=1.12.0'] not found, attempting AutoUpdate...
Collecting onnx>=1.12.0
  Downloading onnx-1.16.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
Downloading onnx-1.16.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.9/15.9 MB 281.1 MB/s eta 0:00:00
Installing collected packages: onnx
Successfully installed onnx-1.16.2

[31m[1mrequirements:[0m AutoUpdate success ✅ 6.7s, installed 1 package: ['onnx>=1.12.0']
[31m[1mrequirements:[0m ⚠️ [1mRestart run

[34m[1mval: [0mScanning /content/datasets/dataset/valid/labels.cache... 6020 images, 0 backgrounds, 0 corrupt: 100%|██████████| 6020/6020 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  59%|█████▊    | 3533/6020 [10:11<10:06,  4.10it/s]

In [13]:
# get paths
model_path = path.join(PROJECT_FOLDER_PATH, 'runs', 'train1', 'weights', 'best.pt')
model = YOLO(model_path)

In [18]:
# predict
image_path = path.join(PROJECT_FOLDER_PATH, 'media', 'screenie_cropped.png')
results = model.predict(image_path, save=True)

# display and save
save_path = path.join(PROJECT_FOLDER_PATH, 'media', 'screenie_cropped_results.png')
for result in results:
    result.plot()
    result.save(filename=save_path)


image 1/1 /content/gdrive/MyDrive/UofT/APS360/APS360 Project/Card Detection/media/screenie_cropped.png: 576x640 2 10ds, 1 2s, 3 9cs, 2 Qds, 1 Qh, 7.3ms
Speed: 2.4ms preprocess, 7.3ms inference, 1.3ms postprocess per image at shape (1, 3, 576, 640)
Results saved to [1mruns/detect/predict2[0m
