# **Video Analytics - YOLO Model Training**

**Author:** [Yi-Jie Wong](https://www.linkedin.com/in/wongyijie/) & [Wingates Voon](https://www.linkedin.com/in/wingates-voon-5858391a0/) & [Yan-Chai Hum](https://www2.utar.edu.my/cv/index.jsp?cv=humyc&reqPageId=aboutMe)<br>
**GitHub:** [Video Analytics](https://github.com/AIProjectsOrg/VARepo)<br>
**Date created:** 2025/06/17<br>
**Last modified:** 2025/07/19<br>
**Description:** Train YOLOv8 for Smoking Detection in CCTV

Pipeline Definition
1.   **Setup Dependencies**
2.   **Setup Dataset**
3.   **Training on Original Dataset**
4.   **Synthetic Dataset**
5.   **Training with Synthetic Dataset**
6.   **Evaluation on All Models**
7.   **Export**

## **Step 1: Setup Dependencies**

In [None]:
!pip install ultralytics==8.3.168

In [None]:
!pip install roboflow==1.2.1

## **Step 2: Setup Dataset**

In [None]:
# Dataset 1: CCTV Smoking Dataset

from roboflow import Roboflow
rf = Roboflow(api_key="rvc5pEYx6sd3cZ8EBcDW")
project = rf.workspace("smoking-gqlqh").project("smoking-cctv-detection-x4fjr")
version = project.version(4)
dataset = version.download("yolov8")

loading Roboflow workspace...
loading Roboflow project...


In [None]:
# Dataset 2: Diverse Angle Smoking Dataset

from roboflow import Roboflow
rf = Roboflow(api_key="rvc5pEYx6sd3cZ8EBcDW")
project = rf.workspace("smoking-gqlqh").project("smoking-person-detection-h0a2x-zvlip")
version = project.version(1)
dataset = version.download("yolov8")

loading Roboflow workspace...
loading Roboflow project...


In [None]:
# Combined Dataset = Dataset 1 + Dataset 2

import shutil, os

# make a copy of Dataset 1 as combined-dataset
shutil.copytree("/content/Smoking-CCTV-Detection-4", "/content/combined-dataset")

# copy Dataset 2 to combined-dataset
!cp -r "/content/Smoking-Person-Detection-1/train/images" "/content/combined-dataset/train/"
!cp -r "/content/Smoking-Person-Detection-1/train/labels" "/content/combined-dataset/train/"

!cp -r "/content/Smoking-Person-Detection-1/valid/images" "/content/combined-dataset/valid/"
!cp -r "/content/Smoking-Person-Detection-1/valid/labels" "/content/combined-dataset/valid/"

!cp -r "/content/Smoking-Person-Detection-1/test/images" "/content/combined-dataset/test/"
!cp -r "/content/Smoking-Person-Detection-1/test/labels" "/content/combined-dataset/test/"

FileExistsError: [Errno 17] File exists: '/content/combined-dataset'

## **Step 3: Training YOLOv8n**

STRATEGY 1: Naive Training

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="/content/combined-dataset/data.yaml", epochs=50, imgsz=640, batch=16, val=False)

STRATEGY 2: Balanced Sampling

In [None]:
import argparse
from ultralytics import YOLO
from ultralytics.data.dataset import YOLODataset
import ultralytics.data.build as build
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os

class YOLOWeightedDataset(YOLODataset):
    def __init__(self, *args, mode="train", **kwargs):
        """
        Initialize the WeightedDataset.

        Args:
            class_weights (list or numpy array): A list or array of weights corresponding to each class.
        """

        super(YOLOWeightedDataset, self).__init__(*args, **kwargs)

        self.train_mode = "train" in self.prefix

        # You can also specify weights manually instead
        self.count_instances()
        class_weights = np.sum(self.counts) / self.counts
        self.agg_func = np.mean

        self.class_weights = np.array(class_weights)
        self.weights = self.calculate_weights()
        self.probabilities = self.calculate_probabilities()

    def count_instances(self):
        """
        Count the number of instances per class

        Returns:
            dict: A dict containing the counts for each class.
        """
        self.counts = [0 for i in range(len(self.data["names"]))]
        for label in self.labels:
            cls = label['cls'].reshape(-1).astype(int)
            for id in cls:
                self.counts[id] += 1

        self.counts = np.array(self.counts)
        self.counts = np.where(self.counts == 0, 1, self.counts)

    def calculate_weights(self):
        """
        Calculate the aggregated weight for each label based on class weights.

        Returns:
            list: A list of aggregated weights corresponding to each label.
        """
        weights = []
        for label in self.labels:
            cls = label['cls'].reshape(-1).astype(int)

            # Give a default weight to background class
            if cls.size == 0:
                weights.append(1)
                continue

            # Take mean of weights
            # You can change this weight aggregation function to aggregate weights differently
            # weight = np.mean(self.class_weights[cls])
            # weight = np.max(self.class_weights[cls])
            weight = self.agg_func(self.class_weights[cls])
            weights.append(weight)
        return weights

    def calculate_probabilities(self):
        """
        Calculate and store the sampling probabilities based on the weights.

        Returns:
            list: A list of sampling probabilities corresponding to each label.
        """
        total_weight = sum(self.weights)
        probabilities = [w / total_weight for w in self.weights]
        return probabilities

    def __getitem__(self, index):
        """
        Return transformed label information based on the sampled index.
        """
        # Don't use for validation
        if not self.train_mode:
            return self.transforms(self.get_image_and_label(index))
        else:
            index = np.random.choice(len(self.labels), p=self.probabilities)
            return self.transforms(self.get_image_and_label(index))

# Monkey patch method
build.YOLODataset = YOLOWeightedDataset

# Load a model
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="/content/combined-dataset/data.yaml", epochs=50, imgsz=640, batch=16, val=False)

In [None]:
# Restart the runtime
import os
os.kill(os.getpid(), 9)

## **Step 4: Synthetic Dataset**

In [None]:
import os
import requests

# Directory to save the .jpg files
directory = "backgrounds"
os.makedirs(directory, exist_ok=True)

# Indices to skip
skip_indices = [2, 9, 10, 13, 15, 16, 19, 24, 32, 33, 36, 38,
                44, 47, 49, 50, 53, 57, 60, 61, 64, 69, 70, 72,
                73, 75, 77, 78, 90, 92, 93, 95, 96, 97]

# Loop through numbers 1 to 100
for i in range(1, 101):
    if i in skip_indices:
        print(f"Skipping {i}.jpg")
        continue

    url = f"http://graphics.cs.cmu.edu/projects/whatMakesParis/testimgs/{i}.jpg"
    file_path = os.path.join(directory, f"{i}.jpg")

    try:
        response = requests.get(url)
        response.raise_for_status()
        with open(file_path, 'wb') as f:
            f.write(response.content)
        print(f"Downloaded {i}.jpg")
    except requests.RequestException as e:
        print(f"Failed to download {i}.jpg: {e}")


In [None]:
import os
len(os.listdir("/content/combined-dataset/train/labels"))

In [None]:
import os
import cv2
import numpy as np
import random

def load_yolo_labels(label_path, img_width, img_height):
    bboxes = []
    with open(label_path, 'r') as f:
        for line in f:
            parts = line.strip().split()
            if len(parts) != 5:
                continue
            _, x_center, y_center, width, height = map(float, parts)
            x1 = int((x_center - width / 2) * img_width)
            y1 = int((y_center - height / 2) * img_height)
            x2 = int((x_center + width / 2) * img_width)
            y2 = int((y_center + height / 2) * img_height)
            bboxes.append((x1, y1, x2, y2))
    return bboxes

def create_mask(img_shape, bboxes):
    mask = np.zeros(img_shape[:2], dtype=np.uint8)
    for (x1, y1, x2, y2) in bboxes:
        cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
    return mask

def extract_objects(img, bboxes):
    objects = []
    for (x1, y1, x2, y2) in bboxes:
        obj = img[y1:y2, x1:x2]
        objects.append(((x1, y1), obj))
    return objects

def paste_objects_on_background(bg, objects):
    for (x, y), obj in objects:
        h, w = obj.shape[:2]
        if y + h <= bg.shape[0] and x + w <= bg.shape[1]:  # simple bounds check
            bg[y:y+h, x:x+w] = obj
    return bg

def generate_synthetic_dataset(image_dir, label_dir, bg_dir):
    image_files = [f for f in os.listdir(image_dir) if f.endswith(('.jpg', '.png'))]
    bg_files = [os.path.join(bg_dir, f) for f in os.listdir(bg_dir) if f.endswith(('.jpg', '.png'))]

    for img_file in image_files:
        image_path = os.path.join(image_dir, img_file)
        label_file = os.path.splitext(img_file)[0] + '.txt'
        label_path = os.path.join(label_dir, label_file)
        if not os.path.exists(label_path):
            continue

        img = cv2.imread(image_path)
        img_h, img_w = img.shape[:2]

        bboxes = load_yolo_labels(label_path, img_w, img_h)
        if not bboxes:
            continue

        mask = create_mask(img.shape, bboxes)
        objects = extract_objects(img, bboxes)

        bg_img_path = random.choice(bg_files)
        bg = cv2.imread(bg_img_path)
        bg = cv2.resize(bg, (img_w, img_h))

        new_img = paste_objects_on_background(bg.copy(), objects)

        # Save synthetic image and label in the same directories with prefix
        synthetic_img_name = f"synthetic_{img_file}"
        synthetic_label_name = f"synthetic_{label_file}"

        cv2.imwrite(os.path.join(image_dir, synthetic_img_name), new_img)
        with open(label_path, 'r') as lf, open(os.path.join(label_dir, synthetic_label_name), 'w') as out_lf:
            out_lf.write(lf.read())

        print(f"Generated: {synthetic_img_name} and {synthetic_label_name}")

# Example usage
generate_synthetic_dataset('/content/combined-dataset/train/images', '/content/combined-dataset/train/labels', '/content/backgrounds')


In [None]:
# prompt: randomly select an image and label pair from "/content/Smoking-Person-Detection-1/train/images" and "/content/Smoking-Person-Detection-1/train/labels". Then plot it out

import random
import matplotlib.pyplot as plt
import cv2
import os

# Define the directories
image_dir = "/content/combined-dataset/train/images"
label_dir = "/content/combined-dataset/train/labels"

# Get list of image files
image_files = [f for f in os.listdir(image_dir) if f.lower().endswith((".jpg", ".png", ".jpeg"))]

if not image_files:
    print("No image files found in the directory.")
else:
    # Randomly select an image file
    random_image_file = random.choice(image_files)
    image_path = os.path.join(image_dir, random_image_file)

    # Construct the corresponding label file path
    label_file_name = os.path.splitext(random_image_file)[0] + ".txt"
    label_path = os.path.join(label_dir, label_file_name)

    # Read the image
    image = cv2.imread(image_path)
    if image is None:
        print(f"Could not read image: {image_path}")
    else:
        # Convert BGR to RGB for displaying with matplotlib
        image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        # Get image dimensions
        img_h, img_w, _ = image.shape

        # Read the label file and draw bounding boxes
        if os.path.exists(label_path):
            with open(label_path, "r") as f:
                for line in f:
                    parts = line.strip().split()
                    if len(parts) == 5:
                        class_id = int(parts[0])
                        xc, yc, bw, bh = map(float, parts[1:])

                        # Convert YOLO format to pixel coordinates
                        x1 = int((xc - bw / 2) * img_w)
                        y1 = int((yc - bh / 2) * img_h)
                        x2 = int((xc + bw / 2) * img_w)
                        y2 = int((yc + bh / 2) * img_h)

                        # Draw the bounding box (Green color, thickness 2)
                        cv2.rectangle(image_rgb, (x1, y1), (x2, y2), (255, 0, 0), 2)

                        # Optionally, put class text on the box (simple example)
                        # cv2.putText(image_rgb, str(class_id), (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

        # Plot the image with bounding boxes
        plt.figure(figsize=(10, 10))
        plt.imshow(image_rgb)
        plt.title(f"Image: {random_image_file}\nLabel: {label_file_name}")
        plt.axis('off')  # Hide axes
        plt.show()

## **Step 5: Training with Synthetic Dataset**

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="/content/combined-dataset/data.yaml", epochs=50, imgsz=640, batch=16, val=False)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/combined-dataset/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=50, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train3, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0,

[34m[1mtrain: [0mScanning /content/combined-dataset/train/labels... 4858 images, 0 backgrounds, 0 corrupt: 100%|██████████| 4858/4858 [00:03<00:00, 1462.03it/s]


[34m[1mtrain: [0mNew cache created: /content/combined-dataset/train/labels.cache
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1871.9±1209.0 MB/s, size: 101.9 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]


Plotting labels to runs/detect/train3/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.001429, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to [1mruns/detect/train3[0m
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/50      2.09G     0.9508      1.582      1.155         38        640: 100%|██████████| 304/304 [00:30<00:00,  9.81it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/50      2.11G     0.9594      1.132      1.153         41        640: 100%|██████████| 304/304 [00:27<00:00, 10.89it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/50      2.11G     0.9558      1.013      1.149         35        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/50      2.11G     0.9492      0.952      1.141         34        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/50      2.11G     0.9082     0.8846      1.128         53        640: 100%|██████████| 304/304 [00:26<00:00, 11.28it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/50      2.12G     0.8733     0.8275      1.108         51        640: 100%|██████████| 304/304 [00:26<00:00, 11.28it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/50      2.13G     0.8594     0.7931      1.098         38        640: 100%|██████████| 304/304 [00:27<00:00, 11.16it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/50      2.13G     0.8295     0.7598      1.082         49        640: 100%|██████████| 304/304 [00:26<00:00, 11.37it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/50      2.15G     0.8106     0.7305      1.071         37        640: 100%|██████████| 304/304 [00:27<00:00, 11.21it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/50      2.15G     0.8051     0.7132      1.067         45        640: 100%|██████████| 304/304 [00:26<00:00, 11.34it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/50      2.15G     0.8043     0.7205      1.074         41        640: 100%|██████████| 304/304 [00:27<00:00, 11.19it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/50      2.15G     0.7962     0.6933      1.067         65        640: 100%|██████████| 304/304 [00:27<00:00, 11.24it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/50      2.15G     0.7668     0.6647      1.051         46        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/50      2.16G     0.7685     0.6588       1.05         40        640: 100%|██████████| 304/304 [00:27<00:00, 11.19it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/50      2.16G     0.7516     0.6444       1.04         44        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/50      2.16G     0.7519     0.6302      1.037         62        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/50      2.16G     0.7424     0.6225      1.035         38        640: 100%|██████████| 304/304 [00:27<00:00, 11.20it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/50      2.16G     0.7236     0.6016      1.027         38        640: 100%|██████████| 304/304 [00:26<00:00, 11.34it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/50      2.16G     0.7286     0.6025      1.029         56        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/50      2.16G      0.714     0.5941      1.018         52        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      21/50      2.16G     0.7174     0.5848      1.026         63        640: 100%|██████████| 304/304 [00:26<00:00, 11.47it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      22/50      2.16G     0.6902     0.5668      1.009         58        640: 100%|██████████| 304/304 [00:26<00:00, 11.28it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      23/50      2.16G     0.7026     0.5709      1.017         51        640: 100%|██████████| 304/304 [00:27<00:00, 11.23it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      24/50      2.16G      0.694      0.567      1.017         47        640: 100%|██████████| 304/304 [00:26<00:00, 11.40it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      25/50      2.16G     0.6835     0.5418      1.007         68        640: 100%|██████████| 304/304 [00:27<00:00, 11.21it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      26/50      2.16G     0.6832     0.5409      1.005         39        640: 100%|██████████| 304/304 [00:26<00:00, 11.52it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      27/50      2.16G     0.6773     0.5401      1.005         43        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      28/50      2.16G      0.679     0.5311      1.003         31        640: 100%|██████████| 304/304 [00:26<00:00, 11.26it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      29/50      2.16G     0.6647     0.5217     0.9978         30        640: 100%|██████████| 304/304 [00:27<00:00, 11.26it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      30/50      2.16G     0.6556     0.5058     0.9915         41        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      31/50      2.16G     0.6547     0.4989     0.9886         41        640: 100%|██████████| 304/304 [00:26<00:00, 11.47it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      32/50      2.16G      0.659      0.503     0.9903         34        640: 100%|██████████| 304/304 [00:27<00:00, 11.24it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      33/50      2.16G     0.6503     0.4973     0.9857         49        640: 100%|██████████| 304/304 [00:26<00:00, 11.37it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      34/50      2.16G     0.6542     0.5044     0.9922         42        640: 100%|██████████| 304/304 [00:26<00:00, 11.36it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      35/50      2.16G     0.6368     0.4852     0.9821         51        640: 100%|██████████| 304/304 [00:27<00:00, 11.23it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      36/50      2.16G     0.6336      0.475     0.9811         32        640: 100%|██████████| 304/304 [00:26<00:00, 11.32it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      37/50      2.16G     0.6336     0.4692      0.978         40        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      38/50      2.16G     0.6241     0.4684     0.9786         67        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      39/50      2.16G     0.6249     0.4614     0.9762         30        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      40/50      2.16G      0.626      0.458     0.9753         51        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]


Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      41/50      2.16G     0.6478     0.4304     0.9591         27        640: 100%|██████████| 304/304 [00:27<00:00, 11.04it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      42/50      2.16G     0.6397     0.4161     0.9502         27        640: 100%|██████████| 304/304 [00:26<00:00, 11.46it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      43/50      2.16G     0.6296     0.4057     0.9476         21        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      44/50      2.16G     0.6185     0.3954     0.9424         22        640: 100%|██████████| 304/304 [00:26<00:00, 11.52it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      45/50      2.16G      0.613     0.3897     0.9422         24        640: 100%|██████████| 304/304 [00:26<00:00, 11.46it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      46/50      2.16G     0.6056     0.3802      0.936         20        640: 100%|██████████| 304/304 [00:26<00:00, 11.34it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      47/50      2.16G     0.5955     0.3724     0.9287         26        640: 100%|██████████| 304/304 [00:26<00:00, 11.43it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      48/50      2.16G     0.5952     0.3695     0.9314         22        640: 100%|██████████| 304/304 [00:26<00:00, 11.48it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      49/50      2.16G     0.5846     0.3639     0.9246         21        640: 100%|██████████| 304/304 [00:26<00:00, 11.49it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      50/50      2.16G     0.5806     0.3582     0.9212         24        640: 100%|██████████| 304/304 [00:26<00:00, 11.48it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 13/13 [00:02<00:00,  4.35it/s]


                   all        411        946      0.848      0.758      0.817      0.528

50 epochs completed in 0.380 hours.
Optimizer stripped from runs/detect/train3/weights/last.pt, 6.2MB
Optimizer stripped from runs/detect/train3/weights/best.pt, 6.2MB

Validating runs/detect/train3/weights/best.pt...
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 13/13 [00:02<00:00,  5.47it/s]


                   all        411        946      0.848      0.758      0.817      0.528
             cigarette        375        377      0.908      0.756      0.855      0.407
                person        394        526      0.981      0.983       0.99      0.856
                 smoke         33         43      0.655      0.535      0.606       0.32
Speed: 0.1ms preprocess, 0.9ms inference, 0.0ms loss, 1.7ms postprocess per image
Results saved to [1mruns/detect/train3[0m


In [None]:
import argparse
from ultralytics import YOLO
from ultralytics.data.dataset import YOLODataset
import ultralytics.data.build as build
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os

class YOLOWeightedDataset(YOLODataset):
    def __init__(self, *args, mode="train", **kwargs):
        """
        Initialize the WeightedDataset.

        Args:
            class_weights (list or numpy array): A list or array of weights corresponding to each class.
        """

        super(YOLOWeightedDataset, self).__init__(*args, **kwargs)

        self.train_mode = "train" in self.prefix

        # You can also specify weights manually instead
        self.count_instances()
        class_weights = np.sum(self.counts) / self.counts
        self.agg_func = np.mean

        self.class_weights = np.array(class_weights)
        self.weights = self.calculate_weights()
        self.probabilities = self.calculate_probabilities()

    def count_instances(self):
        """
        Count the number of instances per class

        Returns:
            dict: A dict containing the counts for each class.
        """
        self.counts = [0 for i in range(len(self.data["names"]))]
        for label in self.labels:
            cls = label['cls'].reshape(-1).astype(int)
            for id in cls:
                self.counts[id] += 1

        self.counts = np.array(self.counts)
        self.counts = np.where(self.counts == 0, 1, self.counts)

    def calculate_weights(self):
        """
        Calculate the aggregated weight for each label based on class weights.

        Returns:
            list: A list of aggregated weights corresponding to each label.
        """
        weights = []
        for label in self.labels:
            cls = label['cls'].reshape(-1).astype(int)

            # Give a default weight to background class
            if cls.size == 0:
                weights.append(1)
                continue

            # Take mean of weights
            # You can change this weight aggregation function to aggregate weights differently
            # weight = np.mean(self.class_weights[cls])
            # weight = np.max(self.class_weights[cls])
            weight = self.agg_func(self.class_weights[cls])
            weights.append(weight)
        return weights

    def calculate_probabilities(self):
        """
        Calculate and store the sampling probabilities based on the weights.

        Returns:
            list: A list of sampling probabilities corresponding to each label.
        """
        total_weight = sum(self.weights)
        probabilities = [w / total_weight for w in self.weights]
        return probabilities

    def __getitem__(self, index):
        """
        Return transformed label information based on the sampled index.
        """
        # Don't use for validation
        if not self.train_mode:
            return self.transforms(self.get_image_and_label(index))
        else:
            index = np.random.choice(len(self.labels), p=self.probabilities)
            return self.transforms(self.get_image_and_label(index))

# Monkey patch method
build.YOLODataset = YOLOWeightedDataset

# Load a model
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="/content/combined-dataset/data.yaml", epochs=50, imgsz=640, batch=16, val=False)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/combined-dataset/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=50, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train4, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0,

[34m[1mtrain: [0mScanning /content/combined-dataset/train/labels.cache... 4858 images, 0 backgrounds, 0 corrupt: 100%|██████████| 4858/4858 [00:00<?, ?it/s]

[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))





[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 407.1±94.3 MB/s, size: 101.9 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]


Plotting labels to runs/detect/train4/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.001429, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to [1mruns/detect/train4[0m
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/50      2.11G     0.9804      1.692      1.195         38        640: 100%|██████████| 304/304 [00:30<00:00, 10.10it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/50      2.13G      0.976      1.213      1.189         46        640: 100%|██████████| 304/304 [00:27<00:00, 10.99it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/50      2.13G      0.987      1.117      1.197         36        640: 100%|██████████| 304/304 [00:27<00:00, 11.15it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/50      2.14G     0.9745      1.037      1.189         34        640: 100%|██████████| 304/304 [00:27<00:00, 11.25it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/50      2.15G      0.925      0.974      1.162         50        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/50      2.15G     0.8788     0.8912      1.135         63        640: 100%|██████████| 304/304 [00:26<00:00, 11.28it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/50      2.15G     0.8784     0.8781      1.129         37        640: 100%|██████████| 304/304 [00:26<00:00, 11.29it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/50      2.15G     0.8575     0.8416      1.121         57        640: 100%|██████████| 304/304 [00:27<00:00, 11.25it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/50      2.15G     0.8513     0.8017      1.112         50        640: 100%|██████████| 304/304 [00:27<00:00, 11.25it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/50      2.15G     0.8109     0.7714      1.096         42        640: 100%|██████████| 304/304 [00:26<00:00, 11.34it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/50      2.15G       0.81     0.7624      1.097         50        640: 100%|██████████| 304/304 [00:26<00:00, 11.26it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/50      2.15G     0.7991     0.7378      1.088         51        640: 100%|██████████| 304/304 [00:26<00:00, 11.29it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/50      2.15G      0.787     0.7191      1.081         41        640: 100%|██████████| 304/304 [00:27<00:00, 11.15it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/50      2.15G     0.7772     0.7096      1.077         34        640: 100%|██████████| 304/304 [00:26<00:00, 11.30it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/50      2.15G     0.7726     0.6838      1.066         36        640: 100%|██████████| 304/304 [00:27<00:00, 11.17it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/50      2.16G     0.7465      0.663      1.061         54        640: 100%|██████████| 304/304 [00:26<00:00, 11.26it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/50      2.17G     0.7371     0.6409       1.05         38        640: 100%|██████████| 304/304 [00:27<00:00, 11.11it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/50      2.17G     0.7365     0.6329      1.055         38        640: 100%|██████████| 304/304 [00:27<00:00, 11.11it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/50      2.17G     0.7265     0.6289      1.047         62        640: 100%|██████████| 304/304 [00:27<00:00, 11.12it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/50      2.17G     0.7257     0.6188      1.044         69        640: 100%|██████████| 304/304 [00:27<00:00, 11.19it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      21/50      2.17G     0.7096     0.5997      1.037         46        640: 100%|██████████| 304/304 [00:27<00:00, 11.18it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      22/50      2.17G     0.7146     0.6065      1.049         53        640: 100%|██████████| 304/304 [00:27<00:00, 11.20it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      23/50      2.17G     0.7052     0.5896      1.032         50        640: 100%|██████████| 304/304 [00:27<00:00, 11.18it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      24/50      2.17G     0.6958     0.5753      1.028         44        640: 100%|██████████| 304/304 [00:27<00:00, 11.23it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      25/50      2.17G      0.695     0.5662      1.031         57        640: 100%|██████████| 304/304 [00:27<00:00, 11.13it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      26/50      2.17G     0.6863     0.5615      1.025         42        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      27/50      2.17G     0.6743     0.5399      1.013         30        640: 100%|██████████| 304/304 [00:27<00:00, 11.04it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      28/50      2.17G     0.6701     0.5524      1.016         37        640: 100%|██████████| 304/304 [00:27<00:00, 11.16it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      29/50      2.17G     0.6618     0.5303      1.009         37        640: 100%|██████████| 304/304 [00:27<00:00, 11.19it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      30/50      2.17G     0.6642     0.5285      1.011         53        640: 100%|██████████| 304/304 [00:27<00:00, 11.16it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      31/50      2.17G      0.651     0.5251      1.005         49        640: 100%|██████████| 304/304 [00:27<00:00, 11.10it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      32/50      2.17G      0.642     0.5117      1.004         42        640: 100%|██████████| 304/304 [00:27<00:00, 11.25it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      33/50      2.17G     0.6335     0.5027     0.9987         58        640: 100%|██████████| 304/304 [00:27<00:00, 11.19it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      34/50      2.17G      0.636     0.4997     0.9952         56        640: 100%|██████████| 304/304 [00:27<00:00, 11.13it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      35/50      2.17G     0.6302     0.4901     0.9956         61        640: 100%|██████████| 304/304 [00:26<00:00, 11.27it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      36/50      2.17G     0.6222     0.4793     0.9922         34        640: 100%|██████████| 304/304 [00:27<00:00, 11.15it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      37/50      2.17G     0.6312      0.495     0.9953         55        640: 100%|██████████| 304/304 [00:27<00:00, 11.08it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      38/50      2.17G     0.6156     0.4669     0.9845         59        640: 100%|██████████| 304/304 [00:27<00:00, 11.25it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      39/50      2.17G     0.6064     0.4616     0.9862         33        640: 100%|██████████| 304/304 [00:27<00:00, 11.18it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      40/50      2.17G     0.6041     0.4593     0.9807         54        640: 100%|██████████| 304/304 [00:27<00:00, 11.18it/s]


Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      41/50      2.17G     0.6159     0.4147      0.961         26        640: 100%|██████████| 304/304 [00:27<00:00, 10.94it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      42/50      2.17G      0.616     0.3961     0.9598         28        640: 100%|██████████| 304/304 [00:26<00:00, 11.48it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      43/50      2.17G     0.5961     0.3872     0.9492         20        640: 100%|██████████| 304/304 [00:26<00:00, 11.31it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      44/50      2.17G      0.605      0.388     0.9529         23        640: 100%|██████████| 304/304 [00:26<00:00, 11.34it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      45/50      2.17G     0.5735     0.3683     0.9367         19        640: 100%|██████████| 304/304 [00:26<00:00, 11.40it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      46/50      2.17G     0.5775     0.3619     0.9336         25        640: 100%|██████████| 304/304 [00:27<00:00, 11.20it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      47/50      2.17G     0.5859     0.3672      0.941         38        640: 100%|██████████| 304/304 [00:26<00:00, 11.46it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      48/50      2.17G      0.561     0.3449     0.9291         25        640: 100%|██████████| 304/304 [00:26<00:00, 11.28it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      49/50      2.17G       0.55     0.3467     0.9217         25        640: 100%|██████████| 304/304 [00:26<00:00, 11.37it/s]



      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      50/50      2.17G     0.5513     0.3388     0.9227         23        640: 100%|██████████| 304/304 [00:26<00:00, 11.45it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 13/13 [00:02<00:00,  4.86it/s]


                   all        411        946      0.826       0.77      0.798      0.511

50 epochs completed in 0.382 hours.
Optimizer stripped from runs/detect/train4/weights/last.pt, 6.2MB
Optimizer stripped from runs/detect/train4/weights/best.pt, 6.2MB

Validating runs/detect/train4/weights/best.pt...
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 13/13 [00:02<00:00,  5.53it/s]


                   all        411        946      0.825       0.77      0.796      0.509
             cigarette        375        377      0.903      0.743      0.832       0.39
                person        394        526      0.981      0.987       0.99      0.853
                 smoke         33         43       0.59      0.581      0.567      0.284
Speed: 0.1ms preprocess, 1.0ms inference, 0.0ms loss, 1.7ms postprocess per image
Results saved to [1mruns/detect/train4[0m


## **Step 6: Evaluation on All Models**

In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/combined-dataset/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/combined-dataset/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1615.9±1070.1 MB/s, size: 73.2 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 26/26 [00:02<00:00,  8.94it/s]


                   all        411        946      0.813      0.745      0.821      0.529
             cigarette        375        377      0.881      0.766      0.856      0.425
                person        394        526      0.982      0.981       0.99      0.866
                 smoke         33         43      0.576      0.488      0.619      0.296
Speed: 0.4ms preprocess, 1.6ms inference, 0.0ms loss, 1.9ms postprocess per image
Results saved to [1mruns/detect/val8[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2040.3±882.2 MB/s, size: 92.6 KB)


[34m[1mval: [0mScanning /content/combined-dataset/test/labels... 156 images, 0 backgrounds, 0 corrupt: 100%|██████████| 156/156 [00:00<00:00, 1478.07it/s]

[34m[1mval: [0mNew cache created: /content/combined-dataset/test/labels.cache



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 10/10 [00:01<00:00,  5.29it/s]


                   all        156        388      0.847      0.833      0.867      0.552
             cigarette        146        146      0.815      0.755      0.839      0.434
                person        153        226      0.942      0.996      0.991      0.856
                 smoke         13         16      0.785       0.75      0.772      0.365
Speed: 0.9ms preprocess, 1.1ms inference, 0.0ms loss, 4.8ms postprocess per image
Results saved to [1mruns/detect/val9[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train2/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/combined-dataset/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/combined-dataset/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2046.5±1200.7 MB/s, size: 70.3 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 26/26 [00:02<00:00,  9.45it/s]


                   all        411        946      0.831      0.763      0.801      0.518
             cigarette        375        377      0.868      0.745      0.826      0.415
                person        394        526      0.979      0.986      0.991      0.865
                 smoke         33         43      0.645      0.558      0.587      0.275
Speed: 0.4ms preprocess, 0.8ms inference, 0.0ms loss, 1.7ms postprocess per image
Results saved to [1mruns/detect/val10[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 3152.3±1034.0 MB/s, size: 109.2 KB)


[34m[1mval: [0mScanning /content/combined-dataset/test/labels.cache... 156 images, 0 backgrounds, 0 corrupt: 100%|██████████| 156/156 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 10/10 [00:01<00:00,  5.55it/s]


                   all        156        388      0.836      0.846      0.853      0.558
             cigarette        146        146      0.792      0.788      0.827       0.43
                person        153        226      0.943          1      0.991      0.858
                 smoke         13         16      0.771       0.75      0.742      0.384
Speed: 0.9ms preprocess, 2.2ms inference, 0.0ms loss, 2.9ms postprocess per image
Results saved to [1mruns/detect/val11[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train3/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/combined-dataset/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/combined-dataset/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1932.4±1133.7 MB/s, size: 67.2 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 26/26 [00:02<00:00,  9.71it/s]


                   all        411        946      0.847      0.757      0.824      0.554
             cigarette        375        377      0.905      0.755      0.871      0.449
                person        394        526       0.98      0.981      0.991      0.872
                 smoke         33         43      0.656      0.535      0.608      0.343
Speed: 0.3ms preprocess, 0.8ms inference, 0.0ms loss, 1.5ms postprocess per image
Results saved to [1mruns/detect/val12[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2139.0±812.7 MB/s, size: 88.6 KB)


[34m[1mval: [0mScanning /content/combined-dataset/test/labels.cache... 156 images, 0 backgrounds, 0 corrupt: 100%|██████████| 156/156 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 10/10 [00:01<00:00,  5.54it/s]


                   all        156        388      0.898       0.82       0.89      0.577
             cigarette        146        146      0.905      0.781      0.901      0.495
                person        153        226      0.971      0.991      0.991      0.862
                 smoke         13         16      0.817      0.688      0.779      0.374
Speed: 0.8ms preprocess, 0.8ms inference, 0.0ms loss, 2.2ms postprocess per image
Results saved to [1mruns/detect/val13[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train4/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/combined-dataset/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/combined-dataset/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1823.1±1111.7 MB/s, size: 73.0 KB)


[34m[1mval: [0mScanning /content/combined-dataset/valid/labels.cache... 411 images, 1 backgrounds, 0 corrupt: 100%|██████████| 411/411 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 26/26 [00:02<00:00,  9.83it/s]


                   all        411        946      0.818       0.77      0.813      0.541
             cigarette        375        377        0.9      0.741      0.852      0.434
                person        394        526      0.979      0.987      0.991      0.871
                 smoke         33         43      0.575      0.581      0.596      0.319
Speed: 0.4ms preprocess, 0.8ms inference, 0.0ms loss, 2.1ms postprocess per image
Results saved to [1mruns/detect/val14[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2581.6±728.7 MB/s, size: 110.7 KB)


[34m[1mval: [0mScanning /content/combined-dataset/test/labels.cache... 156 images, 0 backgrounds, 0 corrupt: 100%|██████████| 156/156 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 10/10 [00:01<00:00,  5.53it/s]


                   all        156        388      0.811      0.829      0.843      0.563
             cigarette        146        146      0.834      0.829      0.867      0.458
                person        153        226      0.961      0.996      0.993      0.853
                 smoke         13         16      0.639      0.664      0.671      0.377
Speed: 1.0ms preprocess, 0.8ms inference, 0.0ms loss, 2.0ms postprocess per image
Results saved to [1mruns/detect/val15[0m


## **Step 6.5: Evaluate on CCTV dataset**

In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1896.6±439.5 MB/s, size: 117.2 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/valid/labels... 41 images, 0 backgrounds, 0 corrupt: 100%|██████████| 41/41 [00:00<00:00, 1503.15it/s]

[34m[1mval: [0mNew cache created: /content/Smoking-CCTV-Detection-4/valid/labels.cache



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.67it/s]


                   all         41        195      0.726      0.705       0.76      0.448
             cigarette         32         33      0.774      0.364      0.577      0.202
                person         41        145      0.956      0.986      0.993      0.817
                 smoke         15         17      0.449      0.765      0.711      0.324
Speed: 0.9ms preprocess, 9.0ms inference, 0.0ms loss, 1.6ms postprocess per image
Results saved to [1mruns/detect/val16[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2299.3±543.5 MB/s, size: 113.7 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/test/labels... 21 images, 0 backgrounds, 0 corrupt: 100%|██████████| 21/21 [00:00<00:00, 1465.88it/s]

[34m[1mval: [0mNew cache created: /content/Smoking-CCTV-Detection-4/test/labels.cache



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.90it/s]


                   all         21        116      0.768      0.717      0.751      0.453
             cigarette         15         15      0.728        0.4      0.477      0.164
                person         21         85      0.944          1      0.994      0.821
                 smoke         13         16      0.632       0.75      0.783      0.373
Speed: 2.2ms preprocess, 4.3ms inference, 0.0ms loss, 12.0ms postprocess per image
Results saved to [1mruns/detect/val17[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train2/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2797.0±728.4 MB/s, size: 111.9 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/valid/labels.cache... 41 images, 0 backgrounds, 0 corrupt: 100%|██████████| 41/41 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.70it/s]


                   all         41        195      0.829      0.619      0.686       0.42
             cigarette         32         33      0.906      0.212       0.41      0.106
                person         41        145      0.973      0.998      0.994      0.821
                 smoke         15         17      0.609      0.647      0.655      0.331
Speed: 1.8ms preprocess, 1.1ms inference, 0.0ms loss, 8.3ms postprocess per image
Results saved to [1mruns/detect/val18[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1777.7±795.8 MB/s, size: 108.6 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/test/labels.cache... 21 images, 0 backgrounds, 0 corrupt: 100%|██████████| 21/21 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  4.65it/s]


                   all         21        116      0.836      0.709      0.793      0.478
             cigarette         15         15      0.849      0.377       0.59      0.192
                person         21         85      0.931          1      0.995      0.829
                 smoke         13         16      0.729       0.75      0.794      0.412
Speed: 2.1ms preprocess, 1.9ms inference, 0.0ms loss, 1.2ms postprocess per image
Results saved to [1mruns/detect/val19[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train3/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2010.0±793.7 MB/s, size: 114.3 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/valid/labels.cache... 41 images, 0 backgrounds, 0 corrupt: 100%|██████████| 41/41 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.64it/s]


                   all         41        195      0.667      0.849      0.803      0.483
             cigarette         32         33      0.714      0.606      0.706      0.224
                person         41        145      0.948          1      0.991      0.823
                 smoke         15         17       0.34      0.941      0.711      0.402
Speed: 1.9ms preprocess, 1.0ms inference, 0.0ms loss, 8.2ms postprocess per image
Results saved to [1mruns/detect/val20[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 1927.6±863.7 MB/s, size: 103.2 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/test/labels.cache... 21 images, 0 backgrounds, 0 corrupt: 100%|██████████| 21/21 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.24it/s]


                   all         21        116      0.718      0.782       0.79      0.486
             cigarette         15         15       0.62      0.533      0.589      0.242
                person         21         85      0.966          1      0.995      0.837
                 smoke         13         16      0.568      0.812      0.787      0.378
Speed: 1.8ms preprocess, 1.5ms inference, 0.0ms loss, 1.2ms postprocess per image
Results saved to [1mruns/detect/val21[0m


In [None]:
# Load the best weights from the training run
model = YOLO("/content/runs/detect/train4/weights/last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/Smoking-CCTV-Detection-4/data.yaml", split="test", conf=0.10)

Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
Model summary (fused): 72 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2305.4±633.9 MB/s, size: 109.6 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/valid/labels.cache... 41 images, 0 backgrounds, 0 corrupt: 100%|██████████| 41/41 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.67it/s]


                   all         41        195      0.696      0.829      0.771      0.463
             cigarette         32         33      0.705      0.545      0.641      0.206
                person         41        145      0.954          1      0.992      0.833
                 smoke         15         17      0.429      0.941       0.68      0.351
Speed: 2.1ms preprocess, 1.0ms inference, 0.0ms loss, 1.5ms postprocess per image
Results saved to [1mruns/detect/val22[0m
Ultralytics 8.3.168 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB, 40507MiB)
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 2611.8±739.4 MB/s, size: 106.7 KB)


[34m[1mval: [0mScanning /content/Smoking-CCTV-Detection-4/test/labels.cache... 21 images, 0 backgrounds, 0 corrupt: 100%|██████████| 21/21 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  4.61it/s]


                   all         21        116      0.862      0.713      0.749      0.478
             cigarette         15         15          1      0.453      0.575      0.239
                person         21         85      0.979          1      0.995      0.814
                 smoke         13         16      0.609      0.688      0.678      0.382
Speed: 2.3ms preprocess, 1.8ms inference, 0.0ms loss, 1.2ms postprocess per image
Results saved to [1mruns/detect/val23[0m


## **Step 7: Export**

In [None]:
from ultralytics import YOLO

model = YOLO("/content/last.pt")
path = model.export(format='onnx', opset=12, simplify=True, dynamic=False, batch=1, imgsz=320)

In [None]:
from ultralytics import YOLO

# Load the best weights from the training run
model = YOLO("last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-Person-Detection-1/data.yaml", split="val", conf=0.10)
results = model.val(data="/content/Smoking-Person-Detection-1/data.yaml", split="test", conf=0.10)

In [None]:
from ultralytics import YOLO

# Load the best weights from the training run
model = YOLO("last.pt")

# Validate the model on the validation and test set
results = model.val(data="/content/Smoking-CCTV-Detection-3/data.yaml", split="test", conf=0.10)