---
author: Zeel B Patel
badges: true
categories:
  - ML
  - CV
date: "2025-02-10"
description: Compare your performance with a random baseline.
title: Object Detection Random Baseline
toc: true
---


# Why Random Baseline?
Given a standard dataset with a fixed set of models, it is easier to compare the performance of different models. However, in cases where the model's performance is worse compared to the best models, but we need to test if it is better than random predictions, we can use a random baseline. 

# Proposed Idea
* To formalize the problem, let's say that for an arbitrary image, model predicts $k$ bounding boxes with sizes $(h_1, w_1), (h_2, w_2), \ldots, (h_k, w_k)$.
* A simple random baseline would be to generate $k$ random bounding boxes for that image with sizes $(h_1, w_1), (h_2, w_2), \ldots, (h_k, w_k)$.

# Imports


In [59]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

import numpy as np
from tqdm.notebook import tqdm
import supervision as sv
from roboflow import Roboflow
from dotenv import load_dotenv
from ultralytics import YOLO
from copy import deepcopy
from PIL import Image

load_dotenv()

True

# Dataset

In [2]:
data_location = "/tmp/poker-cards-fmjio"

rf = Roboflow(api_key=os.getenv("ROBOFLOW_API_KEY"))
project = rf.workspace("roboflow-jvuqo").project("poker-cards-fmjio")
version = project.version(4)
dataset = version.download("yolov8", location=data_location)

loading Roboflow workspace...
loading Roboflow project...


# Train Model

In [3]:
model = YOLO("yolo11m")

model.train(data=f"{data_location}/data.yaml", epochs=1, project="/tmp/poker-cards-fmjio", exist_ok=True)

New https://pypi.org/project/ultralytics/8.3.74 available 😃 Update with 'pip install -U ultralytics'
Ultralytics 8.3.55 🚀 Python-3.10.15 torch-2.5.0+cu124 CUDA:0 (NVIDIA A100-SXM4-80GB, 81156MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolo11m.pt, data=/tmp/poker-cards-fmjio/data.yaml, epochs=1, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=/tmp/poker-cards-fmjio, name=train, exist_ok=True, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_m

[34m[1mtrain: [0mScanning /tmp/poker-cards-fmjio/train/labels.cache... 811 images, 0 backgrounds, 0 corrupt: 100%|██████████| 811/811 [00:00<?, ?it/s]
[34m[1mval: [0mScanning /tmp/poker-cards-fmjio/valid/labels.cache... 44 images, 0 backgrounds, 0 corrupt: 100%|██████████| 44/44 [00:00<?, ?it/s]


Plotting labels to /tmp/poker-cards-fmjio/train/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.000179, momentum=0.9) with parameter groups 106 weight(decay=0.0), 113 weight(decay=0.0005), 112 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to [1m/tmp/poker-cards-fmjio/train[0m
Starting training for 1 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        1/1      8.56G     0.8313      3.612      1.244         66        640: 100%|██████████| 51/51 [00:08<00:00,  5.69it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:01<00:00,  1.56it/s]


                   all         44        197      0.289      0.514      0.256      0.228

1 epochs completed in 0.004 hours.
Optimizer stripped from /tmp/poker-cards-fmjio/train/weights/last.pt, 40.6MB
Optimizer stripped from /tmp/poker-cards-fmjio/train/weights/best.pt, 40.6MB

Validating /tmp/poker-cards-fmjio/train/weights/best.pt...
Ultralytics 8.3.55 🚀 Python-3.10.15 torch-2.5.0+cu124 CUDA:0 (NVIDIA A100-SXM4-80GB, 81156MiB)
YOLO11m summary (fused): 303 layers, 20,070,124 parameters, 0 gradients, 67.9 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  6.48it/s]


                   all         44        197       0.29      0.516      0.255      0.227
           10 of clubs          3          3     0.0908          1      0.583      0.541
        10 of diamonds          7          7     0.0933          1      0.613      0.558
          10 of hearts          7          7      0.364          1      0.738      0.629
          10 of spades          4          4          1          0       0.19      0.147
            2 of clubs          2          2          1          0          0          0
         2 of diamonds          2          2          1          0          0          0
           2 of hearts          1          1          0          0     0.0191     0.0153
           2 of spades          4          4      0.321      0.477      0.314      0.272
            3 of clubs          2          2          0          0     0.0421     0.0337
         3 of diamonds          2          2     0.0793      0.238      0.101     0.0993
           3 of heart

ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x7f6242879ae0>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.03303

# Evaluate Model

In [4]:
test_dataset = sv.DetectionDataset.from_yolo(f"{data_location}/test/images", f"{data_location}/test/labels", f"{data_location}/data.yaml")
len(test_dataset)

44

In [51]:
annotations_list = []
detections_list = []
for _, img, annotations in tqdm(test_dataset):
    results = model.predict(img, verbose=False)[0]
    detections = sv.Detections.from_ultralytics(results)
    annotations_list.append(annotations)
    detections_list.append(detections)

  0%|          | 0/44 [00:00<?, ?it/s]

In [53]:
mAP = sv.metrics.MeanAveragePrecision().update(detections_list, annotations_list).compute()
mAP.map50

0.1417744455736285

# Random Baseline

As per our assumption, we would simply need to randomly move the existing bounding boxes keeping their sizes constant with the following constraints:
* The bounding box should be within the image boundaries.

In [73]:
min_size = 0
max_size = model.args['imgsz']

mAPs = []
for random_seed in tqdm(range(100)):
    np.random.seed(random_seed)
    random_detections_list = []
    for detections in detections_list:
        random_detections = deepcopy(detections)
        shift = np.random.rand(len(detections))
        lower_limit = - detections.xyxy.min(axis=1) + 1e-6
        upper_limit = max_size - detections.xyxy.max(axis=1) - 1e-6
        transformed_shift = lower_limit + shift * (upper_limit - lower_limit)
        random_detections.xyxy = random_detections.xyxy + transformed_shift.reshape(-1, 1)
        random_detections_list.append(random_detections)
        
    mAP = sv.metrics.MeanAveragePrecision().update(random_detections_list, annotations_list).compute()
    mAPs.append(mAP.map50)
    
print(f"mAP50: {mAP.map50:.2f} +/- {np.std(mAPs):.2f}")

  0%|          | 0/100 [00:00<?, ?it/s]

mAP50: 0.04 +/- 0.01


We can also modify the confidence values.

In [74]:
min_size = 0
max_size = model.args['imgsz']

mAPs = []
for random_seed in tqdm(range(100)):
    np.random.seed(random_seed)
    random_detections_list = []
    for detections in detections_list:
        random_detections = deepcopy(detections)
        shift = np.random.rand(len(detections))
        lower_limit = - detections.xyxy.min(axis=1) + 1e-6
        upper_limit = max_size - detections.xyxy.max(axis=1) - 1e-6
        transformed_shift = lower_limit + shift * (upper_limit - lower_limit)
        random_detections.xyxy = random_detections.xyxy + transformed_shift.reshape(-1, 1)
        random_detections.confidence = np.random.rand(len(detections))
        random_detections_list.append(random_detections)
        
    mAP = sv.metrics.MeanAveragePrecision().update(random_detections_list, annotations_list).compute()
    mAPs.append(mAP.map50)
    
print(f"mAP50: {mAP.map50:.2f} +/- {np.std(mAPs):.2f}")

  0%|          | 0/100 [00:00<?, ?it/s]

mAP50: 0.03 +/- 0.01
