Correct confusion matrix calculation-function evaluate_detection_batch #1853

panagiotamoraiti · 2025-05-27T15:06:27Z

Description

This fixes the issue where predicted bounding boxes were matched to ground truth boxes solely based on IoU, without considering class agreement during the matching process. Currently, if a predicted box has a higher IoU but the wrong class, it gets matched first, and the correct prediction with the right class but lower IoU is discarded. This leads to miscounting true positives and false positives, resulting in inaccurate confusion matrix.

The change modifies the matching logic (method evaluate_detection_batch) to incorporate both IoU and class agreement simultaneously, ensuring only predictions that match both IoU threshold and class are matched to ground truths. This results in a correct confusion matrix.

Type of change

Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

I had an image with 2 TP and 1 FP detections, but the confusion matrix predicted 1 TP, 2 FP and 1FN. The FP bbox with the wrong class had higher overlap so the TP was discarded. At the end also this bbox was discareded due to the wrong class id. Now my confusion matrix predicts correctly 2 TP and 1 FP detections.

I run this in a big dataset, another script i have developed and used extensively in previous project gives the following results that now match with the confusion matrix, before i corrected them they didn't match.

Test Set:
Ground Truth Objects: 481
True Positives: 469
False Positives: 11
False Negatives: 12

Validation Set:
Ground Truth Objects: 1073
True Positives: 1037
False Positives: 23
False Negatives: 36

Train Set:
Ground Truth Objects: 3716
True Positives: 3674
False Positives: 52
False Negatives: 42

CLAassistant · 2025-05-27T15:06:34Z

All committers have signed the CLA.

panagiotamoraiti · 2025-06-06T15:54:42Z

For evaluating the confusion matrix you can use the following code:

import numpy as np
import supervision as sv

# Define class names
class_names = ['cat', 'dog', 'rabbit']

# Ground truth detections (3 objects, one per class)
gt = sv.Detections(
    xyxy=np.array([
        [0, 0, 2, 2],   # cat
        [3, 3, 5, 5],   # dog
        [6, 6, 8, 8],   # rabbit
        [6, 15, 9, 16], # rabbit
        [2, 2, 3, 3],   # rabbit
    ]),
    class_id=np.array([0, 1, 2, 2, 2])
)

# Predicted detections (6 predictions)
preds = sv.Detections(
    xyxy=np.array([
        [0, 0, 2, 2],
        [3, 3, 5, 5], 
        [6, 6, 8, 8], 
        [9, 9, 11, 11], # FP 
        [10, 10, 12, 12], # FP
        [2, 2, 3, 3],  # confused rabbit as cat
    ]),
    class_id=np.array([0, 1, 2, 0, 1, 1]),  # note: rabbit GT predicted as cat (confused)
    confidence=np.array([0.9, 0.7, 0.8, 0.6, 0.7, 0.7])
)

# Generate confusion matrix
cm = sv.ConfusionMatrix.from_detections(
    predictions=[preds],
    targets=[gt],
    classes=class_names,
    conf_threshold=0.5,
    iou_threshold=0.5
)

print("Confusion Matrix:\n", cm.matrix)

I've confirmed that it works with many examples.

soumik12345

Hi @panagiotamoraiti, thanks for the PR!
I have a few comments regarding your logic, it would be great if you can address them and add some unit tests.

soumik12345 · 2025-07-03T08:27:45Z

supervision/metrics/detection.py

-        )
+        # For each GT, find best matching detection (highest IoU > threshold)
+        for gt_idx, gt_class in enumerate(true_classes):
+            candidate_det_idxs = np.where(iou_batch[gt_idx] > iou_threshold)[0]


The selection of the best match is happening based solely on IoU, which means a wrong-class prediction can still be chosen over a right-class one if it has a higher IoU.

soumik12345 · 2025-07-03T08:47:06Z

supervision/metrics/detection.py

+        # For each GT, find best matching detection (highest IoU > threshold)
+        for gt_idx, gt_class in enumerate(true_classes):
+            candidate_det_idxs = np.where(iou_batch[gt_idx] > iou_threshold)[0]

-        for i, true_class_value in enumerate(true_classes):
-            j = matched_true_idx == i
-            if matches.shape[0] > 0 and sum(j) == 1:
-                result_matrix[
-                    true_class_value, detection_classes[matched_detection_idx[j]]
-                ] += 1  # TP
+            if len(candidate_det_idxs) == 0:
+                # No matching detection → FN for this GT
+                result_matrix[gt_class, num_classes] += 1
+                continue
+
+            best_det_idx = candidate_det_idxs[
+                np.argmax(iou_batch[gt_idx, candidate_det_idxs])
+            ]
+            det_class = detection_classes[best_det_idx]
+
+            if best_det_idx not in matched_det_idx:
+                # Count as matched regardless of class:
+                # same class → TP, different class → misclassification
+                result_matrix[gt_class, det_class] += 1
+                matched_gt_idx.add(gt_idx)
+                matched_det_idx.add(best_det_idx)
            else:
-                result_matrix[true_class_value, num_classes] += 1  # FN
+                # Detection already matched, GT is FN
+                result_matrix[gt_class, num_classes] += 1


It seems that this logic iterates through ground truth boxes and for each one finds the best-matching detection box, i.e, the one with highest IoU above the threshold, that hasn't been matched yet.

The issue with this logic is that the matching process depends on the order of the ground truth boxes in true_classes. So, if a single detection box has a high IoU with multiple ground truth boxes, it will be matched with the first ground truth box that is processed which can lead to inconsistent and incorrect confusion matrices, as the result will vary depending on the order of ground truths in the input data.

panagiotamoraiti requested review from SkalskiP and onuralpszr as code owners May 27, 2025 15:06

Correct confusion matrix calculation-function evaluate_detection_batch

16c1070

panagiotamoraiti force-pushed the branch_pm branch from 67cc5a7 to 16c1070 Compare May 29, 2025 14:09

pre-commit-ci bot and others added 4 commits May 29, 2025 14:11

fix(pre_commit): 🎨 auto format pre-commit hooks

0468b9a

Update

0d42787

Update

bbcef84

Correct confusion matrix computation

e56ee44

panagiotamoraiti force-pushed the branch_pm branch from cd8ce64 to e56ee44 Compare June 6, 2025 15:51

fix(pre_commit): 🎨 auto format pre-commit hooks

6d6b6dc

soumik12345 requested changes Jul 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correct confusion matrix calculation-function evaluate_detection_batch #1853

Correct confusion matrix calculation-function evaluate_detection_batch #1853

Uh oh!

panagiotamoraiti commented May 27, 2025 •

edited

Loading

Uh oh!

CLAassistant commented May 27, 2025 •

edited

Loading

Uh oh!

panagiotamoraiti commented Jun 6, 2025 •

edited

Loading

Uh oh!

soumik12345 left a comment

Uh oh!

soumik12345 Jul 3, 2025

Uh oh!

soumik12345 Jul 3, 2025

Uh oh!

Uh oh!

Correct confusion matrix calculation-function evaluate_detection_batch #1853

Are you sure you want to change the base?

Correct confusion matrix calculation-function evaluate_detection_batch #1853

Uh oh!

Conversation

panagiotamoraiti commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Uh oh!

CLAassistant commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

panagiotamoraiti commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soumik12345 left a comment

Choose a reason for hiding this comment

Uh oh!

soumik12345 Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

soumik12345 Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

panagiotamoraiti commented May 27, 2025 •

edited

Loading

CLAassistant commented May 27, 2025 •

edited

Loading

panagiotamoraiti commented Jun 6, 2025 •

edited

Loading