Skip to content

fix(eval): use MOT distractor classes in GT preprocessing for TrackEval parity#466

Merged
Borda merged 3 commits into
roboflow:developfrom
RubenHaisma:fix/mot-distractor-classes
Jun 22, 2026
Merged

fix(eval): use MOT distractor classes in GT preprocessing for TrackEval parity#466
Borda merged 3 commits into
roboflow:developfrom
RubenHaisma:fix/mot-distractor-classes

Conversation

@RubenHaisma

Copy link
Copy Markdown
Contributor

Description

The MOT ground-truth preprocessing parses the class column (mot.py) but never used it. Valid GT and distractor regions were both derived from the confidence flag alone:

valid_mask = frame_data.confidences > 0
...                       # valid GT
~valid_mask               # "distractor" == conf == 0

TrackEval's mot_challenge_2d_box.py keys off the class instead:

  • GT kept for scoring = (conf != 0) AND (class == pedestrian, 1)
  • distractor regions (tracker dets best-matching them are suppressed) = class ∈ {2, 7, 8, 12}person_on_vehicle, static_person, distractor, reflection

Why it matters

On MOT17/MOT20 ground truth, distractor-class rows carry conf == 1. The old conf-only logic therefore:

  • counted distractor-class boxes as ground truth → inflated FN / IDFN, and
  • let tracker detections overlapping a distractor be scored as false positives (instead of being suppressed),

shifting MOTA / HOTA / IDF1 away from the official TrackEval numbers. The file documents TrackEval parity and _remove_distractor_matches cites the TrackEval MOT17 line range, so this is a parity gap rather than a design choice.

The existing _remove_distractor_matches machinery was already correct — it just needs the distractor mask to be class-based.

Scope / safety

This is invisible to the current integration tests, which cover only SportsMOT and DanceTrack (single pedestrian class, conf == 1). For that data the new and old masks are identical, so those results are unchanged. num_tracker_ids is intentionally left as-is (built before suppression); a never-matched id is metric-neutral.

Tests

tests/io/test_mot.py (hermetic, no downloads):

  • test_distractor_class_excluded_and_matching_tracker_removed — a distractor-class GT (conf==1) and a conf==0 pedestrian are excluded from GT, and a tracker det overlapping the distractor is suppressed. Fails on the pre-fix code (num_gt_dets == 2), passes after.
  • test_single_class_sequence_unaffected — SportsMOT/DanceTrack-style data passes through unchanged.

…al parity

The MOT ground-truth preprocessing parsed the class column but never used
it: valid GT was `conf > 0` and distractors were `conf <= 0`. TrackEval's
mot_challenge_2d_box.py instead keys off the class:

  - ground truth kept for scoring = (conf != 0) AND (class == pedestrian, 1)
  - distractor regions (tracker dets matched to them are suppressed) =
    class in {2, 7, 8, 12} (person_on_vehicle, static_person, distractor,
    reflection)

On MOT17/MOT20 GT, distractor-class rows carry conf == 1, so the old
conf-only logic counted them as ground truth (inflating FN/IDFN) and let
tracker detections overlapping them be scored as false positives --
shifting MOTA / HOTA / IDF1 away from the official TrackEval numbers.

This gap is invisible to the existing integration tests, which only cover
SportsMOT and DanceTrack (single pedestrian class, conf == 1); for that
data the new and old logic are identical, so those results are unchanged.

Adds a hermetic unit test for the distractor + ignored-pedestrian case
and a single-class regression test.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates MOT-format ground-truth preprocessing to match TrackEval’s MOTChallenge behavior by using the GT class column to decide which GT rows are scored and which regions act as distractors for suppression, improving metric parity (e.g., MOTA/HOTA/IDF1) on MOT-style datasets.

Changes:

  • Add class-based ground-truth masks to keep only pedestrian-class, non-ignored GT rows for scoring.
  • Change distractor handling to use MOT distractor classes for suppressing tracker detections matched to distractor regions.
  • Add unit tests to cover distractor-class GT with conf==1 and ensure single-class sequences remain unchanged.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/trackers/io/mot.py Introduces helper masks for valid GT vs distractor GT and wires them into sequence preparation for TrackEval parity.
tests/io/test_mot.py Adds hermetic tests validating distractor-class exclusion/suppression and unchanged behavior for single-class data.

Comment thread src/trackers/io/mot.py Outdated
Comment thread src/trackers/io/mot.py Outdated
Comment thread tests/io/test_mot.py Outdated
Borda and others added 2 commits June 22, 2026 15:52
…nnotations

Address 8 review findings (items 1–8) from PR roboflow#466 /oss:resolve pass:

- fix: conf predicate `> 0` → `!= 0` in _valid_ground_truth_mask for literal
  TrackEval parity (negative-conf rows now kept, matching `gt_to_keep_mask`)
- fix: replace misleading MOT20 class-6 comment with explicit TODO so readers
  know class-6 support is unimplemented, not accidentally omitted
- types: add `from numpy.typing import NDArray`; replace all bare `np.ndarray`
  annotations with typed NDArray[np.bool_/np.float64/np.intp] across
  _MOTFrameData, _MOTSequenceData, and all private helper functions
- test: assert surviving tracker IDs (10, 30) not just count, proving id20 was
  the specific detection suppressed by _remove_distractor_matches
- test: parametrize distractor-class test across all four values {2, 7, 8, 12}
  so a mistyped constant cannot silently pass the suite
- test: cover conf==0 non-distractor behavioral change (class=5 row must NOT
  suppress overlapping tracker det under new class-based mask)
- test: add all-distractor frame, missing GT frame, and multi-frame accumulation
  edge cases (+8 test cases, 10 total passing)
- style: rename TestMOTDistractorPreprocessing → TestMotDistractorPreprocessing
  to match repo convention (Mot not MOT)

Challenge log: all 8 items evidence=VALID suggestion=VALID resolution=as-suggested
(items 4–8 pre-validated by /oss:review adversarial pass; items 1–3 confirmed
by Copilot inline review and /oss:review [HIGH-2, MEDIUM-4])

[resolve roboflow#1,roboflow#2,roboflow#3,roboflow#4,roboflow#5,roboflow#6,roboflow#7,roboflow#8] PR roboflow#466 — @Copilot + /review findings

---
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: OpenAI Codex <codex@openai.com>
@Borda Borda merged commit 96938d1 into roboflow:develop Jun 22, 2026
18 checks passed
Borda added a commit that referenced this pull request Jun 23, 2026
- Stamp CHANGELOG [Unreleased] → [2.5.0] — 2026-06-22; add missing entries: CBIoUTracker (#417), py.typed, Tuner params (#427), HOTA eval fixes (#462, #466)
- Bump version 2.4.0 → 2.5.0 in pyproject.toml
- Add C-BIoU row to README algorithms table and intro text (MOT17=63.0, SportsMOT=73.1, SoccerNet=82.6, DanceTrack=56.7)

---

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants