fix(eval): use MOT distractor classes in GT preprocessing for TrackEval parity#466
Merged
Merged
Conversation
…al parity
The MOT ground-truth preprocessing parsed the class column but never used
it: valid GT was `conf > 0` and distractors were `conf <= 0`. TrackEval's
mot_challenge_2d_box.py instead keys off the class:
- ground truth kept for scoring = (conf != 0) AND (class == pedestrian, 1)
- distractor regions (tracker dets matched to them are suppressed) =
class in {2, 7, 8, 12} (person_on_vehicle, static_person, distractor,
reflection)
On MOT17/MOT20 GT, distractor-class rows carry conf == 1, so the old
conf-only logic counted them as ground truth (inflating FN/IDFN) and let
tracker detections overlapping them be scored as false positives --
shifting MOTA / HOTA / IDF1 away from the official TrackEval numbers.
This gap is invisible to the existing integration tests, which only cover
SportsMOT and DanceTrack (single pedestrian class, conf == 1); for that
data the new and old logic are identical, so those results are unchanged.
Adds a hermetic unit test for the distractor + ignored-pedestrian case
and a single-class regression test.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates MOT-format ground-truth preprocessing to match TrackEval’s MOTChallenge behavior by using the GT class column to decide which GT rows are scored and which regions act as distractors for suppression, improving metric parity (e.g., MOTA/HOTA/IDF1) on MOT-style datasets.
Changes:
- Add class-based ground-truth masks to keep only pedestrian-class, non-ignored GT rows for scoring.
- Change distractor handling to use MOT distractor classes for suppressing tracker detections matched to distractor regions.
- Add unit tests to cover distractor-class GT with
conf==1and ensure single-class sequences remain unchanged.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/trackers/io/mot.py |
Introduces helper masks for valid GT vs distractor GT and wires them into sequence preparation for TrackEval parity. |
tests/io/test_mot.py |
Adds hermetic tests validating distractor-class exclusion/suppression and unchanged behavior for single-class data. |
…nnotations Address 8 review findings (items 1–8) from PR roboflow#466 /oss:resolve pass: - fix: conf predicate `> 0` → `!= 0` in _valid_ground_truth_mask for literal TrackEval parity (negative-conf rows now kept, matching `gt_to_keep_mask`) - fix: replace misleading MOT20 class-6 comment with explicit TODO so readers know class-6 support is unimplemented, not accidentally omitted - types: add `from numpy.typing import NDArray`; replace all bare `np.ndarray` annotations with typed NDArray[np.bool_/np.float64/np.intp] across _MOTFrameData, _MOTSequenceData, and all private helper functions - test: assert surviving tracker IDs (10, 30) not just count, proving id20 was the specific detection suppressed by _remove_distractor_matches - test: parametrize distractor-class test across all four values {2, 7, 8, 12} so a mistyped constant cannot silently pass the suite - test: cover conf==0 non-distractor behavioral change (class=5 row must NOT suppress overlapping tracker det under new class-based mask) - test: add all-distractor frame, missing GT frame, and multi-frame accumulation edge cases (+8 test cases, 10 total passing) - style: rename TestMOTDistractorPreprocessing → TestMotDistractorPreprocessing to match repo convention (Mot not MOT) Challenge log: all 8 items evidence=VALID suggestion=VALID resolution=as-suggested (items 4–8 pre-validated by /oss:review adversarial pass; items 1–3 confirmed by Copilot inline review and /oss:review [HIGH-2, MEDIUM-4]) [resolve roboflow#1,roboflow#2,roboflow#3,roboflow#4,roboflow#5,roboflow#6,roboflow#7,roboflow#8] PR roboflow#466 — @Copilot + /review findings --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: OpenAI Codex <codex@openai.com>
Borda
approved these changes
Jun 22, 2026
Borda
added a commit
that referenced
this pull request
Jun 23, 2026
- Stamp CHANGELOG [Unreleased] → [2.5.0] — 2026-06-22; add missing entries: CBIoUTracker (#417), py.typed, Tuner params (#427), HOTA eval fixes (#462, #466) - Bump version 2.4.0 → 2.5.0 in pyproject.toml - Add C-BIoU row to README algorithms table and intro text (MOT17=63.0, SportsMOT=73.1, SoccerNet=82.6, DanceTrack=56.7) --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The MOT ground-truth preprocessing parses the
classcolumn (mot.py) but never used it. Valid GT and distractor regions were both derived from the confidence flag alone:TrackEval's
mot_challenge_2d_box.pykeys off the class instead:(conf != 0) AND (class == pedestrian, 1)class ∈ {2, 7, 8, 12}—person_on_vehicle,static_person,distractor,reflectionWhy it matters
On MOT17/MOT20 ground truth, distractor-class rows carry
conf == 1. The old conf-only logic therefore:shifting MOTA / HOTA / IDF1 away from the official TrackEval numbers. The file documents TrackEval parity and
_remove_distractor_matchescites the TrackEval MOT17 line range, so this is a parity gap rather than a design choice.The existing
_remove_distractor_matchesmachinery was already correct — it just needs the distractor mask to be class-based.Scope / safety
This is invisible to the current integration tests, which cover only SportsMOT and DanceTrack (single pedestrian class,
conf == 1). For that data the new and old masks are identical, so those results are unchanged.num_tracker_idsis intentionally left as-is (built before suppression); a never-matched id is metric-neutral.Tests
tests/io/test_mot.py(hermetic, no downloads):test_distractor_class_excluded_and_matching_tracker_removed— a distractor-class GT (conf==1) and aconf==0pedestrian are excluded from GT, and a tracker det overlapping the distractor is suppressed. Fails on the pre-fix code (num_gt_dets == 2), passes after.test_single_class_sequence_unaffected— SportsMOT/DanceTrack-style data passes through unchanged.