perf(eval): vectorize HOTA per-frame alpha loop and id remapping by RubenHaisma · Pull Request #462 · roboflow/trackers

RubenHaisma · 2026-06-21T11:01:10Z

Description

compute_hota_metrics is the slowest of the eval metrics, and the cost is per-frame Python overhead rather than the actual math. Two things were done one element/threshold at a time inside the per-frame loops:

The 19 alpha thresholds were scored in a Python for a, alpha in ... loop (per frame), each iteration fancy-indexing and scattering into a separate matches_counts[a] matrix.
The gt/tracker id → row-index mapping rebuilt a Python dict comprehension every frame (np.array([gt_id_to_idx[int(id_)] for id_ in ...])), twice — once in each pass.

Both are invariant transforms that numpy can do in a single vectorized op:

Alpha loop → broadcast. Score all thresholds at once: matched_sim[None, :] >= (ALPHA_THRESHOLDS[:, None] - EPS), then accumulate TP/FN/FP/LocA with array reductions and scatter the per-alpha co-occurrence counts into one (num_alphas, num_gt, num_tracker) array. (Each (gt, tracker) pair is unique within a frame, so the advanced-index in-place add has no duplicate destinations.)
Dict lookup → np.searchsorted. np.unique already returns sorted ids, so an id's index is its position by binary search — no per-frame dict.

No public API or output change.

Correctness

Output is identical to the previous implementation. Verified by a differential test against the pre-change code across 600 randomized sequences, including empty / single-detection / no-GT / no-tracker frames and both contiguous and non-contiguous ids (rtol=atol=1e-11, all fields incl. the per-alpha arrays).

Existing tests/eval/test_hota.py and the compute_hota_metrics doctest pass unchanged (HOTA/DetA/AssA = 0.745/0.816/0.691).
Adds test_metrics_invariant_to_id_relabeling: HOTA must be invariant to a non-monotonic id relabeling (ids unsorted within a frame), which directly exercises the new searchsorted remapping.

Benchmark

500 frames, ~40 GT / ~45 tracker per frame, non-contiguous ids (20 iters, warm):

	ms/call
before	79.7
after	27.5
speedup	~2.9×

Remaining time is the per-frame Hungarian solve and first-pass IoU, which are inherently per-frame.

compute_hota_metrics did two things per frame that numpy can do at once: 1. scored the 19 alpha thresholds in a Python for-loop 2. rebuilt a Python dict comprehension to map gt/tracker ids to row indices (twice per frame) Replace (1) by broadcasting the matched-pair similarities against ALPHA_THRESHOLDS and scattering the per-alpha co-occurrence counts into a single (num_alphas, num_gt, num_tracker) array; replace (2) with np.searchsorted on the already-sorted unique-id arrays. Output is identical to the previous implementation (verified across 600 randomized sequences incl. empty/single/no-gt/no-tracker frames and both contiguous and non-contiguous ids). ~2.9x faster on a 500-frame, 40x45 sequence. Also adds an id-relabeling invariance test exercising non-contiguous, within-frame-unsorted ids.

CLAassistant · 2026-06-21T11:01:19Z

All committers have signed the CLA.

Copilot

Pull request overview

This PR speeds up HOTA evaluation by removing per-frame Python overhead in compute_hota_metrics, while keeping metric outputs unchanged. It does so by vectorizing the per-alpha threshold scoring and replacing per-frame ID→index dict lookups with np.searchsorted on the globally-unique sorted ID lists.

Changes:

Vectorized per-frame alpha-threshold evaluation via broadcasting to score all 19 thresholds at once and accumulate TP/FN/FP/LocA.
Replaced per-frame Python dict ID remapping with np.searchsorted on np.unique’s sorted ID outputs.
Added a regression test ensuring metrics are invariant under non-monotonic ID relabeling (including unsorted IDs within a frame).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`src/trackers/eval/hota.py`	Vectorizes per-alpha scoring and uses `np.searchsorted` for ID remapping to reduce per-frame Python overhead.
`tests/eval/test_hota.py`	Adds invariance test to validate correctness of the new ID remapping logic.

- Add test_output_matches_sequential_reference (4 parametrized cases: contiguous ids, id-switch, non-monotonic ids, partial asymmetric match): runs vectorized implementation against pre-PR sequential reference (dict map + per-alpha Python loop) and asserts bit-identical per-alpha arrays — guards hot path from future silent drift (resolves M1 finding from /oss:review) - Add _sequential_hota_reference module-level helper implementing pre-vectorization logic for use by the differential test - Extend test_metrics_invariant_to_id_relabeling allclose loop with AssRe_array and AssPr_array (previously checked at scalar level only) --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

- Reword 'no duplicate destinations' comment (hota.py:214): replaces false premise ('each pair is unique') with accurate MOT-convention invariant ('each id appears at most once per frame') — prevents future maintainer from silently switching to np.add.at believing it changes behavior - Add searchsorted precondition comment (hota.py:136): states that all per-frame IDs are guaranteed present in unique_*_ids (silent wrong-index impossible) - Tighten TrackEval reference range from hota.py:72-101 to hota.py:72-88 (the alpha loop at 91-101 no longer exists after vectorization) --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

- Stamp CHANGELOG [Unreleased] → [2.5.0] — 2026-06-22; add missing entries: CBIoUTracker (#417), py.typed, Tuner params (#427), HOTA eval fixes (#462, #466) - Bump version 2.4.0 → 2.5.0 in pyproject.toml - Add C-BIoU row to README algorithms table and intro text (MOT17=63.0, SportsMOT=73.1, SoccerNet=82.6, DanceTrack=56.7) --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

RubenHaisma requested a review from SkalskiP as a code owner June 21, 2026 11:01

Borda requested a review from Copilot June 22, 2026 10:40

Copilot started reviewing on behalf of Borda June 22, 2026 10:41 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Borda and others added 3 commits June 22, 2026 14:40

Merge branch 'develop' into perf/vectorize-hota

b6334e9

Borda approved these changes Jun 22, 2026

View reviewed changes

Borda merged commit e42d40a into roboflow:develop Jun 22, 2026
18 checks passed

Borda mentioned this pull request Jun 22, 2026

v2.5.0: Pluggable IoU variants + C-BIoU tracker #471

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(eval): vectorize HOTA per-frame alpha loop and id remapping#462

perf(eval): vectorize HOTA per-frame alpha loop and id remapping#462
Borda merged 4 commits into
roboflow:developfrom
RubenHaisma:perf/vectorize-hota

RubenHaisma commented Jun 21, 2026

Uh oh!

CLAassistant commented Jun 21, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

RubenHaisma commented Jun 21, 2026

Description

Correctness

Benchmark

Uh oh!

CLAassistant commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented Jun 21, 2026 •

edited

Loading