Skip to content

Add appearance-based ReID for lost track recovery#53

Merged
Devnil434 merged 2 commits into
Devnil434:mainfrom
2024itb047samata:feature/reid-track-recovery
May 15, 2026
Merged

Add appearance-based ReID for lost track recovery#53
Devnil434 merged 2 commits into
Devnil434:mainfrom
2024itb047samata:feature/reid-track-recovery

Conversation

@2024itb047samata
Copy link
Copy Markdown
Contributor

@2024itb047samata 2024itb047samata commented May 15, 2026

Closes #45

Changes

  • Added appearance-based ReID using cosine similarity
  • Added lost-track embedding storage
  • Restores original IDs after short occlusions
  • Added configurable similarity threshold
  • Added embedding expiry cleanup
  • Added tests for recovery, rejection, and expiry

Summary by CodeRabbit

Release Notes

  • New Features

    • Tracker can now re-identify previously lost tracks when they reappear, maintaining ID continuity.
    • Added configurable similarity threshold for track matching (default: 0.85).
  • Tests

    • Enhanced test suite with comprehensive coverage for track re-identification scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@2024itb047samata has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 31 minutes and 24 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aae6e128-f60d-4517-9f1a-a859b6836103

📥 Commits

Reviewing files that changed from the base of the PR and between 8e5f401 and e77be97.

📒 Files selected for processing (2)
  • services/tracking/tracker.py
  • tests/test_tracker.py
📝 Walkthrough

Walkthrough

This PR adds appearance-based re-identification (ReID) to the tracker to recover track IDs after temporary occlusions. A new configurable similarity threshold stores embeddings for lost tracks, matches them against new tracks using cosine similarity, and automatically expires stale embeddings beyond the max_age window.

Changes

ReID Track Recovery

Layer / File(s) Summary
ReID configuration and storage initialization
services/tracking/tracker.py
Tracker.__init__ accepts configurable reid_similarity_threshold (default 0.85) and initializes _lost_embeddings to store appearance vectors for lost track re-identification.
Cosine similarity computation
services/tracking/tracker.py
New _cosine_similarity(a, b) helper method computes cosine similarity via dot product and vector norms for embedding comparison.
Lost embedding capture and expiration
services/tracking/tracker.py
When tracks transition to LOST, embeddings are captured from DeepSort raw_tracks with last_seen timestamp; expired entries older than max_age are periodically cleaned up.
ReID matching in update path
services/tracking/tracker.py
Confirmed DeepSort tracks with appearance features are matched against stored lost embeddings; threshold-passing matches restore the original track ID and remove the used embedding.
REID behavior test suite
tests/test_tracker.py
Mock track helper extended with optional embeddings; three integration tests verify correct restoration on same appearance, rejection on different appearance, and expiration after max_age; BIRTH event assertion validates track ID logging.

Sequence Diagram

sequenceDiagram
  participant Tracker
  participant RawTracks as DeepSort<br/>RawTracks
  participant LostEmbeddings as _lost_embeddings<br/>Store
  participant Similarity as _cosine_similarity

  Tracker->>RawTracks: iterate confirmed tracks
  RawTracks-->>Tracker: track with features
  Tracker->>LostEmbeddings: iterate stored lost IDs
  Tracker->>Similarity: compute similarity(features, embedding)
  Similarity-->>Tracker: similarity score
  alt similarity > threshold
    Tracker->>Tracker: overwrite track_id with lost ID
    Tracker->>LostEmbeddings: remove used embedding
  end

  Note over Tracker: Track becomes LOST
  Tracker->>RawTracks: find corresponding DeepSort track
  RawTracks-->>Tracker: track with latest features
  Tracker->>LostEmbeddings: store embedding + last_seen
  
  Tracker->>LostEmbeddings: iterate all stored embeddings
  alt last_seen > max_age
    Tracker->>LostEmbeddings: delete expired entry
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A track once lost now finds its way,
Through embeddings that glow and play,
Cosine whispers, "I know thee well,"
ReID magic breaks the ID spell,
With threshold set and max_age's chime,
Lost tracks hop back home through time! 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding appearance-based ReID for recovering lost tracks through track re-identification.
Linked Issues check ✅ Passed The PR implementation addresses all coding requirements from issue #45: ReID with configurable cosine similarity threshold (0.85 default), embedding storage for lost tracks, max_age expiry, and comprehensive test coverage including recovery, rejection, and expiry scenarios.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #45 objectives. The modifications to tracker.py implement core ReID functionality, and test additions cover the specified acceptance criteria without introducing unrelated changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
services/tracking/tracker.py (1)

199-215: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical indentation bug: LOST is emitted every frame and track is referenced outside its guard.

Lines 205 and 211 are de-indented to the same level as if frames_since == 1:, which moves them out of that branch:

  • Line 205 if track is not None and ...: runs on every iteration where the track is missing, but track is only assigned inside if frames_since == 1: (line 203). On subsequent frames you'll either hit NameError or — worse — silently consume a stale track left over from a previous iteration of the outer for tid, prev_obj loop, leading to wrong-track embedding storage.
  • Line 211 self._emit_lifecycle(TrackState.LOST, ...) now fires on every frame the track is absent, not once at transition, violating the BORN/LOST/DEAD lifecycle contract and spamming downstream consumers (event logger, drain queue, tests asserting a single LOST event).
🛠️ Proposed fix: nest the embedding capture and LOST emission under `if frames_since == 1:`
             if tid not in current_ids:
                 frames_since = self._frame_id - prev_obj.last_seen_frame
                 if frames_since == 1:
                     track = next((t for t in raw_tracks if int(t.track_id) == tid), None)
-
-                if track is not None and hasattr(track, "features") and track.features:
-                    self._lost_embeddings[tid] = {
-                        "embedding": track.features[-1],
-                        "last_seen": self._frame_id,
-                }
-
-                self._emit_lifecycle(
-                    TrackState.LOST, tid,
-                    prev_obj.zones_present,
-                    prev_obj.dwell_time_seconds,
-                )
+                    if track is not None and hasattr(track, "features") and track.features:
+                        self._lost_embeddings[tid] = {
+                            "embedding": track.features[-1],
+                            "last_seen": self._frame_id,
+                        }
+                    self._emit_lifecycle(
+                        TrackState.LOST, tid,
+                        prev_obj.zones_present,
+                        prev_obj.dwell_time_seconds,
+                    )
                 if frames_since > self._tracker.max_age:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/tracking/tracker.py` around lines 199 - 215, The code emits LOST
every frame and uses `track` outside its guard because the embedding capture and
`self._emit_lifecycle` calls were de-indented; fix by moving the block that
references `track` and the LOST emission inside the `if frames_since == 1:`
branch within the `for tid, prev_obj in list(self._active_tracks.items())` loop
so `track` is only referenced after being assigned, and ensure
`self._lost_embeddings[tid] = {...}` and `self._emit_lifecycle(TrackState.LOST,
tid, prev_obj.zones_present, prev_obj.dwell_time_seconds)` execute only when
`frames_since == 1` (and `track` exists with features).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/tracking/tracker.py`:
- Around line 270-278: Normalize the indentation of the method
_cosine_similarity to consistent 4-space class/method blocks and add a guard for
zero-norm vectors: compute norm_a = np.linalg.norm(a) and norm_b =
np.linalg.norm(b), if either norm is below a small epsilon (e.g. 1e-8) return
0.0 (or a defined sentinel) instead of performing the division, otherwise return
np.dot(a, b) / (norm_a * norm_b); use the function name _cosine_similarity to
find and update the method and ensure the signature parameters and body align
with the surrounding class style.
- Around line 125-155: The ReID matching block mis-indents key statements so
similarity calculation, threshold check, restoration, deletion and break are
outside the inner candidate loop and/or outside the features check, causing
NameError and incorrect control flow; fix by moving the call to
self._cosine_similarity(new_embedding, data["embedding"]), the if similarity >
self.REID_SIMILARITY_THRESHOLD check, the restoration of t.track_id (assign to
lost_id), deletion from self._lost_embeddings, the logger.info call, and the
break so they all sit inside the for lost_id, data in
list(self._lost_embeddings.items()) loop and also ensure that whole loop is
inside the if hasattr(t, "features") and t.features: block, keep the age check
as currently written (continue when age > self.max_age) so each candidate is
evaluated independently, and ensure the break only exits the inner loop (not the
outer track loop).

In `@tests/test_tracker.py`:
- Around line 382-487: The two test functions test_reid_rejects_low_similarity
and test_reid_expires_after_max_age are incorrectly indented inside another test
(making them local functions and undiscoverable); move each function so its def
line (and its `@patch` decorator) is at module level (left-aligned with other
top-level tests) and adjust the body indentation accordingly so pytest can
discover them, keeping the same function bodies and decorators (e.g., maintain
`@patch`("services.tracking.tracker.DeepSort"), Tracker usage, and mock_ds setup).

---

Outside diff comments:
In `@services/tracking/tracker.py`:
- Around line 199-215: The code emits LOST every frame and uses `track` outside
its guard because the embedding capture and `self._emit_lifecycle` calls were
de-indented; fix by moving the block that references `track` and the LOST
emission inside the `if frames_since == 1:` branch within the `for tid, prev_obj
in list(self._active_tracks.items())` loop so `track` is only referenced after
being assigned, and ensure `self._lost_embeddings[tid] = {...}` and
`self._emit_lifecycle(TrackState.LOST, tid, prev_obj.zones_present,
prev_obj.dwell_time_seconds)` execute only when `frames_since == 1` (and `track`
exists with features).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2194a800-e17e-49ee-9e2f-751744c6da99

📥 Commits

Reviewing files that changed from the base of the PR and between c73f3cb and 8e5f401.

📒 Files selected for processing (2)
  • services/tracking/tracker.py
  • tests/test_tracker.py

Comment on lines +125 to +155
# ── ReID matching ─────────────────────────────────────
if hasattr(t, "features") and t.features:

new_embedding = t.features[-1]

for lost_id, data in list(self._lost_embeddings.items()):

age = self._frame_id - data["last_seen"]

if age > self.max_age:
continue

similarity = self._cosine_similarity(
new_embedding,
data["embedding"],
)

if similarity > self.REID_SIMILARITY_THRESHOLD:

# Restore original ID
tid = lost_id
t.track_id = lost_id

del self._lost_embeddings[lost_id]

logger.info(
f"ReID matched: restored track #{lost_id}"
)

break

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical indentation bugs in ReID matching break correctness and raise NameError.

The block has several de-indented statements that take logic out of its intended scope:

  • Line 137 similarity = self._cosine_similarity(...) is at the same level as the for lost_id loop, so it runs once per track using only the last iteration's data, ignoring all earlier candidates. This is exactly why Ruff B007 flags lost_id as unused — the loop body is effectively empty except for the age skip.
  • Line 142 if similarity > self.REID_SIMILARITY_THRESHOLD: is outside the if hasattr(t, "features") and t.features: block. When a track has no features (or _lost_embeddings is empty so data/lost_id are never assigned), similarity / data / lost_id are undefined → NameError crashes update() on common paths.
  • Line 154 break sits inside if similarity > ...: but its nearest enclosing loop is for t in raw_tracks, so a successful ReID match aborts processing of all remaining tracks in the frame, not just the inner candidate loop.
🛠️ Proposed fix: move the similarity computation, threshold check, and break inside the inner loop
             tid  = int(t.track_id)
-            # ── ReID matching ─────────────────────────────────────
-            if hasattr(t, "features") and t.features:
-
-                new_embedding = t.features[-1]
-
-                for lost_id, data in list(self._lost_embeddings.items()):
-
-                    age = self._frame_id - data["last_seen"]
-
-                    if age > self.max_age:
-                        continue
-
-                similarity = self._cosine_similarity(
-                    new_embedding,
-                    data["embedding"],
-                )
-
-            if similarity > self.REID_SIMILARITY_THRESHOLD:
-
-            # Restore original ID
-                tid = lost_id
-                t.track_id = lost_id
-
-                del self._lost_embeddings[lost_id]
-
-                logger.info(
-                f"ReID matched: restored track #{lost_id}"
-            )
-
-                break
-        
+            # ── ReID matching ─────────────────────────────────────
+            if hasattr(t, "features") and t.features:
+                new_embedding = t.features[-1]
+                for lost_id, data in list(self._lost_embeddings.items()):
+                    age = self._frame_id - data["last_seen"]
+                    if age > self.max_age:
+                        continue
+                    similarity = self._cosine_similarity(
+                        new_embedding,
+                        data["embedding"],
+                    )
+                    if similarity > self.REID_SIMILARITY_THRESHOLD:
+                        # Restore original ID
+                        tid = lost_id
+                        t.track_id = lost_id
+                        del self._lost_embeddings[lost_id]
+                        logger.info(f"ReID matched: restored track #{lost_id}")
+                        break
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 130-130: Loop control variable lost_id not used within loop body

(B007)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/tracking/tracker.py` around lines 125 - 155, The ReID matching block
mis-indents key statements so similarity calculation, threshold check,
restoration, deletion and break are outside the inner candidate loop and/or
outside the features check, causing NameError and incorrect control flow; fix by
moving the call to self._cosine_similarity(new_embedding, data["embedding"]),
the if similarity > self.REID_SIMILARITY_THRESHOLD check, the restoration of
t.track_id (assign to lost_id), deletion from self._lost_embeddings, the
logger.info call, and the break so they all sit inside the for lost_id, data in
list(self._lost_embeddings.items()) loop and also ensure that whole loop is
inside the if hasattr(t, "features") and t.features: block, keep the age check
as currently written (continue when age > self.max_age) so each candidate is
evaluated independently, and ensure the break only exits the inner loop (not the
outer track loop).

Comment thread services/tracking/tracker.py Outdated
Comment thread tests/test_tracker.py Outdated
@2024itb047samata
Copy link
Copy Markdown
Contributor Author

Fixed the indentation and control-flow issues in the ReID block, added zero-norm protection in cosine similarity, and corrected test indentation so pytest discovers all ReID tests properly.
@Devnil434 Kindly review

@Devnil434 Devnil434 merged commit 8469d4a into Devnil434:main May 15, 2026
1 check passed
@2024itb047samata
Copy link
Copy Markdown
Contributor Author

Kindly add GSSoC approved tag and the level of difficulty please in all my prs. Thanks @Devnil434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] — Implement track re-identification after occlusion

2 participants