Skip to content

fix: stop sending video frames to realtime LLMs and stop processors when participant leaves#283

Merged
dangusev merged 11 commits intomainfrom
fix/video-frames
Jan 12, 2026
Merged

fix: stop sending video frames to realtime LLMs and stop processors when participant leaves#283
dangusev merged 11 commits intomainfrom
fix/video-frames

Conversation

@d3xvn
Copy link
Contributor

@d3xvn d3xvn commented Jan 9, 2026

When a user leaves a call without ending it, the agent was continuing to forward the last-seen video frame to the Realtime LLM (e.g., Gemini, OpenAI).

This fix ensures that when all video tracks are removed:

  • The agent calls _stop_watching_video_track() on the LLM
  • Each realtime LLM implementation removes its frame handler from the VideoForwarder
  • No more frames are sent to the LLM until a new participant joins

Summary by CodeRabbit

  • Bug Fixes

    • Video teardown now reliably stops senders and processors when tracks end, reducing resource leaks and improving stability.
  • New Features

    • Public stop/cleanup hooks added across LLMs, VLMs, and processors to explicitly halt video forwarding and processing when participants leave.
    • RTC manager now exposes a stop action to cease video sending.
  • Style/Logs

    • Added informational logs for when video forwarding/processing stops.
  • Tests

    • Tests and mocks updated to exercise the new stop/teardown hooks.

✏️ Tip: You can customize this high-level summary in your review settings.

When a user leaves a call without ending it, the agent was continuing to
forward the last-seen video frame to the Realtime LLM (e.g., Gemini, OpenAI).

This fix ensures that when all video tracks are removed:
- The agent calls _stop_watching_video_track() on the LLM
- Each realtime LLM implementation removes its frame handler from the VideoForwarder
- No more frames are sent to the LLM until a new participant joins

Changes:
- agents.py: Call llm._stop_watching_video_track() when no tracks remain
- openai_realtime.py: Implement _stop_watching_video_track() to stop video sender
- rtc_manager.py: Add stop_video_sender() method to stop video forwarding

Fixes AI-378 sub-issue 2/6
@coderabbitai
Copy link

coderabbitai bot commented Jan 9, 2026

Warning

Rate limit exceeded

@dangusev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 9 minutes and 32 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between baf8abf and f460054.

📒 Files selected for processing (5)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/openai/tests/test_chat_completions.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • plugins/ultralytics/tests/test_ultralytics.py
  • tests/test_agent_tracks.py
📝 Walkthrough

Walkthrough

When the agent detects no remaining non-processed video tracks, it now awaits the LLM's stop_watching_video_track() (if a video LLM) and iterates its video processors to await processor.stop_processing(). Multiple realtime/RTC and processor implementations expose corresponding public stop hooks and updated teardown flows.

Changes

Cohort / File(s) Summary
Agent track-change logic
agents-core/vision_agents/core/agents/agents.py
_on_track_change now awaits self.llm.stop_watching_video_track() (if a video LLM) and iterates self.video_processors to await processor.stop_processing() when no non-processed tracks remain.
VideoLLM / Realtime API
agents-core/vision_agents/core/llm/llm.py, agents-core/vision_agents/core/llm/realtime.py
Added public/abstract stop_watching_video_track() hook on VideoLLM/Realtime (renamed from prior private variant) for implementations to override.
RTC manager
plugins/openai/vision_agents/plugins/openai/rtc_manager.py
Added async def stop_video_sender() to remove frame handler from current video forwarder, clear forwarder reference, and log the stop.
OpenAI realtime
plugins/openai/vision_agents/plugins/openai/openai_realtime.py
Promoted _stop_watching_video_trackstop_watching_video_track() and implemented it to call await self.rtc.stop_video_sender(); added docstring.
Realtime / VLM plugins (promoted stop)
plugins/gemini/.../gemini_realtime.py, plugins/qwen/.../qwen_realtime.py, plugins/moondream/.../vlm/*, plugins/huggingface/.../huggingface_vlm.py, plugins/nvidia/.../nvidia_vlm.py, plugins/openai/.../chat_completions/chat_completions_vlm.py
Replaced private _stop_watching_video_track with public stop_watching_video_track() across plugins; updated call sites and added logging in teardown.
Processor base API
agents-core/vision_agents/core/processors/base_processor.py
Added async def stop_processing(self) -> None hook on VideoProcessor for cleanup.
Processor implementations
plugins/moondream/.../detection/*, plugins/roboflow/.../*, plugins/ultralytics/.../yolo_pose_processor.py, plugins/decart/.../decart_restyling_processor.py
Implemented stop_processing() to detach frame handlers, clear forwarders, log stop events; updated close() to await stop_processing() where applicable.
OpenAI RTC integration
plugins/openai/vision_agents/plugins/openai/openai_realtime.py, plugins/openai/vision_agents/plugins/openai/rtc_manager.py
New public stop path ties realtime class to RTC manager's stop_video_sender() to stop forwarding video frames.
Tests & examples
plugins/openai/tests/*, plugins/gemini/tests/*, plugins/qwen/tests/*, plugins/ultralytics/tests/test_ultralytics.py, tests/test_agent_tracks.py, examples/01_simple_agent_example/simple_agent_example.py
Tests updated to call public stop APIs; mocks extended with stop_processing / stop_watching_video_track; fixture teardown adjusted for async closure; minor import reordering in example.

Sequence Diagram(s)

sequenceDiagram
    participant Agent
    participant VideoLLM
    participant Processor as VideoProcessor
    participant RTC as RTCManager

    Agent->>Agent: _on_track_change()
    Agent->>Agent: compute remaining non-processed tracks
    alt no non-processed tracks
        Agent->>VideoLLM: await stop_watching_video_track()
        alt VideoLLM is Realtime implementation
            VideoLLM->>RTC: await stop_video_sender()
            RTC->>RTC: remove frame handler & clear forwarder
            RTC-->>VideoLLM: return
        end
        loop for each processor
            Agent->>Processor: await stop_processing()
            Processor->>Processor: remove frame handler & clear forwarder
            Processor-->>Agent: return
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

bug

Suggested reviewers

  • Nash0x7E2
  • yarikdevcom
  • tschellenbach

Poem

The sender unthreads like a throat unwinding,
cold plastic fingers fall away from light.
I watch the forwarder empty into silence;
the machine exhales and closes its small eyes.
A mute room learns how to be a room again.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main change: stopping video frame delivery to realtime LLMs and halting processors when participants leave, which is the core purpose of this changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
agents-core/vision_agents/core/agents/agents.py (1)

1014-1017: Well-implemented early exit when video tracks are removed.

The logic correctly stops video watching when all non-processed tracks are removed, preventing unnecessary frame forwarding to the LLM.

📝 Optional: Add debug logging for observability

Consider adding a log statement before stopping video watching to aid debugging:

 if not non_processed_tracks:
     # No more video tracks, stop sending video to the LLM
     if _is_video_llm(self.llm):
+        self.logger.debug("No video tracks remain, stopping video watching")
         await self.llm._stop_watching_video_track()
     return

This would provide visibility into when the agent transitions from active video watching to idle state.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1740881 and ac65207.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • plugins/openai/vision_agents/plugins/openai/rtc_manager.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/openai/vision_agents/plugins/openai/rtc_manager.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.

Applied to files:

  • agents-core/vision_agents/core/agents/agents.py
🧬 Code graph analysis (3)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • stop_video_sender (152-159)
agents-core/vision_agents/core/agents/agents.py (5)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • _stop_watching_video_track (375-377)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • _stop_watching_video_track (278-280)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • _stop_watching_video_track (353-358)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • _stop_watching_video_track (227-232)
agents-core/vision_agents/core/llm/realtime.py (1)
  • _stop_watching_video_track (63-65)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Mypy
🔇 Additional comments (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

375-377: LGTM: Proper teardown implementation.

The method correctly delegates to rtc.stop_video_sender() to halt video forwarding when tracks are removed, replacing the previous no-op behavior with concrete cleanup.

plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)

152-159: Clean implementation of video sender shutdown.

The method properly removes the frame handler and clears the forwarder reference. Note that according to video_forwarder.py (lines 86-106), when the last handler is removed, the forwarder automatically stops itself, ensuring complete cleanup.

Fixes mypy type error by declaring the method on the base class
with a default no-op implementation. Subclasses that need actual
cleanup behavior already override this method.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
agents-core/vision_agents/core/llm/llm.py (1)

444-446: LGTM! Clean lifecycle hook implementation.

The default no-op implementation is appropriate for this optional lifecycle method. The async signature correctly matches the overrides in subclasses that perform actual cleanup operations.

However, the docstring could be more comprehensive following Google style guidelines to help subclass implementers understand when this lifecycle hook is invoked and what cleanup actions they should perform.

📝 Enhanced docstring example
 async def _stop_watching_video_track(self) -> None:
-    """Stop watching the video track. Override in subclasses if needed."""
+    """Stop watching the video track.
+    
+    Called by the agent when all video tracks have been removed from the call.
+    Subclasses that implement video watching should override this method to perform
+    cleanup such as stopping video forwarders, removing frame handlers, and releasing
+    video resources.
+    
+    The default implementation is a no-op, suitable for LLMs that don't require cleanup.
+    """
     pass
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ac65207 and 1dabce6.

📒 Files selected for processing (1)
  • agents-core/vision_agents/core/llm/llm.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • agents-core/vision_agents/core/llm/llm.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.

Applied to files:

  • agents-core/vision_agents/core/llm/llm.py
🧬 Code graph analysis (1)
agents-core/vision_agents/core/llm/llm.py (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • _stop_watching_video_track (375-377)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • _stop_watching_video_track (278-280)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • _stop_watching_video_track (353-358)
agents-core/vision_agents/core/llm/realtime.py (1)
  • _stop_watching_video_track (63-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Test "not integration"

- Add stop_processing() method to VideoProcessor base class
- Call stop_processing on all video processors when no tracks remain
- Implement stop_processing in all processor plugins:
  - YOLOPoseProcessor (ultralytics)
  - MoondreamLocalProcessor, MoondreamCloudProcessor
  - RoboflowLocalProcessor, RoboflowCloudProcessor
  - DecartRestylingProcessor
- Standardize logging to INFO level for all stop messages (LLMs and processors)
@d3xvn d3xvn changed the title fix: stop sending video frames to realtime LLMs when participant leavesw fix: stop sending video frames to realtime LLMs and stop processors when participant leaves Jan 9, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @agents-core/vision_agents/core/agents/agents.py:
- Around line 1014-1020: When cleaning up video processors in the branch where
non_processed_tracks is empty, wrap each call to processor.stop_processing() in
its own try/except so one failing processor doesn't prevent others from being
stopped; inside the except log the exception (use an existing logger like
self.logger if available or re-raise after attempting all stops) and continue,
keeping the existing call to await self.llm._stop_watching_video_track() for
video LLMs and returning afterward.

In @plugins/ultralytics/tests/test_ultralytics.py:
- Line 30: The fixture function pose_processor currently declares a return type
of YOLOPoseProcessor but it yields a value, so update its annotation to
Iterator[YOLOPoseProcessor] and ensure typing.Iterator is imported; modify the
def signature of pose_processor to use Iterator[YOLOPoseProcessor] and add "from
typing import Iterator" if missing so the fixture type correctly reflects the
yield-based generator.
🧹 Nitpick comments (1)
plugins/ultralytics/tests/test_ultralytics.py (1)

32-32: Consider moving the import to the top of the file.

Importing asyncio inside the fixture function works but deviates from the standard practice of placing imports at the module level. Moving it to the top would improve consistency and readability.

♻️ Proposed refactor

At the top of the file (around line 5):

 from pathlib import Path
 from typing import Iterator
+import asyncio
 
 import numpy as np

Then in the fixture:

     def pose_processor(self) -> Iterator[YOLOPoseProcessor]:
         """Create and manage YOLOPoseProcessor lifecycle."""
-        import asyncio
         processor = YOLOPoseProcessor(device="cpu")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1dabce6 and d61fddf.

📒 Files selected for processing (12)
  • agents-core/vision_agents/core/agents/agents.py
  • agents-core/vision_agents/core/processors/base_processor.py
  • plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py
  • plugins/openai/vision_agents/plugins/openai/rtc_manager.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py
  • plugins/ultralytics/tests/test_ultralytics.py
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
✅ Files skipped from review due to trivial changes (1)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/openai/vision_agents/plugins/openai/rtc_manager.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
  • plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py
  • plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py
  • plugins/ultralytics/tests/test_ultralytics.py
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • agents-core/vision_agents/core/processors/base_processor.py
**/*test*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*test*.py: Never mock in tests; use pytest for testing
Mark integration tests with @pytest.mark.integration decorator
@pytest.mark.asyncio is not needed - it is automatic

Files:

  • plugins/ultralytics/tests/test_ultralytics.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.

Applied to files:

  • agents-core/vision_agents/core/agents/agents.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
🧬 Code graph analysis (7)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (4)
agents-core/vision_agents/core/processors/base_processor.py (2)
  • stop_processing (76-81)
  • close (29-32)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (3)
  • stop_processing (234-239)
  • _process_and_add_frame (203-232)
  • close (241-246)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)
  • stop_processing (197-202)
  • close (204-210)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)
  • stop_processing (422-429)
  • close (431-435)
plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py (4)
agents-core/vision_agents/core/processors/base_processor.py (2)
  • stop_processing (76-81)
  • close (29-32)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (2)
  • stop_processing (234-239)
  • close (241-246)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)
  • stop_processing (422-429)
  • close (431-435)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • close (307-321)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (4)
agents-core/vision_agents/core/processors/base_processor.py (2)
  • stop_processing (76-81)
  • close (29-32)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)
  • stop_processing (321-326)
  • close (328-337)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py (3)
  • stop_processing (202-207)
  • _process_frame (227-286)
  • close (209-215)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (6)
agents-core/vision_agents/core/processors/base_processor.py (2)
  • stop_processing (76-81)
  • close (29-32)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (3)
  • stop_processing (321-326)
  • _process_and_add_frame (294-319)
  • close (328-337)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py (2)
  • stop_processing (202-207)
  • close (209-215)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)
  • stop_processing (197-202)
  • close (204-210)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)
  • stop_processing (422-429)
  • close (431-435)
agents-core/vision_agents/core/agents/agents.py (1)
  • close (697-721)
plugins/ultralytics/tests/test_ultralytics.py (1)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)
  • YOLOPoseProcessor (24-435)
  • close (431-435)
agents-core/vision_agents/core/agents/agents.py (4)
agents-core/vision_agents/core/llm/llm.py (1)
  • _stop_watching_video_track (444-446)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • _stop_watching_video_track (375-377)
agents-core/vision_agents/core/llm/realtime.py (1)
  • _stop_watching_video_track (63-65)
agents-core/vision_agents/core/processors/base_processor.py (1)
  • stop_processing (76-81)
agents-core/vision_agents/core/processors/base_processor.py (6)
plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py (1)
  • stop_processing (271-276)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (1)
  • stop_processing (234-239)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (1)
  • stop_processing (321-326)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py (1)
  • stop_processing (202-207)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (1)
  • stop_processing (197-202)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
  • stop_processing (422-429)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
🔇 Additional comments (15)
agents-core/vision_agents/core/processors/base_processor.py (1)

76-81: LGTM! Well-designed lifecycle hook.

The stop_processing() method provides a clean lifecycle hook for processors to clean up resources when video tracks are removed. The default no-op implementation is appropriate for the base class and allows subclasses to override as needed.

plugins/ultralytics/tests/test_ultralytics.py (1)

37-37: LGTM! Async teardown properly handled.

The use of asyncio.get_event_loop().run_until_complete() correctly awaits the async close() method in the synchronous fixture teardown context.

plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)

204-204: LGTM! Observability enhancement.

The logging statement provides clear visibility into the video forwarding lifecycle, consistent with similar patterns in other realtime LLM implementations.

plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py (2)

202-208: LGTM! Clean implementation of the lifecycle hook.

The stop_processing() method correctly removes the frame handler, clears the forwarder reference, and logs the stop event. This implementation follows the established pattern across other processors in the codebase.


209-215: LGTM! Proper delegation to stop_processing.

The close() method correctly delegates cleanup to stop_processing() before shutting down the executor and video track, following the DRY principle while ensuring orderly resource cleanup.

plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)

422-429: LGTM - Clean lifecycle hook implementation.

The stop_processing() method correctly detaches the frame handler from the video forwarder and clears the reference. The null-safety check and INFO-level logging align with the PR's standardization goals.


431-434: LGTM - Proper cleanup sequencing.

Calling stop_processing() before shutting down the executor ensures frame processing is halted before resources are torn down. This prevents potential race conditions during shutdown.

plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (2)

234-239: LGTM - Consistent with the established pattern.

This implementation mirrors the approach in other processors (YOLO, Roboflow), ensuring uniform lifecycle management across the codebase. The null-check and logging are appropriate.


241-245: LGTM - Clean shutdown flow.

The sequence (shutdown flag → stop_processing → executor shutdown) ensures graceful termination without hanging tasks or resource leaks.

plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py (2)

271-276: LGTM - Correctly adapted for Decart's architecture.

Unlike other processors that use VideoForwarder frame handlers, this implementation appropriately calls _disconnect_from_decart() to terminate the RealtimeClient websocket connection. Clearing _current_track prevents reconnection attempts after the participant leaves.


278-290: LGTM - Comprehensive cleanup sequence.

The updated close() method now delegates to stop_processing() for initial teardown, then systematically cleans up video tracks, cancels tasks, and closes the Decart client. This ensures all resources are released in the correct order.

plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)

197-202: LGTM - Consistent implementation.

The stop_processing() method follows the established pattern and correctly manages the VideoForwarder lifecycle. The implementation is identical to other processors, which promotes maintainability.


204-210: LGTM - Proper resource cleanup.

The updated close() method ensures video processing stops before marking the processor as closed and shutting down the executor. This prevents processing attempts on a closed processor.

plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)

321-326: LGTM - Completes the consistent pattern across all processors.

This implementation matches the CloudDetectionProcessor and other video processors in the codebase. All processors now have a uniform stop_processing() hook for graceful participant departure handling.


328-337: LGTM - Complete cleanup with model memory management.

The close sequence (shutdown flag → stop_processing → executor shutdown → model cleanup) ensures all resources are properly released. Explicitly deleting the model reference is particularly important for freeing GPU/memory resources.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)

353-358: Potential issue: Calling stop() may incorrectly terminate shared forwarders.

When shared_forwarder is provided to watch_video_track, only a frame handler is added (lines 194-196). However, stop_watching_video_track calls stop() unconditionally, which will terminate the entire forwarder—affecting all other consumers that share it.

Other implementations (Gemini, Qwen) correctly use remove_frame_handler() when using shared forwarders. Consider differentiating between owned and shared forwarders:

Proposed fix
     async def stop_watching_video_track(self) -> None:
         """Stop video forwarding."""
         if self._video_forwarder is not None:
-            await self._video_forwarder.stop()
+            # Only remove our handler if using shared forwarder; stop() if we own it
+            await self._video_forwarder.remove_frame_handler(self._on_frame_received)
             self._video_forwarder = None
             logger.info("Stopped video forwarding")

Note: VideoForwarder.remove_frame_handler() already calls stop() automatically when no handlers remain (see video_forwarder.py lines 103-105), so this approach handles both cases correctly.

plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)

227-232: Same shared forwarder concern as LocalVLM.

This implementation has the same potential issue: calling stop() on a shared forwarder will terminate it for all consumers. Consider using remove_frame_handler() instead:

Proposed fix
     async def stop_watching_video_track(self) -> None:
         """Stop video forwarding."""
         if self._video_forwarder is not None:
-            await self._video_forwarder.stop()
+            await self._video_forwarder.remove_frame_handler(self._on_frame_received)
             self._video_forwarder = None
             logger.info("Stopped video forwarding")
🤖 Fix all issues with AI agents
In @plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py:
- Around line 243-246: The conditional block checking self._video_forwarder
after await self.stop_watching_video_track() is dead code because
stop_watching_video_track() nulls self._video_forwarder; remove the if
self._video_forwarder is not None: await
self._video_forwarder.remove_frame_handler(self._send_video_frame) lines from
the caller, or alternatively move the
remove_frame_handler(self._send_video_frame) call into
stop_watching_video_track() before it sets self._video_forwarder to None so that
_send_video_frame is deregistered while the forwarder still exists.

In @plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py:
- Around line 216-222: The method stop_watching_video_track currently calls the
async method remove_frame_handler synchronously; change
stop_watching_video_track to async def and await
self._video_forwarder.remove_frame_handler(self._frame_buffer.append) (then set
self._video_forwarder = None and log) so the handler is actually removed; ensure
you preserve the None-check on self._video_forwarder and that the await happens
before clearing the forwarder and logging.

In @plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py:
- Around line 373-379: stop_watching_video_track is defined synchronously but
must be async and must await the async remove_frame_handler coroutine; change
the method signature to async def stop_watching_video_track(self) -> None, call
await self._video_forwarder.remove_frame_handler(self._frame_buffer.append),
then set self._video_forwarder = None and emit the logger.info message afterward
so the handler is actually removed; ensure the method matches the base class
VideoLLM.stop_watching_video_track signature.

In
@plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py:
- Around line 243-249: stop_watching_video_track is currently synchronous but
calls the async VideoForwarder.remove_frame_handler without awaiting, leaving
the coroutine unexecuted; change the method signature to async def
stop_watching_video_track(self) -> None and await
self._video_forwarder.remove_frame_handler(self._frame_buffer.append) (then set
self._video_forwarder = None and log as before) so the handler is actually
removed; mirror the pattern used in Qwen/Gemini implementations to ensure proper
cleanup.

In @plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py:
- Around line 158-160: The block that redundantly calls
_video_forwarder.remove_frame_handler(self._send_video_frame) should be removed
because stop_watching_video_track() already performs that cleanup when
_video_forwarder exists; locate the duplicate call after await
self.stop_watching_video_track() and delete the if self._video_forwarder is not
None: await self._video_forwarder.remove_frame_handler(self._send_video_frame)
lines, leaving stop_watching_video_track() as the single source of teardown for
the frame handler in the qwen_realtime.py class.
🧹 Nitpick comments (2)
agents-core/vision_agents/core/processors/base_processor.py (1)

76-82: Remove the redundant pass statement from the abstract method.

The pass statement serves no purpose in an abstract method—it will never execute since subclasses must provide their own implementation.

♻️ Suggested fix
     @abc.abstractmethod
     async def stop_processing(self) -> None:
         """
         Stop processing video. Called when all video tracks are removed.
         Override this to clean up frame handlers and stop output tracks.
         """
-        pass
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)

201-204: Consider resetting _video_forwarder to None for consistency.

Unlike the Gemini implementation (which sets self._video_forwarder = None after removal), this method leaves the forwarder reference intact. While watch_video_track() reassigns the forwarder anyway, clearing the reference would align with other VLM implementations and prevent potential stale state if stop_watching_video_track() is called multiple times.

♻️ Suggested enhancement
     async def stop_watching_video_track(self) -> None:
         if self._video_forwarder is not None:
             await self._video_forwarder.remove_frame_handler(self._send_video_frame)
             logger.info("🛑 Stopped video forwarding to Qwen (participant left)")
+            self._video_forwarder = None
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d61fddf and c0e3ed1.

📒 Files selected for processing (21)
  • agents-core/vision_agents/core/agents/agents.py
  • agents-core/vision_agents/core/llm/llm.py
  • agents-core/vision_agents/core/llm/realtime.py
  • agents-core/vision_agents/core/processors/base_processor.py
  • examples/01_simple_agent_example/simple_agent_example.py
  • plugins/gemini/tests/test_gemini_realtime.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py
  • plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py
  • plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py
  • plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py
  • plugins/openai/tests/test_chat_completions.py
  • plugins/openai/tests/test_openai_realtime.py
  • plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • plugins/qwen/tests/test_qwen_realtime.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • plugins/ultralytics/tests/test_ultralytics.py
  • tests/test_agent_tracks.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/ultralytics/tests/test_ultralytics.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • agents-core/vision_agents/core/agents/agents.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py
  • tests/test_agent_tracks.py
  • plugins/qwen/tests/test_qwen_realtime.py
  • agents-core/vision_agents/core/processors/base_processor.py
  • examples/01_simple_agent_example/simple_agent_example.py
  • plugins/openai/tests/test_openai_realtime.py
  • plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py
  • plugins/openai/tests/test_chat_completions.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py
  • plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py
  • plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • agents-core/vision_agents/core/llm/llm.py
  • plugins/gemini/tests/test_gemini_realtime.py
  • agents-core/vision_agents/core/llm/realtime.py
**/*test*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*test*.py: Never mock in tests; use pytest for testing
Mark integration tests with @pytest.mark.integration decorator
@pytest.mark.asyncio is not needed - it is automatic

Files:

  • tests/test_agent_tracks.py
  • plugins/qwen/tests/test_qwen_realtime.py
  • plugins/openai/tests/test_openai_realtime.py
  • plugins/openai/tests/test_chat_completions.py
  • plugins/gemini/tests/test_gemini_realtime.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.

Applied to files:

  • tests/test_agent_tracks.py
  • agents-core/vision_agents/core/processors/base_processor.py
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • agents-core/vision_agents/core/llm/llm.py
🧬 Code graph analysis (10)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (4)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
tests/test_agent_tracks.py (4)
agents-core/vision_agents/core/processors/base_processor.py (1)
  • stop_processing (77-82)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/tests/test_qwen_realtime.py (11)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
tests/test_agent_tracks.py (1)
  • stop_watching_video_track (81-82)
plugins/openai/tests/test_openai_realtime.py (5)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
tests/test_agent_tracks.py (1)
  • stop_watching_video_track (81-82)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (9)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
tests/test_agent_tracks.py (1)
  • stop_watching_video_track (81-82)
plugins/openai/tests/test_chat_completions.py (10)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
tests/test_agent_tracks.py (1)
  • stop_watching_video_track (81-82)
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (9)
agents-core/vision_agents/core/processors/base_processor.py (2)
  • stop_processing (77-82)
  • close (29-32)
plugins/decart/vision_agents/plugins/decart/decart_restyling_processor.py (2)
  • stop_processing (271-276)
  • close (278-291)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_cloud_processor.py (2)
  • stop_processing (202-207)
  • close (209-215)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)
  • stop_processing (197-202)
  • close (204-210)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2)
  • stop_processing (422-429)
  • close (431-435)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • close (248-253)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • close (130-141)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • close (307-321)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (8)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
plugins/gemini/tests/test_gemini_realtime.py (9)
plugins/aws/tests/test_aws_realtime.py (2)
  • realtime (25-41)
  • realtime (152-166)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
agents-core/vision_agents/core/llm/realtime.py (10)
agents-core/vision_agents/core/llm/llm.py (1)
  • stop_watching_video_track (445-447)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
tests/test_agent_tracks.py (1)
  • stop_watching_video_track (81-82)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
🔇 Additional comments (19)
examples/01_simple_agent_example/simple_agent_example.py (3)

8-15: LGTM!

The expanded import structure is clean and well-organized. The addition of openai and roboflow plugins aligns with the new processor configuration and commented alternative below.


35-36: LGTM!

The commented alternative provides useful guidance for users exploring different LLM options. The reference to 02_golf_coach_example in the docstring above helps direct users to a more complete realtime implementation.


42-44: The processor configuration appropriately demonstrates the new lifecycle capabilities with a clean zero-config instantiation.

The inline comment helpfully clarifies the processor's purpose, making this example accessible to users learning about processor lifecycle and stop_processing hooks.

plugins/qwen/tests/test_qwen_realtime.py (1)

87-87: LGTM!

The test correctly adopts the new public stop_watching_video_track() API and properly awaits the async call. This aligns with the Qwen implementation at qwen_realtime.py:200-203.

plugins/openai/tests/test_chat_completions.py (1)

84-85: Update needed if implementation becomes async.

The call to stop_watching_video_track() is not awaited. If the implementation is corrected to be async (as suggested in chat_completions_vlm.py), this test will need to be updated:

-        vlm.stop_watching_video_track()
+        await vlm.stop_watching_video_track()
agents-core/vision_agents/core/llm/realtime.py (1)

63-65: LGTM!

The promotion from private to public API is appropriate. The default no-op implementation allows providers that don't support video input to inherit without overriding, while those that do (Qwen, Gemini, OpenAI) provide their cleanup logic.

Minor: return None is implicit in Python and could be removed, but this is purely stylistic.

tests/test_agent_tracks.py (2)

65-66: LGTM!

The mock stub correctly matches the async signature from VideoProcessor.stop_processing.


81-82: LGTM!

The mock stub correctly matches the async signature from VideoLLM.stop_watching_video_track.

plugins/moondream/vision_agents/plugins/moondream/detection/moondream_cloud_processor.py (1)

234-248: LGTM!

The stop_processing method correctly awaits the async remove_frame_handler, clears the forwarder reference, and integrates cleanly with close(). This is the proper pattern—other plugins (NvidiaVLM, HuggingFaceVLM) should follow this async approach.

plugins/openai/tests/test_openai_realtime.py (1)

113-113: LGTM!

Test updated to call the now-public stop_watching_video_track() API, aligning with the promoted method in Realtime.

agents-core/vision_agents/core/llm/llm.py (1)

444-448: LGTM! Abstract lifecycle hook for video track cleanup.

The addition of stop_watching_video_track as an abstract method enforces a consistent teardown API across all VideoLLM implementations. This aligns well with the PR objective of ensuring proper cleanup when participants leave.

Minor note: The pass statement is technically redundant in abstract methods since the @abc.abstractmethod decorator already prevents instantiation of classes that don't override it. However, it doesn't cause any issues.

plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)

186-186: LGTM! Call site updated to use public method.

plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)

321-330: LGTM! Correct implementation of stop_processing lifecycle hook.

The method properly uses remove_frame_handler() to detach this processor's callback without affecting other consumers of a shared forwarder. This aligns with the pattern used in other processors (Roboflow, Ultralytics, Decart).


335-335: LGTM! close() now properly calls stop_processing() before cleanup.

This ensures frame handling is stopped before the executor is shut down and model resources are released.

plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)

76-76: LGTM! Call site updated to use public method.

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

278-282: LGTM! Correct implementation using remove_frame_handler().

This properly removes only this component's handler, allowing shared forwarders to continue serving other consumers. The log message clearly indicates the cleanup reason.


315-315: LGTM! close() now properly calls stop_watching_video_track() for cleanup.

plugins/gemini/tests/test_gemini_realtime.py (1)

94-94: LGTM!

The method call now correctly references the public stop_watching_video_track() API, aligning with the broader refactor to expose cleanup methods publicly across all LLM implementations.

plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)

130-141: LGTM!

The close() method now properly invokes the public stop_watching_video_track() for cleanup before shutting down the processing task and client.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)

216-222: Implementation looks correct; consider adding a docstring for consistency.

The method correctly mirrors the pattern used in other VLM plugins (nvidia_vlm, chat_completions_vlm). It properly awaits the async remove_frame_handler call and clears the forwarder reference.

For consistency with watch_video_track (which has a docstring) and per the coding guidelines requiring Google-style docstrings, consider adding one here.

📝 Suggested docstring addition
     async def stop_watching_video_track(self) -> None:
+        """
+        Stop video forwarding and remove the frame handler.
+
+        Called when a participant leaves the call to ensure no more frames
+        are sent to the LLM.
+        """
         if self._video_forwarder is not None:
             await self._video_forwarder.remove_frame_handler(self._frame_buffer.append)
             self._video_forwarder = None
             logger.info(
                 f"🛑 Stopped video forwarding to {PLUGIN_NAME} (participant left)"
             )
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (2)

373-379: Add a Google-style docstring to this new public method.

The method is clear, yet a brief docstring would align with the coding guidelines and match the documentation style of watch_video_track above.

📝 Suggested docstring
 async def stop_watching_video_track(self) -> None:
+    """
+    Stop forwarding video frames to this plugin.
+
+    Removes the frame handler from the current VideoForwarder and clears
+    the forwarder reference. Called when a participant leaves the call.
+    """
     if self._video_forwarder is not None:

373-379: Call stop_watching_video_track() from close() for complete cleanup.

The close() method tears down the HTTP client but does not remove the frame handler from the video forwarder. If close() is called without a prior stop_watching_video_track(), the handler may remain registered, potentially causing issues if the forwarder outlives this instance. Other realtime implementations like Qwen and Gemini follow this pattern in their close() methods.

♻️ Proposed change in close()
 async def close(self) -> None:
     """Close the HTTP client if we own it."""
+    await self.stop_watching_video_track()
     if self._own_client and self._client:
         await self._client.close()
         self._client = None
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 93d07a7 and baf8abf.

📒 Files selected for processing (3)
  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py
  • plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py
  • plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.

Applied to files:

  • plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py
  • plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py
🧬 Code graph analysis (2)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (4)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)
  • stop_watching_video_track (243-249)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (9)
plugins/huggingface/vision_agents/plugins/huggingface/huggingface_vlm.py (1)
  • stop_watching_video_track (216-222)
plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)
  • stop_watching_video_track (373-379)
agents-core/vision_agents/core/llm/realtime.py (1)
  • stop_watching_video_track (63-65)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_cloud_vlm.py (1)
  • stop_watching_video_track (227-232)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • stop_watching_video_track (278-282)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
  • stop_watching_video_track (353-358)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • stop_watching_video_track (375-377)
plugins/qwen/vision_agents/plugins/qwen/qwen_realtime.py (1)
  • stop_watching_video_track (201-204)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • remove_frame_handler (87-107)
🔇 Additional comments (2)
plugins/openai/vision_agents/plugins/openai/chat_completions/chat_completions_vlm.py (1)

243-249: LGTM!

The new stop_watching_video_track method mirrors the established pattern across sibling plugins (HuggingFace, NVIDIA). The approach correctly:

  • Guards against null forwarder before cleanup
  • Removes only this VLM's handler (preserving shared forwarder semantics)
  • Clears the reference to avoid stale state
  • Logs the teardown for observability

The remove_frame_handler call is apt here rather than stop(), since the forwarder may be shared and will auto-stop when its last handler departs.

plugins/nvidia/vision_agents/plugins/nvidia/nvidia_vlm.py (1)

375-375: The asymmetry is intentional and correct. remove_frame_handler is indeed async (line 87 in agents-core/vision_agents/core/utils/video_forwarder.py), while add_frame_handler is synchronous. This design is justified: remove_frame_handler awaits self.stop() when all handlers are removed, requiring it to be async, while add_frame_handler only appends to a list and calls self.start(). The await at line 375 is necessary and correct.

@dangusev dangusev merged commit d34f3d7 into main Jan 12, 2026
10 checks passed
@dangusev dangusev deleted the fix/video-frames branch January 12, 2026 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants