Conversation
- Added `Warmable` and `WarmupCache` to load resources once and share them between instances of the same classes in runtime. - Updated existing plugins
- SileroVAD -> SileroVADSession and SileroVADSessionPool - Updated smart_turn, vogent and aws realtime - Enabled Vogent tests
📝 WalkthroughWalkthroughCentralizes warmup with a Warmable/WarmupCache framework and await_or_run helper; AgentLauncher now orchestrates component warmup before returning agents. Core base warmup hooks removed; plugins and tests migrated to Warmable on_warmup/on_warmed_up and Silero VAD session pooling. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Launcher as AgentLauncher
participant Cache as WarmupCache
participant Agent as Agent
participant Comp as WarmableComponent
rect rgb(240,248,255)
Note over User,Launcher: Launch flow with centralized warmup
end
User->>Launcher: launch()
activate Launcher
Launcher->>Launcher: create_agent via await_or_run
Launcher->>Agent: returns agent instance
activate Agent
Agent-->>Launcher: agent
deactivate Agent
Launcher->>Launcher: _warmup_agent(agent)
Launcher->>Cache: cache.warmup(component)
activate Cache
alt resource cached
Cache->>Comp: on_warmed_up(resource)
Comp->>Comp: apply resource
else not cached
Cache->>Cache: acquire per-class lock
Cache->>Comp: on_warmup()
activate Comp
Comp-->>Cache: resource (when ready)
deactivate Comp
Cache->>Cache: store resource
Cache->>Comp: on_warmed_up(resource)
Comp->>Comp: apply resource
end
deactivate Cache
Launcher-->>User: warmed agent
deactivate Launcher
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
158-176: Broad exception handling violates coding guidelines.The
except Exception as epattern should be replaced with specific exception types. For HuggingFace model loading, consider catchingOSError,ValueError, or the specific HF exceptions.🔎 Proposed fix
- except Exception as e: + except (OSError, ValueError, RuntimeError) as e: error_msg = str(e)As per coding guidelines: "Never write
except Exception as e- use specific exception handling."plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (1)
179-197: Broad exception handling violates coding guidelines.Same issue as in
LocalVLM. Use specific exception types.🔎 Proposed fix
- except Exception as e: + except (OSError, ValueError, RuntimeError) as e: error_msg = str(e)As per coding guidelines: "Never write
except Exception as e- use specific exception handling."
🧹 Nitpick comments (9)
plugins/moondream/tests/test_moondream_local.py (1)
60-60: Docstring should referencewarmup()instead ofload().The test docstring mentions "load()" but the actual method called on line 65 is
warmup(). For accuracy and clarity, the docstring should reference the correct method name.🔎 Suggested fix
- """Test that load() successfully loads the model.""" + """Test that warmup() successfully loads the model."""tests/test_warmup.py (1)
64-86: Good concurrent warmup test, but missingon_warmed_upcall count assertions.The test validates resource sharing and that only one instance triggers
on_warmup, but it doesn't verify thaton_warmed_upwas called for all three instances. Consider adding:# Verify that on_warmup() was called only for the first instance assert warmable1.on_warmup_calls_count == 1 assert warmable2.on_warmup_calls_count == 0 assert warmable3.on_warmup_calls_count == 0 + + # Verify that on_warmed_up() was called for all instances + assert warmable1.on_warmed_up_calls_count == 1 + assert warmable2.on_warmed_up_calls_count == 1 + assert warmable3.on_warmed_up_calls_count == 1plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (1)
196-206: Avoidhasattrusage; prefer direct attribute access.Per coding guidelines,
hasattrshould be avoided. SinceSegmentandTranscriptionInfotypes fromfaster_whispershould have consistent attributes, consider accessing them directly or usinggetattrwith a default if truly optional.🔎 Proposed fix
response = TranscriptResponse( - confidence=segment.no_speech_prob - if hasattr(segment, "no_speech_prob") - else None, - language=info.language - if hasattr(info, "language") - else self.language, + confidence=segment.no_speech_prob, + language=info.language if info.language else self.language, processing_time_ms=processing_time_ms, audio_duration_ms=buffer_to_process.duration_ms, model_name=f"faster-whisper-{self.model_size}", )plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1)
30-30: Missing type parameter forWarmablebase class.The
Warmablegeneric should specify the resource type for clarity and type safety.🔎 Proposed fix
-class LocalVLM(llm.VideoLLM, Warmable): +class LocalVLM(llm.VideoLLM, Warmable[Any]):Or preferably, define a proper type alias for the model type.
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)
30-30: Missing type parameter forWarmable.Same as
LocalVLM, specify the generic type parameter.🔎 Proposed fix
-class LocalDetectionProcessor(VideoProcessorPublisher, Warmable): +class LocalDetectionProcessor(VideoProcessorPublisher, Warmable[Any]):
137-197: Consider extracting shared model loading logic.
_load_model_syncis nearly identical to the one inLocalVLM. Consider extracting this to a shared utility inmoondream_utils.pyto reduce duplication.agents-core/vision_agents/core/vad/silero.py (1)
18-67: Excellent pool-based architecture for resource sharing.The
SileroVADSessionPoolcleanly separates the expensive ONNX model loading from per-track session state. The asyncload()factory correctly offloads blocking I/O and session creation to background threads.Minor typo in docstring at line 62: "instace" → "instance".
🔎 Proposed fix
Returns: - an instace of `SileroVADSession` + an instance of `SileroVADSession`plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
200-204: Avoid catching broadException.Per coding guidelines, use specific exception handling rather than
except Exception as e. Consider catching more specific exceptions or at minimum re-raising after logging if this is a last-resort handler.🔎 Proposed fix
except asyncio.TimeoutError: # Timeout is expected - continue loop to check shutdown continue - except Exception as e: - logger.error(f"Error processing audio: {e}") + except (ValueError, RuntimeError) as e: + logger.error("Error processing audio: %s", e, exc_info=True)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
208-212: Avoid catching broadException.Per coding guidelines, use specific exception handling rather than
except Exception as e. This mirrors the same issue in SmartTurnDetection.🔎 Proposed fix
except asyncio.TimeoutError: # Timeout is expected - continue loop to check shutdown continue - except Exception as e: - logger.error(f"Error processing audio: {e}") + except (ValueError, RuntimeError) as e: + logger.error("Error processing audio: %s", e, exc_info=True)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (23)
agents-core/vision_agents/core/agents/agent_launcher.pyagents-core/vision_agents/core/agents/agents.pyagents-core/vision_agents/core/cli/cli_runner.pyagents-core/vision_agents/core/llm/llm.pyagents-core/vision_agents/core/stt/stt.pyagents-core/vision_agents/core/tts/tts.pyagents-core/vision_agents/core/turn_detection/turn_detection.pyagents-core/vision_agents/core/utils/utils.pyagents-core/vision_agents/core/vad/silero.pyagents-core/vision_agents/core/warmup.pyplugins/aws/tests/test_aws_realtime.pyplugins/aws/vision_agents/plugins/aws/aws_realtime.pyplugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.pyplugins/moondream/tests/test_moondream_local.pyplugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.pyplugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.pyplugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.pyplugins/smart_turn/tests/test_smart_turn.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/tests/test_vogent.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.pytests/test_silero_vad.pytests/test_warmup.py
💤 Files with no reviewable changes (4)
- agents-core/vision_agents/core/turn_detection/turn_detection.py
- agents-core/vision_agents/core/tts/tts.py
- agents-core/vision_agents/core/llm/llm.py
- agents-core/vision_agents/core/stt/stt.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
agents-core/vision_agents/core/cli/cli_runner.pyagents-core/vision_agents/core/utils/utils.pyagents-core/vision_agents/core/agents/agents.pyplugins/moondream/tests/test_moondream_local.pyplugins/aws/tests/test_aws_realtime.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyagents-core/vision_agents/core/vad/silero.pyplugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.pyagents-core/vision_agents/core/agents/agent_launcher.pytests/test_warmup.pyplugins/aws/vision_agents/plugins/aws/aws_realtime.pyplugins/smart_turn/tests/test_smart_turn.pyagents-core/vision_agents/core/warmup.pyplugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.pytests/test_silero_vad.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.pyplugins/vogent/tests/test_vogent.pyplugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.pyplugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*test*.py: Never mock in tests; use pytest for testing
Mark integration tests with @pytest.mark.integration decorator
@pytest.mark.asyncio is not needed - it is automatic
Files:
plugins/moondream/tests/test_moondream_local.pyplugins/aws/tests/test_aws_realtime.pytests/test_warmup.pyplugins/smart_turn/tests/test_smart_turn.pytests/test_silero_vad.pyplugins/vogent/tests/test_vogent.py
🧠 Learnings (1)
📚 Learning: 2025-12-10T19:35:39.238Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 249
File: agents-core/vision_agents/core/agents/agents.py:1032-1042
Timestamp: 2025-12-10T19:35:39.238Z
Learning: In `agents-core/vision_agents/core/agents/agents.py`, when using `video_track_override_path`, creating a new `VideoFileTrack` for each participant (each call to `_on_track_added`) is intentional to maintain proper track lifecycle semantics tied to each participant.
Applied to files:
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py
🧬 Code graph analysis (15)
agents-core/vision_agents/core/cli/cli_runner.py (2)
agents-core/vision_agents/core/agents/agent_launcher.py (2)
warmup(39-54)launch(56-68)agents-core/vision_agents/core/warmup.py (2)
warmup(24-42)warmup(96-105)
agents-core/vision_agents/core/agents/agents.py (1)
agents-core/vision_agents/core/utils/utils.py (1)
await_or_run(91-101)
plugins/aws/tests/test_aws_realtime.py (1)
agents-core/vision_agents/core/agents/agent_types.py (1)
AgentOptions(15-25)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/turn_detection/turn_detection.py (2)
TurnDetector(20-82)start(76-78)agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (1)
Warmable(45-105)
agents-core/vision_agents/core/vad/silero.py (1)
agents-core/vision_agents/core/utils/utils.py (1)
ensure_model(45-88)
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)
agents-core/vision_agents/core/warmup.py (5)
warmup(24-42)warmup(96-105)Warmable(45-105)on_warmup(76-82)on_warmed_up(85-94)tests/test_warmup.py (2)
on_warmup(16-20)on_warmed_up(22-24)
agents-core/vision_agents/core/agents/agent_launcher.py (2)
agents-core/vision_agents/core/utils/utils.py (1)
await_or_run(91-101)agents-core/vision_agents/core/warmup.py (4)
warmup(24-42)warmup(96-105)WarmupCache(13-42)Warmable(45-105)
tests/test_warmup.py (1)
agents-core/vision_agents/core/warmup.py (6)
warmup(24-42)warmup(96-105)Warmable(45-105)WarmupCache(13-42)on_warmup(76-82)on_warmed_up(85-94)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (2)
agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (5)
warmup(24-42)warmup(96-105)Warmable(45-105)on_warmup(76-82)on_warmed_up(85-94)
agents-core/vision_agents/core/warmup.py (5)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (2)
on_warmed_up(200-202)on_warmup(197-198)plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)
on_warmed_up(134-135)on_warmup(122-132)plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (2)
on_warmed_up(112-113)on_warmup(99-110)plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (2)
on_warmed_up(216-217)on_warmup(212-214)tests/test_warmup.py (2)
on_warmed_up(22-24)on_warmup(16-20)
plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (2)
agents-core/vision_agents/core/warmup.py (5)
warmup(24-42)warmup(96-105)Warmable(45-105)on_warmup(76-82)on_warmed_up(85-94)agents-core/vision_agents/core/stt/stt.py (1)
STT(16-165)
tests/test_silero_vad.py (3)
agents-core/vision_agents/core/events/base.py (1)
AudioFormat(23-30)agents-core/vision_agents/core/vad/silero.py (3)
SileroVADSessionPool(18-66)load(33-51)session(53-66)conftest.py (1)
skip_blockbuster(29-46)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
agents-core/vision_agents/core/edge/types.py (2)
Participant(22-24)stop(47-47)agents-core/vision_agents/core/turn_detection/turn_detection.py (3)
TurnDetector(20-82)start(76-78)stop(80-82)agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (1)
Warmable(45-105)
plugins/vogent/tests/test_vogent.py (2)
agents-core/vision_agents/core/agents/conversation.py (1)
InMemoryConversation(230-237)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
start(157-165)stop(167-178)process_audio(180-191)
plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (3)
agents-core/vision_agents/core/llm/llm.py (1)
LLMResponseEvent(39-43)agents-core/vision_agents/core/warmup.py (5)
warmup(24-42)warmup(96-105)Warmable(45-105)on_warmup(76-82)on_warmed_up(85-94)plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (3)
on_warmup(122-132)_load_model_sync(137-197)on_warmed_up(134-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: unit / Ruff
- GitHub Check: unit / Mypy
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (53)
agents-core/vision_agents/core/cli/cli_runner.py (1)
104-108: LGTM! Warmup and launch flow correctly implemented.The two-step warmup and launch approach aligns with the new warmup framework. The
call_typeandcall_idparameters are appropriately used later for demo UI and joining the call, rather than during agent creation.agents-core/vision_agents/core/warmup.py (3)
1-11: LGTM! Clean module structure.Imports and public API exposure are appropriate for the warmup framework.
13-42: LGTM! Correct double-check locking pattern.The concurrent-safe lazy loading implementation correctly uses per-class locking and caching. The double-check pattern prevents race conditions when multiple instances of the same class are warmed up concurrently.
45-105: LGTM! Well-designed warmup abstraction.The
Warmablegeneric interface provides a clean contract for resource loading with strong typing. The lifecycle separation (on_warmupfor loading,on_warmed_upfor applying) enables effective resource sharing viaWarmupCache. The docstring example is clear and helpful.agents-core/vision_agents/core/agents/agents.py (2)
63-63: LGTM! Correct import for unified callable handling.The
await_or_runutility import supports the refactored_applymethod below.
639-639: LGTM! Cleaner unified callable handling.Using
await_or_runsimplifies the_applymethod and centralizes sync/async handling logic. This improves maintainability and consistency across the codebase.agents-core/vision_agents/core/utils/utils.py (2)
3-14: LGTM! Appropriate imports and type variables.The imports and type variables correctly support the generic
await_or_runfunction with proper parameter forwarding and return type preservation.
91-101: LGTM! Solid utility for sync/async unification.The implementation correctly handles both synchronous and asynchronous callables with proper type safety through
ParamSpecandTypeVar. This follows established patterns for maybe-async function handling.tests/test_silero_vad.py (3)
3-4: LGTM! Imports updated for session pool API.The imports correctly reflect the migration from
prepare_silero_vadto the newSileroVADSessionPoolAPI.
8-8: LGTM! Decorator updated to pytest convention.The change from
@skip_blockbusterto@pytest.mark.skip_blockbusterfollows pytest marking conventions more explicitly.
15-16: LGTM! Correct adoption of session pool pattern.The two-step initialization (load pool, create session) enables efficient resource sharing while maintaining per-test state isolation. This aligns with the new warmup framework design.
Also applies to: 32-33
plugins/roboflow/vision_agents/plugins/roboflow/roboflow_local_processor.py (3)
26-26: LGTM! Correct import for Warmable integration.The import enables the class to implement the
Warmable[RFDETR]interface.
53-53: LGTM! Correct Warmable integration.The class signature properly extends both
VideoProcessorPublisherandWarmable[RFDETR], with the type parameter accurately indicating the model resource type.
212-217: LGTM! Proper Warmable lifecycle implementation.The
on_warmupmethod correctly loads the model asynchronously in the executor and returns it for caching. Theon_warmed_upmethod cleanly separates resource application from loading, enabling effective resource sharing viaWarmupCache.plugins/vogent/tests/test_vogent.py (4)
2-2: LGTM! Logging import for test observability.Adding logging support helps with debugging integration test behavior.
13-21: LGTM! Well-structured fixture with proper lifecycle management.The fixture correctly orchestrates the warmup/start/stop lifecycle and ensures cleanup via try/finally. This provides consistent test setup and teardown.
24-26: LGTM! Appropriate test markers and organization.The class decorators correctly mark the test as an integration test that makes blocking calls, following pytest conventions for test categorization.
27-62: LGTM! Test correctly migrated to fixture-based setup.All references to the turn detection instance are consistently updated to use the
vogent_turn_detectionfixture, while preserving the original test logic. The fixture ensures proper warmup and cleanup.tests/test_warmup.py (2)
1-25: LGTM! Clean test infrastructure setup.The
DummyWarmableimplementation correctly follows theWarmableprotocol, tracking call counts to verify caching behavior. The use ofrandom.random()for simulating async work is appropriate for concurrency tests.One minor note: consider adding a type hint to
on_warmed_up(self, resource: Resource)for consistency with the generic parameter inWarmable[Resource].
27-45: Well-structured caching verification test.The test correctly validates that
on_warmupis called only once whileon_warmed_upis called on each warmup invocation with the same cache. The identity check (is) properly verifies resource reuse.plugins/aws/tests/test_aws_realtime.py (2)
22-39: LGTM! Proper lifecycle management with warmup integration.The fixture correctly uses
tmp_pathfor isolated test model directories, callswarmup()before use, and ensuresclose()is called viatry/finally. This aligns well with the newWarmablelifecycle.
148-163: Consistent fixture implementation for Nova2.Both test class fixtures now follow the same pattern. The lifecycle handling mirrors
TestBedrockRealtimeappropriately.plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (1)
73-92: LGTM! Correct Warmable implementation.The
on_warmupmethod properly loads the model in a thread pool when not provided at init, andon_warmed_upcorrectly guards against overwriting an already-provided model.plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (2)
99-113: LGTM! Clean warmup lifecycle implementation.The
on_warmupcorrectly offloads model loading to a thread and returns the model for caching. Theon_warmed_upassignment is straightforward.Consider adding type hints for better IDE support:
async def on_warmup(self) -> Any: def on_warmed_up(self, model: Any) -> None:
188-189: LGTM! Clear precondition check.The
ValueErrorwith a descriptive message properly guards against using the VLM before the model is loaded.plugins/smart_turn/tests/test_smart_turn.py (3)
13-19: LGTM! Clean fixture for SmartTurnDetection lifecycle.The fixture properly calls
warmup(),start(), andstop()in sequence. Theyield-based cleanup pattern ensures proper teardown.Consider whether this fixture should be module-scoped (
@pytest.fixture(scope="module")) ifSmartTurnDetectioninitialization is expensive, since multiple tests share the same configuration.
24-32: LGTM! Correctly uses new SileroVADSessionPool API.The test properly loads the pool asynchronously and creates a session for prediction. The chunk-based iteration with size validation is appropriate.
34-62: LGTM! Good event-driven test pattern.The subscription-based event capture and subsequent assertion on event order effectively validates turn detection behavior.
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)
122-135: LGTM! Proper Warmable implementation.The implementation correctly offloads model loading to a thread and returns the model for caching by
WarmupCache.
216-217: LGTM! Clear precondition enforcement.Properly guards the
process_videomethod against unloaded model state.agents-core/vision_agents/core/vad/silero.py (2)
1-16: LGTM! Clean module setup with proper exports.The
__all__export list explicitly defines the public API. Constants are well-named and documented.
69-135: LGTM! Clean session implementation with proper state management.The
SileroVADSessioncorrectly manages per-session state (context, hidden state) while reusing the shared ONNX model. The periodic reset via_maybe_reset()prevents drift accumulation.agents-core/vision_agents/core/agents/agent_launcher.py (4)
1-11: LGTM! Clean imports for warmup infrastructure.The imports correctly bring in the new
WarmupCache,Warmable, andawait_or_runutilities.
22-37: LGTM! Proper type hints and initialization.The flexible type hints for
create_agentandjoin_callaccommodate both sync and async factories. TheWarmupCacheinstance is correctly scoped to the launcher for sharing across warmups.
39-54: Verify:warmup()creates an agent that is discarded.The comment says "dry-run Agent instance" but the agent is created and its components warmed up, then discarded. If
create_agenthas side effects (e.g., establishing connections, allocating resources), this could cause issues.Is this intentional? The cached resources from
_warmup_cachepersist, but the agent instance itself is lost.
70-108: Excellent parallel warmup orchestration.The
_warmup_agenthelper elegantly:
- Collects warmup tasks for all
Warmablecomponents- Executes them in parallel via
asyncio.gather- Uses the shared
_warmup_cachefor resource deduplicationThe
isinstance(component, Warmable)checks are appropriate here for duck-typing detection.plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (6)
45-50: LGTM on the Warmable integration!The multiple inheritance from
TurnDetectorandWarmablewith the appropriately typed tuple resource is well-structured. The generic type parameter clearly documents what resources are managed.
121-124: Clean initialization of optional state fields.Properly typed as
| Nonewith explicitNonedefaults. This aligns with the warmup lifecycle where resources are set viaon_warmed_up.
125-143: Well-structured async warmup with proper non-blocking I/O.The use of
asyncio.to_threadfor blocking operations (directory creation, WhisperFeatureExtractor init, ONNX session building) prevents event loop starvation. The sequential loading order is correct: VAD pool → model path → feature extractor → ONNX session.
145-154: Correct resource binding in on_warmed_up.The tuple unpacking and session creation from pool follows the Warmable contract. Each instance gets its own stateful
SileroVADSessionwhile sharing the underlying ONNX model.
227-228: Good defensive check for uninitialized VAD session.The explicit
ValueErrorwith a descriptive message ensures fast failure if the warmup lifecycle is violated. This is the right pattern for catching misuse early.
372-377: Consistent null checks for required resources.The explicit checks for
_whisper_extractorand_smart_turnbefore use provide clear error messages. This matches the pattern used for_vad_session.plugins/aws/vision_agents/plugins/aws/aws_realtime.py (4)
87-87: Correct Warmable integration with SileroVADSessionPool.The class signature properly inherits from both
realtime.RealtimeandWarmable[SileroVADSessionPool], enabling the warmup lifecycle for VAD pool management.
167-171: Clean state initialization for connection tracking and VAD session.The new instance attributes for
connected,last_connected_at,_last_audio_at, and_vad_sessionare properly typed and initialized to safe defaults.
200-202: LGTM on on_warmed_up implementation.Correctly creates a per-instance VAD session from the shared pool, following the Warmable contract.
393-394: Good defensive check for VAD initialization.The explicit
ValueErrorwhen_vad_sessionisNoneprovides a clear error message for lifecycle violations.plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (7)
33-35: Correct Warmable integration with typed resource tuple.The class properly inherits from
TurnDetectorandWarmablewith a clear 3-tuple type for the pooled VAD, Whisper model, and Vogent detector.
116-149: Well-structured async warmup pipeline.The sequential loading of VAD pool → Whisper model → Vogent detector with proper
asyncio.to_threadwrapping prevents blocking the event loop. The lambda wrapping is necessary for the thread context.
151-155: Correct resource binding pattern.The tuple unpacking directly assigns
_whisperand_vogentwhile creating a session fromvad_pool. This follows the Warmable contract appropriately.
157-165: LGTM on start() implementation.Launches background processing and calls parent's start. Mirrors the pattern in SmartTurnDetection.
167-178: Good cleanup in stop().Properly cancels the background task and handles
CancelledError. The shutdown event is set before cancellation.
236-237: Good defensive check for VAD session.Consistent with the pattern used in other files, raising
ValueErrorfor uninitialized resources.
397-398: Correct error handling for uninitialized Vogent.The explicit
ValueErrormatches the expected pattern for warmup lifecycle violations.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
Outdated
Show resolved
Hide resolved
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
Outdated
Show resolved
Hide resolved
Good point @Nash0x7E2 . |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
DEVELOPMENT.md
🧰 Additional context used
🪛 LanguageTool
DEVELOPMENT.md
[grammar] ~509-~509: Ensure spelling is correct
Context: ...essors using YOLO models In order to standartise how these resources are loaded and to m...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Ruff
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Ruff
🔇 Additional comments (1)
DEVELOPMENT.md (1)
501-551: Well-documented warmup pattern with clear example.The new section effectively explains the
Warmableframework and provides a concrete, working example. The code demonstrates proper inheritance, method signatures, and lifecycle documentation. The explanation helps plugin developers understand when each hook is called and how resource sharing works.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
137-146: Comment contradicts code:compile_modelvalue.Line 137 comment states
compile_model=False, but line 140 actually passescompile_model=True. Update the comment to match the implementation or explain why compilation is enabled despite potential edge case issues.🔎 Proposed fix
logger.info("Loading Vogent turn detection model") - # Note: compile_model=False to avoid torch.compile issues with edge cases + # Note: compile_model=True to improve inference performance vogent = await asyncio.to_thread( lambda: VogentDetector( compile_model=True,
♻️ Duplicate comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
156-161: Start method relies on downstream validation.Per the past review comment, this method lacks an explicit warmup guard. However, the downstream methods
_process_audio_packet(line 227-228) and_blocking_predict_turn_completed(lines 373-377) validate initialization state and raise clear ValueErrors. Given thatAgentLauncher.launch()manages the warmup lifecycle (per PR context), this approach is acceptable in practice. Consider adding a docstring noting the warmup prerequisite if you prefer explicit documentation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
DEVELOPMENT.mdagents-core/vision_agents/core/vad/silero.pyplugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
agents-core/vision_agents/core/vad/silero.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.pyplugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.
Applied to files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
🧬 Code graph analysis (3)
agents-core/vision_agents/core/vad/silero.py (1)
agents-core/vision_agents/core/utils/utils.py (1)
ensure_model(45-88)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (7)
agents-core/vision_agents/core/agents/agent_types.py (2)
AgentOptions(15-25)default_agent_options(32-33)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (1)
Warmable(45-105)plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (2)
on_warmup(73-88)on_warmed_up(90-92)plugins/aws/vision_agents/plugins/aws/aws_realtime.py (2)
on_warmup(197-198)on_warmed_up(200-202)plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (2)
on_warmup(122-132)on_warmed_up(134-135)
plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (3)
agents-core/vision_agents/core/warmup.py (5)
warmup(24-42)warmup(96-105)Warmable(45-105)on_warmup(76-82)on_warmed_up(85-94)plugins/fast_whisper/tests/test_fast_whisper_stt.py (1)
stt(14-21)tests/test_warmup.py (2)
on_warmup(16-20)on_warmed_up(22-24)
🪛 LanguageTool
DEVELOPMENT.md
[style] ~508-~508: Consider a more concise word here.
Context: ...- Video processors using YOLO models In order to standardise how these resources are loa...
(IN_ORDER_TO_PREMIUM)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Ruff
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Mypy
🔇 Additional comments (21)
DEVELOPMENT.md (1)
500-551: Warmup documentation is clear and helpful.The new section effectively explains the
Warmablepattern with good motivation (Silero VAD, YOLO models) and a concrete, well-commented code example. The example correctly demonstrates generic typing (Warmable[WhisperModel]), proper method signatures (on_warmupasync,on_warmed_upsync), and the lifecycle. This should help contributors understand the pattern quickly.Minor style note: Line 509 could use "To standardise" instead of "In order to standardise" for conciseness, but the current phrasing is perfectly acceptable.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (9)
1-29: LGTM: Imports and constants align with warmup pattern.The additions properly support the new Warmable lifecycle and VAD session pooling. The public constants for the ONNX model (filename and URL) follow established patterns seen in other warmup-enabled components.
45-50: LGTM: Class signature correctly implements Warmable.The generic type parameter accurately describes the three-resource tuple (VAD pool, feature extractor, ONNX session) that will be loaded during warmup and bound during initialization. This matches the pattern established in other warmup-enabled plugins.
121-123: LGTM: State fields properly initialized.The three resource fields correctly mirror the Warmable type parameter and follow the pattern of being None until
on_warmed_upis called. The optional typing is appropriate for the lifecycle model.
125-143: LGTM: on_warmup implementation follows best practices.Properly uses
asyncio.to_threadfor all blocking operations (directory creation, feature extractor initialization, ONNX session loading) to avoid blocking the event loop. Resource loading is well-structured and the return type matches the Warmable contract.
145-154: LGTM: on_warmed_up correctly binds resources.Properly unpacks the warmed resources and creates a new VAD session from the pool (line 152). This per-instance session pattern is correct, enabling independent state management while reusing the shared ONNX model from the pool.
227-228: LGTM: Excellent defensive validation.Clear guards with actionable error messages ensure that any attempt to process audio without proper warmup fails fast with explicit guidance ("call warmup() first"). This defensive approach catches misconfiguration early and provides developers with clear remediation steps.
Also applies to: 373-377
250-252: LGTM: VAD inference properly offloaded.Correctly uses
asyncio.to_threadfor the potentially CPU-intensive VAD prediction, avoiding event loop blocking. The session is validated earlier in the method (line 227-228).
398-398: LGTM: ONNX inference uses warmed-up session.Correctly references
self._smart_turnwhich is initialized during warmup and validated earlier in the method (lines 376-377).
405-418: LGTM: ONNX session builder uses best practices.Loading the model into memory (lines 409-410) before creating the session is the correct approach for multi-worker scenarios, avoiding file access contention. Session options are appropriately configured for inference workloads.
agents-core/vision_agents/core/vad/silero.py (3)
1-11: LGTM: Imports and exports properly structured.The
typing_extensions.Selfimport addresses Python 3.10 compatibility (per past review). The updated__all__appropriately exposes the new pool/session pattern and theSILERO_CHUNKconstant for consumers.
18-66: LGTM: Pool pattern correctly implements resource sharing.The
SileroVADSessionPoolproperly:
- Loads the ONNX model once during async initialization (lines 46-50)
- Uses
asyncio.to_threadto avoid blocking the event loop during ONNX session creation- Provides a
session()factory that creates stateful instances sharing the underlying model- Configures appropriate ONNX session options (single inter-op thread)
The
type: ignorecomments on lines 46-47 are acceptable given mypy's current limitations withasyncio.to_threadreturn types.
69-135: LGTM: Session refactoring maintains thread-safety.The
SileroVADSessioncorrectly:
- Accepts a shared ONNX model in the constructor (line 74)
- Maintains per-instance state (
_state,_context) for thread-safe concurrent usage- Consistently references the new field names throughout (
_model,_context_size,_reset_interval_seconds)- Preserves the public
predict_speechAPI for backward compatibilityThis refactoring enables efficient resource sharing while maintaining independent state per session.
plugins/fast_whisper/vision_agents/plugins/fast_whisper/stt.py (1)
14-14: LGTM: Warmable integration looks correct.The import and class signature changes properly integrate the Warmable protocol with the appropriate generic type parameter.
Also applies to: 27-27
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (7)
2-2: LGTM: Imports and Warmable integration are correct.The new imports support the Warmable lifecycle, and the class signature properly declares the tuple of resources (VAD pool, Whisper model, Vogent detector) that will be warmed up.
Also applies to: 6-6, 8-8, 16-18, 33-35
53-55: LGTM: Type-safe model size parameter.The
Literaltype annotation correctly constrains the Whisper model size parameter to valid values.
105-114: LGTM: Attributes properly initialized.The new attributes for Whisper configuration and runtime resources are appropriately initialized to
Noneand will be populated during the warmup lifecycle.
151-155: LGTM: Resources properly bound to instance.The
on_warmed_up()method correctly unpacks the warmed resources and obtains a session from the VAD pool for this instance.
157-178: LGTM: Lifecycle management is robust.The
start()andstop()methods properly manage the background processing task with appropriate cancellation handling and cleanup.
236-237: LGTM: Initialization checks are consistent.All three resources (VAD, Whisper, Vogent) now have explicit initialization checks that raise
ValueErrorwith clear messages. This addresses the past review concern about silent failures and ensures consistent error handling.Also applies to: 361-362, 397-398
366-368: LGTM: Configurable language parameter.The transcription now uses
self._whisper_languageinstead of a hardcoded"en", making the language setting configurable per instance.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
193-213: Use specific exception handling instead of genericException.Line 211 violates the coding guideline: "Never write
except Exception as e- use specific exception handling." The generic catch could mask unexpected errors. Consider catching specific exceptions likeValueError,RuntimeError, or model-specific errors that might occur during audio processing.🔎 Proposed refactor
except asyncio.TimeoutError: # Timeout is expected - continue loop to check shutdown continue - except Exception as e: - logger.error(f"Error processing audio: {e}") + except (ValueError, RuntimeError, OSError) as e: + logger.error(f"Error processing audio: {e}", exc_info=True) + except Exception as e: + # Catch-all for truly unexpected errors, but log with full traceback + logger.exception(f"Unexpected error processing audio: {e}") + raise # Re-raise unexpected errorsNote: Adjust the specific exception types based on what
_process_audio_packet, VAD, Whisper, and Vogent can raise.Based on coding guidelines, ...
🧹 Nitpick comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
101-102: Consider more specific type annotations.The
asyncio.Queue[Any]andOptional[asyncio.Task[Any]]use genericAnytypes. For better type safety, consider:self._audio_queue: asyncio.Queue[tuple[PcmData, Participant, Optional[Conversation]]] = asyncio.Queue() self._processing_task: Optional[asyncio.Task[None]] = None
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Mypy
🔇 Additional comments (5)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (5)
1-24: LGTM!The imports and constants are well-organized and appropriate for the Warmable-based audio processing pipeline.
214-338: LGTM!The audio processing pipeline is well-structured: VAD-based speech detection, intelligent segment assembly with silence tracking, and proper offloading of CPU-intensive operations to the thread pool. The initialization check at line 236 provides clear error messaging.
349-378: Excellent improvement!The past review comment about silent failure has been properly addressed. Lines 362-364 now raise a clear
ValueErrorwhen the Whisper model is uninitialized, providing consistent error handling with_predict().
380-445: LGTM!Consistent initialization checks and proper thread-pool offloading for CPU-intensive Vogent predictions. The
_get_previous_linehelper cleanly extracts conversational context with appropriate text preprocessing.
157-165: Lifecycle guarantee is provided by the warmup framework.The
AgentLauncherandWarmableframework ensure thaton_warmed_up()completes beforestart()can be called.WarmupCache.warmup()(line 42 in warmup.py) explicitly callson_warmed_up(resource)and waits for completion before returning.AgentLauncher.launch()awaits_warmup_agent()before returning the agent instance. The defensive check at line 236 is sound practice but the error condition will not occur with correct API usage.
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
136-146: A whisper of contradiction in the margins.The comment on line 137 states "compile_model=False to avoid torch.compile issues" but the code sets
compile_model=Trueon line 140. This discrepancy could confuse future maintainers. Either update the comment to reflect the current choice, or revert tocompile_model=Falseif the edge cases remain a concern.🔎 Proposed fix
logger.info("Loading Vogent turn detection model") - # Note: compile_model=False to avoid torch.compile issues with edge cases + # Note: compile_model=True enables torch.compile for better performance vogent = await asyncio.to_thread( lambda: VogentDetector( compile_model=True,
🧹 Nitpick comments (4)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (2)
235-236: A message from another era, still speaking of the old ways.The error message references
warmup()but the new lifecycle useson_warmup/on_warmed_upmanaged byWarmupCache. Consider updating to clarify the actual mechanism (e.g., "VAD model not initialized; ensure warmup completed via AgentLauncher").🔎 Proposed fix
if self._vad_session is None: - raise ValueError("The VAD model is not initialized, call warmup() first") + raise ValueError("VAD model not initialized; ensure warmup completed via AgentLauncher")
210-211: The net cast too wide catches all manner of darkness.Per coding guidelines, avoid
except Exception as e. In a background loop, broad catches prevent task death, but consider catching specific expected exceptions (e.g.,ValueError,RuntimeError) and letting unexpected ones propagate or at least logging them at a higher severity level.🔎 Proposed fix
except asyncio.TimeoutError: # Timeout is expected - continue loop to check shutdown continue - except Exception as e: - logger.error(f"Error processing audio: {e}") + except (ValueError, RuntimeError) as e: + logger.error(f"Error processing audio: {e}") + except Exception: + logger.exception("Unexpected error in audio processing loop") + raiseBased on coding guidelines, specific exception handling is preferred.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
226-227: The guard stands, though its words echo an older tongue.The
ValueErrorensures fail-fast if warmup hasn't completed. Same observation as Vogent: the message references "warmup()" but the actual mechanism ison_warmup/on_warmed_upviaAgentLauncher. Consider updating for clarity.
202-203: The net again cast too wide, swallowing all that stirs.Per coding guidelines, avoid
except Exception as e. Consider catching specific expected exceptions and letting unexpected ones propagate with proper logging. See the similar comment on Vogent for a suggested fix pattern.Based on coding guidelines, specific exception handling is preferred.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
agents-core/vision_agents/core/turn_detection/turn_detection.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
agents-core/vision_agents/core/turn_detection/turn_detection.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.
Applied to files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧬 Code graph analysis (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (7)
agents-core/vision_agents/core/agents/agent_types.py (2)
AgentOptions(15-25)default_agent_options(32-33)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)agents-core/vision_agents/core/turn_detection/turn_detection.py (2)
TurnDetector(20-82)start(74-78)agents-core/vision_agents/core/turn_detection/events.py (1)
TurnStartedEvent(11-25)agents-core/vision_agents/core/utils/utils.py (1)
ensure_model(45-88)agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (1)
Warmable(45-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (11)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
74-78: The double-start guard emerges, a sentinel at the gate.This addition prevents orphaned background tasks and duplicate initialization. The base class now shoulders this burden, so subclasses (Vogent, SmartTurn) can call
await super().start()and trust the protection is in place. TheValueErrorwithselfin the message provides useful debugging context.plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (5)
33-35: The class declaration, now bearing the weight of warmth.The dual inheritance from
TurnDetectorandWarmable[tuple[...]]correctly expresses the lifecycle contract. The tuple type parameter clearly documents what resources are loaded and shared.
151-155: Resources descend, finding their proper vessels.The binding correctly extracts a per-instance
SileroVADSessionfrom the shared pool while directly assigning the statelessWhisperModelandVogentDetector. This pattern enables model sharing across instances without state contamination.
157-177: The lifecycle, now guarded by the ancestor's vigilance.The call to
super().start()now inherits the double-start protection from the base class, addressing the prior review concern about orphaned_processing_task. Thestop()method properly signals shutdown, cancels the task, and awaits its termination.
359-377: The silent void now speaks its truth.The
ValueErroron uninitialized_whisperaddresses the prior review concern about silent failures. The language parameter is now properly configurable viaself._whisper_language. Consider aligning the error message with the new lifecycle terminology as noted for the VAD check.
397-421: The oracle speaks, but only when prepared.The
ValueErrorguard ensures fail-fast behavior if warmup hasn't completed. The multimodal prediction correctly combines audio features with text context. Implementation is sound.plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (5)
45-50: The class declaration mirrors its sibling, bearing warmth.The Warmable type parameter clearly specifies the three resources (VAD pool, Whisper extractor, ONNX session) that are loaded and cached. Consistent with the Vogent implementation pattern.
125-143: Resources summoned from the depths, each in its proper thread.The warmup correctly offloads blocking I/O (directory creation, model download, feature extractor init, ONNX session build) to threads. The
ensure_modelutility handles the async download. Resources are returned as a tuple for caching.
145-154: The binding complete, resources anchored to their vessel.Per-instance VAD session extraction from the shared pool, with stateless resources (feature extractor, ONNX session) assigned directly. Pattern consistent with Vogent implementation.
156-159: The ancestor's guard now shields this threshold.The call to
super().start()inherits the double-start protection, addressing the prior review concern about explicit guards. The docstring requirement from the past comment is now unnecessary since the base class enforces the contract.
371-397: Twin sentinels guard the gate of prediction.The dual
ValueErrorguards for_whisper_extractorand_smart_turnensure fail-fast behavior. The ONNX inference is correctly executed after validation.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
156-159: Add explicit guard or docstring to clarify warmup requirement.The
start()method lacks an explicit check that warmup has completed. While_process_audio_packetraises clear ValueErrors if models are uninitialized (line 225), catching this requirement earlier instart()would improve defensive programming. Either add an explicit guard (e.g.,if self._vad_session is None: raise RuntimeError(...)) or add a docstring documenting thatwarmup()must be called beforestart().
🧹 Nitpick comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
175-203: Log full traceback in background task exception handler.The broad
except Exceptionat line 201 contradicts the coding guideline to use specific exception handling. While a catch-all might be justified in a background task loop to prevent the task from dying, consider at minimum logging the full traceback usinglogger.exception()instead oflogger.error()to aid debugging.🔎 Proposed improvement
except asyncio.TimeoutError: # Timeout is expected - continue loop to check shutdown continue except Exception as e: - logger.error(f"Error processing audio: {e}") + logger.exception(f"Error processing audio: {e}")Or, if specific exceptions can be anticipated (e.g.,
ValueErrorfrom uninitialized models), handle them specifically:except asyncio.TimeoutError: continue + except ValueError as e: + logger.error(f"Configuration error in audio processing: {e}") except Exception as e: - logger.error(f"Error processing audio: {e}") + logger.exception(f"Unexpected error processing audio: {e}")Based on coding guidelines.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.
Applied to files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
🧬 Code graph analysis (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (5)
agents-core/vision_agents/core/agents/agent_types.py (1)
AgentOptions(15-25)agents-core/vision_agents/core/edge/types.py (1)
Participant(22-24)agents-core/vision_agents/core/turn_detection/turn_detection.py (2)
TurnDetector(20-82)start(74-78)agents-core/vision_agents/core/vad/silero.py (5)
SileroVADSession(69-135)SileroVADSessionPool(18-66)load(33-51)session(53-66)predict_speech(90-95)agents-core/vision_agents/core/warmup.py (1)
Warmable(45-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (2)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
125-143: LGTM! Warmup implementation follows the pattern correctly.The
on_warmupmethod correctly:
- Loads shared resources (VAD pool, ONNX model) that can be cached.
- Uses
asyncio.to_threadfor blocking I/O operations (WhisperFeatureExtractor initialization, ONNX session building).- Returns the pool (not a session), which aligns with the cache-and-share design.
403-416: LGTM! Loading ONNX model from memory prevents multi-worker file access issues.Building the session from
model_bytes(line 416) rather than a file path is a good practice for avoiding file access conflicts in multi-worker deployments. The session configuration (sequential execution, single thread, full optimization) is appropriate.
What's changed
WarmableandWarmupCacheto load resources once and share them between instances of the same classes in runtime.SileroVADwrapper to separate the model (which can be shared between calls) and the track state (which must be isolated).aws.RealtimeNew
warmup()approach.Previously, some plugins had
warmup()methods defined, which were called byAgentLauncher.However, there was no way to share the warmed-up state between different instances of the same class.
E.g., a warmed-up TurnDetection would load the model every time a new call starts.
We need the warmup to be done only once during the lifetime of the program.
This PR adds
WarmableandWarmupCacheto standardize how the warmup is done and shared within the same process.How it works:
AgentLauncherholds a single instance ofWarmupCache, which stores the warmup data like VAD models, etc.Warmableobjects define two methods:on_warmup- called only once for the sameWarmupCacheinstance for the same class. Here you load the resources and return themon_warmed_up- called every time a new Agent is spawned. Here you receive the warmed up resources and set them back to your object.AgentLauncher.launch()handles the warmup process under the hood.Example:
Summary by CodeRabbit
New Features
Refactor
Tests / Docs
✏️ Tip: You can customize this high-level summary in your review settings.