-
Notifications
You must be signed in to change notification settings - Fork 69
Add more metrics #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add more metrics #143
Conversation
WalkthroughAdds OpenTelemetry-based observability: a metrics module with a reusable Timer and many histograms/counters; instruments LLM/STT/TTS/turn-detection entry points to record latency and errors; refactors public plugin LLM/STT methods to internal hooks; adds observability stack, dashboards, init scripts, and Timer unit tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Core as Core API
participant Timer as Timer
participant Impl as Implementation (plugin/_hook)
participant OT as OpenTelemetry
Client->>Core: call public API (e.g. simple_response / process_audio / send)
Core->>Timer: start(metric, base_attrs)
Timer->>Impl: invoke underlying _simple_response / _process_audio / detect_turn
alt success
Impl-->>Timer: return result
Timer->>OT: record histogram with merged attributes
Core-->>Client: return result
else error
Impl-->>Timer: raise error
Timer->>OT: increment error counter (provider, error_type)
Timer-->>Core: re-raise
Core-->>Client: propagate exception
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
111-117: Fix type mismatch when dispatching PCM frames
process_pcm_turn_takingexpects aPcmDataobject (it readssample_rateandsamples), butdetect_turnpasses a rawndarray. The first call topcm.sample_ratewill throw anAttributeError, so turn detection never runs. Keep thePcmDatawrapper when resampling.- resampled_pcm = audio_data.resample(16_000, 1).samples + resampled_pcm = audio_data.resample(16_000, 1) try: loop = asyncio.get_event_loop() await loop.run_in_executor( - None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata + None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata )plugins/fish/vision_agents/plugins/fish/stt.py (1)
63-70: Align docstring with the actual signatureThe docstring still references
user_metadataand returning a list of tuples, but the method only takesparticipantand returnsNone. Please update the Args/Returns sections to reflect reality so the Google-style doc stays accurate.
🧹 Nitpick comments (3)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
32-35: Clean up duplicate default resolution assignments.Line 32 redefines
DEFAULT_WIDTH/DEFAULT_HEIGHTtwice, leaving both 640x480 and 1920x1080 literals in place. Please keep only the intended defaults so the module state is unambiguous.plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
96-96: TODO indicates known service limitation.Line 96 documents a bug in the fal-ai/wizper service where only "transcribe" task works. Consider tracking this with an issue if not already done, as the
taskandtarget_languageconstructor parameters suggest broader intended functionality.Would you like me to create an issue to track this limitation and monitor for when the fal-ai/wizper service fixes the bug?
observability/grafana/init-home-dashboard.sh (1)
26-30: Hardcoded credentials acceptable for local development.The use of
admin:admincredentials is appropriate for the local Docker Compose development setup described in the PR. For production deployments, these credentials should be externalized via environment variables.Based on learnings
To make this more flexible, consider this enhancement:
+GRAFANA_USER="${GRAFANA_USER:-admin}" +GRAFANA_PASSWORD="${GRAFANA_PASSWORD:-admin}" + echo "" echo "Setting admin user home dashboard..." curl -X PUT \ - -u "admin:admin" \ + -u "${GRAFANA_USER}:${GRAFANA_PASSWORD}" \ -H "Content-Type: application/json" \
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (27)
agents-core/tests/test_timer.py(1 hunks)agents-core/vision_agents/core/llm/llm.py(9 hunks)agents-core/vision_agents/core/observability/metrics.py(2 hunks)agents-core/vision_agents/core/stt/stt.py(5 hunks)agents-core/vision_agents/core/tts/tts.py(1 hunks)agents-core/vision_agents/core/turn_detection/turn_detection.py(4 hunks)examples/01_simple_agent_example/simple_agent_example.py(5 hunks)observability/.gitignore(1 hunks)observability/README.md(1 hunks)observability/grafana/dashboards/vision-agents.json(1 hunks)observability/grafana/init-home-dashboard.sh(1 hunks)observability/grafana/provisioning/dashboards/default.yml(1 hunks)observability/grafana/provisioning/datasources/prometheus.yml(1 hunks)observability/prometheus/prometheus.yml(1 hunks)plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py(15 hunks)plugins/aws/vision_agents/plugins/aws/aws_llm.py(18 hunks)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py(7 hunks)plugins/fish/vision_agents/plugins/fish/stt.py(2 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py(12 hunks)plugins/krisp/vision_agents/plugins/krisp/turn_detection.py(1 hunks)plugins/openai/vision_agents/plugins/openai/openai_llm.py(2 hunks)plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py(4 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py(2 hunks)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py(16 hunks)plugins/wizper/vision_agents/plugins/wizper/stt.py(1 hunks)plugins/xai/vision_agents/plugins/xai/llm.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/fish/vision_agents/plugins/fish/stt.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/wizper/vision_agents/plugins/wizper/stt.pyplugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.pyagents-core/vision_agents/core/tts/tts.pyplugins/xai/vision_agents/plugins/xai/llm.pyplugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.pyplugins/krisp/vision_agents/plugins/krisp/turn_detection.pyagents-core/tests/test_timer.pyagents-core/vision_agents/core/turn_detection/turn_detection.pyexamples/01_simple_agent_example/simple_agent_example.pyagents-core/vision_agents/core/stt/stt.pyplugins/gemini/vision_agents/plugins/gemini/gemini_llm.pyplugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.pyplugins/openai/vision_agents/plugins/openai/openai_llm.pyplugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.pyagents-core/vision_agents/core/observability/metrics.pyagents-core/vision_agents/core/llm/llm.pyplugins/aws/vision_agents/plugins/aws/aws_llm.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.
Applied to files:
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
🧬 Code graph analysis (17)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (2)
_process_audio(151-166)_emit_transcript_event(41-63)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
_process_audio(60-128)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
detect_turn(182-193)predict_speech(518-546)_predict_turn_completed(405-456)
plugins/wizper/vision_agents/plugins/wizper/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(151-166)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)plugins/fish/vision_agents/plugins/fish/stt.py (1)
_process_audio(52-138)
plugins/xai/vision_agents/plugins/xai/llm.py (4)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(88-98)LLMResponseEvent(35-39)plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
_simple_response(81-109)plugins/xai/vision_agents/plugins/xai/events.py (1)
XAIChunkEvent(7-10)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (3)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
detect_turn(158-170)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
detect_turn(182-193)
agents-core/tests/test_timer.py (2)
agents-core/vision_agents/core/observability/metrics.py (4)
Timer(62-199)elapsed_ms(143-144)stop(125-141)_restart(146-150)agents-core/vision_agents/core/stt/stt.py (1)
process_audio(116-148)
agents-core/vision_agents/core/turn_detection/turn_detection.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
detect_turn(158-170)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
detect_turn(182-193)plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
detect_turn(78-119)
examples/01_simple_agent_example/simple_agent_example.py (1)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
TurnDetection(34-202)
agents-core/vision_agents/core/stt/stt.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/tts/tts.py (1)
send(238-343)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
process_audio(44-69)plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
_process_audio(78-114)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (6)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseCompletedEvent(109-115)LLMResponseChunkEvent(90-105)agents-core/vision_agents/core/llm/llm.py (5)
_simple_response(88-98)LLMResponseEvent(35-39)_dedup_and_execute(379-413)_convert_tools_to_provider_format(139-153)_extract_tool_calls_from_response(155-169)agents-core/vision_agents/core/agents/conversation.py (1)
Message(14-24)plugins/gemini/vision_agents/plugins/gemini/events.py (1)
GeminiResponseEvent(35-38)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_convert_tools_to_provider_format(509-532)_extract_tool_calls_from_response(534-569)agents-core/vision_agents/core/llm/llm_types.py (2)
ToolSchema(64-67)NormalizedToolCallItem(107-111)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (3)
agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (7)
_simple_response(88-98)_dedup_and_execute(379-413)_sanitize_tool_output(415-426)_convert_tools_to_provider_format(139-153)_extract_tool_calls_from_response(155-169)LLMResponseEvent(35-39)_create_tool_result_message(187-202)agents-core/vision_agents/core/llm/llm_types.py (2)
NormalizedToolCallItem(107-111)ToolSchema(64-67)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (5)
agents-core/vision_agents/core/llm/llm.py (1)
_simple_response(88-98)plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (1)
_simple_response(75-95)plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
_simple_response(81-101)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)plugins/xai/vision_agents/plugins/xai/llm.py (1)
_simple_response(71-96)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
agents-core/vision_agents/core/stt/stt.py (1)
_process_audio(151-166)
agents-core/vision_agents/core/llm/llm.py (5)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/openai/vision_agents/plugins/openai/openai_llm.py (4)
_simple_response(81-109)_convert_tools_to_provider_format(372-404)_extract_tool_calls_from_response(406-433)_create_tool_result_message(435-465)agents-core/vision_agents/core/llm/llm_types.py (2)
ToolSchema(64-67)NormalizedToolCallItem(107-111)agents-core/vision_agents/core/llm/function_registry.py (1)
get_tool_schemas(108-114)agents-core/vision_agents/core/llm/events.py (2)
ToolStartEvent(119-125)ToolEndEvent(129-138)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (5)
plugins/aws/tests/test_aws.py (1)
llm(37-44)agents-core/vision_agents/core/llm/events.py (2)
LLMResponseChunkEvent(90-105)LLMResponseCompletedEvent(109-115)agents-core/vision_agents/core/llm/llm.py (4)
_simple_response(88-98)LLMResponseEvent(35-39)_dedup_and_execute(379-413)_sanitize_tool_output(415-426)agents-core/vision_agents/core/llm/llm_types.py (1)
NormalizedToolCallItem(107-111)plugins/aws/vision_agents/plugins/aws/events.py (1)
AWSStreamEvent(7-10)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
_prepare_silero_vad(150-156)SileroVAD(419-486)detect_turn(158-170)predict_speech(454-455)
🪛 Gitleaks (8.28.0)
observability/grafana/init-home-dashboard.sh
[high] 26-27: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.
(curl-auth-user)
🪛 markdownlint-cli2 (0.18.1)
observability/README.md
19-19: Bare URL used
(MD034, no-bare-urls)
20-20: Bare URL used
(MD034, no-bare-urls)
21-21: Bare URL used
(MD034, no-bare-urls)
38-38: Bare URL used
(MD034, no-bare-urls)
51-51: Bare URL used
(MD034, no-bare-urls)
59-59: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
69-69: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
137-137: Bare URL used
(MD034, no-bare-urls)
138-138: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (15)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
78-115: LGTM! Method rename aligns with observability pattern.The rename from
process_audioto_process_audiois consistent with the broader refactoring to support metrics collection at the base class level. The docstring follows Google style, and the logic correctly handles connection readiness, resampling, and audio transmission.plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (4)
68-68: Provider name attribute added correctly.The addition of
provider_name = "anthropic"enables consistent provider identification in metrics and events across the observability stack.
75-95: LGTM! Method visibility change aligns with base class pattern.The rename to
_simple_responseis consistent with the standardized approach where the base class provides a public wrapper with metrics collection, delegating to this private implementation method.
132-217: Tool calling logic is well-structured.The multi-hop tool execution with deduplication (via
_dedup_and_execute) and bounded iteration (MAX_ROUNDS=3) provides robust handling of Claude's tool use pattern. The formatting improvements enhance readability without altering behavior.
218-349: Streaming tool execution correctly implemented.The streaming path maintains consistency with the non-streaming approach, properly accumulating tool calls across chunks and executing them in multi-hop fashion. Event emissions correctly identify the provider.
plugins/xai/vision_agents/plugins/xai/llm.py (2)
62-62: LGTM! Provider standardization implemented correctly.The addition of
provider_name = "xai"and the rename to_simple_responsealign with the cross-provider standardization pattern. The unusedparticipantparameter in the method signature matches the base class interface requirements.Also applies to: 71-96
148-152: Event emissions correctly formatted.The multi-line event emission formatting improves readability, and all events consistently include
plugin_name="xai"for proper observability tracking.Also applies to: 184-193, 199-205
observability/grafana/provisioning/dashboards/default.yml (2)
8-8: Verify dashboard deletion setting aligns with PR guidance.Line 8 sets
disableDeletion: false, allowing dashboards to be deleted via the UI. However, the PR description states "Do not modify Grafana directly; changes should be requested via AI." Consider setting this totrueto prevent accidental deletions and enforce the file-based workflow.If the intention is to allow UI modifications only for experimentation (not persistence), apply this diff:
- disableDeletion: false + disableDeletion: true
1-13: Provisioning configuration is well-structured.The dashboard provider configuration correctly points to the standard Grafana dashboards directory with folder structure preservation enabled. The 10-second update interval provides quick feedback during development.
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
60-129: LGTM! Implementation follows STT plugin pattern.The rename to
_process_audiois consistent with the base class design. The implementation properly handles temporary file lifecycle with cleanup in afinallyblock, and error handling emits appropriate events.examples/01_simple_agent_example/simple_agent_example.py (2)
60-94: Observability stack integration is well-implemented.The telemetry setup correctly configures OpenTelemetry with Prometheus metrics exposition on port 9464 and OTLP tracing to localhost:4317. The resource identification and graceful shutdown handling via
atexitare appropriate.
6-6: Verification complete: Migration fromvogent.TurnDetectiontosmart_turn.TurnDetectionis successful.The verification confirms no references to
vogent.TurnDetectionremain in the codebase. The file has been correctly updated to import and usesmart_turn.TurnDetection()on line 25, with the corresponding import statement on line 6. This aligns with the intended module organization where VogentTurnDetection remains available as a separate plugin while SmartTurnDetection becomes the default in examples.agents-core/vision_agents/core/tts/tts.py (1)
341-343: Improved metric identification with fully qualified class paths.The change from
self.__class__.__name__to the fully qualified path (module.qualname) provides better metric disambiguation when multiple TTS providers might share simple class names. This aligns with the observability enhancements throughout the codebase.observability/grafana/init-home-dashboard.sh (1)
1-33: Script correctly configures home dashboard.The initialization script properly waits for Grafana health, then sets the Vision Agents dashboard as the default home view for both organization and admin user. The dashboard UID reference aligns with the provisioned dashboard configuration.
agents-core/tests/test_timer.py (1)
18-440: Comprehensive Timer coverageLove how these tests hammer the context-manager, decorator, inheritance, and async paths—this suite should catch regressions fast.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
Show resolved
Hide resolved
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
131-146: Consider warning callers when ignored parameters are used.The method accepts
system_prompt,temperature, andmax_tokensbut silently ignores them. While this appears to be for interface compliance (matching the OpenAI Realtime pattern), callers who pass these parameters may expect them to have an effect.Consider logging a warning when these parameters are provided:
async def _simple_response( self, text: str, system_prompt: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None, **kwargs: Any, ) -> str: """ Internal simple response implementation required by LLM base class. Note: Gemini Realtime is event-driven and doesn't return responses directly. This implementation sends the text and returns a placeholder. """ + if system_prompt or max_tokens or temperature != 0.7: + self.logger.warning( + "Parameters system_prompt, temperature, and max_tokens are not supported " + "by Gemini Realtime and will be ignored" + ) await self.send_realtime_input(text=text) return "" # Realtime API doesn't return text synchronously
148-164: Consider adding parameter warnings and clarifying the generator pattern.Similar to
_simple_response, this method silently ignoressystem_prompt,temperature, andmax_tokens. Additionally, thereturnfollowed byyieldpattern (lines 163-164) creates an empty generator, which is technically correct but may be confusing.Consider adding a warning for ignored parameters and a comment explaining the pattern:
async def _simple_response_stream( self, text: str, system_prompt: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None, **kwargs: Any, ): """ Internal simple response stream implementation required by LLM base class. Note: Gemini Realtime is event-driven and doesn't stream responses in this manner. This implementation sends the text but yields nothing. """ + if system_prompt or max_tokens or temperature != 0.7: + self.logger.warning( + "Parameters system_prompt, temperature, and max_tokens are not supported " + "by Gemini Realtime and will be ignored" + ) await self.send_realtime_input(text=text) + # Return immediately to create an empty generator (yield makes this a generator function) return yield # Make this a generatorplugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
138-154: Unreachable yield statement after return.The
yieldon line 154 is unreachable because it follows thereturnstatement on line 153. While this still makes the function a generator (which is the intent), the unreachable code is a code smell.Apply this diff to make the intent clearer:
- await self.rtc.send_text(text) - return - yield # Make this a generator + await self.rtc.send_text(text) + # Make this a generator that yields nothing + if False: + yieldOr alternatively:
- await self.rtc.send_text(text) - return - yield # Make this a generator + await self.rtc.send_text(text) + yield # Generator that yields nothingNote: This pattern is also present in the Gemini Realtime implementation (
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 147-163).docker-compose.yml (3)
17-17: Pin Grafana and Prometheus to specific versions for reproducibility.Using
:latesttags can lead to non-deterministic behavior across environments and time, potentially breaking observability setups unexpectedly. Pin these images to specific versions for consistent deployments.- image: prom/prometheus:latest + image: prom/prometheus:v2.52.0- image: grafana/grafana:latest + image: grafana/grafana:11.2.0Check Prometheus releases and Grafana releases for appropriate versions to match your requirements.
Also applies to: 35-35
50-51: Add health checks to prevent race conditions during startup.The
depends_ondeclarations ensure service startup order but do not wait for readiness. Thegrafana-initservice may attempt to initialize dashboards before Grafana is fully ready, and Grafana may start before Prometheus is accepting connections.Consider adding health checks to ensure dependent services are truly ready before proceeding:
prometheus: image: prom/prometheus:v2.52.0 ... + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"] + interval: 5s + timeout: 3s + retries: 5grafana: ... depends_on: - - prometheus + prometheus: + condition: service_healthy + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"] + interval: 5s + timeout: 3s + retries: 5grafana-init: ... depends_on: - - grafana + grafana: + condition: service_healthyAlso applies to: 62-63
56-66: Grafana-init script robustness.The
grafana-initservice uses a simple curl image to run a shell script. Consider adding error handling and explicit success/failure indicators in the script itself to make debugging easier if initialization fails.Additionally, ensure the
init-home-dashboard.shscript includes proper error handling and exits with appropriate status codes.Please review the
observability/grafana/init-home-dashboard.shscript and confirm it:
- Exits with a non-zero status on failure
- Has informative logging for debugging
- Includes reasonable retry logic or timeout handling for API calls
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
docker-compose.yml(1 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py(1 hunks)plugins/openai/vision_agents/plugins/openai/openai_realtime.py(1 hunks)tests/test_function_calling.py(10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/openai/vision_agents/plugins/openai/openai_realtime.pytests/test_function_calling.pyplugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic
Files:
tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_simple_response(131-146)_simple_response_stream(148-164)plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
_simple_response(81-109)plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
send_text(419-440)
tests/test_function_calling.py (3)
agents-core/vision_agents/core/llm/llm.py (1)
LLM(46-426)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
_simple_response(131-146)_simple_response_stream(148-164)plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-136)_simple_response_stream(138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-136)_simple_response_stream(138-154)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
121-136: Implementation correctly aligns with base LLM API.The method properly delegates to
rtc.send_textand returns a placeholder empty string, consistent with the event-driven nature of the Realtime API. The unused parameters (system_prompt,temperature,max_tokens) are accepted for interface compatibility.tests/test_function_calling.py (2)
16-41: Well-structured test LLM implementation.The TestLLM class provides appropriate mock implementations of the required abstract methods. The
_simple_responsereturns a mock string, and_simple_response_streamproperly yields (not unreachable like in the production code).
162-162: Correctly replaces abstract class instantiation with test implementation.Replacing
LLM()withTestLLM()throughout the tests properly addresses the issue of instantiating an abstract base class. The test implementation provides the required concrete methods.Also applies to: 176-176, 392-392, 416-416, 435-435, 481-481, 511-511, 544-544
docker-compose.yml (1)
22-22: All referenced configuration files are present and correctly mounted.Verification confirms that all observability configuration files and scripts referenced in the docker-compose.yml file exist in the repository:
observability/prometheus/prometheus.yml✓observability/grafana/init-home-dashboard.sh✓observability/grafana/provisioning/✓observability/grafana/dashboards/✓No issues found.
…m.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
272-278: Replace magic numbers with defined constantsThe calculation uses literals
512and16000, butCHUNKandRATEconstants are defined at lines 41-42 and used elsewhere in the file (e.g., lines 238, 242, 243).Apply this diff:
trailing_silence_ms = ( self._silence.trailing_silence_chunks - * 512 - / 16000 + * CHUNK + / RATE * 1000 * 5 # DTX correction )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py(15 hunks)plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py(2 hunks)plugins/openai/vision_agents/plugins/openai/openai_realtime.py(1 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)tests/test_function_calling.py(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
- plugins/openai/vision_agents/plugins/openai/openai_realtime.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.pyplugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pytests/test_function_calling.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic
Files:
tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
_simple_response(121-137)simple_response(100-119)plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)agents-core/vision_agents/core/llm/llm.py (1)
LLMResponseEvent(35-39)agents-core/vision_agents/core/agents/agents.py (1)
simple_response(266-279)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
detect_turn(72-84)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
detect_turn(182-193)_process_audio_packet(216-342)predict_speech(518-546)_predict_turn_completed(405-456)
tests/test_function_calling.py (4)
agents-core/vision_agents/core/llm/llm.py (6)
LLM(46-426)LLMResponseEvent(35-39)register_function(218-231)get_available_functions(233-235)_dedup_and_execute(379-413)_run_one_tool(280-352)agents-core/vision_agents/core/llm/function_registry.py (3)
FunctionRegistry(35-237)get_tool_schemas(108-114)get_callable(144-160)agents-core/vision_agents/core/llm/events.py (2)
ToolStartEvent(119-125)ToolEndEvent(129-138)agents-core/vision_agents/core/events/manager.py (1)
wait(470-484)
🔇 Additional comments (7)
tests/test_function_calling.py (1)
16-30: LGTM: Clean test double implementation.The
TestLLMclass provides a concrete implementation of the now-abstractLLMbase class, which is necessary after_simple_responsebecame abstract. This is a legitimate test double that doesn't rely on mocking frameworks.plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
131-147: LGTM! Internal method aligns with Realtime API pattern.The
_simple_responseimplementation correctly follows the pattern established inopenai_realtime.py. Returning an emptyLLMResponseEventis appropriate for event-driven APIs where actual responses arrive through the event system rather than return values.The inline import of
LLMResponseEventmirrors the OpenAI implementation and likely avoids circular dependencies.
362-364: LGTM! Formatting improvement.The multi-line formatting improves readability without changing behavior.
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
16-20: LGTM: Observability importsThe metrics imports are clean and consistent with the broader instrumentation patterns across the codebase.
177-208: Excellent lifecycle management with proper cleanupThe nested try/finally structure elegantly resolves the shutdown race condition from the previous review. The outer finally block (lines 205-208) ensures both
_processing_activeand_shutdown_eventare cleared even when cancellation occurs mid-processing, preventing hung waits and enabling clean restart cycles.
252-256: LGTM: VAD latency instrumentationThe Timer wrapping captures essential context (samples, implementation) and follows the established observability pattern seen in vogent.
294-303: LGTM: Comprehensive turn-end detection metricsThe Timer captures rich context (audio duration, samples, trailing silence, prediction probability, and turn-end decision), enabling effective observability for turn detection behavior.
| """ | ||
|
|
||
| import pytest | ||
| from unittest.mock import Mock, patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unittest.mock usage per coding guidelines.
This file violates the coding guideline: "Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files" for tests/**/*.py. The Mock and patch imports are used extensively in TestOpenAIFunctionCalling, TestClaudeFunctionCalling, and TestGeminiFunctionCalling classes.
Consider rewriting these tests without mocking frameworks—use real instances with test configurations, in-memory fakes, or integration test markers if external dependencies are required.
As per coding guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
182-193: Update call sites in vogent plugin to usedetect_turnmethod.The refactoring is incomplete. Tests and examples within the vogent plugin still call the old
process_audiomethod, which will cause failures:
plugins/vogent/tests/test_vogent_td.pylines 45, 46plugins/vogent/example/basic_usage.pylines 90, 166
🧹 Nitpick comments (4)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
164-168: Consider extracting duplicated constants.The
max_concurrency=8andtimeout_s=30parameters appear identically in both tool execution loops (lines 164-168 and 408-412).Apply this diff to extract the constants:
def __init__( self, model: str, region_name: str = "us-east-1", aws_access_key_id: Optional[str] = None, aws_secret_access_key: Optional[str] = None, aws_session_token: Optional[str] = None, ): """ Initialize the BedrockLLM class. Args: model: The Bedrock model ID (e.g., "anthropic.claude-3-5-sonnet-20241022-v2:0") region_name: AWS region name (default: "us-east-1") aws_access_key_id: Optional AWS access key ID aws_secret_access_key: Optional AWS secret access key aws_session_token: Optional AWS session token """ super().__init__() self.events.register_events_from_module(events) self.model = model self._pending_tool_uses_by_index: Dict[int, Dict[str, Any]] = {} + self._tool_max_concurrency = 8 + self._tool_timeout_s = 30Then reference these constants in both locations:
while current_calls and rounds < MAX_ROUNDS: # Execute calls concurrently with dedup triples, seen = await self._dedup_and_execute( cast(List[Dict[str, Any]], current_calls), seen=seen, - max_concurrency=8, - timeout_s=30, + max_concurrency=self._tool_max_concurrency, + timeout_s=self._tool_timeout_s, )And similarly at lines 408-412.
Also applies to: 408-412
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
149-153: Simplify model initialization to match smart_turn pattern.The lambda wrappers around model constructors add unnecessary complexity. The
asyncio.to_threadfunction accepts any callable, so you can pass constructors directly as shown in the smart_turn plugin.Apply this diff to simplify
_prepare_silero_vad:- self.vad = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: SileroVAD( # type: ignore[arg-type] - path, reset_interval_seconds=self.vad_reset_interval_seconds - ) + self.vad = await asyncio.to_thread( + SileroVAD, path, reset_interval_seconds=self.vad_reset_interval_seconds )Apply similar changes to
_prepare_whisper:- self.whisper = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: WhisperModel( - self.whisper_model_size, device="cpu", compute_type="int8" - ) + self.whisper = await asyncio.to_thread( + WhisperModel, self.whisper_model_size, device="cpu", compute_type="int8" )And to
_prepare_vogent:- self.vogent = await asyncio.to_thread( # type: ignore[func-returns-value] - lambda: VogentDetector( - compile_model=True, - warmup=True, - device=None, - model_name="vogent/Vogent-Turn-80M", - revision="main", - ) + self.vogent = await asyncio.to_thread( + VogentDetector, + compile_model=True, + warmup=True, + device=None, + model_name="vogent/Vogent-Turn-80M", + revision="main", )Note: The
type: ignorecomments may become unnecessary after this refactor. If they remain, investigate the underlying type issues rather than suppressing them.Also applies to: 159-162, 171-179
376-403: Consider checking self.whisper before starting the timer.The Timer currently records latency even when
self.whisper is Noneand an empty string is returned immediately. This could skew your metrics if the early-exit path is taken frequently.Apply this diff to avoid recording no-op latency:
async def _transcribe_segment(self, pcm: PcmData) -> str: """ Transcribe audio segment using faster-whisper. Args: pcm: PcmData containing audio samples Returns: Transcribed text """ + if self.whisper is None: + return "" + with Timer(vogent_whisper_latency_ms) as timer: # Ensure it's 16khz and f32 format pcm = pcm.resample(16000).to_float32() audio_array = pcm.samples timer.attributes["audio_duration_ms"] = pcm.duration_ms timer.attributes["samples"] = len(audio_array) - if self.whisper is None: - return "" - # Run transcription in thread pool to avoid blocking segments, info = await asyncio.to_thread(
422-456: Consider checking self.vogent before starting the timer.Similar to the transcription method, the Timer records latency even when
self.vogent is NoneandFalseis returned immediately. Moving the guard clause outside the Timer context would provide more accurate metrics.Apply this diff:
async def _predict_turn_completed( self, pcm: PcmData, prev_line: str, curr_line: str, ) -> bool: """ Predict whether the current turn is complete using Vogent. Args: pcm: PcmData containing audio samples prev_line: Previous speaker's text (for context) curr_line: Current speaker's text Returns: True if turn is complete, False otherwise """ + if self.vogent is None: + return False + with Timer(turn_end_detection_latency_ms) as timer: # Ensure it's 16khz and f32 format pcm = pcm.resample(16000).to_float32() # Truncate to 8 seconds audio_array = pcm.tail(8, False).samples timer.attributes["implementation"] = "vogent" timer.attributes["audio_duration_ms"] = len(audio_array) / 16000 * 1000 timer.attributes["prev_line_length"] = len(prev_line) timer.attributes["curr_line_length"] = len(curr_line) - if self.vogent is None: - return False - # Run vogent prediction in thread pool
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
plugins/aws/vision_agents/plugins/aws/aws_llm.py(3 hunks)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py(10 hunks)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py(15 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide
Files:
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.pyplugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.pyplugins/aws/vision_agents/plugins/aws/aws_llm.py
🧬 Code graph analysis (3)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
detect_turn(182-193)_process_audio_packet(216-342)predict_speech(518-546)_predict_turn_completed(405-456)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
Timer(62-199)agents-core/vision_agents/core/utils/utils.py (1)
ensure_model(115-157)plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
_prepare_silero_vad(151-157)SileroVAD(429-496)detect_turn(159-171)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (2)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
_simple_response(69-86)agents-core/vision_agents/core/llm/llm.py (2)
_simple_response(88-98)LLMResponseEvent(35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff & mypy
- GitHub Check: unit / Test "not integration"
🔇 Additional comments (6)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
82-87: LGTM - Method signature aligns with the refactoring pattern.The change from public
simple_responseto private_simple_responsewith explicit return type annotation is consistent with the broader refactoring across LLM implementations. The signature correctly matches the base class abstract method.plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
19-24: LGTM! Clean metrics imports and definition.The observability imports are well-organized, and the new
vogent_whisper_latency_mshistogram follows proper naming conventions with clear metadata.Also applies to: 31-36
259-264: LGTM! VAD instrumentation follows best practices.The Timer usage correctly wraps the VAD prediction, and the attributes provide useful context for performance analysis (sample count and implementation type).
476-476: LGTM! Punctuation stripping improves model input.Using
rstrip(".!?")to remove terminal punctuation before passing text to Vogent is a sensible preprocessing step that should improve turn detection accuracy.plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
178-210: Lifecycle cleanup looks solid.Resetting
_processing_activeand_shutdown_eventin the shutdown path prevents the cancellation hang we saw earlier and keeps the detector restartable.
253-305: Telemetry instrumentation is on point.Wrapping the VAD and turn-completion calls in
Timergives us per-call latency with helpful context attributes, which will drop straight into the new dashboards.
How to play with this
uv add --dev opentelemetry-exporter-prometheus prometheus-client opentelemetry-sdk opentelemetry-exporter-otlpNote: Do not change the Grafana directly, ask AI to change it for you (everything lives under observability and has a README)!
See turn_detection.py to see an example of metrics collection
Summary by CodeRabbit
New Features
Improvements
Tests
Documentation