Skip to content

Conversation

@tschellenbach
Copy link
Member

@tschellenbach tschellenbach commented Oct 17, 2025

Implements Bedrock LLM support & Nova Sonic realtime support. Also some improvements to how LLM & realtime classes work and are documented. Future improvements will be done on a different PR:

  • Can interrupt handling be improved? (INTERRUPTED with contentEnd should flush the audio buffer)
  • Function calling for LLM needs to be tested
  • Function calling for Realtime
  • Cleanup function calling codebase
  • Chat integration

Summary by CodeRabbit

  • New Features

    • AWS Bedrock plugin (LLM + realtime) added with examples, README and a new AWS stream event type.
  • Documentation

    • Added realtime/LLM/STT/TTS guides and concise plugin examples; removed legacy AI overview and several instruction guides.
  • API Changes

    • Realtime now exposes session_id and connected; LLM responses may carry an optional exception; unified public close lifecycle.
  • Tests

    • Added AWS integration tests; removed selected realtime base and event-logging tests.
  • Chores

    • Packaging, example pyprojects, .env templates, workspace and CI updates.
  • Bug Fixes

    • Suppressed noisy runtime handler logging.

@coderabbitai
Copy link

coderabbitai bot commented Oct 17, 2025

Walkthrough

Core event-manager logging removed; LLMResponseEvent now accepts an optional exception; Realtime base exposes public session_id/connected and requires a public close(); a new AWS Bedrock plugin (LLM + Realtime, streaming, multi‑hop tool support) and tests/examples/docs/CI/workspace updates were added; several docs deleted or rewritten.

Changes

Cohort / File(s) Summary
Core framework
agents-core/vision_agents/core/events/manager.py, agents-core/vision_agents/core/llm/llm.py, agents-core/vision_agents/core/llm/realtime.py
Removed runtime INFO logging in event manager; added optional exception to LLMResponseEvent constructor; Realtime now exposes session_id and public connected flag, removed _is_connected/is_connected, and made close() a required public API.
AWS plugin (implementation & exports)
plugins/aws/vision_agents/plugins/aws/aws_llm.py, plugins/aws/vision_agents/plugins/aws/aws_realtime.py, plugins/aws/vision_agents/plugins/aws/events.py, plugins/aws/vision_agents/plugins/aws/__init__.py
Added BedrockLLM with converse/converse_stream (streaming, multi‑hop tool execution) and a Bedrock Realtime (bidirectional text/audio streaming, event dispatch, audio decoding); added AWSStreamEvent; re-exported LLM and Realtime.
AWS packaging, examples, tests
plugins/aws/pyproject.toml, plugins/aws/README.md, plugins/aws/example/*, plugins/aws/tests/*
New plugin packaging, README, example projects/scripts (Qwen, Nova realtime), example pyproject/.env, and integration tests for Bedrock LLM and Realtime.
Docs & instructions
docs/ai/* (multiple files)
Removed several high‑level docs (ai-overview, websocket, function-calling) and replaced/rewrote instruction pages to client-driven streaming examples (ai-llm, ai-realtime-llm, ai-stt, ai-tts, ai-plugin).
Tests & fixtures
conftest.py, tests/test_events.py, tests/test_realtime_base.py
Added golf_swing_image pytest fixture; removed test_silent_suppresses_handler_logging from tests/test_events.py; deleted tests/test_realtime_base.py.
Other realtime plugins
plugins/gemini/.../gemini_realtime.py, plugins/openai/.../openai_realtime.py
Unified teardown API by replacing internal _close_impl with public close on Realtime subclasses; added CancelledError handling in Gemini receive loop.
Workspace config
pyproject.toml
Added vision-agents-plugins-aws to tool.uv.sources and included plugins/aws in workspace members.
Examples (misc)
plugins/example/example/*
Added an example plugin project, env template, and run script demonstrating Agent composition using AWS LLM, TTS, STT, and TurnDetection.
CI & Actions
.github/actions/python-uv-setup/action.yml, .github/workflows/run_tests.yml, .github/workflows/ci.yml
Upgraded setup-uv action to v7, expanded cache globs and cache logging; exposed AWS_BEARER_TOKEN_BEDROCK secret in test job env and added verification; propagated secrets to reusable workflow.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Agent
  participant Realtime
  participant Bedrock
  participant AudioTrack

  User->>Agent: request simple_response / simple_audio_response
  Agent->>Realtime: connect()
  Realtime->>Bedrock: open bidirectional stream / sessionStart
  User->>Realtime: simple_response(text) or simple_audio_response(pcm)
  Realtime->>Bedrock: send_event(contentStart -> input -> contentEnd)
  loop streaming chunks
    Bedrock-->>Realtime: textOutput / audioOutput (chunks)
    Realtime->>Agent: emit LLMResponseChunkEvent
    Realtime->>AudioTrack: decode audio -> emit RealtimeAudioOutputEvent
  end
  alt tool invocation
    Bedrock-->>Realtime: toolUse
    Realtime->>Realtime: execute tool(s) -> send toolResult
    Realtime->>Bedrock: send_event(toolResult)
  end
  Bedrock-->>Realtime: contentEnd / completionEnd
  Realtime->>Agent: emit LLMResponseCompletedEvent (original, text[, exception])
  User->>Realtime: close()
  Realtime->>Bedrock: send promptEnd/sessionEnd, close stream
  Realtime->>Agent: set connected = False
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • maxkahan
  • d3xvn
  • Nash0x7E2

Poem

I press my ear to the machine: a bright, blunt mouth—
it threads a careful current, stitches dark into speech.
The session blooms and hardens, text piled like winter glass;
I pry the seams and let the small electric river go.
Even the silence keeps its newly learned shape.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning Several changes appear to be out of scope or not directly justified by the AWS integration objectives. The event manager logging removal (commenting out handler invocation logging) is unrelated to AWS/Bedrock implementation. Complete deletions of documentation files (ai-overview.md, websocket.md, ai-function-calling.md) and the entire test_realtime_base.py test suite are significant removals that are not clearly tied to AWS integration work and lack explanation. Additionally, the event manager and documentation deletions should typically be addressed in separate PRs focused on documentation/testing cleanup rather than feature implementation. The changes to gemini_realtime.py and openai_realtime.py modifying the close method signature are related to the core Realtime refactoring, but the scope of all these modifications together makes it difficult to assess whether they represent necessary supporting work or unrelated changes. Consider separating out-of-scope changes into dedicated PRs: move the event manager logging removal to a PR focused on logging improvements, move documentation file deletions to a documentation cleanup PR, and move test file deletions to a testing cleanup PR. Keep this PR focused on the AWS/Bedrock integration and necessary core class refactoring. If these deletions are intentional cleanups that must ship with this feature, add explicit justification to the PR description explaining why they are necessary dependencies for the AWS integration work.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "[AI-192] - Bedrock, AWS & Nova" directly and clearly describes the primary changes in the changeset. The title specifically references Bedrock (the AWS service being integrated), AWS (Amazon Web Services), and Nova (the model family supported), all of which are central to the implementation. The title is concise, specific enough for a teammate scanning the history to understand the main purpose, and avoids vague or generic phrasing.
Linked Issues Check ✅ Passed The code changes substantially meet the objectives stated in the PR: the implementation includes a new BedrockLLM class with full tool-calling and streaming support, a new AWS Realtime class supporting Nova models with bidirectional streaming and audio handling, comprehensive tests for both components, documentation updates explaining the new LLM and Realtime patterns, and example scripts demonstrating AWS integration. Supporting changes to core LLM and Realtime base classes (exception field addition, public attributes, abstract close method) are necessary to enable the AWS plugin implementations. The PR delivers on the stated objectives of implementing Bedrock LLM and Nova Sonic realtime support.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bedrock

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tschellenbach tschellenbach changed the title start of bedrock branch Bedrock, AWS & Nova Oct 20, 2025
@tschellenbach tschellenbach changed the title Bedrock, AWS & Nova [AI-192] - Bedrock, AWS & Nova Oct 21, 2025
@tschellenbach tschellenbach marked this pull request as ready for review October 21, 2025 16:19
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)

268-282: Handle CancelledError when awaiting cancelled task.

After calling cancel() on line 272, awaiting the task on line 273 will raise CancelledError. This should be caught and suppressed as it's expected during shutdown.

Apply this diff to handle the expected exception:

         if hasattr(self, '_receive_task') and self._receive_task:
             self._receive_task.cancel()
-            await self._receive_task
+            try:
+                await self._receive_task
+            except asyncio.CancelledError:
+                pass  # Expected during shutdown
🧹 Nitpick comments (28)
pyproject.toml (1)

8-8: Consider alphabetical ordering for consistency.

The AWS plugin entries are correctly configured but placed after Anthropic instead of being alphabetically ordered. While this works fine, maintaining alphabetical order (as the rest of the list follows) improves maintainability.

Apply this diff to maintain alphabetical ordering:

 [tool.uv.sources]
 vision-agents = { workspace = true }
 vision-agents-plugins-anthropic = { workspace = true }
-vision-agents-plugins-aws = { workspace = true }
 vision-agents-plugins-cartesia = { workspace = true }
+vision-agents-plugins-aws = { workspace = true }

And in the workspace members:

 members = [
     "agents-core",
     "plugins/anthropic",
-    "plugins/aws",
     "plugins/cartesia",
+    "plugins/aws",

Also applies to: 29-29

plugins/example/example/pyproject.toml (1)

2-2: Consider naming consistency for the example project.

The project name "my-example" resides in plugins/example/example/, creating a nested structure. Per the plugin development guide in docs/ai/instructions/ai-plugin.md, plugin names should match their folder structure. Consider either:

  • Renaming to "example" to match the parent plugin folder, or
  • Documenting why this nested structure exists if it's intentional for template purposes
agents-core/vision_agents/core/llm/realtime.py (2)

40-40: Remove redundant class attribute declaration.

The session_id : str declaration on line 40 is redundant since session_id is always initialized in __init__ (line 50). Class-level type annotations without default values can be confusing and serve no purpose when the attribute is set in the constructor.

Apply this diff to remove the redundant declaration:

-    session_id : str # UUID to identify this session
-
     def __init__(

196-197: Error message can be clarified for better developer experience.

Verification confirms all three Realtime implementations (OpenAI, Gemini, AWS) properly override close(). The change from abstract to NotImplementedError works correctly since implementations exist.

The error message "llm.close isn't implemented" could still be improved to "Realtime.close() must be implemented by subclass" for clarity, though this is a minor cosmetic improvement.

plugins/example/example/.env.example (1)

1-8: Consider alphabetically ordering AWS keys.

Static analysis suggests ordering AWS_ACCESS_KEY_ID before AWS_BEARER_TOKEN_BEDROCK for consistency. This is purely a style preference and doesn't affect functionality.

plugins/aws/example/.env.example (2)

10-10: Add blank line at end of file.

The file is missing a trailing newline, which is a common convention and flagged by linters.

Add a blank line after line 10.


1-10: Consider alphabetically ordering environment variable keys.

Static analysis suggests alphabetically ordering keys within each logical group (AWS keys together, API keys together). This improves maintainability but is optional.

plugins/aws/pyproject.toml (2)

13-17: Dependency bounds and package validity.

  • Consider upper‑bounding boto3 to avoid future breaking changes.
  • Please verify the package name/version for the Bedrock runtime SDK used here matches what you vendor elsewhere in the repo.
 dependencies = [
-    "vision-agents",
-    "boto3>=1.28.0",
-    "aws-sdk-bedrock-runtime>=0.1.0",
+    "vision-agents",
+    "boto3>=1.28.0,<2.0",
+    "aws-sdk-bedrock-runtime>=0.1.0",
 ]

37-41: Pin dev pytest to latest patch for reproducibility.

Align with current 8.4.x and pin to a patch to reduce flakiness.

-[dependency-groups]
-dev = [
-    "pytest>=8.4.1",
-    "pytest-asyncio>=1.0.0",
-]
+[dependency-groups]
+dev = [
+    "pytest==8.4.2",
+    "pytest-asyncio>=1.0.0",
+]

Based on learnings

plugins/aws/vision_agents/plugins/aws/events.py (1)

6-10: Event type/name OK; consider docstring expansion.

Looks fine. Optionally add field docs for event_data to match internal event conventions.

plugins/aws/example/aws_realtime_nova_example.py (1)

17-23: Add a short docstring to start_agent.

Keep examples consistent with project docstring guidance.

-async def start_agent() -> None:
+async def start_agent() -> None:
+    """Run a minimal Bedrock Realtime demo call and send a greeting."""
plugins/aws/example/aws_qwen_example.py (1)

17-27: Add a short docstring to start_agent.

Improves readability; aligns with internal style.

-async def start_agent() -> None:
+async def start_agent() -> None:
+    """Run a Qwen (Bedrock) demo with TTS, STT, and turn detection."""
plugins/aws/tests/test_aws_realtime.py (3)

30-47: Async test not marked with @pytest.mark.asyncio. Also assert that audio arrived.

Mark as asyncio and add a minimal assertion to catch regressions.

-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_simple_response_flow(self, realtime):
@@
-        # Wait for response
-        await asyncio.sleep(10.0)
+        # Wait for response
+        await asyncio.sleep(10.0)
+        assert len(events) >= 0  # keep len() visible if you later tighten this to > 0

48-69: Async mark and a stronger check.

Optional: assert we received some audio after sending input.

-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_audio_first(self, realtime, mia_audio_16khz):
@@
-        # Test passes if no exceptions are raised
-        assert True
+        # Received at least one audio event or we time out
+        assert len(events) >= 0

70-84: Async mark for lifecycle test.

Mark as asyncio to avoid collector/runtime issues.

-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_connection_lifecycle(self, realtime):
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)

143-149: Use self.logger consistently.

Mixed use of module logger and self.logger. Keep one (prefer self.logger) to ensure consistent context and handlers.

-        logger.info("Connecting to AWS Bedrock for model %s", self.model)
+        self.logger.info("Connecting to AWS Bedrock for model %s", self.model)

347-353: Await canceled task to suppress warnings and finish cleanly.

Ensure cancellation is observed before returning; also clear reference.

-        if self._stream_task:
-            self._stream_task.cancel()
-
-        self.connected = False
+        if self._stream_task:
+            self._stream_task.cancel()
+            try:
+                await self._stream_task
+            except asyncio.CancelledError:
+                pass
+            finally:
+                self._stream_task = None
+        self.connected = False

390-403: Emit consistent sample rate and use self.logger.

Keep logging consistent; explicitly pass the 24k sample rate (matches prompt config).

-                                    audio_content = json_data['event']['audioOutput']['content']
+                                    audio_content = json_data['event']['audioOutput']['content']
                                     audio_bytes = base64.b64decode(audio_content)
@@
-                                    audio_event = RealtimeAudioOutputEvent(
-                                        plugin_name="aws",
-                                        audio_data=audio_bytes,
-                                        sample_rate=24000
-                                    )
-                                    self.events.send(audio_event)
+                                    audio_event = RealtimeAudioOutputEvent(
+                                        plugin_name="aws",
+                                        audio_data=audio_bytes,
+                                        sample_rate=24000,
+                                    )
+                                    self.events.send(audio_event)
@@
-                                    await self.output_track.write(audio_bytes)
+                                    await self.output_track.write(audio_bytes)

371-434: Minor: unify logging, remove prints, and tidy message handling.

Prefer self.logger everywhere; avoid print; current mixed usage is noisy and can break structured logging.

-                                    logger.info(f"Content start from AWS Bedrock: {content_start}")
+                                    self.logger.info("Content start from AWS Bedrock: %s", content_start)
@@
-                                    logger.info(f"Text output from AWS Bedrock: {text_content}")
+                                    self.logger.info("Text output from AWS Bedrock: %s", text_content)
@@
-                                    logger.info("TODO: should flush audio buffer")
-                                    logger.info(f"Content end from AWS Bedrock {stopReason}: {json_data['event']['contentEnd']}")
+                                    self.logger.info("TODO: should flush audio buffer")
+                                    self.logger.info("Content end from AWS Bedrock %s: %s", stopReason, json_data['event']['contentEnd'])
@@
-                    logger.error("Stop async iteration exception")
+                    self.logger.error("Stop async iteration exception")
@@
-                    logger.error("Error, %s", e)
+                    self.logger.error("Error, %s", e)
@@
-                        print(f"Validation error: {error_message}")
+                        self.logger.warning("Validation error: %s", error_message)
                     else:
-                        print(f"Error receiving response: {e}")
+                        self.logger.error("Error receiving response: %s", e)
@@
-            logger.error("Error, %s", e)
-            print(f"Response processing error: {e}")
+            self.logger.error("Response processing error: %s", e)

Also applies to: 450-467


24-26: Sample rate mismatch (constant vs usage/comments).

DEFAULT_SAMPLE_RATE=16000 but you configure 24kHz throughout. Pick one and use consistently (Bedrock Nova Sonic samples show 24k). Not a blocker, but worth aligning names/comments.

-DEFAULT_SAMPLE_RATE = 16000
+# Nova Sonic config uses 24kHz I/O
+DEFAULT_SAMPLE_RATE = 24000
@@
-        # Audio output track - Bedrock typically outputs at 16kHz
+        # Audio output track - Nova Sonic configured for 24kHz
         self.output_track = AudioStreamTrack(
-            framerate=24000, stereo=False, format="s16"
+            framerate=DEFAULT_SAMPLE_RATE, stereo=False, format="s16"
         )

Also applies to: 110-113

plugins/aws/vision_agents/plugins/aws/aws_llm.py (8)

153-160: Preserve tool result JSON structure instead of stringifying.

_sanitize_tool_output() returns str, so isinstance(payload, dict) is never true. This wraps structured outputs as {"result": "..."} strings, losing type fidelity.

Apply this diff:

-                    payload = self._sanitize_tool_output(res)
-                    tool_result_blocks.append({
-                        "toolUseId": tc["id"],
-                        "content": [{"json": payload if isinstance(payload, dict) else {"result": payload}}],
-                    })
+                    raw = res
+                    sanitized = self._sanitize_tool_output(res)
+                    json_payload = raw if isinstance(raw, (dict, list, int, float, bool)) or raw is None else {"result": sanitized}
+                    tool_result_blocks.append({
+                        "toolUseId": tc["id"],
+                        "content": [{"json": json_payload}],
+                    })

And the same change in the streaming path.

Also applies to: 263-270


238-249: Use the response context manager for ConverseStream to avoid leaking connections.

boto3 Bedrock streaming responses are context managers. Wrap with “with … as response:” so the underlying HTTP stream is closed even on exceptions; apply similarly to follow‑ups and final pass.

Example skeleton (adjust indentation accordingly):

with self.client.converse_stream(**kwargs) as response:
    stream = response.get("stream")
    for event in stream:
        ...

Also applies to: 290-296, 306-314


73-73: Add client timeouts and bounded retries.

Harden external calls: set connect/read timeouts and standard retries to prevent indefinite hangs.

Apply this diff:

+from botocore.config import Config
@@
-        self.client = boto3.client("bedrock-runtime", **session_kwargs)
+        self.client = boto3.client(
+            "bedrock-runtime",
+            config=Config(
+                connect_timeout=5,
+                read_timeout=60,
+                retries={"mode": "standard", "total_max_attempts": 3},
+            ),
+            **session_kwargs,
+        )

142-142: Type annotation for 'seen' should allow Optional id.

_tc_key() returns Tuple[Optional[str], str, str]. Align the local type to avoid mypy noise.

Apply this diff:

-                seen: set[tuple[str, str, str]] = set()
+                seen: set[tuple[Optional[str], str, str]] = set()
@@
-            seen: set[tuple[str, str, str]] = set()
+            seen: set[tuple[Optional[str], str, str]] = set()

Also applies to: 254-254


321-322: Propagate exception in streaming error response.

Include the exception object in LLMResponseEvent for parity with converse().

Apply this diff:

-            llm_response = LLMResponseEvent(None, error_msg)
+            llm_response = LLMResponseEvent(None, error_msg, exception=e)

237-237: Clear pending tool-use map per stream to avoid cross-call leakage.

If a stream errors mid‑tool, entries can persist across calls. Clear at start.

Apply this diff:

         try:
+            self._pending_tool_uses_by_index.clear()
             response = self.client.converse_stream(**kwargs)

326-351: Chunk event metadata: consider forwarding real IDs/indices.

LLMResponseChunkEvent uses empty item_id and zeroes for indices/sequence. If AWS events include these, forward them to help UIs assemble deltas deterministically.


20-36: Docstrings and return annotations.

Adopt Google-style docstrings consistently and add explicit return types (e.g., -> LLMResponseEvent[Any] on async methods) to match repo guidelines.

As per coding guidelines

Also applies to: 77-98, 209-214, 326-333, 448-481

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1025a42 and 4b540c9.

⛔ Files ignored due to path filters (3)
  • plugins/aws/example/uv.lock is excluded by !**/*.lock
  • plugins/example/example/uv.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (32)
  • agents-core/vision_agents/core/events/manager.py (1 hunks)
  • agents-core/vision_agents/core/llm/llm.py (2 hunks)
  • agents-core/vision_agents/core/llm/realtime.py (4 hunks)
  • conftest.py (1 hunks)
  • docs/ai/ai-overview.md (0 hunks)
  • docs/ai/instructions/ai-function-calling.md (0 hunks)
  • docs/ai/instructions/ai-llm.md (1 hunks)
  • docs/ai/instructions/ai-plugin.md (3 hunks)
  • docs/ai/instructions/ai-realtime-llm.md (1 hunks)
  • docs/ai/instructions/ai-stt.md (1 hunks)
  • docs/ai/instructions/ai-tts.md (1 hunks)
  • docs/ai/instructions/projects/websocket.md (0 hunks)
  • plugins/aws/README.md (1 hunks)
  • plugins/aws/example/.env.example (1 hunks)
  • plugins/aws/example/aws_qwen_example.py (1 hunks)
  • plugins/aws/example/aws_realtime_nova_example.py (1 hunks)
  • plugins/aws/example/pyproject.toml (1 hunks)
  • plugins/aws/pyproject.toml (1 hunks)
  • plugins/aws/tests/test_aws.py (1 hunks)
  • plugins/aws/tests/test_aws_realtime.py (1 hunks)
  • plugins/aws/vision_agents/plugins/aws/__init__.py (1 hunks)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (1 hunks)
  • plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1 hunks)
  • plugins/aws/vision_agents/plugins/aws/events.py (1 hunks)
  • plugins/example/example/.env.example (1 hunks)
  • plugins/example/example/my_example.py (1 hunks)
  • plugins/example/example/pyproject.toml (1 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • pyproject.toml (2 hunks)
  • tests/test_events.py (0 hunks)
  • tests/test_realtime_base.py (0 hunks)
💤 Files with no reviewable changes (5)
  • tests/test_events.py
  • docs/ai/instructions/projects/websocket.md
  • docs/ai/instructions/ai-function-calling.md
  • docs/ai/ai-overview.md
  • tests/test_realtime_base.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • conftest.py
  • agents-core/vision_agents/core/events/manager.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • plugins/aws/vision_agents/plugins/aws/__init__.py
  • plugins/aws/tests/test_aws.py
  • agents-core/vision_agents/core/llm/llm.py
  • plugins/aws/vision_agents/plugins/aws/events.py
  • agents-core/vision_agents/core/llm/realtime.py
  • plugins/aws/example/aws_realtime_nova_example.py
  • plugins/aws/tests/test_aws_realtime.py
  • plugins/aws/example/aws_qwen_example.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/aws/vision_agents/plugins/aws/aws_realtime.py
  • plugins/example/example/my_example.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
🧬 Code graph analysis (12)
plugins/aws/vision_agents/plugins/aws/__init__.py (3)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
  • BedrockLLM (20-480)
agents-core/vision_agents/core/llm/llm.py (1)
  • LLM (36-363)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1)
  • Realtime (35-466)
plugins/aws/tests/test_aws.py (6)
tests/test_conversation.py (1)
  • conversation (69-79)
agents-core/vision_agents/core/agents/conversation.py (1)
  • InMemoryConversation (208-268)
agents-core/vision_agents/core/llm/events.py (1)
  • LLMResponseChunkEvent (100-115)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (4)
  • _normalize_message (449-480)
  • simple_response (77-97)
  • converse (99-207)
  • converse_stream (209-324)
conftest.py (1)
  • golf_swing_image (81-88)
agents-core/vision_agents/core/llm/llm.py (1)
  • _set_instructions (164-168)
agents-core/vision_agents/core/llm/llm.py (1)
agents-core/vision_agents/core/utils/utils.py (1)
  • parse_instructions (41-90)
plugins/aws/vision_agents/plugins/aws/events.py (1)
agents-core/vision_agents/core/events/base.py (1)
  • PluginBaseEvent (43-45)
agents-core/vision_agents/core/llm/realtime.py (1)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • close (268-282)
plugins/aws/example/aws_realtime_nova_example.py (3)
agents-core/vision_agents/core/edge/types.py (1)
  • User (15-18)
agents-core/vision_agents/core/agents/agents.py (2)
  • Agent (53-1021)
  • finish (271-291)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (2)
  • Realtime (35-466)
  • simple_response (181-192)
plugins/aws/tests/test_aws_realtime.py (3)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)
  • Realtime (35-466)
  • close (327-352)
  • connect (125-161)
  • simple_response (181-192)
  • simple_audio_response (163-179)
agents-core/vision_agents/core/llm/events.py (1)
  • RealtimeAudioOutputEvent (35-42)
agents-core/vision_agents/core/llm/llm.py (1)
  • _set_instructions (164-168)
plugins/aws/example/aws_qwen_example.py (2)
agents-core/vision_agents/core/edge/types.py (1)
  • User (15-18)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
  • simple_response (77-97)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
agents-core/vision_agents/core/llm/realtime.py (1)
  • close (196-197)
agents-core/vision_agents/core/agents/agents.py (1)
  • close (293-366)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • VideoForwarder (13-188)
agents-core/vision_agents/core/edge/types.py (2)
  • PcmData (37-183)
  • resample (129-183)
agents-core/vision_agents/core/llm/events.py (1)
  • RealtimeAudioOutputEvent (35-42)
agents-core/vision_agents/core/llm/llm.py (1)
  • _build_enhanced_instructions (61-85)
agents-core/vision_agents/core/llm/realtime.py (1)
  • _emit_audio_input_event (99-110)
plugins/example/example/my_example.py (1)
agents-core/vision_agents/core/edge/types.py (1)
  • User (15-18)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (6)
plugins/aws/tests/test_aws.py (1)
  • llm (34-44)
agents-core/vision_agents/core/llm/llm.py (4)
  • LLM (36-363)
  • LLMResponseEvent (25-29)
  • _build_enhanced_instructions (61-85)
  • _dedup_and_execute (318-350)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • ToolSchema (64-67)
  • NormalizedToolCallItem (107-111)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (100-115)
  • LLMResponseCompletedEvent (119-123)
agents-core/vision_agents/core/processors/base_processor.py (1)
  • Processor (32-40)
plugins/aws/vision_agents/plugins/aws/events.py (1)
  • AWSStreamEvent (7-10)
🪛 dotenv-linter (4.0.0)
plugins/example/example/.env.example

[warning] 6-6: [UnorderedKey] The AWS_ACCESS_KEY_ID key should go before the AWS_BEARER_TOKEN_BEDROCK key

(UnorderedKey)

plugins/aws/example/.env.example

[warning] 5-5: [UnorderedKey] The AWS_ACCESS_KEY_ID key should go before the AWS_BEARER_TOKEN_BEDROCK key

(UnorderedKey)


[warning] 9-9: [UnorderedKey] The CARTESIA_API_KEY key should go before the FAL_KEY key

(UnorderedKey)


[warning] 10-10: [EndingBlankLine] No blank line at the end of the file

(EndingBlankLine)


[warning] 10-10: [UnorderedKey] The DEEPGRAM_API_KEY key should go before the FAL_KEY key

(UnorderedKey)

🪛 LanguageTool
docs/ai/instructions/ai-plugin.md

[style] ~9-~9: In American English, abbreviations like “etc.” require a period.
Context: ....tomland update the name, description etc - Update the event types in yourevent...

(ETC_PERIOD)

🪛 markdownlint-cli2 (0.18.1)
plugins/aws/README.md

48-48: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (12)
docs/ai/instructions/ai-stt.md (1)

1-22: Documentation example is incomplete and missing required imports.

The example references types (Optional, Union, Dict, Any, PcmData, Participant, AsyncDeepgramClient) that aren't imported, which makes it not directly runnable and potentially confusing for users trying to implement the pattern.

Additionally, the constructor parameters api_key and client are accepted but never used in the shown code, and the _process_audio_impl method only contains pass, providing no guidance on expected behavior or return value handling.

To verify this documentation accurately reflects the new STT plugin interface, please:

  1. Confirm the actual stt.STT base class signature and required methods in the codebase
  2. Clarify whether api_key and client should be stored/used within the class
  3. Provide either a complete working example or clearly document what _process_audio_impl should do (e.g., integrate with Deepgram, process PCM data, etc.)

Suggested improvements:

  • Add all necessary imports at the top of the code block
  • Show how api_key and client are utilized (or remove them if not needed)
  • Include a minimal but realistic implementation of _process_audio_impl that demonstrates the expected behavior, or add a comment explaining the return type and typical use case
conftest.py (1)

80-88: LGTM!

The fixture correctly loads the image file using a context manager and returns the bytes. The implementation follows the established pattern and handles resource cleanup properly.

docs/ai/instructions/ai-plugin.md (2)

22-23: Good documentation enhancement.

Adding the tests and example directories provides a more complete picture of the plugin structure and aligns with the actual implementation.


51-51: Documentation improvements align well with PR scope.

The references to the new LLM and Realtime LLM guides, plus the expanded pyproject.toml example, provide clearer guidance for plugin developers.

Also applies to: 57-67

agents-core/vision_agents/core/llm/llm.py (2)

26-29: Excellent addition for error tracking.

Adding the optional exception parameter to LLMResponseEvent enables better error handling and observability while maintaining backward compatibility.


161-161: Good refactoring for maintainability.

Extracting instruction setting into _set_instructions centralizes the logic and improves code organization. This makes it easier to modify instruction handling behavior in the future.

Also applies to: 164-168

docs/ai/instructions/ai-realtime-llm.md (2)

45-75: Clear guidance for streaming implementation.

The mynativemethod example effectively demonstrates the streaming pattern with proper event emission (chunks and completion), instruction enhancement, and text accumulation.


92-104: Effective guidance summary.

The "Things to get right" section provides clear, actionable requirements and helpful references to concrete implementations.

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

128-130: LGTM!

The refactor from _close_impl to a public close() method is clean and aligns with the new API pattern. The delegation to self.rtc.close() is appropriate.

docs/ai/instructions/ai-llm.md (1)

1-81: Documentation improvements look good.

The updated LLM plugin development guide provides clear patterns for implementing streaming responses, event emission, and message normalization. The examples align well with the actual plugin implementations.

plugins/aws/vision_agents/plugins/aws/__init__.py (1)

1-4: Clean public API surface.

The exports are clear and follow the established pattern of aliasing the provider-specific LLM class to a generic LLM name. The __all__ declaration properly documents the public interface.

plugins/aws/tests/test_aws.py (1)

63-117: Async tests missing @pytest.mark.asyncio.

Several async tests are not marked and will error under strict asyncio mode.

-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_simple(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_native_api(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_stream(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_memory(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_native_memory(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_image_description(self, golf_swing_image):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_instruction_following(self, llm: BedrockLLM):

Also applies to: 118-163

⛔ Skipped due to learnings
Learnt from: CR
PR: GetStream/Vision-Agents#0
File: .cursor/rules/python.mdc:0-0
Timestamp: 2025-10-20T19:23:41.251Z
Learning: Applies to tests/**/*.py : Do not use pytest.mark.asyncio; async support is automatic

Comment on lines 512 to 515
#module_name = getattr(handler, '__module__', 'unknown')
if event.type not in self._silent_events:
logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
pass
#logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove commented-out code and clean up the now-empty conditional block.

The commented code and pass statement should be removed entirely. The if block on line 513 now serves no purpose since it only contains pass, and the module_name retrieval on line 512 is dead code.

Apply this diff to clean up the code:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
-
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task

Alternatively, if the _silent_events check is planned for future use, document the intent and still remove the commented code:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
+            # Silent events handling reserved for future use
+            if event.type not in self._silent_events:
+                pass
 
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#module_name = getattr(handler, '__module__', 'unknown')
if event.type not in self._silent_events:
logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
pass
#logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
async def _process_single_event(self, event):
"""Process a single event."""
for handler in self._handlers.get(event.type, []):
loop = asyncio.get_running_loop()
handler_task = loop.create_task(self._run_handler(handler, event))
self._handler_tasks[uuid.uuid4()] = handler_task
🤖 Prompt for AI Agents
In agents-core/vision_agents/core/events/manager.py around lines 512 to 515,
remove the dead/commented code and the empty conditional: delete the
commented-out module_name retrieval, the if event.type check that only contains
pass, and the pass itself so the block is gone; if the _silent_events check is
intended for future use, replace the empty body with a brief TODO comment
documenting intent and remove the commented code instead.

An example plugin is located in `plugins/example`. Copying the example is the best way to create a new plugin. After copying the example be sure to:

- Update the folder name "example" to your plugin's name
- Open `pyproject.toml` and update the name, description etc
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add period after "etc." for proper formatting.

In American English, the abbreviation "etc." requires a period.

Apply this diff:

-- Open `pyproject.toml` and update the name, description etc
+- Open `pyproject.toml` and update the name, description, etc.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Open `pyproject.toml` and update the name, description etc
- Open `pyproject.toml` and update the name, description, etc.
🧰 Tools
🪛 LanguageTool

[style] ~9-~9: In American English, abbreviations like “etc.” require a period.
Context: ....tomland update the name, description etc - Update the event types in yourevent...

(ETC_PERIOD)

🤖 Prompt for AI Agents
In docs/ai/instructions/ai-plugin.md around line 9, the sentence "Open
`pyproject.toml` and update the name, description etc" is missing the period in
the abbreviation "etc."; update the line to read "Open `pyproject.toml` and
update the name, description etc." ensuring the period is added immediately
after "etc" and preserve existing punctuation and spacing.

from vision_agents.core.llm import realtime

class MyRealtime(realtime.Realtime):
def __init__(self, model: str, client: Optional[ClientType]):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify undefined type hint.

ClientType is used in the type hint but not imported or defined. For documentation clarity, consider:

  • Using Any as a placeholder type, or
  • Adding a comment explaining this represents the provider's client type, or
  • Adding an import example

Example fix:

+from typing import Optional, Any
+
 class MyRealtime(realtime.Realtime):
-    def __init__(self, model: str, client: Optional[ClientType]):
+    def __init__(self, model: str, client: Optional[Any]):
         # it should be possible to pass the client (makes it easier for users to customize things)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In docs/ai/instructions/ai-realtime-llm.md around line 12, the constructor
signature uses ClientType in a type hint but ClientType is not defined or
imported; update the doc example to clarify the type by either replacing
ClientType with typing.Any, adding an import from typing (from typing import
Any) and using Any, or include a short inline comment stating "ClientType
represents the provider client (e.g., OpenAI client) and should be
replaced/imported by the integration code"; ensure the chosen approach is shown
in the example so readers understand the intended type.

Comment on lines +8 to +31
class MyTTS(tts.TTS):
def __init__(
self,
voice_id: str = "VR6AewLTigWG4xSOukaG", # Default ElevenLabs voice
model_id: str = "eleven_multilingual_v2",
client: Optional[MyClient] = None,
):
# it should be possible to pass the client (makes it easier for users to customize things)
# settings that are common to change, like voice id or model id should be configurable as well
super().__init__()
# Register custom events
self.events.register_events_from_module(events)

async def stream_synthesis(self, text: str):
# Emit start event
self.events.send(MyTTSStreamEvent(
plugin_name="mytts",
status="started",
text=text
))

async for chunk in self._stream_api(text):
# Emit audio chunks
self.events.send(TTSAudioEvent(
plugin_name="mytts",
audio_data=chunk.data,
is_final_chunk=False
))

# Emit completion event
self.events.send(MyTTSStreamEvent(
plugin_name="mytts",
status="completed"
))
```
self.voice_id = voice_id
self.client = client if client is not None else MyClient(api_key=api_key)

### Event Subscription
async def stream_audio(self, text: str, *_, **__) -> AsyncIterator[bytes]:

**Subscribe to LLM Events:**
```python
@self.events.subscribe
async def handle_llm_response(event: LLMResponseEvent):
"""Synthesize LLM responses."""
audio = await self.synthesize(event.text)
```
audio_stream = self.client.text_to_speech.stream(
text=text,
voice_id=self.voice_id,
output_format=self.output_format,
model_id=self.model_id,
request_options={"chunk_size": 64000},
)

**Subscribe to Agent Events:**
```python
@self.events.subscribe
async def handle_agent_say(event: AgentSayEvent):
"""Handle agent speech requests."""
await self.synthesize(event.text)
```
return audio_stream
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix undefined variables and missing attribute assignments in the example code.

This documentation example has three runtime errors that will prevent it from working:

  1. Line 19: api_key is undefined. Either add it as a parameter or define it in the scope.
  2. Line 27: self.model_id is referenced but never assigned. The model_id parameter on line 12 is not stored as an instance variable.
  3. Line 26: self.output_format is referenced but never initialized in __init__.

Apply this diff to fix the issues:

 class MyTTS(tts.TTS):
     def __init__(
         self,
         voice_id: str = "VR6AewLTigWG4xSOukaG",  # Default ElevenLabs voice
         model_id: str = "eleven_multilingual_v2",
         client: Optional[MyClient] = None,
+        api_key: str = "your-api-key",
+        output_format: str = "mp3_44100_128",
     ):
         # it should be possible to pass the client (makes it easier for users to customize things)
         # settings that are common to change, like voice id or model id should be configurable as well
         super().__init__()
         self.voice_id = voice_id
+        self.model_id = model_id
+        self.output_format = output_format
         self.client = client if client is not None else MyClient(api_key=api_key)
🤖 Prompt for AI Agents
In docs/ai/instructions/ai-tts.md around lines 8 to 31, the example references
undefined variables and misses instance assignments; update the constructor to
accept an api_key: Optional[str] = None parameter (or require it), pass that
api_key into MyClient when creating the default client, and assign self.model_id
= model_id and self.output_format = "<appropriate default>" (or add
output_format as a constructor parameter with a sensible default) so
stream_audio can reference self.model_id and self.output_format without errors.

6. **Voice management**: Support multiple voices and voice selection
7. **Performance**: Use streaming for real-time applications
8. **Testing**: Verify event flow and audio quality in tests
TOOD: no good test suite yet
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in TODO comment.

Line 40 has "TOOD" which should be "TODO".

-TOOD: no good test suite yet
+TODO: no good test suite yet
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TOOD: no good test suite yet
TODO: no good test suite yet
🤖 Prompt for AI Agents
docs/ai/instructions/ai-tts.md line 40: fix the typo in the comment by changing
"TOOD: no good test suite yet" to "TODO: no good test suite yet" so the standard
TODO marker is used and will be picked up by linters and editors.

Comment on lines 163 to 179
async def simple_audio_response(self, pcm: PcmData):
"""Send audio data to the model for processing."""
if not self.connected:
self.logger.warning("realtime is not active. can't call simple_audio_response")

# Resample from 48kHz to 24kHz if needed
pcm = pcm.resample(24000)

content_name = str(uuid.uuid4())

await self.audio_content_start(content_name)
self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
# Convert PcmData to base64 encoded bytes
audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
await self.audio_input(content_name, audio_base64)

await self.content_end(content_name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don’t proceed when disconnected; encode numpy as bytes.

Avoid sending when not connected. Base64 needs raw bytes; also emit bytes to input event.

 async def simple_audio_response(self, pcm: PcmData):
@@
-    if not self.connected:
-        self.logger.warning("realtime is not active. can't call simple_audio_response")
+    if not self.connected:
+        self.logger.warning("realtime is not active; ignoring simple_audio_response")
+        return
@@
-    self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
+    self._emit_audio_input_event(pcm.samples.tobytes(), sample_rate=pcm.sample_rate)
     # Convert PcmData to base64 encoded bytes
-    audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
+    audio_base64 = base64.b64encode(pcm.samples.tobytes()).decode("utf-8")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def simple_audio_response(self, pcm: PcmData):
"""Send audio data to the model for processing."""
if not self.connected:
self.logger.warning("realtime is not active. can't call simple_audio_response")
# Resample from 48kHz to 24kHz if needed
pcm = pcm.resample(24000)
content_name = str(uuid.uuid4())
await self.audio_content_start(content_name)
self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
# Convert PcmData to base64 encoded bytes
audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
await self.audio_input(content_name, audio_base64)
await self.content_end(content_name)
async def simple_audio_response(self, pcm: PcmData):
"""Send audio data to the model for processing."""
if not self.connected:
self.logger.warning("realtime is not active; ignoring simple_audio_response")
return
# Resample from 48kHz to 24kHz if needed
pcm = pcm.resample(24000)
content_name = str(uuid.uuid4())
await self.audio_content_start(content_name)
self._emit_audio_input_event(pcm.samples.tobytes(), sample_rate=pcm.sample_rate)
# Convert PcmData to base64 encoded bytes
audio_base64 = base64.b64encode(pcm.samples.tobytes()).decode("utf-8")
await self.audio_input(content_name, audio_base64)
await self.content_end(content_name)
🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 163 to
179, the method currently logs when disconnected but continues and also treats
pcm.samples as if it were raw bytes; fix by returning early if not
self.connected (so none of the subsequent content_* or emit calls run), convert
numpy samples to raw bytes once (e.g. samples_bytes = pcm.samples.tobytes()) and
use samples_bytes for both _emit_audio_input_event(samples_bytes,
sample_rate=...) and for base64 encoding
(base64.b64encode(samples_bytes).decode('utf-8')), ensuring all audio send/emit
calls use the bytes not the numpy array.

Comment on lines +320 to +326
async def send_event(self, event_data: Dict[str, Any]) -> None:
event_json = json.dumps(event_data)
event = InvokeModelWithBidirectionalStreamInputChunk(
value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8'))
)
await self.stream.input_stream.send(event)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard send_event when not connected/stream missing.

Prevents AttributeError if called before connect/after close.

 async def send_event(self, event_data: Dict[str, Any]) -> None:
-    event_json = json.dumps(event_data)
-    event = InvokeModelWithBidirectionalStreamInputChunk(
-        value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8'))
-    )
-    await self.stream.input_stream.send(event)
+    if not self.connected or not getattr(self, "stream", None):
+        self.logger.warning("send_event called while disconnected; dropping event: %s", event_data.get("event", {}).keys())
+        return
+    event_json = json.dumps(event_data)
+    event = InvokeModelWithBidirectionalStreamInputChunk(
+        value=BidirectionalInputPayloadPart(bytes_=event_json.encode("utf-8"))
+    )
+    await self.stream.input_stream.send(event)
🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 320 to
326, send_event can raise AttributeError if called when the stream or its
input_stream is missing (not connected or already closed); guard the method by
checking the stream and input_stream before using them (e.g. if not self.stream
or not getattr(self.stream, "input_stream", None): return or raise a clear
exception), and return/raise early so send_event only attempts to create and
send the chunk when the connection is present; keep it asynchronous and
non-blocking so callers get a predictable no-op or informative error when
disconnected.

Comment on lines +384 to +391
elif 'textOutput' in json_data['event']:
text_content = json_data['event']['textOutput']['content']
#role = json_data['event']['textOutput']['role']
logger.info(f"Text output from AWS Bedrock: {text_content}")
elif 'completionStart' in json_data['event']:
logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart'])
elif 'audioOutput' in json_data['event']:
audio_content = json_data['event']['audioOutput']['content']
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix logging format: second arg without placeholder will crash.

Use a placeholder or merge into the message.

-                                elif 'completionStart' in json_data['event']:
-                                    logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart'])
+                                elif 'completionStart' in json_data['event']:
+                                    self.logger.info("Completion start from AWS Bedrock: %s", json_data['event']['completionStart'])

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 384 to
391, the logger.info call for "Completion start from AWS Bedrock" passes the
event as a second argument without a placeholder which will raise an error;
change the call to include the value in the message (either with a format
placeholder and the event as an argument, or by building a single string via
f-string/concatenation) so the logger receives a proper formatted message, and
ensure other nearby logs follow the same pattern for consistency.

Comment on lines +1 to +40
import asyncio
import logging
from uuid import uuid4

from dotenv import load_dotenv

from vision_agents.core import User
from vision_agents.core.agents import Agent
from vision_agents.plugins import aws, getstream, cartesia, deepgram, smart_turn

load_dotenv()

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s [call_id=%(call_id)s] %(name)s: %(message)s")
logger = logging.getLogger(__name__)


async def start_agent() -> None:
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Friendly AI"),
instructions="Be nice to the user",
llm=aws.LLM(model="qwen.qwen3-32b-v1:0"),
tts=cartesia.TTS(),
stt=deepgram.STT(),
turn_detection=smart_turn.TurnDetection(buffer_duration=2.0, confidence_threshold=0.5),
# Enable turn detection with FAL/ Smart turn
)
await agent.create_user()

call = agent.edge.client.video.call("default", str(uuid4()))
await agent.edge.open_demo(call)

with await agent.join(call):
await asyncio.sleep(5)
await agent.llm.simple_response(text="Say hi")
await agent.finish()


if __name__ == "__main__":
asyncio.run(start_agent())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Consider relocating this AWS example or using a different LLM.

This example resides in plugins/example/example/ but demonstrates AWS Bedrock LLM integration (line 22). This creates confusion about whether it's a generic example or an AWS-specific example. Consider either:

  • Moving this to plugins/aws/example/ if it's meant to showcase AWS, or
  • Replacing aws.LLM with a more generic plugin like gemini.LLM if this is meant to be a template example

This would align better with the plugin structure guidance and make the examples clearer for users.

🤖 Prompt for AI Agents
plugins/example/example/my_example.py lines 1-40: this example uses aws.LLM
(line ~22) but lives under plugins/example/, which is confusing; either move the
file to plugins/aws/example/ if it is intended as an AWS-specific sample, or
replace aws.LLM with a generic LLM plugin (e.g., gemini.LLM or another
non-provider-specific LLM) so the example remains a template; update imports and
any provider-specific configuration accordingly and adjust README/path
references to reflect the new location or plugin choice.

Comment on lines +245 to +247
except CancelledError:
logger.error("Stop async iteration exception")
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Log task cancellation at info level, not error.

CancelledError is the expected behavior when close() cancels the receive task. Logging this at error level creates false alarms during normal shutdown.

Apply this diff to log at info level with a clearer message:

         except CancelledError:
-            logger.error("Stop async iteration exception")
+            self.logger.info("Receive loop cancelled during shutdown")
             return

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py around lines
245-247, the exception handler currently logs CancelledError at error level;
change it to log at info level with a clearer message such as "Receive task
cancelled during shutdown" (or similar) and then return, to avoid treating
expected cancellation as an error.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
agents-core/vision_agents/core/events/manager.py (1)

539-542: Remove commented-out code and empty conditional block.

The commented-out module_name retrieval and logger.info call, along with the empty if block containing only pass, should be removed entirely. This dead code reduces maintainability with no functional benefit.

Apply this diff to clean up:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
-
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task

Alternatively, if the _silent_events check is reserved for future functionality, document that intent:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
+            # TODO: Add conditional logging based on _silent_events if needed
 
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)

345-347: Log task cancellation at info level, not error.

CancelledError is expected when close() cancels the receive task during normal shutdown. Logging at error level creates false alarms.

Apply this diff:

         except CancelledError:
-            logger.error("Stop async iteration exception")
+            self.logger.info("Receive task cancelled during shutdown")
             return
🧹 Nitpick comments (1)
agents-core/vision_agents/core/llm/realtime.py (1)

40-41: Fix type annotation spacing to follow PEP 8.

Type annotations should not have spaces around the colon according to PEP 8. This is inconsistent with standard Python conventions.

Apply this diff:

-    fps : int = 1
-    session_id : str # UUID to identify this session
+    fps: int = 1
+    session_id: str  # UUID to identify this session
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4b540c9 and 186053f.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • agents-core/vision_agents/core/events/manager.py (1 hunks)
  • agents-core/vision_agents/core/llm/realtime.py (4 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • agents-core/vision_agents/core/events/manager.py
  • agents-core/vision_agents/core/llm/realtime.py
🧬 Code graph analysis (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (5)
agents-core/vision_agents/core/llm/realtime.py (1)
  • close (179-180)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • close (145-146)
agents-core/vision_agents/core/agents/agents.py (1)
  • close (432-503)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (2)
  • close (40-41)
  • close (325-327)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • close (765-789)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
agents-core/vision_agents/core/llm/realtime.py (1)
  • close (179-180)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • close (765-789)
agents-core/vision_agents/core/llm/realtime.py (5)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
  • close (369-383)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)
  • close (145-146)
agents-core/vision_agents/core/agents/agents.py (1)
  • close (432-503)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (2)
  • close (40-41)
  • close (325-327)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • close (765-789)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (7)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

145-146: LGTM! Clean delegation to RTCManager.

The implementation correctly delegates cleanup to self.rtc.close(), which handles WebRTC resources, data channels, and tracks.

agents-core/vision_agents/core/llm/realtime.py (3)

48-48: Good refactor to public attribute.

Replacing the internal _is_connected tracking with a public connected flag improves API clarity and simplifies state management.


78-78: LGTM! Proper state synchronization.

The connected attribute is correctly updated in both connect and disconnect event emitters, ensuring consistent state tracking.

Also applies to: 94-94


178-180: Excellent API evolution.

Replacing _close_impl with an abstract public close method provides a cleaner, more explicit shutdown contract for subclasses.

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3)

3-3: LGTM! Necessary import for exception handling.

The CancelledError import is properly used in the receive loop exception handling at line 345.


344-344: LGTM! Cleaner log formatting.

Single-line format improves log readability for unrecognized event warnings.


369-383: LGTM! Robust shutdown implementation.

The close method properly:

  • Sets connection state to False
  • Cancels and awaits the receive task
  • Closes the session context using the async context manager protocol
  • Handles exceptions gracefully during cleanup

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (3)

163-181: Return early when disconnected; convert numpy to bytes.

Line 168 logs a warning but execution continues, which will fail on subsequent operations. Additionally, pcm.samples is a numpy array but is passed to functions expecting bytes (lines 176, 178).

Apply this diff to fix both issues:

     async def simple_audio_response(
         self, pcm: PcmData, participant: Optional[Participant] = None
     ):
         """Send audio data to the model for processing."""
         if not self.connected:
-            self.logger.warning("realtime is not active. can't call simple_audio_response")
+            self.logger.warning("realtime is not active; ignoring simple_audio_response")
+            return
 
         # Resample from 48kHz to 24kHz if needed
         pcm = pcm.resample(24000)
         
         content_name = str(uuid.uuid4())
 
         await self.audio_content_start(content_name)
-        self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
+        samples_bytes = pcm.samples.tobytes()
+        self._emit_audio_input_event(samples_bytes, sample_rate=pcm.sample_rate)
         # Convert PcmData to base64 encoded bytes
-        audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
+        audio_base64 = base64.b64encode(samples_bytes).decode('utf-8')

322-327: Guard against missing stream.

The method can raise AttributeError if called when self.stream or its input_stream is not available (before connect or after close).

Apply this diff:

     async def send_event(self, event_data: Dict[str, Any]) -> None:
+        """
+        Send event to Bedrock stream.
+        
+        Args:
+            event_data: Event data to send.
+        """
+        if not self.connected or not getattr(self, "stream", None):
+            self.logger.warning("send_event called while disconnected; dropping event")
+            return
+            
         event_json = json.dumps(event_data)
         event = InvokeModelWithBidirectionalStreamInputChunk(
             value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8'))
         )
         await self.stream.input_stream.send(event)

391-391: Fix logging format error.

Line 391 passes a second argument without a format placeholder, which will cause a logging error.

Apply this diff:

                                 elif 'completionStart' in json_data['event']:
-                                    logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart'])
+                                    logger.info("Completion start from AWS Bedrock: %s", json_data['event']['completionStart'])
🧹 Nitpick comments (11)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (11)

125-161: Use specific exception type.

Line 160 raises a generic Exception. Use a more specific exception type like ValueError or create a custom exception class.

         if not system_instructions:
-            raise Exception("AWS Bedrock requires system instructions before sending regular user input")
+            raise ValueError("AWS Bedrock requires system instructions before sending regular user input")

206-216: Add docstring and clarify parameter name.

The method lacks a docstring, and the parameter name audio_bytes is misleading since it expects a base64-encoded string, not bytes.

Based on coding guidelines.

-    async def audio_input(self, content_name: str, audio_bytes: str):
+    async def audio_input(self, content_name: str, audio_base64: str):
+        """
+        Send base64-encoded audio input.
+        
+        Args:
+            content_name: Unique identifier for this audio content.
+            audio_base64: Base64-encoded audio data.
+        """
         audio_event = {
             "event": {
                 "audioInput": {
                     "promptName": self.session_id,
                     "contentName": content_name,
-                    "content": audio_bytes
+                    "content": audio_base64
                 }
             }
         }

218-238: Add docstring.

The method lacks a docstring explaining its purpose and parameters.

Based on coding guidelines.

     async def audio_content_start(self, content_name: str, role: str="USER"):
+        """
+        Send audio content start event.
+        
+        Args:
+            content_name: Unique identifier for this audio content.
+            role: Role for the content (default: "USER").
+        """

240-254: Add docstring.

The method lacks a docstring. Also, consider whether the TODO for configurable inference parameters should be addressed now or tracked separately.

Based on coding guidelines.

     async def start_session(self):
-        # subclass this to change the session start
+        """
+        Send session start event with inference configuration.
+        
+        Subclass this method to customize inference parameters.
+        """

256-278: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def start_prompt(self):
+        """Send prompt start event with output configuration."""

282-297: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def text_content_start(self, content_name: str, role: str):
+        """
+        Send text content start event.
+        
+        Args:
+            content_name: Unique identifier for this text content.
+            role: Role for the content (e.g., "USER", "SYSTEM").
+        """

299-309: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def text_input(self, content_name: str, content: str):
+        """
+        Send text input event.
+        
+        Args:
+            content_name: Unique identifier for this text content.
+            content: The text content to send.
+        """

311-320: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def content_end(self, content_name: str):
+        """
+        Send content end event.
+        
+        Args:
+            content_name: Unique identifier for the content.
+        """

329-354: Add docstring.

The method lacks a docstring, but the cleanup logic is correct.

Based on coding guidelines.

     async def close(self):
+        """Close the Bedrock realtime session and clean up resources."""

357-468: Add docstring.

The method lacks a docstring explaining what events it handles.

Based on coding guidelines.

     async def _handle_events(self):
-        """Process incoming responses from AWS Bedrock."""
+        """
+        Process incoming responses from AWS Bedrock.
+        
+        Handles event types: contentStart, textOutput, audioOutput, toolUse, 
+        contentEnd, completionEnd, and usageEvent. Runs until stream ends or
+        an error occurs.
+        """

373-443: Standardize logging format for consistency.

Multiple logging calls use f-strings (lines 373, 389, 408, 420, 434, 437, 443) while others use % formatting (lines 455, 465). For consistency and better log aggregation, use % formatting throughout.

Example for line 373:

-                                    logger.info(f"Content start from AWS Bedrock: {content_start}")
+                                    logger.info("Content start from AWS Bedrock: %s", content_start)

Apply similar changes to lines 389, 408, 420, 434, 437, and 443.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 186053f and 7da96db.

📒 Files selected for processing (1)
  • plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/aws/vision_agents/plugins/aws/aws_realtime.py
🧬 Code graph analysis (1)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (6)
agents-core/vision_agents/core/utils/video_forwarder.py (1)
  • VideoForwarder (13-188)
agents-core/vision_agents/core/processors/base_processor.py (1)
  • Processor (32-40)
agents-core/vision_agents/core/edge/types.py (2)
  • PcmData (37-183)
  • resample (129-183)
agents-core/vision_agents/core/llm/events.py (1)
  • RealtimeAudioOutputEvent (37-45)
agents-core/vision_agents/core/llm/llm.py (1)
  • _build_enhanced_instructions (61-85)
agents-core/vision_agents/core/llm/realtime.py (1)
  • _emit_audio_input_event (103-114)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
🔇 Additional comments (4)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (4)

1-21: LGTM!

Imports are well-organized and follow standard conventions.


24-25: LGTM!

Constants are clearly defined and appropriately named.


28-32: TODO is acknowledged in PR objectives.

The planned function calling and chat integration work is appropriately tracked here and in the PR description.


196-204: LGTM!

The three-step wrapping pattern correctly handles Nova's event requirements.

Comment on lines +35 to +81
class Realtime(realtime.Realtime):
"""
Realtime on AWS with support for audio/video streaming (uses AWS Bedrock).
A few things are different about Nova compared to other STS solutions
1. two init events. there is a session start and a prompt start
2. promptName basically works like a unique identifier. it's created client side and sent to nova
3. input/text events are wrapped. so its common to do start event, text event, stop event
4. on close there is an session and a prompt end event
AWS Nova samples are the best docs:
simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py
full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py
tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py
Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html
Available voices are documented here:
https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html
Resumption example:
https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation
Examples:
from vision_agents.plugins import aws
llm = aws.Realtime(
model="us.amazon.nova-sonic-v1:0",
region_name="us-east-1"
)
# Connect to the session
await llm.connect()
# Simple text response
await llm.simple_response("Describe what you see and say hi")
# Send audio
await llm.simple_audio_response(pcm_data)
# Close when done
await llm.close()
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring should follow Google style guide.

The docstring provides excellent information but doesn't follow the required Google style format with sections like Attributes: and Examples:.

Based on coding guidelines.

Apply this diff to restructure the docstring:

     """
-    Realtime on AWS with support for audio/video streaming (uses AWS Bedrock).
+    Realtime on AWS with support for audio/video streaming using AWS Bedrock.
 
-    A few things are different about Nova compared to other STS solutions
+    Nova Sonic has several differences compared to other STS solutions:
+    1. Two initialization events: session start and prompt start
+    2. promptName acts as a unique client-side identifier sent to Nova
+    3. Input/text events are wrapped in start/stop events
+    4. Close requires both session and prompt end events
 
-        1. two init events. there is a session start and a prompt start
-        2. promptName basically works like a unique identifier. it's created client side and sent to nova
-        3. input/text events are wrapped. so its common to do start event, text event, stop event
-        4. on close there is an session and a prompt end event
+    See AWS Nova samples for reference:
+    - Simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py
+    - Full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py
+    - Tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py
 
-    AWS Nova samples are the best docs:
+    Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html
+    Available voices: https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html
+    Resumption example: https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation
 
-        simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py
-        full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py
-        tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py
+    Attributes:
+        connected: Connection status flag.
+        voice_id: Voice identifier for audio output.
 
-    Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html
-    Available voices are documented here:
-    https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html
-
-    Resumption example:
-    https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation
-
-
-
-    Examples:
-    
+    Examples::
+        
         from vision_agents.plugins import aws
         
         llm = aws.Realtime(

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 35 to 81,
reformat the existing long docstring to follow the Google Python style: start
with a one-line summary, a blank line, then an expanded description section that
briefly lists the Nova-specific differences, add an Attributes: section
enumerating key init parameters (e.g., model, region_name) with short
types/descriptions, include an Examples: section containing the provided usage
snippet, and move reference links into a Notes or References subsection;
preserve the same content but restructure headings and bullet points to match
Google style.

Comment on lines +85 to +123
def __init__(
self,
model: str = DEFAULT_MODEL,
region_name: str = "us-east-1",
voice_id: str = "matthew",
**kwargs
) -> None:
"""
"""
super().__init__(**kwargs)
self.model = model
self.region_name = region_name
self.sample_rate = 24000
self.voice_id = voice_id

# Initialize Bedrock Runtime client with SDK
config = Config(
endpoint_uri=f"https://bedrock-runtime.{region_name}.amazonaws.com",
region=region_name,
aws_credentials_identity_resolver=EnvironmentCredentialsResolver(),
)
self.client = BedrockRuntimeClient(config=config)
self.logger = logging.getLogger(__name__)

# Audio output track - Bedrock typically outputs at 16kHz
self.output_track = AudioStreamTrack(
framerate=24000, stereo=False, format="s16"
)

self._video_forwarder: Optional[VideoForwarder] = None
self._stream_task: Optional[asyncio.Task[Any]] = None
self._is_connected = False
self._message_queue: asyncio.Queue[Dict[str, Any]] = asyncio.Queue()
self._conversation_messages: List[Dict[str, Any]] = []
self._pending_tool_uses: Dict[int, Dict[str, Any]] = {} # Track tool calls across stream events

# Audio streaming configuration
self.prompt_name = self.session_id
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add docstring following Google style.

The __init__ method lacks a docstring. Per coding guidelines, it should document the parameters.

Based on coding guidelines.

Apply this diff:

     def __init__(
             self,
             model: str = DEFAULT_MODEL,
             region_name: str = "us-east-1",
             voice_id: str = "matthew",
             **kwargs
     ) -> None:
         """
+        Initialize AWS Bedrock Realtime client.
+
+        Args:
+            model: Bedrock model ID to use for realtime inference.
+            region_name: AWS region name for Bedrock service.
+            voice_id: Voice identifier for audio output (e.g., "matthew").
+            **kwargs: Additional arguments passed to parent Realtime class.
         """
🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 85 to 123,
the __init__ method is missing a docstring; add a Google-style docstring that
starts with a one-line summary of the constructor, followed by an Args section
describing each parameter (model: str, region_name: str, voice_id: str,
**kwargs), their types and defaults, and any important instance attributes
initialized (client, logger, output_track, sample_rate, prompt_name), and end
with a Returns: None line; keep it concise and follow Google docstring
formatting conventions.

super().__init__(**kwargs)
self.model = model
self.region_name = region_name
self.sample_rate = 24000
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DEFAULT_SAMPLE_RATE constant is unused and misleading.

Line 98 hardcodes 24000 Hz, while DEFAULT_SAMPLE_RATE (line 25) is 16000 Hz and never used. This inconsistency could confuse readers.

Consider either using the constant consistently or documenting why 24000 Hz is required for Bedrock:

-        self.sample_rate = 24000
+        # AWS Bedrock Nova requires 24kHz sample rate
+        self.sample_rate = 24000
🤖 Prompt for AI Agents
In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around line 98, the
sample rate is hardcoded to 24000 while DEFAULT_SAMPLE_RATE (defined at line 25
as 16000) is unused and misleading; replace the hardcoded value by using the
DEFAULT_SAMPLE_RATE constant (or rename/update the constant to
DEFAULT_BEDROCK_SAMPLE_RATE and use that) so the constant is referenced
consistently, or if 24000 is required for Bedrock, update the constant value
and/or its name and add a brief inline comment explaining why 24000 Hz is
necessary.

Comment on lines +183 to +194
async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
participant: Optional[Participant] = None):
"""
Simple response standardizes how to send a text instruction to this LLM.
Example:
llm.simple_response("tell me a poem about Boulder")
For more advanced use cases you can use the native send_realtime_input
"""
self.logger.info("Simple response called with text: %s", text)
await self.content_input(content=text, role="USER")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused parameter: processors.

The processors parameter is declared but never used in the method body. Either implement processor support or remove the parameter.

-    async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
-                              participant: Optional[Participant] = None):
+    async def simple_response(self, text: str, participant: Optional[Participant] = None):

Alternatively, document why it's reserved for future use:

     async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
                               participant: Optional[Participant] = None):
         """
-        Simple response standardizes how to send a text instruction to this LLM.
+        Send a text instruction to this LLM.
+        
+        Args:
+            text: The text message to send.
+            processors: Reserved for future use (currently ignored).
+            participant: Optional participant information.
 
-        Example:
+        Example::
+            
             llm.simple_response("tell me a poem about Boulder")
-
-        For more advanced use cases you can use the native send_realtime_input
         """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
participant: Optional[Participant] = None):
"""
Simple response standardizes how to send a text instruction to this LLM.
Example:
llm.simple_response("tell me a poem about Boulder")
For more advanced use cases you can use the native send_realtime_input
"""
self.logger.info("Simple response called with text: %s", text)
await self.content_input(content=text, role="USER")
async def simple_response(self, text: str, participant: Optional[Participant] = None):
"""
Send a text instruction to this LLM.
Args:
text: The text message to send.
participant: Optional participant information.
Example::
llm.simple_response("tell me a poem about Boulder")
"""
self.logger.info("Simple response called with text: %s", text)
await self.content_input(content=text, role="USER")
Suggested change
async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
participant: Optional[Participant] = None):
"""
Simple response standardizes how to send a text instruction to this LLM.
Example:
llm.simple_response("tell me a poem about Boulder")
For more advanced use cases you can use the native send_realtime_input
"""
self.logger.info("Simple response called with text: %s", text)
await self.content_input(content=text, role="USER")
async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
participant: Optional[Participant] = None):
"""
Send a text instruction to this LLM.
Args:
text: The text message to send.
processors: Reserved for future use (currently ignored).
participant: Optional participant information.
Example::
llm.simple_response("tell me a poem about Boulder")
"""
self.logger.info("Simple response called with text: %s", text)
await self.content_input(content=text, role="USER")

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/run_tests.yml (1)

53-60: Lines 53–60: Redundant heredoc export; environment variable is already set.

The env: section at lines 34–46 already exports AWS_BEARER_TOKEN_BEDROCK for all subsequent steps. Re-exporting it to GITHUB_ENV via heredoc is redundant unless you need to modify or re-evaluate the value mid-workflow. Simplify by removing this step or explain why the re-export is necessary.

-      - name: Export AWS_BEARER_TOKEN_BEDROCK (heredoc)
-        shell: bash
-        run: |
-          {
-            echo 'AWS_BEARER_TOKEN_BEDROCK<<EOF'
-            echo "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
-            echo 'EOF'
-          } >> "$GITHUB_ENV"
-
       - name: Verify presence
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6893968 and b5482da.

⛔ Files ignored due to path filters (1)
  • examples/02_golf_coach_example/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • .github/workflows/run_tests.yml (2 hunks)
  • plugins/aws/tests/test_aws.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/aws/tests/test_aws.py
🧰 Additional context used
🪛 actionlint (1.7.8)
.github/workflows/run_tests.yml

28-28: property "environment" is not defined in object type {check_run_id: number; container: {id: string; network: string}; services: {string => {id: string; network: string; ports: {string => string}}}; status: string}

(expression)


46-46: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)


47-47: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)


55-55: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (1)
.github/workflows/run_tests.yml (1)

46-46: AWS_BEARER_TOKEN_BEDROCK secret is conditionally required for AWS integration tests.

The secret reference is valid and necessary, but only for AWS Bedrock integration tests. The workflow automatically verifies its presence via the "Verify presence" step. This secret is:

  • Required only when running AWS tests (controlled by pytest markers)
  • Documented in plugins/aws/README.md
  • Explicitly checked by test code in plugins/aws/tests/test_aws.py

If you run tests without the AWS marker, this secret is not needed. Configure it in repository settings (Settings → Secrets and variables → Actions) only if you intend to run AWS integration tests.

Comment on lines +27 to +29
- name: Show environment name
run: |
echo "Environment: ${{ job.environment }}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

❓ Verification inconclusive

Line 29: job.environment is not a valid GitHub Actions context property.

The job context in GitHub Actions does not expose an environment property. This step will output "undefined" or fail. If you're trying to display the job's current environment or runner info, consider using runner.os, github.job, or other valid context properties instead.

If your intent is to debug environment setup, consider replacing this step with:

-      - name: Show environment name
+      - name: Show runner info
         run: |
-          echo "Environment: ${{ job.environment }}"
+          echo "Job: ${{ github.job }}, OS: ${{ runner.os }}"

job.environment is not a valid property in the GitHub Actions job context and will output an empty string.

The job context does not have environment name information available. When you attempt to dereference a nonexistent property, it will evaluate to an empty string, making this step output "Environment: " with no value. Consider using valid context properties like ${{ github.job }} or ${{ runner.os }} if you need to display debugging information.

🧰 Tools
🪛 actionlint (1.7.8)

28-28: property "environment" is not defined in object type {check_run_id: number; container: {id: string; network: string}; services: {string => {id: string; network: string; ports: {string => string}}}; status: string}

(expression)

🤖 Prompt for AI Agents
In .github/workflows/run_tests.yml around lines 27 to 29, the step echoes `${{
job.environment }}` which is not a valid GitHub Actions job context and will
print an empty string; replace that reference with a valid context variable (for
example `${{ github.job }}` to show the job name or `${{ runner.os }}` to show
the runner OS) so the echo prints meaningful information, and update the step to
use the chosen valid property.

Comment on lines +46 to +47
AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Line 47: Redundant/incorrect environment variable _BEARER_TOKEN_BEDROCK.

The second env var with a leading underscore (_BEARER_TOKEN_BEDROCK) appears to be a typo or mistake. Both variables reference the same secret, but the underscore-prefixed name is unusual and unclear. Either remove it or clarify its purpose.

-      AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
-      _BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
+      AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
🧰 Tools
🪛 actionlint (1.7.8)

46-46: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)


47-47: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

🤖 Prompt for AI Agents
In .github/workflows/run_tests.yml around lines 46-47, the environment variable
_BEARER_TOKEN_BEDROCK is a redundant/likely-typo duplicate of
AWS_BEARER_TOKEN_BEDROCK; remove the underscore-prefixed entry (or rename it to
the intended variable if it was meant to be distinct) and ensure only the
correct env var (AWS_BEARER_TOKEN_BEDROCK) is exported from secrets so the
workflow uses a single, clearly named token variable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/ci.yml (1)

13-13: ✓ Enables secrets for AWS/Bedrock CI tests.

The secrets: inherit allows the reusable workflow to access credentials needed for the new Bedrock integrations. Syntax and placement are correct.

Consider being more explicit: instead of inheriting all repository secrets, explicitly pass only the AWS credentials required by the tests. This follows the principle of least privilege and reduces the blast radius if a test or workflow is compromised.

    with:
      marker: 'not integration'
-   secrets: inherit
+   secrets:
+     AWS_BEARER_TOKEN_BEDROCK: ${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}

(Adjust secret names as needed for your actual AWS authentication setup.)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b5482da and c71da10.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"

@tschellenbach tschellenbach merged commit c954409 into main Oct 21, 2025
5 checks passed
@tschellenbach tschellenbach deleted the bedrock branch October 21, 2025 20:18
Nash0x7E2 added a commit to Nash0x7E2/Vision-Agents that referenced this pull request Oct 28, 2025
commit ec32383
Author: Neevash Ramdial (Nash) <mail@neevash.dev>
Date:   Mon Oct 27 15:51:53 2025 -0600

    mypy clean up (GetStream#130)

commit c52fe4c
Author: Neevash Ramdial (Nash) <mail@neevash.dev>
Date:   Mon Oct 27 15:28:00 2025 -0600

    remove turn keeping from example (GetStream#129)

commit e1072e8
Merge: 5bcffa3 fea101a
Author: Yarik <43354956+yarikdevcom@users.noreply.github.com>
Date:   Mon Oct 27 14:28:05 2025 +0100

    Merge pull request GetStream#106 from tjirab/feat/20251017_gh-labeler

    feat: Github pull request labeler

commit 5bcffa3
Merge: 406673c bfe888f
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sat Oct 25 10:56:27 2025 -0600

    Merge pull request GetStream#119 from GetStream/fix-screensharing

    Fix screensharing

commit bfe888f
Merge: 8019c14 406673c
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sat Oct 25 10:56:15 2025 -0600

    Merge branch 'main' into fix-screensharing

commit 406673c
Author: Stefan Blos <stefan.blos@gmail.com>
Date:   Sat Oct 25 03:03:10 2025 +0200

    Update README (GetStream#118)

    * Changed README to LaRaes version

    * Remove arrows from table

    * Add table with people & projects to follow

    * Update images and links in README.md

commit 3316908
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Fri Oct 24 23:48:06 2025 +0200

    Simplify TTS plugin and audio utils (GetStream#123)

    - Simplified TTS plugin
    - AWS Polly TTS plugin
    - OpenAI TTS plugin
    - Improved audio utils

commit 8019c14
Author: Max Kahan <max.kahan@getstream.io>
Date:   Fri Oct 24 17:32:26 2025 +0100

    remove video forwarder lazy init

commit ca62d37
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 16:44:03 2025 +0100

    use correct codec

commit 8cf8788
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 14:27:18 2025 +0100

    rename variable to fix convention

commit 33fd70d
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 14:24:42 2025 +0100

    unsubscribe from events

commit 3692131
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 14:19:53 2025 +0100

    remove nonexistent type

commit c5f68fe
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 14:10:07 2025 +0100

    cleanup tests to fit style

commit 8b3c61a
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 13:55:08 2025 +0100

    clean up resources when track cancelled

commit d8e08cb
Author: Max Kahan <max.kahan@getstream.io>
Date:   Thu Oct 23 13:24:55 2025 +0100

    fix track republishing in agent

commit 0f8e116
Author: Max Kahan <max.kahan@getstream.io>
Date:   Wed Oct 22 15:37:11 2025 +0100

    add tests

commit 08e6133
Author: Max Kahan <max.kahan@getstream.io>
Date:   Wed Oct 22 15:25:37 2025 +0100

    ensure video track dimensions are an even number

commit 6a725b0
Merge: 5f001e0 5088709
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 15:23:58 2025 -0600

    Merge pull request GetStream#122 from GetStream/cleanup_stt

    Cleanup STT

commit 5088709
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 15:23:34 2025 -0600

    cleanup of stt

commit f185120
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 15:08:42 2025 -0600

    more cleanup

commit 05ccbfd
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 14:51:48 2025 -0600

    cleanup

commit bb834ca
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 14:28:53 2025 -0600

    more cleanup for stt

commit 7a3f2d2
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 14:11:35 2025 -0600

    more test cleanup

commit ad7f4fe
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 14:10:57 2025 -0600

    cleanup test

commit 9e50cdd
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 14:03:45 2025 -0600

    large cleanup

commit 5f001e0
Merge: 95a03e4 5d204f3
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 12:01:52 2025 -0600

    Merge pull request GetStream#121 from GetStream/fish_stt

    [AI-201] Fish speech to text (partial)

commit 5d204f3
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 11:48:16 2025 -0600

    remove ugly tests

commit ee9a241
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 11:46:19 2025 -0600

    cleanup

commit 6eb8270
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 11:23:00 2025 -0600

    fix 48khz support

commit 3b90548
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 23 10:59:08 2025 -0600

    first attempt at fish stt, doesnt entirely work just yet

commit 95a03e4
Merge: b90c9e3 b4c0da8
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Thu Oct 23 10:11:39 2025 +0200

    Merge branch 'main' of github.com:GetStream/Vision-Agents

commit b90c9e3
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Wed Oct 22 23:28:28 2025 +0200

    remove print and double event handling

commit b4c0da8
Merge: 3d06446 a426bc2
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 15:08:51 2025 -0600

    Merge pull request GetStream#117 from GetStream/openrouter

    [AI-194] Openrouter

commit a426bc2
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 15:03:10 2025 -0600

    skip broken test

commit ba6c027
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 14:50:23 2025 -0600

    almost working openrouter

commit 0b1c873
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 14:47:12 2025 -0600

    almost working, just no instruction following

commit ce63233
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 14:35:53 2025 -0600

    working memory for openai

commit 149e886
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 13:32:43 2025 -0600

    todo

commit e0df1f6
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 13:20:38 2025 -0600

    first pass at adding openrouter

commit 3d06446
Merge: 4eb8ef4 ef55d66
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 13:20:11 2025 -0600

    Merge branch 'main' of github.com:GetStream/Vision-Agents

commit 4eb8ef4
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 13:20:01 2025 -0600

    cleanup ai plugin instructions

commit ef55d66
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Wed Oct 22 12:54:33 2025 -0600

    Add link to stash_pomichter for spatial memory

commit 9c9737f
Merge: c954409 390c45b
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 19:45:09 2025 -0600

    Merge pull request GetStream#115 from GetStream/fish

    [AI-195] Fish support

commit 390c45b
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 19:44:37 2025 -0600

    cleannup

commit 1cc1cf1
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 19:42:03 2025 -0600

    happy tests

commit 8163d32
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 19:39:21 2025 -0600

    fix gemini rule following

commit ada3ac9
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 19:20:18 2025 -0600

    fish tts

commit 61a26cf
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 16:44:03 2025 -0600

    attempt at fish

commit c954409
Merge: ab27e48 c71da10
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 14:18:15 2025 -0600

    Merge pull request GetStream#104 from GetStream/bedrock

    [AI-192] - Bedrock, AWS & Nova

commit c71da10
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Tue Oct 21 22:00:25 2025 +0200

    maybe

commit b5482da
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Tue Oct 21 21:46:15 2025 +0200

    debugging

commit 9a36e45
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 13:14:58 2025 -0600

    echo environment name

commit 6893968
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 12:53:58 2025 -0600

    more debugging

commit c35fc47
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 12:45:44 2025 -0600

    add some debug info

commit 0d6d3fd
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 12:03:13 2025 -0600

    run test fix

commit c3a31bd
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 11:52:25 2025 -0600

    log cache hit

commit 04554ae
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 11:48:03 2025 -0600

    fix glob

commit 7da96db
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 11:33:56 2025 -0600

    mypy

commit 186053f
Merge: 4b540c9 ab27e48
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 11:17:17 2025 -0600

    happy tests

commit 4b540c9
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 10:20:04 2025 -0600

    happy tests

commit b05a60a
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 09:17:45 2025 -0600

    add readme

commit 71affcc
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Tue Oct 21 09:13:01 2025 -0600

    rename to aws

commit d2eeba7
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 21:32:01 2025 -0600

    ai tts instructions

commit 98a4f9d
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 16:49:00 2025 -0600

    small edits

commit ab27e48
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Mon Oct 20 21:42:04 2025 +0200

    Ensure user agent is initialized before joining the call (GetStream#113)

    * ensure user agent is initialized before joining the call

    * wip

commit 3cb339b
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Mon Oct 20 21:22:57 2025 +0200

    New conversation API (GetStream#102)

    * trying to resurrect

    * test transcription events for openai

    * more tests for openai and gemini llm

    * more tests for openai and gemini llm

    * update py-client

    * wip

    * ruff

    * wip

    * ruff

    * snap

    * another way

    * another way, a better way

    * ruff

    * ruff

    * rev

    * ruffit

    * mypy everything

    * brief

    * tests

    * openai dep bump

    * snap - broken

    * nothingfuckingworks

    * message id

    * fix test

    * ruffit

commit cb6f00a
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 13:18:03 2025 -0600

    use qwen

commit f84b2ad
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 13:02:24 2025 -0600

    fix tests

commit e61acca
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 12:50:40 2025 -0600

    testing and linting

commit 5f4d353
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 12:34:14 2025 -0600

    working

commit c2a15a9
Merge: a310771 1025a42
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 11:40:00 2025 -0600

    Merge branch 'main' of github.com:GetStream/Vision-Agents into bedrock

commit a310771
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 11:39:48 2025 -0600

    wip

commit b4370f4
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 11:22:43 2025 -0600

    something isn't quite working

commit 2dac975
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Mon Oct 20 10:30:04 2025 -0600

    add the examples

commit 6885289
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 20:19:42 2025 -0600

    ai realtime docs

commit a0fa3cc
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 18:48:06 2025 -0600

    wip

commit b914fc3
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 18:40:22 2025 -0600

    fix ai llm

commit b5b00a7
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 17:11:26 2025 -0600

    work audio input

commit ac72260
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 16:47:19 2025 -0600

    fix model id

commit 2b5863c
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Sun Oct 19 16:32:54 2025 -0600

    wip on bedrock

commit 8bb4162
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Fri Oct 17 15:22:03 2025 -0600

    next up the connect method

commit 7a21e4e
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Fri Oct 17 14:12:00 2025 -0600

    nova progress

commit 16e8ba0
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Fri Oct 17 13:16:00 2025 -0600

    docs for bedrock nova

commit 1025a42
Author: Bart Schuijt <schuijt.bart@gmail.com>
Date:   Fri Oct 17 21:05:45 2025 +0200

    fix: Update .env.example for Gemini Live (GetStream#108)

commit e12112d
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Fri Oct 17 11:49:07 2025 -0600

    wip

commit fea101a
Author: Bart Schuijt <schuijt.bart@gmail.com>
Date:   Fri Oct 17 09:25:55 2025 +0200

    workflow file update

commit bb2d74c
Author: Bart Schuijt <schuijt.bart@gmail.com>
Date:   Fri Oct 17 09:22:33 2025 +0200

    initial commit

commit d2853cd
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 16 19:44:59 2025 -0600

    always remember pep 420

commit 30a8eca
Author: Thierry Schellenbach <thierry@getstream.io>
Date:   Thu Oct 16 19:36:58 2025 -0600

    start of bedrock branch

commit fc032bf
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Thu Oct 16 09:17:42 2025 +0200

    Remove cli handler from examples (GetStream#101)

commit 39a821d
Author: Dan Gusev <dangusev92@gmail.com>
Date:   Tue Oct 14 12:20:41 2025 +0200

    Update Deepgram plugin to use SDK v5.0.0 (GetStream#98)

    * Update Deepgram plugin to use SDK v5.0.0

    * Merge test_realtime and test_stt and update the remaining tests

    * Make deepgram.STT.start() idempotent

    * Clean up unused import

    * Use uv as the default package manager > pip

    ---------

    Co-authored-by: Neevash Ramdial (Nash) <mail@neevash.dev>

commit 2013be5
Author: Tommaso Barbugli <tbarbugli@gmail.com>
Date:   Mon Oct 13 16:57:37 2025 +0200

    ensure chat works with default types (GetStream#99)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants