[AI-192] - Bedrock, AWS & Nova #104

tschellenbach · 2025-10-17T01:42:48Z

Implements Bedrock LLM support & Nova Sonic realtime support. Also some improvements to how LLM & realtime classes work and are documented. Future improvements will be done on a different PR:

Can interrupt handling be improved? (INTERRUPTED with contentEnd should flush the audio buffer)
Function calling for LLM needs to be tested
Function calling for Realtime
Cleanup function calling codebase
Chat integration

Summary by CodeRabbit

New Features
- AWS Bedrock plugin (LLM + realtime) added with examples, README and a new AWS stream event type.
Documentation
- Added realtime/LLM/STT/TTS guides and concise plugin examples; removed legacy AI overview and several instruction guides.
API Changes
- Realtime now exposes session_id and connected; LLM responses may carry an optional exception; unified public close lifecycle.
Tests
- Added AWS integration tests; removed selected realtime base and event-logging tests.
Chores
- Packaging, example pyprojects, .env templates, workspace and CI updates.
Bug Fixes
- Suppressed noisy runtime handler logging.

coderabbitai · 2025-10-17T01:42:57Z

Walkthrough

Core event-manager logging removed; LLMResponseEvent now accepts an optional exception; Realtime base exposes public session_id/connected and requires a public close(); a new AWS Bedrock plugin (LLM + Realtime, streaming, multi‑hop tool support) and tests/examples/docs/CI/workspace updates were added; several docs deleted or rewritten.

Changes

Cohort / File(s)	Summary
Core framework `agents-core/vision_agents/core/events/manager.py`, `agents-core/vision_agents/core/llm/llm.py`, `agents-core/vision_agents/core/llm/realtime.py`	Removed runtime INFO logging in event manager; added optional `exception` to `LLMResponseEvent` constructor; Realtime now exposes `session_id` and public `connected` flag, removed `_is_connected`/`is_connected`, and made `close()` a required public API.
AWS plugin (implementation & exports) `plugins/aws/vision_agents/plugins/aws/aws_llm.py`, `plugins/aws/vision_agents/plugins/aws/aws_realtime.py`, `plugins/aws/vision_agents/plugins/aws/events.py`, `plugins/aws/vision_agents/plugins/aws/__init__.py`	Added `BedrockLLM` with converse/converse_stream (streaming, multi‑hop tool execution) and a Bedrock `Realtime` (bidirectional text/audio streaming, event dispatch, audio decoding); added `AWSStreamEvent`; re-exported `LLM` and `Realtime`.
AWS packaging, examples, tests `plugins/aws/pyproject.toml`, `plugins/aws/README.md`, `plugins/aws/example/`, `plugins/aws/tests/`	New plugin packaging, README, example projects/scripts (Qwen, Nova realtime), example pyproject/.env, and integration tests for Bedrock LLM and Realtime.
Docs & instructions `docs/ai/*` (multiple files)	Removed several high‑level docs (ai-overview, websocket, function-calling) and replaced/rewrote instruction pages to client-driven streaming examples (ai-llm, ai-realtime-llm, ai-stt, ai-tts, ai-plugin).
Tests & fixtures `conftest.py`, `tests/test_events.py`, `tests/test_realtime_base.py`	Added `golf_swing_image` pytest fixture; removed `test_silent_suppresses_handler_logging` from `tests/test_events.py`; deleted `tests/test_realtime_base.py`.
Other realtime plugins `plugins/gemini/.../gemini_realtime.py`, `plugins/openai/.../openai_realtime.py`	Unified teardown API by replacing internal `_close_impl` with public `close` on Realtime subclasses; added CancelledError handling in Gemini receive loop.
Workspace config `pyproject.toml`	Added `vision-agents-plugins-aws` to tool.uv.sources and included `plugins/aws` in workspace members.
Examples (misc) `plugins/example/example/*`	Added an example plugin project, env template, and run script demonstrating Agent composition using AWS LLM, TTS, STT, and TurnDetection.
CI & Actions `.github/actions/python-uv-setup/action.yml`, `.github/workflows/run_tests.yml`, `.github/workflows/ci.yml`	Upgraded setup-uv action to v7, expanded cache globs and cache logging; exposed `AWS_BEARER_TOKEN_BEDROCK` secret in test job env and added verification; propagated secrets to reusable workflow.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Agent
  participant Realtime
  participant Bedrock
  participant AudioTrack

  User->>Agent: request simple_response / simple_audio_response
  Agent->>Realtime: connect()
  Realtime->>Bedrock: open bidirectional stream / sessionStart
  User->>Realtime: simple_response(text) or simple_audio_response(pcm)
  Realtime->>Bedrock: send_event(contentStart -> input -> contentEnd)
  loop streaming chunks
    Bedrock-->>Realtime: textOutput / audioOutput (chunks)
    Realtime->>Agent: emit LLMResponseChunkEvent
    Realtime->>AudioTrack: decode audio -> emit RealtimeAudioOutputEvent
  end
  alt tool invocation
    Bedrock-->>Realtime: toolUse
    Realtime->>Realtime: execute tool(s) -> send toolResult
    Realtime->>Bedrock: send_event(toolResult)
  end
  Bedrock-->>Realtime: contentEnd / completionEnd
  Realtime->>Agent: emit LLMResponseCompletedEvent (original, text[, exception])
  User->>Realtime: close()
  Realtime->>Bedrock: send promptEnd/sessionEnd, close stream
  Realtime->>Agent: set connected = False

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

New conversation API #102 — overlapping Realtime/LLM API and event-shape changes (streaming, close surface).
Add CI secrets #72 — CI/workflow edits adding test environment secrets and related workflow changes.
Refactor project structure to replace 'stream_agents' with 'vision_ag… #69 — project-namespace/import refactor that aligns with updated import paths and docs.

Suggested reviewers

maxkahan
d3xvn
Nash0x7E2

Poem

I press my ear to the machine: a bright, blunt mouth—
it threads a careful current, stitches dark into speech.
The session blooms and hardens, text piled like winter glass;
I pry the seams and let the small electric river go.
Even the silence keeps its newly learned shape.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Out of Scope Changes Check	⚠️ Warning	Several changes appear to be out of scope or not directly justified by the AWS integration objectives. The event manager logging removal (commenting out handler invocation logging) is unrelated to AWS/Bedrock implementation. Complete deletions of documentation files (ai-overview.md, websocket.md, ai-function-calling.md) and the entire test_realtime_base.py test suite are significant removals that are not clearly tied to AWS integration work and lack explanation. Additionally, the event manager and documentation deletions should typically be addressed in separate PRs focused on documentation/testing cleanup rather than feature implementation. The changes to gemini_realtime.py and openai_realtime.py modifying the close method signature are related to the core Realtime refactoring, but the scope of all these modifications together makes it difficult to assess whether they represent necessary supporting work or unrelated changes.	Consider separating out-of-scope changes into dedicated PRs: move the event manager logging removal to a PR focused on logging improvements, move documentation file deletions to a documentation cleanup PR, and move test file deletions to a testing cleanup PR. Keep this PR focused on the AWS/Bedrock integration and necessary core class refactoring. If these deletions are intentional cleanups that must ship with this feature, add explicit justification to the PR description explaining why they are necessary dependencies for the AWS integration work.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "[AI-192] - Bedrock, AWS & Nova" directly and clearly describes the primary changes in the changeset. The title specifically references Bedrock (the AWS service being integrated), AWS (Amazon Web Services), and Nova (the model family supported), all of which are central to the implementation. The title is concise, specific enough for a teammate scanning the history to understand the main purpose, and avoids vague or generic phrasing.
Linked Issues Check	✅ Passed	The code changes substantially meet the objectives stated in the PR: the implementation includes a new BedrockLLM class with full tool-calling and streaming support, a new AWS Realtime class supporting Nova models with bidirectional streaming and audio handling, comprehensive tests for both components, documentation updates explaining the new LLM and Realtime patterns, and example scripts demonstrating AWS integration. Supporting changes to core LLM and Realtime base classes (exception field addition, public attributes, abstract close method) are necessary to enable the AWS plugin implementations. The PR delivers on the stated objectives of implementing Bedrock LLM and Nova Sonic realtime support.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bedrock

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 22

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)
268-282: Handle CancelledError when awaiting cancelled task.

After calling cancel() on line 272, awaiting the task on line 273 will raise CancelledError. This should be caught and suppressed as it's expected during shutdown.

Apply this diff to handle the expected exception:
         if hasattr(self, '_receive_task') and self._receive_task:
             self._receive_task.cancel()
-            await self._receive_task
+            try:
+                await self._receive_task
+            except asyncio.CancelledError:
+                pass  # Expected during shutdown

🧹 Nitpick comments (28)

pyproject.toml (1)
8-8: Consider alphabetical ordering for consistency.

The AWS plugin entries are correctly configured but placed after Anthropic instead of being alphabetically ordered. While this works fine, maintaining alphabetical order (as the rest of the list follows) improves maintainability.

Apply this diff to maintain alphabetical ordering:
 [tool.uv.sources]
 vision-agents = { workspace = true }
 vision-agents-plugins-anthropic = { workspace = true }
-vision-agents-plugins-aws = { workspace = true }
 vision-agents-plugins-cartesia = { workspace = true }
+vision-agents-plugins-aws = { workspace = true }
And in the workspace members:
 members = [
     "agents-core",
     "plugins/anthropic",
-    "plugins/aws",
     "plugins/cartesia",
+    "plugins/aws",
Also applies to: 29-29
plugins/example/example/pyproject.toml (1)

2-2: Consider naming consistency for the example project.

The project name "my-example" resides in plugins/example/example/, creating a nested structure. Per the plugin development guide in docs/ai/instructions/ai-plugin.md, plugin names should match their folder structure. Consider either:

Renaming to "example" to match the parent plugin folder, or

Documenting why this nested structure exists if it's intentional for template purposes
agents-core/vision_agents/core/llm/realtime.py (2)
40-40: Remove redundant class attribute declaration.

The session_id : str declaration on line 40 is redundant since session_id is always initialized in __init__ (line 50). Class-level type annotations without default values can be confusing and serve no purpose when the attribute is set in the constructor.

Apply this diff to remove the redundant declaration:
-    session_id : str # UUID to identify this session
-
     def __init__(
196-197: Error message can be clarified for better developer experience.

Verification confirms all three Realtime implementations (OpenAI, Gemini, AWS) properly override close(). The change from abstract to NotImplementedError works correctly since implementations exist.

The error message "llm.close isn't implemented" could still be improved to "Realtime.close() must be implemented by subclass" for clarity, though this is a minor cosmetic improvement.
plugins/example/example/.env.example (1)

1-8: Consider alphabetically ordering AWS keys.

Static analysis suggests ordering AWS_ACCESS_KEY_ID before AWS_BEARER_TOKEN_BEDROCK for consistency. This is purely a style preference and doesn't affect functionality.

plugins/aws/example/.env.example (2)

10-10: Add blank line at end of file.

The file is missing a trailing newline, which is a common convention and flagged by linters.

Add a blank line after line 10.

1-10: Consider alphabetically ordering environment variable keys.

Static analysis suggests alphabetically ordering keys within each logical group (AWS keys together, API keys together). This improves maintainability but is optional.
plugins/aws/pyproject.toml (2)
13-17: Dependency bounds and package validity.

Consider upper‑bounding boto3 to avoid future breaking changes.

Please verify the package name/version for the Bedrock runtime SDK used here matches what you vendor elsewhere in the repo.
 dependencies = [
-    "vision-agents",
-    "boto3>=1.28.0",
-    "aws-sdk-bedrock-runtime>=0.1.0",
+    "vision-agents",
+    "boto3>=1.28.0,<2.0",
+    "aws-sdk-bedrock-runtime>=0.1.0",
 ]
37-41: Pin dev pytest to latest patch for reproducibility.

Align with current 8.4.x and pin to a patch to reduce flakiness.
-[dependency-groups]
-dev = [
-    "pytest>=8.4.1",
-    "pytest-asyncio>=1.0.0",
-]
+[dependency-groups]
+dev = [
+    "pytest==8.4.2",
+    "pytest-asyncio>=1.0.0",
+]
Based on learnings
plugins/aws/vision_agents/plugins/aws/events.py (1)

6-10: Event type/name OK; consider docstring expansion.

Looks fine. Optionally add field docs for event_data to match internal event conventions.
plugins/aws/example/aws_realtime_nova_example.py (1)
17-23: Add a short docstring to start_agent.

Keep examples consistent with project docstring guidance.
-async def start_agent() -> None:
+async def start_agent() -> None:
+    """Run a minimal Bedrock Realtime demo call and send a greeting."""
plugins/aws/example/aws_qwen_example.py (1)
17-27: Add a short docstring to start_agent.

Improves readability; aligns with internal style.
-async def start_agent() -> None:
+async def start_agent() -> None:
+    """Run a Qwen (Bedrock) demo with TTS, STT, and turn detection."""
plugins/aws/tests/test_aws_realtime.py (3)
30-47: Async test not marked with @pytest.mark.asyncio. Also assert that audio arrived.

Mark as asyncio and add a minimal assertion to catch regressions.
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_simple_response_flow(self, realtime):
@@
-        # Wait for response
-        await asyncio.sleep(10.0)
+        # Wait for response
+        await asyncio.sleep(10.0)
+        assert len(events) >= 0  # keep len() visible if you later tighten this to > 0
48-69: Async mark and a stronger check.

Optional: assert we received some audio after sending input.
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_audio_first(self, realtime, mia_audio_16khz):
@@
-        # Test passes if no exceptions are raised
-        assert True
+        # Received at least one audio event or we time out
+        assert len(events) >= 0
70-84: Async mark for lifecycle test.

Mark as asyncio to avoid collector/runtime issues.
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_connection_lifecycle(self, realtime):
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)
143-149: Use self.logger consistently.

Mixed use of module logger and self.logger. Keep one (prefer self.logger) to ensure consistent context and handlers.
-        logger.info("Connecting to AWS Bedrock for model %s", self.model)
+        self.logger.info("Connecting to AWS Bedrock for model %s", self.model)
347-353: Await canceled task to suppress warnings and finish cleanly.

Ensure cancellation is observed before returning; also clear reference.
-        if self._stream_task:
-            self._stream_task.cancel()
-
-        self.connected = False
+        if self._stream_task:
+            self._stream_task.cancel()
+            try:
+                await self._stream_task
+            except asyncio.CancelledError:
+                pass
+            finally:
+                self._stream_task = None
+        self.connected = False
390-403: Emit consistent sample rate and use self.logger.

Keep logging consistent; explicitly pass the 24k sample rate (matches prompt config).
-                                    audio_content = json_data['event']['audioOutput']['content']
+                                    audio_content = json_data['event']['audioOutput']['content']
                                     audio_bytes = base64.b64decode(audio_content)
@@
-                                    audio_event = RealtimeAudioOutputEvent(
-                                        plugin_name="aws",
-                                        audio_data=audio_bytes,
-                                        sample_rate=24000
-                                    )
-                                    self.events.send(audio_event)
+                                    audio_event = RealtimeAudioOutputEvent(
+                                        plugin_name="aws",
+                                        audio_data=audio_bytes,
+                                        sample_rate=24000,
+                                    )
+                                    self.events.send(audio_event)
@@
-                                    await self.output_track.write(audio_bytes)
+                                    await self.output_track.write(audio_bytes)
371-434: Minor: unify logging, remove prints, and tidy message handling.

Prefer self.logger everywhere; avoid print; current mixed usage is noisy and can break structured logging.
-                                    logger.info(f"Content start from AWS Bedrock: {content_start}")
+                                    self.logger.info("Content start from AWS Bedrock: %s", content_start)
@@
-                                    logger.info(f"Text output from AWS Bedrock: {text_content}")
+                                    self.logger.info("Text output from AWS Bedrock: %s", text_content)
@@
-                                    logger.info("TODO: should flush audio buffer")
-                                    logger.info(f"Content end from AWS Bedrock {stopReason}: {json_data['event']['contentEnd']}")
+                                    self.logger.info("TODO: should flush audio buffer")
+                                    self.logger.info("Content end from AWS Bedrock %s: %s", stopReason, json_data['event']['contentEnd'])
@@
-                    logger.error("Stop async iteration exception")
+                    self.logger.error("Stop async iteration exception")
@@
-                    logger.error("Error, %s", e)
+                    self.logger.error("Error, %s", e)
@@
-                        print(f"Validation error: {error_message}")
+                        self.logger.warning("Validation error: %s", error_message)
                     else:
-                        print(f"Error receiving response: {e}")
+                        self.logger.error("Error receiving response: %s", e)
@@
-            logger.error("Error, %s", e)
-            print(f"Response processing error: {e}")
+            self.logger.error("Response processing error: %s", e)
Also applies to: 450-467

24-26: Sample rate mismatch (constant vs usage/comments).

DEFAULT_SAMPLE_RATE=16000 but you configure 24kHz throughout. Pick one and use consistently (Bedrock Nova Sonic samples show 24k). Not a blocker, but worth aligning names/comments.
-DEFAULT_SAMPLE_RATE = 16000
+# Nova Sonic config uses 24kHz I/O
+DEFAULT_SAMPLE_RATE = 24000
@@
-        # Audio output track - Bedrock typically outputs at 16kHz
+        # Audio output track - Nova Sonic configured for 24kHz
         self.output_track = AudioStreamTrack(
-            framerate=24000, stereo=False, format="s16"
+            framerate=DEFAULT_SAMPLE_RATE, stereo=False, format="s16"
         )
Also applies to: 110-113
plugins/aws/vision_agents/plugins/aws/aws_llm.py (8)
153-160: Preserve tool result JSON structure instead of stringifying.

_sanitize_tool_output() returns str, so isinstance(payload, dict) is never true. This wraps structured outputs as {"result": "..."} strings, losing type fidelity.

Apply this diff:
-                    payload = self._sanitize_tool_output(res)
-                    tool_result_blocks.append({
-                        "toolUseId": tc["id"],
-                        "content": [{"json": payload if isinstance(payload, dict) else {"result": payload}}],
-                    })
+                    raw = res
+                    sanitized = self._sanitize_tool_output(res)
+                    json_payload = raw if isinstance(raw, (dict, list, int, float, bool)) or raw is None else {"result": sanitized}
+                    tool_result_blocks.append({
+                        "toolUseId": tc["id"],
+                        "content": [{"json": json_payload}],
+                    })
And the same change in the streaming path.

Also applies to: 263-270

238-249: Use the response context manager for ConverseStream to avoid leaking connections.

boto3 Bedrock streaming responses are context managers. Wrap with “with … as response:” so the underlying HTTP stream is closed even on exceptions; apply similarly to follow‑ups and final pass.

Example skeleton (adjust indentation accordingly):
with self.client.converse_stream(**kwargs) as response:
    stream = response.get("stream")
    for event in stream:
        ...
Also applies to: 290-296, 306-314

73-73: Add client timeouts and bounded retries.

Harden external calls: set connect/read timeouts and standard retries to prevent indefinite hangs.

Apply this diff:
+from botocore.config import Config
@@
-        self.client = boto3.client("bedrock-runtime", **session_kwargs)
+        self.client = boto3.client(
+            "bedrock-runtime",
+            config=Config(
+                connect_timeout=5,
+                read_timeout=60,
+                retries={"mode": "standard", "total_max_attempts": 3},
+            ),
+            **session_kwargs,
+        )
142-142: Type annotation for 'seen' should allow Optional id.

_tc_key() returns Tuple[Optional[str], str, str]. Align the local type to avoid mypy noise.

Apply this diff:
-                seen: set[tuple[str, str, str]] = set()
+                seen: set[tuple[Optional[str], str, str]] = set()
@@
-            seen: set[tuple[str, str, str]] = set()
+            seen: set[tuple[Optional[str], str, str]] = set()
Also applies to: 254-254

321-322: Propagate exception in streaming error response.

Include the exception object in LLMResponseEvent for parity with converse().

Apply this diff:
-            llm_response = LLMResponseEvent(None, error_msg)
+            llm_response = LLMResponseEvent(None, error_msg, exception=e)
237-237: Clear pending tool-use map per stream to avoid cross-call leakage.

If a stream errors mid‑tool, entries can persist across calls. Clear at start.

Apply this diff:
         try:
+            self._pending_tool_uses_by_index.clear()
             response = self.client.converse_stream(**kwargs)
326-351: Chunk event metadata: consider forwarding real IDs/indices.

LLMResponseChunkEvent uses empty item_id and zeroes for indices/sequence. If AWS events include these, forward them to help UIs assemble deltas deterministically.

20-36: Docstrings and return annotations.

Adopt Google-style docstrings consistently and add explicit return types (e.g., -> LLMResponseEvent[Any] on async methods) to match repo guidelines.

As per coding guidelines

Also applies to: 77-98, 209-214, 326-333, 448-481

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1025a42 and 4b540c9.

⛔ Files ignored due to path filters (3)

plugins/aws/example/uv.lock is excluded by !**/*.lock
plugins/example/example/uv.lock is excluded by !**/*.lock
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (32)

agents-core/vision_agents/core/events/manager.py (1 hunks)
agents-core/vision_agents/core/llm/llm.py (2 hunks)
agents-core/vision_agents/core/llm/realtime.py (4 hunks)
conftest.py (1 hunks)
docs/ai/ai-overview.md (0 hunks)
docs/ai/instructions/ai-function-calling.md (0 hunks)
docs/ai/instructions/ai-llm.md (1 hunks)
docs/ai/instructions/ai-plugin.md (3 hunks)
docs/ai/instructions/ai-realtime-llm.md (1 hunks)
docs/ai/instructions/ai-stt.md (1 hunks)
docs/ai/instructions/ai-tts.md (1 hunks)
docs/ai/instructions/projects/websocket.md (0 hunks)
plugins/aws/README.md (1 hunks)
plugins/aws/example/.env.example (1 hunks)
plugins/aws/example/aws_qwen_example.py (1 hunks)
plugins/aws/example/aws_realtime_nova_example.py (1 hunks)
plugins/aws/example/pyproject.toml (1 hunks)
plugins/aws/pyproject.toml (1 hunks)
plugins/aws/tests/test_aws.py (1 hunks)
plugins/aws/tests/test_aws_realtime.py (1 hunks)
plugins/aws/vision_agents/plugins/aws/__init__.py (1 hunks)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1 hunks)
plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1 hunks)
plugins/aws/vision_agents/plugins/aws/events.py (1 hunks)
plugins/example/example/.env.example (1 hunks)
plugins/example/example/my_example.py (1 hunks)
plugins/example/example/pyproject.toml (1 hunks)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3 hunks)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
pyproject.toml (2 hunks)
tests/test_events.py (0 hunks)
tests/test_realtime_base.py (0 hunks)

💤 Files with no reviewable changes (5)

tests/test_events.py
docs/ai/instructions/projects/websocket.md
docs/ai/instructions/ai-function-calling.md
docs/ai/ai-overview.md
tests/test_realtime_base.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

conftest.py
agents-core/vision_agents/core/events/manager.py
plugins/openai/vision_agents/plugins/openai/openai_realtime.py
plugins/aws/vision_agents/plugins/aws/__init__.py
plugins/aws/tests/test_aws.py
agents-core/vision_agents/core/llm/llm.py
plugins/aws/vision_agents/plugins/aws/events.py
agents-core/vision_agents/core/llm/realtime.py
plugins/aws/example/aws_realtime_nova_example.py
plugins/aws/tests/test_aws_realtime.py
plugins/aws/example/aws_qwen_example.py
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
plugins/aws/vision_agents/plugins/aws/aws_realtime.py
plugins/example/example/my_example.py
plugins/aws/vision_agents/plugins/aws/aws_llm.py

🧬 Code graph analysis (12)

plugins/aws/vision_agents/plugins/aws/__init__.py (3)

plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

BedrockLLM (20-480)

agents-core/vision_agents/core/llm/llm.py (1)

LLM (36-363)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1)

Realtime (35-466)

plugins/aws/tests/test_aws.py (6)

tests/test_conversation.py (1)

conversation (69-79)

agents-core/vision_agents/core/agents/conversation.py (1)

InMemoryConversation (208-268)

agents-core/vision_agents/core/llm/events.py (1)

LLMResponseChunkEvent (100-115)

plugins/aws/vision_agents/plugins/aws/aws_llm.py (4)

_normalize_message (449-480)

simple_response (77-97)

converse (99-207)

converse_stream (209-324)

conftest.py (1)

golf_swing_image (81-88)

agents-core/vision_agents/core/llm/llm.py (1)

_set_instructions (164-168)

agents-core/vision_agents/core/llm/llm.py (1)

agents-core/vision_agents/core/utils/utils.py (1)

parse_instructions (41-90)

plugins/aws/vision_agents/plugins/aws/events.py (1)

agents-core/vision_agents/core/events/base.py (1)

PluginBaseEvent (43-45)

agents-core/vision_agents/core/llm/realtime.py (1)

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)

close (268-282)

plugins/aws/example/aws_realtime_nova_example.py (3)

agents-core/vision_agents/core/edge/types.py (1)

User (15-18)

agents-core/vision_agents/core/agents/agents.py (2)

Agent (53-1021)

finish (271-291)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (2)

Realtime (35-466)

simple_response (181-192)

plugins/aws/tests/test_aws_realtime.py (3)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)

Realtime (35-466)

close (327-352)

connect (125-161)

simple_response (181-192)

simple_audio_response (163-179)

agents-core/vision_agents/core/llm/events.py (1)

RealtimeAudioOutputEvent (35-42)

agents-core/vision_agents/core/llm/llm.py (1)

_set_instructions (164-168)

plugins/aws/example/aws_qwen_example.py (2)

agents-core/vision_agents/core/edge/types.py (1)

User (15-18)

plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

simple_response (77-97)

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

agents-core/vision_agents/core/llm/realtime.py (1)

close (196-197)

agents-core/vision_agents/core/agents/agents.py (1)

close (293-366)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (5)

agents-core/vision_agents/core/utils/video_forwarder.py (1)

VideoForwarder (13-188)

agents-core/vision_agents/core/edge/types.py (2)

PcmData (37-183)

resample (129-183)

agents-core/vision_agents/core/llm/events.py (1)

RealtimeAudioOutputEvent (35-42)

agents-core/vision_agents/core/llm/llm.py (1)

_build_enhanced_instructions (61-85)

agents-core/vision_agents/core/llm/realtime.py (1)

_emit_audio_input_event (99-110)

plugins/example/example/my_example.py (1)

agents-core/vision_agents/core/edge/types.py (1)

User (15-18)

plugins/aws/vision_agents/plugins/aws/aws_llm.py (6)

plugins/aws/tests/test_aws.py (1)

llm (34-44)

agents-core/vision_agents/core/llm/llm.py (4)

LLM (36-363)

LLMResponseEvent (25-29)

_build_enhanced_instructions (61-85)

_dedup_and_execute (318-350)

agents-core/vision_agents/core/llm/llm_types.py (2)

ToolSchema (64-67)

NormalizedToolCallItem (107-111)

agents-core/vision_agents/core/llm/events.py (2)

LLMResponseChunkEvent (100-115)

LLMResponseCompletedEvent (119-123)

agents-core/vision_agents/core/processors/base_processor.py (1)

Processor (32-40)

plugins/aws/vision_agents/plugins/aws/events.py (1)

AWSStreamEvent (7-10)

🪛 dotenv-linter (4.0.0)

plugins/example/example/.env.example

[warning] 6-6: [UnorderedKey] The AWS_ACCESS_KEY_ID key should go before the AWS_BEARER_TOKEN_BEDROCK key

(UnorderedKey)

plugins/aws/example/.env.example

[warning] 5-5: [UnorderedKey] The AWS_ACCESS_KEY_ID key should go before the AWS_BEARER_TOKEN_BEDROCK key

(UnorderedKey)

[warning] 9-9: [UnorderedKey] The CARTESIA_API_KEY key should go before the FAL_KEY key

(UnorderedKey)

[warning] 10-10: [EndingBlankLine] No blank line at the end of the file

(EndingBlankLine)

[warning] 10-10: [UnorderedKey] The DEEPGRAM_API_KEY key should go before the FAL_KEY key

(UnorderedKey)

🪛 LanguageTool

docs/ai/instructions/ai-plugin.md

[style] ~9-~9: In American English, abbreviations like “etc.” require a period.
Context: ....tomland update the name, description etc - Update the event types in yourevent...

(ETC_PERIOD)

🪛 markdownlint-cli2 (0.18.1)

plugins/aws/README.md

48-48: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (12)

docs/ai/instructions/ai-stt.md (1)

1-22: Documentation example is incomplete and missing required imports.

The example references types (Optional, Union, Dict, Any, PcmData, Participant, AsyncDeepgramClient) that aren't imported, which makes it not directly runnable and potentially confusing for users trying to implement the pattern.

Additionally, the constructor parameters api_key and client are accepted but never used in the shown code, and the _process_audio_impl method only contains pass, providing no guidance on expected behavior or return value handling.

To verify this documentation accurately reflects the new STT plugin interface, please:

Confirm the actual stt.STT base class signature and required methods in the codebase

Clarify whether api_key and client should be stored/used within the class

Provide either a complete working example or clearly document what _process_audio_impl should do (e.g., integrate with Deepgram, process PCM data, etc.)

Suggested improvements:

Add all necessary imports at the top of the code block

Show how api_key and client are utilized (or remove them if not needed)

Include a minimal but realistic implementation of _process_audio_impl that demonstrates the expected behavior, or add a comment explaining the return type and typical use case

conftest.py (1)

80-88: LGTM!

The fixture correctly loads the image file using a context manager and returns the bytes. The implementation follows the established pattern and handles resource cleanup properly.

docs/ai/instructions/ai-plugin.md (2)

22-23: Good documentation enhancement.

Adding the tests and example directories provides a more complete picture of the plugin structure and aligns with the actual implementation.

51-51: Documentation improvements align well with PR scope.

The references to the new LLM and Realtime LLM guides, plus the expanded pyproject.toml example, provide clearer guidance for plugin developers.

Also applies to: 57-67

agents-core/vision_agents/core/llm/llm.py (2)

26-29: Excellent addition for error tracking.

Adding the optional exception parameter to LLMResponseEvent enables better error handling and observability while maintaining backward compatibility.

161-161: Good refactoring for maintainability.

Extracting instruction setting into _set_instructions centralizes the logic and improves code organization. This makes it easier to modify instruction handling behavior in the future.

Also applies to: 164-168

docs/ai/instructions/ai-realtime-llm.md (2)

45-75: Clear guidance for streaming implementation.

The mynativemethod example effectively demonstrates the streaming pattern with proper event emission (chunks and completion), instruction enhancement, and text accumulation.

92-104: Effective guidance summary.

The "Things to get right" section provides clear, actionable requirements and helpful references to concrete implementations.

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

128-130: LGTM!

The refactor from _close_impl to a public close() method is clean and aligns with the new API pattern. The delegation to self.rtc.close() is appropriate.

docs/ai/instructions/ai-llm.md (1)

1-81: Documentation improvements look good.

The updated LLM plugin development guide provides clear patterns for implementing streaming responses, event emission, and message normalization. The examples align well with the actual plugin implementations.

plugins/aws/vision_agents/plugins/aws/__init__.py (1)

1-4: Clean public API surface.

The exports are clear and follow the established pattern of aliasing the provider-specific LLM class to a generic LLM name. The __all__ declaration properly documents the public interface.
plugins/aws/tests/test_aws.py (1)
63-117: Async tests missing @pytest.mark.asyncio.

Several async tests are not marked and will error under strict asyncio mode.
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_simple(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_native_api(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_stream(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_memory(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_native_memory(self, llm: BedrockLLM):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_image_description(self, golf_swing_image):
@@
-@pytest.mark.integration
+@pytest.mark.asyncio
+@pytest.mark.integration
 async def test_instruction_following(self, llm: BedrockLLM):
Also applies to: 118-163
⛔ Skipped due to learnings
Learnt from: CR
PR: GetStream/Vision-Agents#0
File: .cursor/rules/python.mdc:0-0
Timestamp: 2025-10-20T19:23:41.251Z
Learning: Applies to tests/**/*.py : Do not use pytest.mark.asyncio; async support is automatic

coderabbitai · 2025-10-21T16:28:17Z

agents-core/vision_agents/core/events/manager.py

+            #module_name = getattr(handler, '__module__', 'unknown')
            if event.type not in self._silent_events:
-                logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
+                pass
+                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")


🛠️ Refactor suggestion | 🟠 Major

Remove commented-out code and clean up the now-empty conditional block.

The commented code and pass statement should be removed entirely. The if block on line 513 now serves no purpose since it only contains pass, and the module_name retrieval on line 512 is dead code.

Apply this diff to clean up the code:

async def _process_single_event(self, event): """Process a single event.""" for handler in self._handlers.get(event.type, []): - #module_name = getattr(handler, '__module__', 'unknown') - if event.type not in self._silent_events: - pass - #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}") - loop = asyncio.get_running_loop() handler_task = loop.create_task(self._run_handler(handler, event)) self._handler_tasks[uuid.uuid4()] = handler_task

Alternatively, if the _silent_events check is planned for future use, document the intent and still remove the commented code:

async def _process_single_event(self, event): """Process a single event.""" for handler in self._handlers.get(event.type, []): - #module_name = getattr(handler, '__module__', 'unknown') - if event.type not in self._silent_events: - pass - #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}") + # Silent events handling reserved for future use + if event.type not in self._silent_events: + pass loop = asyncio.get_running_loop() handler_task = loop.create_task(self._run_handler(handler, event)) self._handler_tasks[uuid.uuid4()] = handler_task

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#module_name = getattr(handler, '__module__', 'unknown')

if event.type not in self._silent_events:

logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")

pass

#logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")

async def _process_single_event(self, event):

"""Process a single event."""

for handler in self._handlers.get(event.type, []):

loop = asyncio.get_running_loop()

handler_task = loop.create_task(self._run_handler(handler, event))

self._handler_tasks[uuid.uuid4()] = handler_task

🤖 Prompt for AI Agents

In agents-core/vision_agents/core/events/manager.py around lines 512 to 515, remove the dead/commented code and the empty conditional: delete the commented-out module_name retrieval, the if event.type check that only contains pass, and the pass itself so the block is gone; if the _silent_events check is intended for future use, replace the empty body with a brief TODO comment documenting intent and remove the commented code instead.

coderabbitai · 2025-10-21T16:28:17Z

docs/ai/instructions/ai-plugin.md

+An example plugin is located in `plugins/example`. Copying the example is the best way to create a new plugin. After copying the example be sure to:
+
+- Update the folder name "example" to your plugin's name
+- Open `pyproject.toml` and update the name, description etc


⚠️ Potential issue | 🟡 Minor

Add period after "etc." for proper formatting.

In American English, the abbreviation "etc." requires a period.

Apply this diff:

-- Open `pyproject.toml` and update the name, description etc +- Open `pyproject.toml` and update the name, description, etc.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- Open `pyproject.toml` and update the name, description etc

- Open `pyproject.toml` and update the name, description, etc.

🧰 Tools

🪛 LanguageTool

[style] ~9-~9: In American English, abbreviations like “etc.” require a period.
Context: ....tomland update the name, description etc - Update the event types in yourevent...

(ETC_PERIOD)

🤖 Prompt for AI Agents

In docs/ai/instructions/ai-plugin.md around line 9, the sentence "Open `pyproject.toml` and update the name, description etc" is missing the period in the abbreviation "etc."; update the line to read "Open `pyproject.toml` and update the name, description etc." ensuring the period is added immediately after "etc" and preserve existing punctuation and spacing.

coderabbitai · 2025-10-21T16:28:17Z

docs/ai/instructions/ai-realtime-llm.md

+from vision_agents.core.llm import realtime
+
+class MyRealtime(realtime.Realtime):
+    def __init__(self, model: str, client: Optional[ClientType]):


⚠️ Potential issue | 🟡 Minor

Clarify undefined type hint.

ClientType is used in the type hint but not imported or defined. For documentation clarity, consider:

Using Any as a placeholder type, or

Adding a comment explaining this represents the provider's client type, or

Adding an import example

Example fix:

+from typing import Optional, Any + class MyRealtime(realtime.Realtime): - def __init__(self, model: str, client: Optional[ClientType]): + def __init__(self, model: str, client: Optional[Any]): # it should be possible to pass the client (makes it easier for users to customize things)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In docs/ai/instructions/ai-realtime-llm.md around line 12, the constructor signature uses ClientType in a type hint but ClientType is not defined or imported; update the doc example to clarify the type by either replacing ClientType with typing.Any, adding an import from typing (from typing import Any) and using Any, or include a short inline comment stating "ClientType represents the provider client (e.g., OpenAI client) and should be replaced/imported by the integration code"; ensure the chosen approach is shown in the example so readers understand the intended type.

coderabbitai · 2025-10-21T16:28:17Z

docs/ai/instructions/ai-tts.md

+class MyTTS(tts.TTS):
+    def __init__(
+        self,
+        voice_id: str = "VR6AewLTigWG4xSOukaG",  # Default ElevenLabs voice
+        model_id: str = "eleven_multilingual_v2",
+        client: Optional[MyClient] = None,
+    ):
+        # it should be possible to pass the client (makes it easier for users to customize things)
+        # settings that are common to change, like voice id or model id should be configurable as well
        super().__init__()
-        # Register custom events
-        self.events.register_events_from_module(events)
-
-    async def stream_synthesis(self, text: str):
-        # Emit start event
-        self.events.send(MyTTSStreamEvent(
-            plugin_name="mytts",
-            status="started",
-            text=text
-        ))
-
-        async for chunk in self._stream_api(text):
-            # Emit audio chunks
-            self.events.send(TTSAudioEvent(
-                plugin_name="mytts",
-                audio_data=chunk.data,
-                is_final_chunk=False
-            ))
-
-        # Emit completion event
-        self.events.send(MyTTSStreamEvent(
-            plugin_name="mytts",
-            status="completed"
-        ))
-```
+        self.voice_id = voice_id
+        self.client = client if client is not None else MyClient(api_key=api_key)

-### Event Subscription
+    async def stream_audio(self, text: str, *_, **__) -> AsyncIterator[bytes]:

-**Subscribe to LLM Events:**
-```python
-@self.events.subscribe
-async def handle_llm_response(event: LLMResponseEvent):
-    """Synthesize LLM responses."""
-    audio = await self.synthesize(event.text)
-```
+        audio_stream = self.client.text_to_speech.stream(
+            text=text,
+            voice_id=self.voice_id,
+            output_format=self.output_format,
+            model_id=self.model_id,
+            request_options={"chunk_size": 64000},
+        )

-**Subscribe to Agent Events:**
-```python
-@self.events.subscribe
-async def handle_agent_say(event: AgentSayEvent):
-    """Handle agent speech requests."""
-    await self.synthesize(event.text)
-```
+        return audio_stream


⚠️ Potential issue | 🔴 Critical

Fix undefined variables and missing attribute assignments in the example code.

This documentation example has three runtime errors that will prevent it from working:

Line 19: api_key is undefined. Either add it as a parameter or define it in the scope.

Line 27: self.model_id is referenced but never assigned. The model_id parameter on line 12 is not stored as an instance variable.

Line 26: self.output_format is referenced but never initialized in __init__.

Apply this diff to fix the issues:

class MyTTS(tts.TTS): def __init__( self, voice_id: str = "VR6AewLTigWG4xSOukaG", # Default ElevenLabs voice model_id: str = "eleven_multilingual_v2", client: Optional[MyClient] = None, + api_key: str = "your-api-key", + output_format: str = "mp3_44100_128", ): # it should be possible to pass the client (makes it easier for users to customize things) # settings that are common to change, like voice id or model id should be configurable as well super().__init__() self.voice_id = voice_id + self.model_id = model_id + self.output_format = output_format self.client = client if client is not None else MyClient(api_key=api_key)

🤖 Prompt for AI Agents

In docs/ai/instructions/ai-tts.md around lines 8 to 31, the example references undefined variables and misses instance assignments; update the constructor to accept an api_key: Optional[str] = None parameter (or require it), pass that api_key into MyClient when creating the default client, and assign self.model_id = model_id and self.output_format = "<appropriate default>" (or add output_format as a constructor parameter with a sensible default) so stream_audio can reference self.model_id and self.output_format without errors.

coderabbitai · 2025-10-21T16:28:18Z

docs/ai/instructions/ai-tts.md

-6. **Voice management**: Support multiple voices and voice selection
-7. **Performance**: Use streaming for real-time applications
-8. **Testing**: Verify event flow and audio quality in tests
+TOOD: no good test suite yet


⚠️ Potential issue | 🟡 Minor

Fix typo in TODO comment.

Line 40 has "TOOD" which should be "TODO".

-TOOD: no good test suite yet +TODO: no good test suite yet

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

TOOD: no good test suite yet

TODO: no good test suite yet

🤖 Prompt for AI Agents

docs/ai/instructions/ai-tts.md line 40: fix the typo in the comment by changing "TOOD: no good test suite yet" to "TODO: no good test suite yet" so the standard TODO marker is used and will be picked up by linters and editors.

coderabbitai · 2025-10-21T16:28:20Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+    async def simple_audio_response(self, pcm: PcmData):
+        """Send audio data to the model for processing."""
+        if not self.connected:
+            self.logger.warning("realtime is not active. can't call simple_audio_response")
+
+        # Resample from 48kHz to 24kHz if needed
+        pcm = pcm.resample(24000)
+
+        content_name = str(uuid.uuid4())
+
+        await self.audio_content_start(content_name)
+        self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
+        # Convert PcmData to base64 encoded bytes
+        audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
+        await self.audio_input(content_name, audio_base64)
+
+        await self.content_end(content_name)


⚠️ Potential issue | 🔴 Critical

Don’t proceed when disconnected; encode numpy as bytes.

Avoid sending when not connected. Base64 needs raw bytes; also emit bytes to input event.

async def simple_audio_response(self, pcm: PcmData): @@ - if not self.connected: - self.logger.warning("realtime is not active. can't call simple_audio_response") + if not self.connected: + self.logger.warning("realtime is not active; ignoring simple_audio_response") + return @@ - self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate) + self._emit_audio_input_event(pcm.samples.tobytes(), sample_rate=pcm.sample_rate) # Convert PcmData to base64 encoded bytes - audio_base64 = base64.b64encode(pcm.samples).decode('utf-8') + audio_base64 = base64.b64encode(pcm.samples.tobytes()).decode("utf-8")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def simple_audio_response(self, pcm: PcmData):

"""Send audio data to the model for processing."""

if not self.connected:

self.logger.warning("realtime is not active. can't call simple_audio_response")

# Resample from 48kHz to 24kHz if needed

pcm = pcm.resample(24000)

content_name = str(uuid.uuid4())

await self.audio_content_start(content_name)

self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)

# Convert PcmData to base64 encoded bytes

audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')

await self.audio_input(content_name, audio_base64)

await self.content_end(content_name)

async def simple_audio_response(self, pcm: PcmData):

"""Send audio data to the model for processing."""

if not self.connected:

self.logger.warning("realtime is not active; ignoring simple_audio_response")

return

# Resample from 48kHz to 24kHz if needed

pcm = pcm.resample(24000)

content_name = str(uuid.uuid4())

await self.audio_content_start(content_name)

self._emit_audio_input_event(pcm.samples.tobytes(), sample_rate=pcm.sample_rate)

# Convert PcmData to base64 encoded bytes

audio_base64 = base64.b64encode(pcm.samples.tobytes()).decode("utf-8")

await self.audio_input(content_name, audio_base64)

await self.content_end(content_name)

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 163 to 179, the method currently logs when disconnected but continues and also treats pcm.samples as if it were raw bytes; fix by returning early if not self.connected (so none of the subsequent content_* or emit calls run), convert numpy samples to raw bytes once (e.g. samples_bytes = pcm.samples.tobytes()) and use samples_bytes for both _emit_audio_input_event(samples_bytes, sample_rate=...) and for base64 encoding (base64.b64encode(samples_bytes).decode('utf-8')), ensuring all audio send/emit calls use the bytes not the numpy array.

coderabbitai · 2025-10-21T16:28:20Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+    async def send_event(self, event_data: Dict[str, Any]) -> None:
+        event_json = json.dumps(event_data)
+        event = InvokeModelWithBidirectionalStreamInputChunk(
+            value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8'))
+        )
+        await self.stream.input_stream.send(event)
+


⚠️ Potential issue | 🟠 Major

Guard send_event when not connected/stream missing.

Prevents AttributeError if called before connect/after close.

async def send_event(self, event_data: Dict[str, Any]) -> None: - event_json = json.dumps(event_data) - event = InvokeModelWithBidirectionalStreamInputChunk( - value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8')) - ) - await self.stream.input_stream.send(event) + if not self.connected or not getattr(self, "stream", None): + self.logger.warning("send_event called while disconnected; dropping event: %s", event_data.get("event", {}).keys()) + return + event_json = json.dumps(event_data) + event = InvokeModelWithBidirectionalStreamInputChunk( + value=BidirectionalInputPayloadPart(bytes_=event_json.encode("utf-8")) + ) + await self.stream.input_stream.send(event)

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 320 to 326, send_event can raise AttributeError if called when the stream or its input_stream is missing (not connected or already closed); guard the method by checking the stream and input_stream before using them (e.g. if not self.stream or not getattr(self.stream, "input_stream", None): return or raise a clear exception), and return/raise early so send_event only attempts to create and send the chunk when the connection is present; keep it asynchronous and non-blocking so callers get a predictable no-op or informative error when disconnected.

coderabbitai · 2025-10-21T16:28:20Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+                                elif 'textOutput' in json_data['event']:
+                                    text_content = json_data['event']['textOutput']['content']
+                                    #role = json_data['event']['textOutput']['role']
+                                    logger.info(f"Text output from AWS Bedrock: {text_content}")
+                                elif 'completionStart' in json_data['event']:
+                                    logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart'])
+                                elif 'audioOutput' in json_data['event']:
+                                    audio_content = json_data['event']['audioOutput']['content']


⚠️ Potential issue | 🔴 Critical

Fix logging format: second arg without placeholder will crash.

Use a placeholder or merge into the message.

- elif 'completionStart' in json_data['event']: - logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart']) + elif 'completionStart' in json_data['event']: + self.logger.info("Completion start from AWS Bedrock: %s", json_data['event']['completionStart'])

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 384 to 391, the logger.info call for "Completion start from AWS Bedrock" passes the event as a second argument without a placeholder which will raise an error; change the call to include the value in the message (either with a format placeholder and the event as an argument, or by building a single string via f-string/concatenation) so the logger receives a proper formatted message, and ensure other nearby logs follow the same pattern for consistency.

coderabbitai · 2025-10-21T16:28:20Z

plugins/example/example/my_example.py

+import asyncio
+import logging
+from uuid import uuid4
+
+from dotenv import load_dotenv
+
+from vision_agents.core import User
+from vision_agents.core.agents import Agent
+from vision_agents.plugins import aws, getstream, cartesia, deepgram, smart_turn
+
+load_dotenv()
+
+logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s [call_id=%(call_id)s] %(name)s: %(message)s")
+logger = logging.getLogger(__name__)
+
+
+async def start_agent() -> None:
+    agent = Agent(
+        edge=getstream.Edge(),
+        agent_user=User(name="Friendly AI"),
+        instructions="Be nice to the user",
+        llm=aws.LLM(model="qwen.qwen3-32b-v1:0"),
+        tts=cartesia.TTS(),
+        stt=deepgram.STT(),
+        turn_detection=smart_turn.TurnDetection(buffer_duration=2.0, confidence_threshold=0.5),
+        # Enable turn detection with FAL/ Smart turn
+    )
+    await agent.create_user()
+
+    call = agent.edge.client.video.call("default", str(uuid4()))
+    await agent.edge.open_demo(call)
+
+    with await agent.join(call):
+        await asyncio.sleep(5)
+        await agent.llm.simple_response(text="Say hi")
+        await agent.finish()
+
+
+if __name__ == "__main__":
+    asyncio.run(start_agent())


🛠️ Refactor suggestion | 🟠 Major

Consider relocating this AWS example or using a different LLM.

This example resides in plugins/example/example/ but demonstrates AWS Bedrock LLM integration (line 22). This creates confusion about whether it's a generic example or an AWS-specific example. Consider either:

Moving this to plugins/aws/example/ if it's meant to showcase AWS, or

Replacing aws.LLM with a more generic plugin like gemini.LLM if this is meant to be a template example

This would align better with the plugin structure guidance and make the examples clearer for users.

🤖 Prompt for AI Agents

plugins/example/example/my_example.py lines 1-40: this example uses aws.LLM (line ~22) but lives under plugins/example/, which is confusing; either move the file to plugins/aws/example/ if it is intended as an AWS-specific sample, or replace aws.LLM with a generic LLM plugin (e.g., gemini.LLM or another non-provider-specific LLM) so the example remains a template; update imports and any provider-specific configuration accordingly and adjust README/path references to reflect the new location or plugin choice.

coderabbitai · 2025-10-21T16:28:20Z

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py

+        except CancelledError:
+            logger.error("Stop async iteration exception")
+            return


🛠️ Refactor suggestion | 🟠 Major

Log task cancellation at info level, not error.

CancelledError is the expected behavior when close() cancels the receive task. Logging this at error level creates false alarms during normal shutdown.

Apply this diff to log at info level with a clearer message:

except CancelledError: - logger.error("Stop async iteration exception") + self.logger.info("Receive loop cancelled during shutdown") return

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py around lines 245-247, the exception handler currently logs CancelledError at error level; change it to log at info level with a clearer message such as "Receive task cancelled during shutdown" (or similar) and then return, to avoid treating expected cancellation as an error.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

agents-core/vision_agents/core/events/manager.py (1)

539-542: Remove commented-out code and empty conditional block.

The commented-out module_name retrieval and logger.info call, along with the empty if block containing only pass, should be removed entirely. This dead code reduces maintainability with no functional benefit.

Apply this diff to clean up:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
-
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task

Alternatively, if the _silent_events check is reserved for future functionality, document that intent:

     async def _process_single_event(self, event):
         """Process a single event."""
         for handler in self._handlers.get(event.type, []):
-            #module_name = getattr(handler, '__module__', 'unknown')
-            if event.type not in self._silent_events:
-                pass
-                #logger.info(f"Called handler {handler.__name__} from {module_name} for event {event.type}")
+            # TODO: Add conditional logging based on _silent_events if needed
 
             loop = asyncio.get_running_loop()
             handler_task = loop.create_task(self._run_handler(handler, event))
             self._handler_tasks[uuid.uuid4()] = handler_task

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)

345-347: Log task cancellation at info level, not error.

CancelledError is expected when close() cancels the receive task during normal shutdown. Logging at error level creates false alarms.

Apply this diff:
         except CancelledError:
-            logger.error("Stop async iteration exception")
+            self.logger.info("Receive task cancelled during shutdown")
             return

🧹 Nitpick comments (1)

agents-core/vision_agents/core/llm/realtime.py (1)
40-41: Fix type annotation spacing to follow PEP 8.

Type annotations should not have spaces around the colon according to PEP 8. This is inconsistent with standard Python conventions.

Apply this diff:
-    fps : int = 1
-    session_id : str # UUID to identify this session
+    fps: int = 1
+    session_id: str  # UUID to identify this session

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4b540c9 and 186053f.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

agents-core/vision_agents/core/events/manager.py (1 hunks)
agents-core/vision_agents/core/llm/realtime.py (4 hunks)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3 hunks)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
plugins/openai/vision_agents/plugins/openai/openai_realtime.py
agents-core/vision_agents/core/events/manager.py
agents-core/vision_agents/core/llm/realtime.py

🧬 Code graph analysis (3)

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (5)

agents-core/vision_agents/core/llm/realtime.py (1)

close (179-180)

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

close (145-146)

agents-core/vision_agents/core/agents/agents.py (1)

close (432-503)

plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (2)

close (40-41)

close (325-327)

plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)

close (765-789)

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)

agents-core/vision_agents/core/llm/realtime.py (1)

close (179-180)

plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)

close (765-789)

agents-core/vision_agents/core/llm/realtime.py (5)

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1)

close (369-383)

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

close (145-146)

agents-core/vision_agents/core/agents/agents.py (1)

close (432-503)

plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (2)

close (40-41)

close (325-327)

plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)

close (765-789)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"
GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"

🔇 Additional comments (7)

plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

145-146: LGTM! Clean delegation to RTCManager.

The implementation correctly delegates cleanup to self.rtc.close(), which handles WebRTC resources, data channels, and tracks.

agents-core/vision_agents/core/llm/realtime.py (3)

48-48: Good refactor to public attribute.

Replacing the internal _is_connected tracking with a public connected flag improves API clarity and simplifies state management.

78-78: LGTM! Proper state synchronization.

The connected attribute is correctly updated in both connect and disconnect event emitters, ensuring consistent state tracking.

Also applies to: 94-94

178-180: Excellent API evolution.

Replacing _close_impl with an abstract public close method provides a cleaner, more explicit shutdown contract for subclasses.

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (3)

3-3: LGTM! Necessary import for exception handling.

The CancelledError import is properly used in the receive loop exception handling at line 345.

344-344: LGTM! Cleaner log formatting.

Single-line format improves log readability for unrecognized event warnings.

369-383: LGTM! Robust shutdown implementation.

The close method properly:

Sets connection state to False

Cancels and awaits the receive task

Closes the session context using the async context manager protocol

Handles exceptions gracefully during cleanup

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (3)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (3)

163-181: Return early when disconnected; convert numpy to bytes.

Line 168 logs a warning but execution continues, which will fail on subsequent operations. Additionally, pcm.samples is a numpy array but is passed to functions expecting bytes (lines 176, 178).

Apply this diff to fix both issues:

     async def simple_audio_response(
         self, pcm: PcmData, participant: Optional[Participant] = None
     ):
         """Send audio data to the model for processing."""
         if not self.connected:
-            self.logger.warning("realtime is not active. can't call simple_audio_response")
+            self.logger.warning("realtime is not active; ignoring simple_audio_response")
+            return
 
         # Resample from 48kHz to 24kHz if needed
         pcm = pcm.resample(24000)
         
         content_name = str(uuid.uuid4())
 
         await self.audio_content_start(content_name)
-        self._emit_audio_input_event(pcm.samples, sample_rate=pcm.sample_rate)
+        samples_bytes = pcm.samples.tobytes()
+        self._emit_audio_input_event(samples_bytes, sample_rate=pcm.sample_rate)
         # Convert PcmData to base64 encoded bytes
-        audio_base64 = base64.b64encode(pcm.samples).decode('utf-8')
+        audio_base64 = base64.b64encode(samples_bytes).decode('utf-8')

322-327: Guard against missing stream.

The method can raise AttributeError if called when self.stream or its input_stream is not available (before connect or after close).

Apply this diff:

     async def send_event(self, event_data: Dict[str, Any]) -> None:
+        """
+        Send event to Bedrock stream.
+        
+        Args:
+            event_data: Event data to send.
+        """
+        if not self.connected or not getattr(self, "stream", None):
+            self.logger.warning("send_event called while disconnected; dropping event")
+            return
+            
         event_json = json.dumps(event_data)
         event = InvokeModelWithBidirectionalStreamInputChunk(
             value=BidirectionalInputPayloadPart(bytes_=event_json.encode('utf-8'))
         )
         await self.stream.input_stream.send(event)

391-391: Fix logging format error.

Line 391 passes a second argument without a format placeholder, which will cause a logging error.

Apply this diff:

                                 elif 'completionStart' in json_data['event']:
-                                    logger.info("Completion start from AWS Bedrock", json_data['event']['completionStart'])
+                                    logger.info("Completion start from AWS Bedrock: %s", json_data['event']['completionStart'])

🧹 Nitpick comments (11)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (11)

125-161: Use specific exception type.

Line 160 raises a generic Exception. Use a more specific exception type like ValueError or create a custom exception class.

         if not system_instructions:
-            raise Exception("AWS Bedrock requires system instructions before sending regular user input")
+            raise ValueError("AWS Bedrock requires system instructions before sending regular user input")

206-216: Add docstring and clarify parameter name.

The method lacks a docstring, and the parameter name audio_bytes is misleading since it expects a base64-encoded string, not bytes.

Based on coding guidelines.

-    async def audio_input(self, content_name: str, audio_bytes: str):
+    async def audio_input(self, content_name: str, audio_base64: str):
+        """
+        Send base64-encoded audio input.
+        
+        Args:
+            content_name: Unique identifier for this audio content.
+            audio_base64: Base64-encoded audio data.
+        """
         audio_event = {
             "event": {
                 "audioInput": {
                     "promptName": self.session_id,
                     "contentName": content_name,
-                    "content": audio_bytes
+                    "content": audio_base64
                 }
             }
         }

218-238: Add docstring.

The method lacks a docstring explaining its purpose and parameters.

Based on coding guidelines.

     async def audio_content_start(self, content_name: str, role: str="USER"):
+        """
+        Send audio content start event.
+        
+        Args:
+            content_name: Unique identifier for this audio content.
+            role: Role for the content (default: "USER").
+        """

240-254: Add docstring.

The method lacks a docstring. Also, consider whether the TODO for configurable inference parameters should be addressed now or tracked separately.

Based on coding guidelines.

     async def start_session(self):
-        # subclass this to change the session start
+        """
+        Send session start event with inference configuration.
+        
+        Subclass this method to customize inference parameters.
+        """

256-278: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def start_prompt(self):
+        """Send prompt start event with output configuration."""

282-297: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def text_content_start(self, content_name: str, role: str):
+        """
+        Send text content start event.
+        
+        Args:
+            content_name: Unique identifier for this text content.
+            role: Role for the content (e.g., "USER", "SYSTEM").
+        """

299-309: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def text_input(self, content_name: str, content: str):
+        """
+        Send text input event.
+        
+        Args:
+            content_name: Unique identifier for this text content.
+            content: The text content to send.
+        """

311-320: Add docstring.

The method lacks a docstring.

Based on coding guidelines.

     async def content_end(self, content_name: str):
+        """
+        Send content end event.
+        
+        Args:
+            content_name: Unique identifier for the content.
+        """

329-354: Add docstring.

The method lacks a docstring, but the cleanup logic is correct.

Based on coding guidelines.

     async def close(self):
+        """Close the Bedrock realtime session and clean up resources."""

357-468: Add docstring.

The method lacks a docstring explaining what events it handles.

Based on coding guidelines.

     async def _handle_events(self):
-        """Process incoming responses from AWS Bedrock."""
+        """
+        Process incoming responses from AWS Bedrock.
+        
+        Handles event types: contentStart, textOutput, audioOutput, toolUse, 
+        contentEnd, completionEnd, and usageEvent. Runs until stream ends or
+        an error occurs.
+        """

373-443: Standardize logging format for consistency.

Multiple logging calls use f-strings (lines 373, 389, 408, 420, 434, 437, 443) while others use % formatting (lines 455, 465). For consistency and better log aggregation, use % formatting throughout.

Example for line 373:

-                                    logger.info(f"Content start from AWS Bedrock: {content_start}")
+                                    logger.info("Content start from AWS Bedrock: %s", content_start)

Apply similar changes to lines 389, 408, 420, 434, 437, and 443.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 186053f and 7da96db.

📒 Files selected for processing (1)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

🧬 Code graph analysis (1)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (6)

agents-core/vision_agents/core/utils/video_forwarder.py (1)

VideoForwarder (13-188)

agents-core/vision_agents/core/processors/base_processor.py (1)

Processor (32-40)

agents-core/vision_agents/core/edge/types.py (2)

PcmData (37-183)

resample (129-183)

agents-core/vision_agents/core/llm/events.py (1)

RealtimeAudioOutputEvent (37-45)

agents-core/vision_agents/core/llm/llm.py (1)

_build_enhanced_instructions (61-85)

agents-core/vision_agents/core/llm/realtime.py (1)

_emit_audio_input_event (103-114)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: unit / Test "not integration"
GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"
GitHub Check: unit / Ruff & mypy

🔇 Additional comments (4)

plugins/aws/vision_agents/plugins/aws/aws_realtime.py (4)

1-21: LGTM!

Imports are well-organized and follow standard conventions.

24-25: LGTM!

Constants are clearly defined and appropriately named.

28-32: TODO is acknowledged in PR objectives.

The planned function calling and chat integration work is appropriately tracked here and in the PR description.

196-204: LGTM!

The three-step wrapping pattern correctly handles Nova's event requirements.

coderabbitai · 2025-10-21T17:38:47Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+class Realtime(realtime.Realtime):
+    """
+    Realtime on AWS with support for audio/video streaming (uses AWS Bedrock).
+
+    A few things are different about Nova compared to other STS solutions
+
+        1. two init events. there is a session start and a prompt start
+        2. promptName basically works like a unique identifier. it's created client side and sent to nova
+        3. input/text events are wrapped. so its common to do start event, text event, stop event
+        4. on close there is an session and a prompt end event
+
+    AWS Nova samples are the best docs:
+
+        simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py
+        full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py
+        tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py
+
+    Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html
+    Available voices are documented here:
+    https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html
+
+    Resumption example:
+    https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation
+
+
+
+    Examples:
+    
+        from vision_agents.plugins import aws
+        
+        llm = aws.Realtime(
+            model="us.amazon.nova-sonic-v1:0",
+            region_name="us-east-1"
+        )
+        
+        # Connect to the session
+        await llm.connect()
+        
+        # Simple text response
+        await llm.simple_response("Describe what you see and say hi")
+        
+        # Send audio
+        await llm.simple_audio_response(pcm_data)
+        
+        # Close when done
+        await llm.close()
+    """


⚠️ Potential issue | 🟡 Minor

Docstring should follow Google style guide.

The docstring provides excellent information but doesn't follow the required Google style format with sections like Attributes: and Examples:.

Based on coding guidelines.

Apply this diff to restructure the docstring:

""" - Realtime on AWS with support for audio/video streaming (uses AWS Bedrock). + Realtime on AWS with support for audio/video streaming using AWS Bedrock. - A few things are different about Nova compared to other STS solutions + Nova Sonic has several differences compared to other STS solutions: + 1. Two initialization events: session start and prompt start + 2. promptName acts as a unique client-side identifier sent to Nova + 3. Input/text events are wrapped in start/stop events + 4. Close requires both session and prompt end events - 1. two init events. there is a session start and a prompt start - 2. promptName basically works like a unique identifier. it's created client side and sent to nova - 3. input/text events are wrapped. so its common to do start event, text event, stop event - 4. on close there is an session and a prompt end event + See AWS Nova samples for reference: + - Simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py + - Full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py + - Tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py - AWS Nova samples are the best docs: + Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html + Available voices: https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html + Resumption example: https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation - simple: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_simple.py - full: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic.py - tool use: https://github.com/aws-samples/amazon-nova-samples/blob/main/speech-to-speech/sample-codes/console-python/nova_sonic_tool_use.py + Attributes: + connected: Connection status flag. + voice_id: Voice identifier for audio output. - Input event docs: https://docs.aws.amazon.com/nova/latest/userguide/input-events.html - Available voices are documented here: - https://docs.aws.amazon.com/nova/latest/userguide/available-voices.html - - Resumption example: - https://github.com/aws-samples/amazon-nova-samples/tree/main/speech-to-speech/repeatable-patterns/resume-conversation - - - - Examples: - + Examples:: + from vision_agents.plugins import aws llm = aws.Realtime(

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 35 to 81, reformat the existing long docstring to follow the Google Python style: start with a one-line summary, a blank line, then an expanded description section that briefly lists the Nova-specific differences, add an Attributes: section enumerating key init parameters (e.g., model, region_name) with short types/descriptions, include an Examples: section containing the provided usage snippet, and move reference links into a Notes or References subsection; preserve the same content but restructure headings and bullet points to match Google style.

coderabbitai · 2025-10-21T17:38:47Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+    def __init__(
+            self,
+            model: str = DEFAULT_MODEL,
+            region_name: str = "us-east-1",
+            voice_id: str = "matthew",
+            **kwargs
+    ) -> None:
+        """
+
+        """
+        super().__init__(**kwargs)
+        self.model = model
+        self.region_name = region_name
+        self.sample_rate = 24000
+        self.voice_id = voice_id
+
+        # Initialize Bedrock Runtime client with SDK
+        config = Config(
+            endpoint_uri=f"https://bedrock-runtime.{region_name}.amazonaws.com",
+            region=region_name,
+            aws_credentials_identity_resolver=EnvironmentCredentialsResolver(),
+        )
+        self.client = BedrockRuntimeClient(config=config)
+        self.logger = logging.getLogger(__name__)
+
+        # Audio output track - Bedrock typically outputs at 16kHz
+        self.output_track = AudioStreamTrack(
+            framerate=24000, stereo=False, format="s16"
+        )
+
+        self._video_forwarder: Optional[VideoForwarder] = None
+        self._stream_task: Optional[asyncio.Task[Any]] = None
+        self._is_connected = False
+        self._message_queue: asyncio.Queue[Dict[str, Any]] = asyncio.Queue()
+        self._conversation_messages: List[Dict[str, Any]] = []
+        self._pending_tool_uses: Dict[int, Dict[str, Any]] = {}  # Track tool calls across stream events
+
+        # Audio streaming configuration
+        self.prompt_name = self.session_id


⚠️ Potential issue | 🟡 Minor

Add docstring following Google style.

The __init__ method lacks a docstring. Per coding guidelines, it should document the parameters.

Based on coding guidelines.

Apply this diff:

def __init__( self, model: str = DEFAULT_MODEL, region_name: str = "us-east-1", voice_id: str = "matthew", **kwargs ) -> None: """ + Initialize AWS Bedrock Realtime client. + + Args: + model: Bedrock model ID to use for realtime inference. + region_name: AWS region name for Bedrock service. + voice_id: Voice identifier for audio output (e.g., "matthew"). + **kwargs: Additional arguments passed to parent Realtime class. """

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around lines 85 to 123, the __init__ method is missing a docstring; add a Google-style docstring that starts with a one-line summary of the constructor, followed by an Args section describing each parameter (model: str, region_name: str, voice_id: str, **kwargs), their types and defaults, and any important instance attributes initialized (client, logger, output_track, sample_rate, prompt_name), and end with a Returns: None line; keep it concise and follow Google docstring formatting conventions.

coderabbitai · 2025-10-21T17:38:47Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+        super().__init__(**kwargs)
+        self.model = model
+        self.region_name = region_name
+        self.sample_rate = 24000


⚠️ Potential issue | 🟡 Minor

DEFAULT_SAMPLE_RATE constant is unused and misleading.

Line 98 hardcodes 24000 Hz, while DEFAULT_SAMPLE_RATE (line 25) is 16000 Hz and never used. This inconsistency could confuse readers.

Consider either using the constant consistently or documenting why 24000 Hz is required for Bedrock:

- self.sample_rate = 24000 + # AWS Bedrock Nova requires 24kHz sample rate + self.sample_rate = 24000

🤖 Prompt for AI Agents

In plugins/aws/vision_agents/plugins/aws/aws_realtime.py around line 98, the sample rate is hardcoded to 24000 while DEFAULT_SAMPLE_RATE (defined at line 25 as 16000) is unused and misleading; replace the hardcoded value by using the DEFAULT_SAMPLE_RATE constant (or rename/update the constant to DEFAULT_BEDROCK_SAMPLE_RATE and use that) so the constant is referenced consistently, or if 24000 is required for Bedrock, update the constant value and/or its name and add a brief inline comment explaining why 24000 Hz is necessary.

coderabbitai · 2025-10-21T17:38:48Z

plugins/aws/vision_agents/plugins/aws/aws_realtime.py

+    async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,
+                              participant: Optional[Participant] = None):
+        """
+        Simple response standardizes how to send a text instruction to this LLM.
+
+        Example:
+            llm.simple_response("tell me a poem about Boulder")
+
+        For more advanced use cases you can use the native send_realtime_input
+        """
+        self.logger.info("Simple response called with text: %s", text)
+        await self.content_input(content=text, role="USER")


⚠️ Potential issue | 🟡 Minor

Unused parameter: processors.

The processors parameter is declared but never used in the method body. Either implement processor support or remove the parameter.

- async def simple_response(self, text: str, processors: Optional[List[Processor]] = None, - participant: Optional[Participant] = None): + async def simple_response(self, text: str, participant: Optional[Participant] = None):

Alternatively, document why it's reserved for future use:

async def simple_response(self, text: str, processors: Optional[List[Processor]] = None, participant: Optional[Participant] = None): """ - Simple response standardizes how to send a text instruction to this LLM. + Send a text instruction to this LLM. + + Args: + text: The text message to send. + processors: Reserved for future use (currently ignored). + participant: Optional participant information. - Example: + Example:: + llm.simple_response("tell me a poem about Boulder") - - For more advanced use cases you can use the native send_realtime_input """

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,

participant: Optional[Participant] = None):

"""

Simple response standardizes how to send a text instruction to this LLM.

Example:

llm.simple_response("tell me a poem about Boulder")

For more advanced use cases you can use the native send_realtime_input

"""

self.logger.info("Simple response called with text: %s", text)

await self.content_input(content=text, role="USER")

async def simple_response(self, text: str, participant: Optional[Participant] = None):

"""

Send a text instruction to this LLM.

Args:

text: The text message to send.

participant: Optional participant information.

Example::

llm.simple_response("tell me a poem about Boulder")

"""

self.logger.info("Simple response called with text: %s", text)

await self.content_input(content=text, role="USER")

Suggested change

async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,

participant: Optional[Participant] = None):

"""

Simple response standardizes how to send a text instruction to this LLM.

Example:

llm.simple_response("tell me a poem about Boulder")

For more advanced use cases you can use the native send_realtime_input

"""

self.logger.info("Simple response called with text: %s", text)

await self.content_input(content=text, role="USER")

async def simple_response(self, text: str, processors: Optional[List[Processor]] = None,

participant: Optional[Participant] = None):

"""

Send a text instruction to this LLM.

Args:

text: The text message to send.

processors: Reserved for future use (currently ignored).

participant: Optional participant information.

Example::

llm.simple_response("tell me a poem about Boulder")

"""

self.logger.info("Simple response called with text: %s", text)

await self.content_input(content=text, role="USER")

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

.github/workflows/run_tests.yml (1)
53-60: Lines 53–60: Redundant heredoc export; environment variable is already set.

The env: section at lines 34–46 already exports AWS_BEARER_TOKEN_BEDROCK for all subsequent steps. Re-exporting it to GITHUB_ENV via heredoc is redundant unless you need to modify or re-evaluate the value mid-workflow. Simplify by removing this step or explain why the re-export is necessary.
-      - name: Export AWS_BEARER_TOKEN_BEDROCK (heredoc)
-        shell: bash
-        run: |
-          {
-            echo 'AWS_BEARER_TOKEN_BEDROCK<<EOF'
-            echo "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
-            echo 'EOF'
-          } >> "$GITHUB_ENV"
-
       - name: Verify presence

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6893968 and b5482da.

⛔ Files ignored due to path filters (1)

examples/02_golf_coach_example/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

.github/workflows/run_tests.yml (2 hunks)
plugins/aws/tests/test_aws.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

plugins/aws/tests/test_aws.py

🧰 Additional context used

🪛 actionlint (1.7.8)

.github/workflows/run_tests.yml

28-28: property "environment" is not defined in object type {check_run_id: number; container: {id: string; network: string}; services: {string => {id: string; network: string; ports: {string => string}}}; status: string}

(expression)

46-46: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

47-47: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

55-55: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"
GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"

🔇 Additional comments (1)

.github/workflows/run_tests.yml (1)

46-46: AWS_BEARER_TOKEN_BEDROCK secret is conditionally required for AWS integration tests.

The secret reference is valid and necessary, but only for AWS Bedrock integration tests. The workflow automatically verifies its presence via the "Verify presence" step. This secret is:

Required only when running AWS tests (controlled by pytest markers)

Documented in plugins/aws/README.md

Explicitly checked by test code in plugins/aws/tests/test_aws.py

If you run tests without the AWS marker, this secret is not needed. Configure it in repository settings (Settings → Secrets and variables → Actions) only if you intend to run AWS integration tests.

coderabbitai · 2025-10-21T19:53:54Z

.github/workflows/run_tests.yml

+      - name: Show environment name
+        run: |
+          echo "Environment: ${{ job.environment }}"


⚠️ Potential issue | 🔴 Critical

❓ Verification inconclusive

Line 29: job.environment is not a valid GitHub Actions context property.

The job context in GitHub Actions does not expose an environment property. This step will output "undefined" or fail. If you're trying to display the job's current environment or runner info, consider using runner.os, github.job, or other valid context properties instead.

If your intent is to debug environment setup, consider replacing this step with:

- - name: Show environment name + - name: Show runner info run: | - echo "Environment: ${{ job.environment }}" + echo "Job: ${{ github.job }}, OS: ${{ runner.os }}"

job.environment is not a valid property in the GitHub Actions job context and will output an empty string.

The job context does not have environment name information available. When you attempt to dereference a nonexistent property, it will evaluate to an empty string, making this step output "Environment: " with no value. Consider using valid context properties like ${{ github.job }} or ${{ runner.os }} if you need to display debugging information.

🧰 Tools

🪛 actionlint (1.7.8)

28-28: property "environment" is not defined in object type {check_run_id: number; container: {id: string; network: string}; services: {string => {id: string; network: string; ports: {string => string}}}; status: string}

(expression)

🤖 Prompt for AI Agents

In .github/workflows/run_tests.yml around lines 27 to 29, the step echoes `${{ job.environment }}` which is not a valid GitHub Actions job context and will print an empty string; replace that reference with a valid context variable (for example `${{ github.job }}` to show the job name or `${{ runner.os }}` to show the runner OS) so the echo prints meaningful information, and update the step to use the chosen valid property.

coderabbitai · 2025-10-21T19:53:54Z

.github/workflows/run_tests.yml

+      AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
+      _BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"


⚠️ Potential issue | 🟠 Major

Line 47: Redundant/incorrect environment variable _BEARER_TOKEN_BEDROCK.

The second env var with a leading underscore (_BEARER_TOKEN_BEDROCK) appears to be a typo or mistake. Both variables reference the same secret, but the underscore-prefixed name is unusual and unclear. Either remove it or clarify its purpose.

- AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}" - _BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}" + AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"

_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"

AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"

🧰 Tools

🪛 actionlint (1.7.8)

46-46: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

47-47: property "aws_bearer_token_bedrock" is not defined in object type {actions_runner_debug: string; actions_step_debug: string; github_token: string}

(expression)

🤖 Prompt for AI Agents

In .github/workflows/run_tests.yml around lines 46-47, the environment variable _BEARER_TOKEN_BEDROCK is a redundant/likely-typo duplicate of AWS_BEARER_TOKEN_BEDROCK; remove the underscore-prefixed entry (or rename it to the intended variable if it was meant to be distinct) and ensure only the correct env var (AWS_BEARER_TOKEN_BEDROCK) is exported from secrets so the workflow uses a single, clearly named token variable.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/ci.yml (1)
13-13: ✓ Enables secrets for AWS/Bedrock CI tests.

The secrets: inherit allows the reusable workflow to access credentials needed for the new Bedrock integrations. Syntax and placement are correct.

Consider being more explicit: instead of inheriting all repository secrets, explicitly pass only the AWS credentials required by the tests. This follows the principle of least privilege and reduces the blast radius if a test or workflow is compromised.
    with:
      marker: 'not integration'
-   secrets: inherit
+   secrets:
+     AWS_BEARER_TOKEN_BEDROCK: ${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}
(Adjust secret names as needed for your actual AWS authentication setup.)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b5482da and c71da10.

📒 Files selected for processing (1)

.github/workflows/ci.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: unit / Test "not integration"
GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Ruff & mypy
GitHub Check: unit / Test "not integration"

commit ec32383 Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:51:53 2025 -0600 mypy clean up (GetStream#130) commit c52fe4c Author: Neevash Ramdial (Nash) <mail@neevash.dev> Date: Mon Oct 27 15:28:00 2025 -0600 remove turn keeping from example (GetStream#129) commit e1072e8 Merge: 5bcffa3 fea101a Author: Yarik <43354956+yarikdevcom@users.noreply.github.com> Date: Mon Oct 27 14:28:05 2025 +0100 Merge pull request GetStream#106 from tjirab/feat/20251017_gh-labeler feat: Github pull request labeler commit 5bcffa3 Merge: 406673c bfe888f Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:27 2025 -0600 Merge pull request GetStream#119 from GetStream/fix-screensharing Fix screensharing commit bfe888f Merge: 8019c14 406673c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sat Oct 25 10:56:15 2025 -0600 Merge branch 'main' into fix-screensharing commit 406673c Author: Stefan Blos <stefan.blos@gmail.com> Date: Sat Oct 25 03:03:10 2025 +0200 Update README (GetStream#118) * Changed README to LaRaes version * Remove arrows from table * Add table with people & projects to follow * Update images and links in README.md commit 3316908 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Fri Oct 24 23:48:06 2025 +0200 Simplify TTS plugin and audio utils (GetStream#123) - Simplified TTS plugin - AWS Polly TTS plugin - OpenAI TTS plugin - Improved audio utils commit 8019c14 Author: Max Kahan <max.kahan@getstream.io> Date: Fri Oct 24 17:32:26 2025 +0100 remove video forwarder lazy init commit ca62d37 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 16:44:03 2025 +0100 use correct codec commit 8cf8788 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:27:18 2025 +0100 rename variable to fix convention commit 33fd70d Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:24:42 2025 +0100 unsubscribe from events commit 3692131 Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:19:53 2025 +0100 remove nonexistent type commit c5f68fe Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 14:10:07 2025 +0100 cleanup tests to fit style commit 8b3c61a Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:55:08 2025 +0100 clean up resources when track cancelled commit d8e08cb Author: Max Kahan <max.kahan@getstream.io> Date: Thu Oct 23 13:24:55 2025 +0100 fix track republishing in agent commit 0f8e116 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:37:11 2025 +0100 add tests commit 08e6133 Author: Max Kahan <max.kahan@getstream.io> Date: Wed Oct 22 15:25:37 2025 +0100 ensure video track dimensions are an even number commit 6a725b0 Merge: 5f001e0 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:58 2025 -0600 Merge pull request GetStream#122 from GetStream/cleanup_stt Cleanup STT commit 5088709 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:23:34 2025 -0600 cleanup of stt commit f185120 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 15:08:42 2025 -0600 more cleanup commit 05ccbfd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:51:48 2025 -0600 cleanup commit bb834ca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:28:53 2025 -0600 more cleanup for stt commit 7a3f2d2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:11:35 2025 -0600 more test cleanup commit ad7f4fe Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:10:57 2025 -0600 cleanup test commit 9e50cdd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 14:03:45 2025 -0600 large cleanup commit 5f001e0 Merge: 95a03e4 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 12:01:52 2025 -0600 Merge pull request GetStream#121 from GetStream/fish_stt [AI-201] Fish speech to text (partial) commit 5d204f3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:48:16 2025 -0600 remove ugly tests commit ee9a241 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:46:19 2025 -0600 cleanup commit 6eb8270 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 11:23:00 2025 -0600 fix 48khz support commit 3b90548 Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 23 10:59:08 2025 -0600 first attempt at fish stt, doesnt entirely work just yet commit 95a03e4 Merge: b90c9e3 b4c0da8 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 23 10:11:39 2025 +0200 Merge branch 'main' of github.com:GetStream/Vision-Agents commit b90c9e3 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Wed Oct 22 23:28:28 2025 +0200 remove print and double event handling commit b4c0da8 Merge: 3d06446 a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:08:51 2025 -0600 Merge pull request GetStream#117 from GetStream/openrouter [AI-194] Openrouter commit a426bc2 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 15:03:10 2025 -0600 skip broken test commit ba6c027 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:50:23 2025 -0600 almost working openrouter commit 0b1c873 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:47:12 2025 -0600 almost working, just no instruction following commit ce63233 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 14:35:53 2025 -0600 working memory for openai commit 149e886 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:32:43 2025 -0600 todo commit e0df1f6 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:38 2025 -0600 first pass at adding openrouter commit 3d06446 Merge: 4eb8ef4 ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:11 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents commit 4eb8ef4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 13:20:01 2025 -0600 cleanup ai plugin instructions commit ef55d66 Author: Thierry Schellenbach <thierry@getstream.io> Date: Wed Oct 22 12:54:33 2025 -0600 Add link to stash_pomichter for spatial memory commit 9c9737f Merge: c954409 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:45:09 2025 -0600 Merge pull request GetStream#115 from GetStream/fish [AI-195] Fish support commit 390c45b Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:44:37 2025 -0600 cleannup commit 1cc1cf1 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:42:03 2025 -0600 happy tests commit 8163d32 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:39:21 2025 -0600 fix gemini rule following commit ada3ac9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 19:20:18 2025 -0600 fish tts commit 61a26cf Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 16:44:03 2025 -0600 attempt at fish commit c954409 Merge: ab27e48 c71da10 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 14:18:15 2025 -0600 Merge pull request GetStream#104 from GetStream/bedrock [AI-192] - Bedrock, AWS & Nova commit c71da10 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 22:00:25 2025 +0200 maybe commit b5482da Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Tue Oct 21 21:46:15 2025 +0200 debugging commit 9a36e45 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 13:14:58 2025 -0600 echo environment name commit 6893968 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:53:58 2025 -0600 more debugging commit c35fc47 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:45:44 2025 -0600 add some debug info commit 0d6d3fd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 12:03:13 2025 -0600 run test fix commit c3a31bd Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:52:25 2025 -0600 log cache hit commit 04554ae Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:48:03 2025 -0600 fix glob commit 7da96db Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:33:56 2025 -0600 mypy commit 186053f Merge: 4b540c9 ab27e48 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 11:17:17 2025 -0600 happy tests commit 4b540c9 Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 10:20:04 2025 -0600 happy tests commit b05a60a Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:17:45 2025 -0600 add readme commit 71affcc Author: Thierry Schellenbach <thierry@getstream.io> Date: Tue Oct 21 09:13:01 2025 -0600 rename to aws commit d2eeba7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 21:32:01 2025 -0600 ai tts instructions commit 98a4f9d Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 16:49:00 2025 -0600 small edits commit ab27e48 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:42:04 2025 +0200 Ensure user agent is initialized before joining the call (GetStream#113) * ensure user agent is initialized before joining the call * wip commit 3cb339b Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 20 21:22:57 2025 +0200 New conversation API (GetStream#102) * trying to resurrect * test transcription events for openai * more tests for openai and gemini llm * more tests for openai and gemini llm * update py-client * wip * ruff * wip * ruff * snap * another way * another way, a better way * ruff * ruff * rev * ruffit * mypy everything * brief * tests * openai dep bump * snap - broken * nothingfuckingworks * message id * fix test * ruffit commit cb6f00a Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:18:03 2025 -0600 use qwen commit f84b2ad Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 13:02:24 2025 -0600 fix tests commit e61acca Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:50:40 2025 -0600 testing and linting commit 5f4d353 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 12:34:14 2025 -0600 working commit c2a15a9 Merge: a310771 1025a42 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:40:00 2025 -0600 Merge branch 'main' of github.com:GetStream/Vision-Agents into bedrock commit a310771 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:39:48 2025 -0600 wip commit b4370f4 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 11:22:43 2025 -0600 something isn't quite working commit 2dac975 Author: Thierry Schellenbach <thierry@getstream.io> Date: Mon Oct 20 10:30:04 2025 -0600 add the examples commit 6885289 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 20:19:42 2025 -0600 ai realtime docs commit a0fa3cc Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:48:06 2025 -0600 wip commit b914fc3 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 18:40:22 2025 -0600 fix ai llm commit b5b00a7 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 17:11:26 2025 -0600 work audio input commit ac72260 Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:47:19 2025 -0600 fix model id commit 2b5863c Author: Thierry Schellenbach <thierry@getstream.io> Date: Sun Oct 19 16:32:54 2025 -0600 wip on bedrock commit 8bb4162 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 15:22:03 2025 -0600 next up the connect method commit 7a21e4e Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 14:12:00 2025 -0600 nova progress commit 16e8ba0 Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 13:16:00 2025 -0600 docs for bedrock nova commit 1025a42 Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 21:05:45 2025 +0200 fix: Update .env.example for Gemini Live (GetStream#108) commit e12112d Author: Thierry Schellenbach <thierry@getstream.io> Date: Fri Oct 17 11:49:07 2025 -0600 wip commit fea101a Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:25:55 2025 +0200 workflow file update commit bb2d74c Author: Bart Schuijt <schuijt.bart@gmail.com> Date: Fri Oct 17 09:22:33 2025 +0200 initial commit commit d2853cd Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:44:59 2025 -0600 always remember pep 420 commit 30a8eca Author: Thierry Schellenbach <thierry@getstream.io> Date: Thu Oct 16 19:36:58 2025 -0600 start of bedrock branch commit fc032bf Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Thu Oct 16 09:17:42 2025 +0200 Remove cli handler from examples (GetStream#101) commit 39a821d Author: Dan Gusev <dangusev92@gmail.com> Date: Tue Oct 14 12:20:41 2025 +0200 Update Deepgram plugin to use SDK v5.0.0 (GetStream#98) * Update Deepgram plugin to use SDK v5.0.0 * Merge test_realtime and test_stt and update the remaining tests * Make deepgram.STT.start() idempotent * Clean up unused import * Use uv as the default package manager > pip --------- Co-authored-by: Neevash Ramdial (Nash) <mail@neevash.dev> commit 2013be5 Author: Tommaso Barbugli <tbarbugli@gmail.com> Date: Mon Oct 13 16:57:37 2025 +0200 ensure chat works with default types (GetStream#99)

start of bedrock branch

30a8eca

tschellenbach added 15 commits October 16, 2025 19:44

always remember pep 420

d2853cd

wip

e12112d

docs for bedrock nova

16e8ba0

nova progress

7a21e4e

next up the connect method

8bb4162

wip on bedrock

2b5863c

fix model id

ac72260

work audio input

b5b00a7

fix ai llm

b914fc3

wip

a0fa3cc

ai realtime docs

6885289

add the examples

2dac975

something isn't quite working

b4370f4

wip

a310771

Merge branch 'main' of github.com:GetStream/Vision-Agents into bedrock

c2a15a9

tschellenbach changed the title ~~start of bedrock branch~~ Bedrock, AWS & Nova Oct 20, 2025

tschellenbach added 8 commits October 20, 2025 12:34

working

5f4d353

testing and linting

e61acca

fix tests

f84b2ad

use qwen

cb6f00a

small edits

98a4f9d

ai tts instructions

d2eeba7

rename to aws

71affcc

add readme

b05a60a

tschellenbach changed the title ~~Bedrock, AWS & Nova~~ [AI-192] - Bedrock, AWS & Nova Oct 21, 2025

tschellenbach marked this pull request as ready for review October 21, 2025 16:19

happy tests

4b540c9

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

happy tests

186053f

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

mypy

7da96db

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

tschellenbach and others added 7 commits October 21, 2025 11:48

fix glob

04554ae

log cache hit

c3a31bd

run test fix

0d6d3fd

add some debug info

c35fc47

more debugging

6893968

echo environment name

9a36e45

debugging

b5482da

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

maybe

c71da10

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

tschellenbach merged commit c954409 into main Oct 21, 2025
5 checks passed

tschellenbach deleted the bedrock branch October 21, 2025 20:18

coderabbitai bot mentioned this pull request Oct 27, 2025

WIP - Vogent + New Smart TURN + Audio utils usage #128

Merged

This was referenced Oct 31, 2025

feat: add AWS Bedrock function calling implementation #120

Merged

Add more metrics #143

Open

	- Open `pyproject.toml` and update the name, description etc
	- Open `pyproject.toml` and update the name, description, etc.

		AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
		_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"

[AI-192] - Bedrock, AWS & Nova #104

[AI-192] - Bedrock, AWS & Nova #104

Uh oh!

Conversation

tschellenbach commented Oct 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

tschellenbach commented Oct 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 17, 2025 •

edited

Loading