Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an Inworld Realtime (WebRTC) integration: new Realtime LLM class, RTCManager for WebRTC signaling/transport, tool-calling utilities, examples and tests, README/CHANGELOG updates, and new runtime dependencies for aiortc and OpenAI realtime support. (≤50 words) Changes
Sequence DiagramsequenceDiagram
participant Agent as Agent
participant Realtime as Realtime
participant RTCManager as RTCManager
participant InworldAPI as Inworld API
participant WebRTC as WebRTC
Agent->>Realtime: create(model, voice, api_key, instructions)
Realtime->>Realtime: build realtime_session config
Realtime->>RTCManager: instantiate with session params
Agent->>Realtime: connect()
Realtime->>RTCManager: connect()
RTCManager->>InworldAPI: fetch ICE/TURN servers
RTCManager->>WebRTC: create RTCPeerConnection & data channel
RTCManager->>InworldAPI: post SDP offer → receive SDP answer
WebRTC-->>RTCManager: data channel open
RTCManager-->>Realtime: callback(events/audio)
Realtime-->>Agent: emit connected event
Agent->>Realtime: simple_response("Hello")
Realtime->>RTCManager: send_text("Hello")
RTCManager->>WebRTC: send conversation.item.create + response.create
WebRTC-->>RTCManager: incoming audio frames & events
RTCManager-->>Realtime: dispatch event callback
Realtime->>Realtime: handle event (transcript/func-call/etc.)
Realtime->>Realtime: if function call → stream args, run tool
Realtime->>RTCManager: send_event(function_call_output)
RTCManager->>WebRTC: deliver via data channel
Agent->>Realtime: close()
Realtime->>RTCManager: close()
RTCManager->>WebRTC: close peer connection
Realtime-->>Agent: emit disconnected event
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (3)
plugins/inworld/vision_agents/plugins/inworld/tool_utils.py (1)
8-15: Replace old typing aliases with modern Python annotations throughout this helper.Lines 8, 14-15, 34, and 56 still use
Any,Dict, andListfrom the typing module. Update these to built-in generics with concrete types:
- Remove
from typing import Any, Dict, List- Change
List[ToolSchema]→list[ToolSchema]- Change
List[Dict[str, Any]]→list[dict[str, object]]- Change
Dict[str, Any]→dict[str, object]- Update
parse_tool_argumentssignature:dict→dict[str, object]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py` around lines 8 - 15, Replace old typing aliases with built-in generics: remove the line importing Any, Dict, List from typing; update the signature of convert_tools_to_openai_format to use list[ToolSchema] and its return type to list[dict[str, object]]; update any intermediate variables or annotations that use Dict[str, Any] to dict[str, object]; update parse_tool_arguments' parameter and return annotations from plain dict to dict[str, object]; ensure references to NormalizedToolCallItem and ToolSchema remain unchanged (e.g., convert_tools_to_openai_format, parse_tool_arguments, NormalizedToolCallItem, ToolSchema).plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (2)
34-34: Avoidload_dotenv()at module import.Calling
load_dotenv()on import mutates process env as a side effect of merely importing the plugin, which surprises library consumers and makes test isolation awkward. Let the application entrypoint (or example script) own that, or move it behind an explicitconfigure()call. The example inplugins/inworld/example/inworld_realtime_example.pyis the right place for it.♻️ Proposed change
import json import logging import os from typing import Any, Optional import aiortc.mediastreams import httpx -from dotenv import load_dotenv from getstream.video.rtc.track_util import PcmData ... -load_dotenv() - logger = logging.getLogger(__name__)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line 34, Remove the top-level load_dotenv() call so importing the module has no side effects; instead add an explicit configure function (e.g., configure_env() or initialize()) in this module that calls load_dotenv() when the application wants to initialize env, and update callers/examples (such as the example in plugins/inworld/example/inworld_realtime_example.py) to call that configure function before using any other symbols from this module; ensure all references to module-level environment access in functions like any realtime client constructors read from os.environ at runtime so they work after configure_env() is run.
38-54: Fold the trailing exceptions into_IGNORED_EVENT_TYPES.
"session.updated"and"rate_limits.updated"are tacked on asor-chained string compares at line 283 instead of living in the frozenset a few lines above. Moving them keeps dispatch O(1) and the ignore list in one spot.♻️ Proposed change
_IGNORED_EVENT_TYPES = frozenset( { "conversation.item.created", ... "input_audio_buffer.speech_stopped", "input_audio_buffer.committed", + "session.updated", + "rate_limits.updated", } ) ... - elif et in _IGNORED_EVENT_TYPES or et == "session.updated" or et == "rate_limits.updated": + elif et in _IGNORED_EVENT_TYPES: passAlso applies to: 283-284
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 38 - 54, Add "session.updated" and "rate_limits.updated" to the existing _IGNORED_EVENT_TYPES frozenset and remove the separate or-chained string comparisons in the event dispatch logic that currently checks those two values (the check near the dispatch/event handling code in inworld_realtime.py). This keeps all ignored event types centralized in _IGNORED_EVENT_TYPES and allows the dispatch membership test to remain a single O(1) containment check against _IGNORED_EVENT_TYPES.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 65-67: Update the changelog entry for "Inworld Realtime plugin
(WebRTC)" / "inworld.Realtime" to include the PR reference for this user-facing
feature by appending the PR number or link (e.g., " (PR #<number>)" or "
([#<number>](link))") to the end of the entry; ensure it remains under the
appropriate version heading and "New Features" section and uses the same
phrasing as the existing entry so only the PR reference is added.
In `@plugins/inworld/README.md`:
- Around line 49-61: The example omits the required agent_user parameter for
Agent.__init__, causing a TypeError; update the Agent(...) call in the README
example to pass a valid agent_user (e.g., a user id or user object) when
constructing Agent so the inworld.Realtime usage (inworld.Realtime) works;
ensure the parameter name matches the constructor signature (agent_user) and
include a brief placeholder value in the snippet so the example runs as-is.
In `@plugins/inworld/tests/test_realtime.py`:
- Around line 1-4: Add a guard to skip these live integration tests when the
INWORLD_API_KEY is not set: import os and pytest at top of
plugins/inworld/tests/test_realtime.py, then in the session/test fixture used
for realtime integration (the fixture defined around lines ~152-154 in this
file) check if os.environ.get("INWORLD_API_KEY") is falsy and call
pytest.skip("INWORLD_API_KEY not set - skipping live integration tests") from
the fixture setup path so tests are cleanly skipped instead of raising during
setup.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 138: Replace the type ignore by ensuring tools_for_inworld is typed to
match the RealtimeSessionCreateRequestParam TypedDict's "tools" field: update
the convert_tools_to_openai_format signature to return the concrete list item
type expected by RealtimeSessionCreateRequestParam (or explicitly wrap
tools_for_inworld with typing.cast to that list type) and then assign it to
self.realtime_session["tools"] without a "# type: ignore"; reference the
convert_tools_to_openai_format function and the
RealtimeSessionCreateRequestParam TypedDict so the assignment
self.realtime_session["tools"] = tools_for_inworld type-checks correctly.
- Around line 271-282: Replace the raise in _handle_inworld_event for et ==
"response.done" with emitting an error event and returning: detect
response.get("status") == "failed" inside the response.done branch and call
self._emit_error_event(InworldRealtimeError(json.dumps(response)),
context="response_failed") (mirroring the "error" branch), then return so the
dispatcher does not raise from the background task; keep the rest of the
response.done handling intact if any.
- Around line 102-115: The code currently always overwrites caller-supplied
realtime_session fields (realtime_session["model"] and
realtime_session["audio"]["output"]["voice"]) with default kwargs; change the
logic in the RealtimeSessionCreateRequestParam handling so you only set
realtime_session["model"] if the caller explicitly passed a model kwarg (use a
sentinel or check if model is not None) and only set
realtime_session["audio"]["output"]["voice"] if the caller explicitly passed a
voice kwarg (or if audio/output is missing and you need to create defaults). In
practice, update the block that assigns self.realtime_session["model"] and the
block that assigns self.realtime_session["audio"]["output"]["voice"] to first
honor existing keys on self.realtime_session (and only overwrite when the
corresponding kwarg was explicitly provided), while still creating missing
audio/output objects when realtime_session lacks them using
RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 196-203: The on_message handler currently calls
asyncio.create_task(self._handle_event(data)) without keeping a strong
reference, so tasks may be garbage-collected mid-flight; modify the RTCManager
(or the class containing on_message) to maintain a set attribute (e.g.,
self._pending_tasks) and when creating the task from on_message add it to that
set, attach a done callback that removes the task from self._pending_tasks, and
ensure the callback also logs exceptions from the task if needed; specifically
update the on_message closure that calls asyncio.create_task and the class
initializer to create and manage self._pending_tasks for tasks spawned from
self._handle_event.
- Around line 243-259: The close() method fires-and-forgets the peer-connection
close via asyncio.create_task(_safe_close()) causing RTCManager.close() to
return before pc.close() finishes; change it to await pc.close() directly
(inside close()) and handle expected teardown exceptions explicitly (catch
asyncio.CancelledError, ConnectionError and aiortc/InvalidStateError-like
exceptions) rather than silently spawning a background task; if non-blocking
behavior is required instead, provide a separate helper that returns the task
and store/await it from callers (e.g., in inworld_realtime.Realtime.close()) so
the disconnect event is emitted only after the task completes.
In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py`:
- Around line 27-32: The code currently assigns params =
t.get("parameters_schema") or t.get("parameters") or {} then calls
params.setdefault(...), which mutates the caller/registry-owned schema; instead,
create a copy of the resolved schema before mutating (e.g., shallow or deep
copy) so the original t["parameters_schema"] / t["parameters"] is untouched,
then perform params.setdefault("type", "object"),
params.setdefault("properties", {}), and
params.setdefault("additionalProperties", False) on that copy; update any
downstream use to reference the new copied variable rather than mutating t in
place.
- Around line 56-65: parse_tool_arguments currently swallows malformed JSON and
non-object results by returning {} which can cause tools to run with missing or
incorrect arguments; change it so that when args is a dict it is returned, when
args is falsy keep returning {} (if you want to preserve that behavior), but if
args is a string then attempt json.loads and if json.loads raises
json.JSONDecodeError or returns a value that is not a dict raise a ValueError
(or a specific exception) describing "tool arguments must be a JSON object";
update parse_tool_arguments to validate the parsed value is a dict and raise on
errors instead of returning {}.
---
Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 34: Remove the top-level load_dotenv() call so importing the module has
no side effects; instead add an explicit configure function (e.g.,
configure_env() or initialize()) in this module that calls load_dotenv() when
the application wants to initialize env, and update callers/examples (such as
the example in plugins/inworld/example/inworld_realtime_example.py) to call that
configure function before using any other symbols from this module; ensure all
references to module-level environment access in functions like any realtime
client constructors read from os.environ at runtime so they work after
configure_env() is run.
- Around line 38-54: Add "session.updated" and "rate_limits.updated" to the
existing _IGNORED_EVENT_TYPES frozenset and remove the separate or-chained
string comparisons in the event dispatch logic that currently checks those two
values (the check near the dispatch/event handling code in inworld_realtime.py).
This keeps all ignored event types centralized in _IGNORED_EVENT_TYPES and
allows the dispatch membership test to remain a single O(1) containment check
against _IGNORED_EVENT_TYPES.
In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py`:
- Around line 8-15: Replace old typing aliases with built-in generics: remove
the line importing Any, Dict, List from typing; update the signature of
convert_tools_to_openai_format to use list[ToolSchema] and its return type to
list[dict[str, object]]; update any intermediate variables or annotations that
use Dict[str, Any] to dict[str, object]; update parse_tool_arguments' parameter
and return annotations from plain dict to dict[str, object]; ensure references
to NormalizedToolCallItem and ToolSchema remain unchanged (e.g.,
convert_tools_to_openai_format, parse_tool_arguments, NormalizedToolCallItem,
ToolSchema).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 47653894-e65b-4370-a2e3-6981f67c5e8b
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
CHANGELOG.mdREADME.mdplugins/inworld/README.mdplugins/inworld/example/inworld_realtime_example.pyplugins/inworld/pyproject.tomlplugins/inworld/tests/test_realtime.pyplugins/inworld/vision_agents/plugins/inworld/__init__.pyplugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyplugins/inworld/vision_agents/plugins/inworld/rtc_manager.pyplugins/inworld/vision_agents/plugins/inworld/tool_utils.py
| import asyncio | ||
|
|
||
| import pytest | ||
| from dotenv import load_dotenv |
There was a problem hiding this comment.
Skip live integration tests when INWORLD_API_KEY is absent.
The fixture currently raises during setup when the secret is unavailable, which makes integration-enabled CI fail instead of reporting a clean skip.
🧪 Proposed fixture guard
import asyncio
+import os
import pytest
from dotenv import load_dotenv
@@
`@pytest.fixture`
async def live_realtime(self):
+ if not os.getenv("INWORLD_API_KEY"):
+ pytest.skip("INWORLD_API_KEY is required for Inworld Realtime integration tests")
rt = inworld.Realtime()
try:
yield rtAlso applies to: 152-154
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/tests/test_realtime.py` around lines 1 - 4, Add a guard to
skip these live integration tests when the INWORLD_API_KEY is not set: import os
and pytest at top of plugins/inworld/tests/test_realtime.py, then in the
session/test fixture used for realtime integration (the fixture defined around
lines ~152-154 in this file) check if os.environ.get("INWORLD_API_KEY") is falsy
and call pytest.skip("INWORLD_API_KEY not set - skipping live integration
tests") from the fixture setup path so tests are cleanly skipped instead of
raising during setup.
| tools_for_inworld = convert_tools_to_openai_format( | ||
| available_tools, for_realtime=True | ||
| ) | ||
| self.realtime_session["tools"] = tools_for_inworld # type: ignore[typeddict-item] |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Drop the # type: ignore; use a properly typed assignment.
Repo guidelines explicitly say to avoid # type: ignore. RealtimeSessionCreateRequestParam is a TypedDict with a tools key — cast the list to the expected item type rather than silencing the checker. If convert_tools_to_openai_format returns a concrete schema type, annotate its return accordingly so the assignment type-checks without a suppression.
As per coding guidelines: "Avoid # type: ignore comments."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line
138, Replace the type ignore by ensuring tools_for_inworld is typed to match the
RealtimeSessionCreateRequestParam TypedDict's "tools" field: update the
convert_tools_to_openai_format signature to return the concrete list item type
expected by RealtimeSessionCreateRequestParam (or explicitly wrap
tools_for_inworld with typing.cast to that list type) and then assign it to
self.realtime_session["tools"] without a "# type: ignore"; reference the
convert_tools_to_openai_format function and the
RealtimeSessionCreateRequestParam TypedDict so the assignment
self.realtime_session["tools"] = tools_for_inworld type-checks correctly.
| def parse_tool_arguments(args: str | dict) -> dict: | ||
| """Parse tool arguments from string or dict.""" | ||
| if isinstance(args, dict): | ||
| return args | ||
| if not args: | ||
| return {} | ||
| try: | ||
| return json.loads(args) | ||
| except json.JSONDecodeError: | ||
| return {} |
There was a problem hiding this comment.
Reject malformed or non-object tool arguments instead of defaulting to {}.
Returning {} for invalid JSON can execute a tool with omitted arguments, and json.loads() may return a list/string while the function promises a dict.
🛠️ Proposed fix
-def parse_tool_arguments(args: str | dict) -> dict:
+def parse_tool_arguments(args: str | dict[str, object]) -> dict[str, object]:
"""Parse tool arguments from string or dict."""
if isinstance(args, dict):
return args
if not args:
return {}
try:
- return json.loads(args)
- except json.JSONDecodeError:
- return {}
+ parsed = json.loads(args)
+ except json.JSONDecodeError as exc:
+ raise ValueError("Tool arguments must be valid JSON.") from exc
+ if not isinstance(parsed, dict):
+ raise ValueError("Tool arguments must decode to a JSON object.")
+ return parsed🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py` around lines 56
- 65, parse_tool_arguments currently swallows malformed JSON and non-object
results by returning {} which can cause tools to run with missing or incorrect
arguments; change it so that when args is a dict it is returned, when args is
falsy keep returning {} (if you want to preserve that behavior), but if args is
a string then attempt json.loads and if json.loads raises json.JSONDecodeError
or returns a value that is not a dict raise a ValueError (or a specific
exception) describing "tool arguments must be a JSON object"; update
parse_tool_arguments to validate the parsed value is a dict and raise on errors
instead of returning {}.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)
109-122:⚠️ Potential issue | 🟡 MinorPreserve caller-provided
realtime_sessionmodel and voice.This still overwrites advanced session config with default kwargs, so a caller-provided
modeloraudio.output.voiceis silently lost unless they also pass matching constructor args. Consider usingNone/sentinel defaults and only applying kwargs when explicitly provided.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 109 - 122, The constructor currently forces caller-provided session fields by unconditionally setting self.realtime_session["model"] = model and self.realtime_session["audio"]["output"]["voice"] = voice; change this to only apply the provided model and voice when those kwargs are not None (or use a sentinel) and only if the keys are missing on the caller-supplied realtime_session. Specifically, when initializing self.realtime_session (and when creating defaults like RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam), check for the presence of "model" and the nested "audio"->"output"->"voice" before assigning; if model or voice parameters are None, preserve whatever the caller passed, and only set defaults when keys are absent. Ensure checks use the existing self.realtime_session dict/object access patterns so you don't clobber nested caller-provided config.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/inworld/example/inworld_realtime_tools_example.py`:
- Around line 51-52: get_time currently returns a naive ISO timestamp; make it
timezone-aware by calling datetime.datetime.now with an explicit tz (e.g.,
datetime.timezone.utc or another desired tz) so get_time() returns an ISO string
that includes the UTC offset (use datetime.datetime.now(tz=...) and then
.isoformat(timespec="seconds")).
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 149-168: The code logs when the Inworld plan disallows tool
calling but fails to remove previously-set tools from the session; update the
else branch in the block using
get_available_functions/convert_tools_to_openai_format and
_plan_supports_tool_calling to explicitly drop any existing tools from
self.realtime_session (e.g., pop/"del" the "tools" key) before returning so
stale tool config is not sent in session.update, and keep the warning log
intact.
- Around line 363-369: The current logging in the tool execution path logs raw
arguments and full error details; change it so that logger.info only records the
tool name (from name) and does not include arguments or payloads, and change the
logger.error call in the _run_one_tool handling to record a generic failure
message (e.g., "Tool call %s failed") without printing the raw error details; if
you need to preserve the full arguments or exception for troubleshooting, emit
them at debug level (logger.debug) or write them to a secure audit sink instead,
and keep response_data as {"error": "tool call failed"} or similarly generic to
avoid leaking PII/secrets.
- Around line 470-478: The update_tools() path currently sends tools without
applying the same preflight that connect() uses, which can re-add disallowed
tools and trigger the server-error state from _plan_supports_tool_calling();
modify update_tools() to run the same preflight/validation before converting and
sending tools (reuse _plan_supports_tool_calling() or the same validation logic
used in connect()), and if the preflight fails set the session to the
server-error state consistently (the same behavior as in connect()), then only
call convert_tools_to_openai_format(...) and await self.rtc.send_event(...) when
the plan check passes.
---
Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 109-122: The constructor currently forces caller-provided session
fields by unconditionally setting self.realtime_session["model"] = model and
self.realtime_session["audio"]["output"]["voice"] = voice; change this to only
apply the provided model and voice when those kwargs are not None (or use a
sentinel) and only if the keys are missing on the caller-supplied
realtime_session. Specifically, when initializing self.realtime_session (and
when creating defaults like
RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam), check for the presence
of "model" and the nested "audio"->"output"->"voice" before assigning; if model
or voice parameters are None, preserve whatever the caller passed, and only set
defaults when keys are absent. Ensure checks use the existing
self.realtime_session dict/object access patterns so you don't clobber nested
caller-provided config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e669cbd3-6c04-494e-8524-698bd3e5cb73
📒 Files selected for processing (3)
plugins/inworld/example/inworld_realtime_example.pyplugins/inworld/example/inworld_realtime_tools_example.pyplugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
| logger.info("Executing tool call: %s with args: %s", name, arguments) | ||
|
|
||
| tc, result, error = await self._run_one_tool(tool_call, timeout_s=30) | ||
|
|
||
| if error: | ||
| response_data: dict[str, Any] = {"error": str(error)} | ||
| logger.error("Tool call %s failed: %s", name, error) |
There was a problem hiding this comment.
Avoid logging raw tool arguments and error details at info/error level.
Tool arguments can contain user PII or secrets. Log the tool name and keep payload details out of default logs.
🛡️ Proposed fix
- logger.info("Executing tool call: %s with args: %s", name, arguments)
+ logger.info("Executing tool call: %s", name)
tc, result, error = await self._run_one_tool(tool_call, timeout_s=30)
if error:
response_data: dict[str, Any] = {"error": str(error)}
- logger.error("Tool call %s failed: %s", name, error)
+ logger.error("Tool call %s failed", name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 363 - 369, The current logging in the tool execution path logs raw
arguments and full error details; change it so that logger.info only records the
tool name (from name) and does not include arguments or payloads, and change the
logger.error call in the _run_one_tool handling to record a generic failure
message (e.g., "Tool call %s failed") without printing the raw error details; if
you need to preserve the full arguments or exception for troubleshooting, emit
them at debug level (logger.debug) or write them to a secure audit sink instead,
and keep response_data as {"error": "tool call failed"} or similarly generic to
avoid leaking PII/secrets.
| tools_for_inworld = convert_tools_to_openai_format( | ||
| available_tools, for_realtime=True | ||
| ) | ||
| await self.rtc.send_event( | ||
| { | ||
| "type": "session.update", | ||
| "session": {"tools": tools_for_inworld}, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Apply the tool-calling preflight in update_tools() too.
connect() may drop tools for restricted plans, but update_tools() can add them back later and put the session into the server-error state described in _plan_supports_tool_calling().
🛡️ Proposed fix
tools_for_inworld = convert_tools_to_openai_format(
available_tools, for_realtime=True
)
+ if not await self._plan_supports_tool_calling():
+ self.realtime_session.pop("tools", None)
+ await self.rtc.send_event(
+ {
+ "type": "session.update",
+ "session": {"tools": []},
+ }
+ )
+ logger.warning(
+ "Inworld plan does not permit tool calling; not updating %d tools",
+ len(tools_for_inworld),
+ )
+ return
+
await self.rtc.send_event(
{
"type": "session.update",
"session": {"tools": tools_for_inworld},
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 470 - 478, The update_tools() path currently sends tools without applying
the same preflight that connect() uses, which can re-add disallowed tools and
trigger the server-error state from _plan_supports_tool_calling(); modify
update_tools() to run the same preflight/validation before converting and
sending tools (reuse _plan_supports_tool_calling() or the same validation logic
used in connect()), and if the preflight fails set the session to the
server-error state consistently (the same behavior as in connect()), then only
call convert_tools_to_openai_format(...) and await self.rtc.send_event(...) when
the plan check passes.
- Fix ruff format on inworld_realtime.py and tools example - Remove top-level load_dotenv() side effect from module import - Consolidate ignored event types into _IGNORED_EVENT_TYPES frozenset - Preserve caller-provided model/voice in realtime_session (use setdefault) - Drop stale tools from session when plan disallows tool calling - Add preflight check to update_tools() matching connect() behavior - Keep strong references to asyncio tasks in RTCManager (prevent GC) - Await pc.close() in RTCManager.close() instead of fire-and-forget - Shallow-copy tool params in convert_tools_to_openai_format to avoid mutation - Modernize type annotations (remove Dict/List/Any, use builtins and unions) - Reduce logging of raw arguments/error details in tool execution - Add agent_user to README example snippet - Add PR reference (#502) to CHANGELOG - Make get_time timezone-aware in tools example Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)
153-160: 🛠️ Refactor suggestion | 🟠 Major
# type: ignoresuppressions still present — replace with a typed cast.Both
self.realtime_session["tools"] = tools_for_inworld # type: ignore[typeddict-item]on line 153 andself.realtime_session.pop("tools", None) # type: ignore[misc]on line 160 still silence the type checker. As per coding guidelines: "Avoid# type: ignorecomments." Tightenconvert_tools_to_openai_format's return type (ortyping.castthe assignment to the concreteRealtimeSessionCreateRequestParam["tools"]item type) so the assignment and pop type-check cleanly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 153 - 160, The assignment and pop use "# type: ignore" to bypass typing; instead update convert_tools_to_openai_format to return the exact RealtimeSessionCreateRequestParam["tools"] type (or wrap tools_for_inworld with typing.cast(RealtimeSessionCreateRequestParam["tools"], tools_for_inworld)) so the line self.realtime_session["tools"] = tools_for_inworld type-checks without ignores, and change the pop to use a typed deletion such as: cast the session to the proper TypedDict before popping or delete the key with a typed helper so the self.realtime_session.pop("tools", None) no longer requires "# type: ignore".
🧹 Nitpick comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)
125-125: Type_pending_tool_callswith aTypedDictto avoidobject-typed list mutation.
self._pending_tool_calls: dict[str, dict[str, object]]meanspending["argument_parts"]isobject, yet line 293 calls.append(delta)on it and line 350 does"".join(pending["argument_parts"]). This only works becauseobjectgets narrowed at runtime — static checkers will reject it, and any future attempt to drop the remaining# type: ignores will stall here too.♻️ Proposed refactor
+from typing import TypedDict + + +class _PendingToolCall(TypedDict): + call_id: str | None + name: str + argument_parts: list[str] @@ - self._pending_tool_calls: dict[str, dict[str, object]] = {} + self._pending_tool_calls: dict[str, _PendingToolCall] = {}Then
pending["argument_parts"]/.append(...)/"".join(...)all check without casts.Also applies to: 278-300, 342-350
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line 125, Define a TypedDict (e.g., PendingToolCall) describing the exact shape of each pending call (including argument_parts: list[str], any other keys like args/metadata with their proper types) and replace the current untyped map annotation on self._pending_tool_calls in inworld_realtime.py with dict[str, PendingToolCall]; update code paths that mutate pending["argument_parts"] (the .append(delta) call) and later read it (the "".join(pending["argument_parts"]) call) to rely on the TypedDict so static checkers accept list operations without casts or # type: ignore.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 103-110: The constructor currently assigns self.model = model and
self.voice = voice which can diverge from a caller-provided realtime_session;
change the assignment to derive the effective values from the session after you
call self.realtime_session.setdefault("model", model) (e.g. self.model =
self.realtime_session["model"]) and read the voice from the session's nested
audio.output (e.g. self.voice = self.realtime_session.get("audio",
{}).get("output", {}).get("voice", voice)); update any other places in the
constructor that set/assume model/voice from kwargs (the blocks around the
earlier assignments and the later uses that call _emit_connected_event and
_plan_supports_tool_calling) so they all use the effective session-derived
self.model/self.voice instead of the original kwarg defaults.
- Around line 395-448: The _plan_supports_tool_calling probe is called
repeatedly and can silently miss reworded error messages; cache its boolean
result on the instance (e.g., add self._plan_tool_support: bool | None = None in
__init__), return the cached value unless api key (self._api_key) or model
(self.model) changed (invalidate cache when those change), and on a 400 response
inspect resp.json() for structured fields like "error.code" or "error.type" in
addition to the message; if none of those indicate plan restriction, emit a
processLogger.warning (or self.logger.warning) including the full
resp.json()/resp.text so operators see unexpected 400s, and only fall back to
True when the checks explicitly indicate tools are allowed or when a
network/parse error occurs.
---
Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 153-160: The assignment and pop use "# type: ignore" to bypass
typing; instead update convert_tools_to_openai_format to return the exact
RealtimeSessionCreateRequestParam["tools"] type (or wrap tools_for_inworld with
typing.cast(RealtimeSessionCreateRequestParam["tools"], tools_for_inworld)) so
the line self.realtime_session["tools"] = tools_for_inworld type-checks without
ignores, and change the pop to use a typed deletion such as: cast the session to
the proper TypedDict before popping or delete the key with a typed helper so the
self.realtime_session.pop("tools", None) no longer requires "# type: ignore".
---
Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 125: Define a TypedDict (e.g., PendingToolCall) describing the exact
shape of each pending call (including argument_parts: list[str], any other keys
like args/metadata with their proper types) and replace the current untyped map
annotation on self._pending_tool_calls in inworld_realtime.py with dict[str,
PendingToolCall]; update code paths that mutate pending["argument_parts"] (the
.append(delta) call) and later read it (the "".join(pending["argument_parts"])
call) to rely on the TypedDict so static checkers accept list operations without
casts or # type: ignore.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 69bc7fb5-ab05-4201-a932-7d4ca9494bb7
📒 Files selected for processing (6)
CHANGELOG.mdplugins/inworld/README.mdplugins/inworld/example/inworld_realtime_tools_example.pyplugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyplugins/inworld/vision_agents/plugins/inworld/rtc_manager.pyplugins/inworld/vision_agents/plugins/inworld/tool_utils.py
✅ Files skipped from review due to trivial changes (3)
- CHANGELOG.md
- plugins/inworld/example/inworld_realtime_tools_example.py
- plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/inworld/README.md
- plugins/inworld/vision_agents/plugins/inworld/tool_utils.py
| async def _plan_supports_tool_calling(self) -> bool: | ||
| """Probe whether the account's plan permits tool calling. | ||
|
|
||
| Inworld's Realtime API silently returns ``server_error`` on every | ||
| response when a tool-restricted plan has tools in the session, which | ||
| leaves the agent unable to respond at all. The REST completions | ||
| endpoint rejects the same request up-front with a validation 400 | ||
| carrying ``"Tool calling is currently restricted on your plan."`` — | ||
| we use that as a cheap preflight so we can strip tools and keep the | ||
| session functional when the plan disallows them. | ||
|
|
||
| Returns True on any unexpected response so we fall back to the old | ||
| behaviour rather than silently disabling tool calling. | ||
| """ | ||
| if self._force_tool_calling: | ||
| return True | ||
| try: | ||
| async with httpx.AsyncClient(timeout=10.0) as client: | ||
| resp = await client.post( | ||
| f"{INWORLD_API_BASE}/v1/chat/completions", | ||
| headers={ | ||
| "Authorization": f"Bearer {self._api_key}", | ||
| "Content-Type": "application/json", | ||
| }, | ||
| json={ | ||
| "model": self.model, | ||
| "messages": [{"role": "user", "content": "hi"}], | ||
| "tools": [ | ||
| { | ||
| "type": "function", | ||
| "function": { | ||
| "name": "_preflight", | ||
| "description": "preflight", | ||
| "parameters": { | ||
| "type": "object", | ||
| "properties": {}, | ||
| }, | ||
| }, | ||
| } | ||
| ], | ||
| "max_tokens": 1, | ||
| }, | ||
| ) | ||
| except httpx.RequestError: | ||
| return True | ||
|
|
||
| if resp.status_code != 400: | ||
| return True | ||
| try: | ||
| message = resp.json().get("error", {}).get("message", "") | ||
| except ValueError: | ||
| message = resp.text | ||
| msg_lower = message.lower() | ||
| return not ("tool calling" in msg_lower and "restricted" in msg_lower) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does the Inworld Realtime API expose an error code or type field (not just a human-readable message) when a request is rejected because the account's plan restricts tool calling?
💡 Result:
Yes, the Inworld Realtime API exposes both an error type field (error category, e.g., "server_error") and an error code field (specific error code) in the error event when a request is rejected. These are machine-readable fields separate from the human-readable message. Although no source explicitly documents a plan restriction error for tool calling, the general error schema includes these fields for all errors, including those from account plan limits (e.g., rate limits, concurrency), as shown in official API reference documentation. Tool calling is configured via session.tools and tool_choice, and plan limits apply to Realtime API usage including concurrent sessions.
Citations:
- 1: https://docs.inworld.ai/api-reference/realtimeAPI/realtime/realtime-websocket
- 2: https://dev.docs.inworld.ai/api-reference/realtimeAPI/realtime/realtime-websocket
- 3: https://docs.inworld.ai/api-reference/realtimeAPI/realtime/realtime-webrtc
- 4: https://dev.docs.inworld.ai/api-reference/realtimeAPI/realtime/realtime-webrtc
- 5: https://docs.inworld.ai/docs/realtime/usage/using-realtime-models
- 6: https://docs.inworld.ai/realtime/usage/using-realtime-models
🏁 Script executed:
# First, locate the file mentioned in the review
find . -name "inworld_realtime.py" -type fRepository: GetStream/Vision-Agents
Length of output: 134
🏁 Script executed:
# Examine the file around lines 395-448 to see the current implementation
wc -l plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 136
🏁 Script executed:
# Look for usages of _plan_supports_tool_calling to see if it's called repeatedly
rg "_plan_supports_tool_calling" -A 2 -B 2Repository: GetStream/Vision-Agents
Length of output: 1687
🏁 Script executed:
# Check the connect() and update_tools() methods to confirm the concern
rg "def (connect|update_tools)" -A 10Repository: GetStream/Vision-Agents
Length of output: 27312
🏁 Script executed:
# Look at __init__ to see current instance attributes
rg "def __init__" plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py -A 30Repository: GetStream/Vision-Agents
Length of output: 1250
🏁 Script executed:
# Look at the connect() method in detail
sed -n '348,450p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 4152
🏁 Script executed:
# Look at the update_tools() method
sed -n '450,483p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 1354
🏁 Script executed:
# Search for any existing _plan_tool_support variable
rg "_plan_tool_support"Repository: GetStream/Vision-Agents
Length of output: 49
🏁 Script executed:
# Find the connect() method in inworld_realtime.py
sed -n '265,350p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 4149
🏁 Script executed:
# Check where _plan_supports_tool_calling is called - look for the line numbers
grep -n "_plan_supports_tool_calling" plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 249
🏁 Script executed:
# Look at the connect() method around line 152
sed -n '120,200p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyRepository: GetStream/Vision-Agents
Length of output: 3459
Cache preflight probe and log unmatched error responses to avoid silent tool-calling failures.
Two issues in _plan_supports_tool_calling (called on every connect() and update_tools()):
-
Redundant HTTP requests. Each call fires a live
/v1/chat/completionspreflight request, adding roundtrip latency and token usage to session startup. Cache the result on the instance, invalidating only whenapi_keyormodelchanges. -
Silent failure on error message rephrasing. Lines 447–448 match the tool-restriction response by checking for both
"tool calling"and"restricted"substrings. If Inworld rewords the message, the check fails silently and the session falls into theserver_errortrap the preflight was meant to avoid. Log unmatched 400 responses atwarninglevel so operators see the regression. Better still, also check the errorcodeortypefield (confirmed available in Inworld's error schema) rather than relying on substring matching alone.
Suggested changes
+ if self._plan_tool_support is not None:
+ return self._plan_tool_support
+
if resp.status_code != 400:
- return True
+ self._plan_tool_support = True
+ return True
try:
message = resp.json().get("error", {}).get("message", "")
except ValueError:
message = resp.text
msg_lower = message.lower()
- return not ("tool calling" in msg_lower and "restricted" in msg_lower)
+ restricted = "tool calling" in msg_lower and "restricted" in msg_lower
+ if not restricted:
+ logger.warning(
+ "Inworld preflight returned 400 but did not match the known "
+ "tool-restriction message; assuming tools are allowed. body=%s",
+ message,
+ )
+ self._plan_tool_support = not restricted
+ return self._plan_tool_supportInitialize self._plan_tool_support: bool | None = None in __init__.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 395 - 448, The _plan_supports_tool_calling probe is called repeatedly and
can silently miss reworded error messages; cache its boolean result on the
instance (e.g., add self._plan_tool_support: bool | None = None in __init__),
return the cached value unless api key (self._api_key) or model (self.model)
changed (invalidate cache when those change), and on a 400 response inspect
resp.json() for structured fields like "error.code" or "error.type" in addition
to the message; if none of those indicate plan restriction, emit a
processLogger.warning (or self.logger.warning) including the full
resp.json()/resp.text so operators see unexpected 400s, and only fall back to
True when the checks explicitly indicate tools are allowed or when a
network/parse error occurs.
…callbacks Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py (1)
82-86:AudioForwarderinstance is not retained anywhere.The forwarder is created as a local inside
on_trackandstart()spawns its task internally; nothing inRTCManagerholds a reference or callsstop()inclose(). Today it self-terminates when the track ends, but that is a load-bearing implicit contract — any future change toAudioForwarder.start()that depends on an explicitstop()(or any exception in its internal task) will silently become un-observable here. Store the forwarder onselfand stop it inclose().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 82 - 86, The AudioForwarder created in the on_track handler is only a local variable and not retained, so create and store it on the RTCManager instance (e.g., self._audio_forwarders as a list or dict) when you instantiate it in on_track (where AudioForwarder(track, self._audio_callback) and await audio_forwarder.start() are called), ensure you remove it when the forwarder/track ends, and update the close() method to iterate stored forwarders and call their stop() (or await stop() if async) to cleanly shut them down; reference the on_track handler, AudioForwarder.start(), and RTCManager.close() when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 198-206: The connect() flow currently swallows a session.updated
timeout (in the block using self.rtc.send_event and await
asyncio.wait_for(self._session_updated.wait(), timeout=10.0)) and only logs a
warning; change this to surface the failure: when asyncio.TimeoutError occurs,
emit a RealtimeErrorEvent with context="session_update_timeout" (so subscribers
can react) and then raise InworldRealtimeError instead of continuing to fire
RealtimeConnectedEvent; ensure the timeout branch references _session_updated,
RealtimeErrorEvent, RealtimeConnectedEvent, and InworldRealtimeError so callers
get a clear failure rather than a silent warning.
- Around line 450-455: The code assumes resp.json().get("error",
{}).get("message", "") where "error" is always a dict; to fix, parse resp.json()
into a variable (resp_json), check if resp_json.get("error") is a dict before
calling .get("message") and otherwise treat error as a string (or use
resp_json.get("error") directly), falling back to resp.text on JSON decode
errors; update the logic that sets message (and then msg_lower) to handle both
dict and non-dict error payloads so the function (using resp, message,
msg_lower) no longer raises AttributeError for string errors.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 259-262: The except block around awaiting pc.close() currently
suppresses asyncio.CancelledError and breaks cooperative cancellation; update
the handler so that only ConnectionError is caught (or re-raise CancelledError)
instead of swallowing asyncio.CancelledError. Locate the await pc.close() call
in rtc_manager.py (in the close/shutdown logic managing the Inworld peer
connection) and change the except tuple to catch only ConnectionError (or
explicitly except asyncio.CancelledError and re-raise it) so cancellation
propagates to the caller.
- Around line 65-94: After creating the RTCPeerConnection in RTCManager.connect,
wrap all subsequent work (calls to _setup_connection_logging, _add_data_channel,
addTrack, createOffer, setLocalDescription, _exchange_sdp, setRemoteDescription
and any started AudioForwarder) in a try/except (or try/finally) block so that
if any of those calls fail you await self.pc.close() and clear self.pc before
re-raising the exception; ensure you await self.pc.close() (not just call close)
and set self.pc = None (or a cleared state) so the leaked ICE/DTLS state machine
is terminated when _add_data_channel, createOffer, _exchange_sdp or
setRemoteDescription raise.
---
Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 82-86: The AudioForwarder created in the on_track handler is only
a local variable and not retained, so create and store it on the RTCManager
instance (e.g., self._audio_forwarders as a list or dict) when you instantiate
it in on_track (where AudioForwarder(track, self._audio_callback) and await
audio_forwarder.start() are called), ensure you remove it when the
forwarder/track ends, and update the close() method to iterate stored forwarders
and call their stop() (or await stop() if async) to cleanly shut them down;
reference the on_track handler, AudioForwarder.start(), and RTCManager.close()
when making these changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bb549e7b-2daf-4792-85ad-9108f3c21a1a
📒 Files selected for processing (2)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyplugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
| await self.rtc.send_event( | ||
| {"type": "session.update", "session": dict(self.realtime_session)} | ||
| ) | ||
| try: | ||
| await asyncio.wait_for(self._session_updated.wait(), timeout=10.0) | ||
| except asyncio.TimeoutError: | ||
| logger.warning( | ||
| "Timed out waiting for Inworld session.updated; proceeding anyway" | ||
| ) |
There was a problem hiding this comment.
Treat session.updated timeout as a real error, not a log warning.
The docstring on connect() (lines 141-152) explicitly states that the method "waits for the server's session.updated confirmation before returning so callers can trust the session is fully configured." The current implementation contradicts that: on timeout, it logs a warning and proceeds to emit RealtimeConnectedEvent anyway, so callers have no way to distinguish a healthy connect from a silently-degraded one where tools/voice/instructions never took effect. At minimum, emit a RealtimeErrorEvent with context="session_update_timeout" so subscribers can react; consider raising InworldRealtimeError to honour the contract the docstring advertises.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 198 - 206, The connect() flow currently swallows a session.updated timeout
(in the block using self.rtc.send_event and await
asyncio.wait_for(self._session_updated.wait(), timeout=10.0)) and only logs a
warning; change this to surface the failure: when asyncio.TimeoutError occurs,
emit a RealtimeErrorEvent with context="session_update_timeout" (so subscribers
can react) and then raise InworldRealtimeError instead of continuing to fire
RealtimeConnectedEvent; ensure the timeout branch references _session_updated,
RealtimeErrorEvent, RealtimeConnectedEvent, and InworldRealtimeError so callers
get a clear failure rather than a silent warning.
| async def connect(self) -> None: | ||
| """Establish the WebRTC connection to Inworld. | ||
|
|
||
| Pre-fetches TURN/STUN servers from Inworld so the peer connection can | ||
| be created with them upfront — required because Inworld's media relay | ||
| sits behind NAT (private IPs in ICE candidates) and cannot be reached | ||
| without the server-issued TURN credentials. | ||
| """ | ||
| ice_servers = await self._fetch_ice_servers() | ||
| self.pc = RTCPeerConnection( | ||
| configuration=RTCConfiguration(iceServers=ice_servers) | ||
| ) | ||
| self._setup_connection_logging() | ||
|
|
||
| await self._add_data_channel() | ||
| self.pc.addTrack(self._audio_to_inworld_track) | ||
|
|
||
| @self.pc.on("track") | ||
| async def on_track(track): | ||
| if track.kind == "audio" and self._audio_callback: | ||
| audio_forwarder = AudioForwarder(track, self._audio_callback) | ||
| await audio_forwarder.start() | ||
|
|
||
| offer = await self.pc.createOffer() | ||
| await self.pc.setLocalDescription(offer) | ||
|
|
||
| answer_sdp = await self._exchange_sdp(offer.sdp) | ||
|
|
||
| answer = RTCSessionDescription(sdp=answer_sdp, type="answer") | ||
| await self.pc.setRemoteDescription(answer) |
There was a problem hiding this comment.
Clean up self.pc if connect() fails partway through.
Once RTCPeerConnection is instantiated on line 74, any failure in _add_data_channel, createOffer, _exchange_sdp, or setRemoteDescription leaves a live peer connection attached to self.pc. The caller in inworld_realtime.Realtime.connect() re-raises without invoking RTCManager.close(), so the pc (and its ICE/DTLS state machines) leaks until process exit. Wrap the post-instantiation body in try/except and close self.pc before re-raising.
🛡️ Sketch
ice_servers = await self._fetch_ice_servers()
self.pc = RTCPeerConnection(
configuration=RTCConfiguration(iceServers=ice_servers)
)
- self._setup_connection_logging()
-
- await self._add_data_channel()
- self.pc.addTrack(self._audio_to_inworld_track)
- ...
- answer = RTCSessionDescription(sdp=answer_sdp, type="answer")
- await self.pc.setRemoteDescription(answer)
+ try:
+ self._setup_connection_logging()
+ await self._add_data_channel()
+ self.pc.addTrack(self._audio_to_inworld_track)
+ ...
+ answer = RTCSessionDescription(sdp=answer_sdp, type="answer")
+ await self.pc.setRemoteDescription(answer)
+ except BaseException:
+ await self.close()
+ raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 65
- 94, After creating the RTCPeerConnection in RTCManager.connect, wrap all
subsequent work (calls to _setup_connection_logging, _add_data_channel,
addTrack, createOffer, setLocalDescription, _exchange_sdp, setRemoteDescription
and any started AudioForwarder) in a try/except (or try/finally) block so that
if any of those calls fail you await self.pc.close() and clear self.pc before
re-raising the exception; ensure you await self.pc.close() (not just call close)
and set self.pc = None (or a cleared state) so the leaked ICE/DTLS state machine
is terminated when _add_data_channel, createOffer, _exchange_sdp or
setRemoteDescription raise.
- Derive self.model/self.voice from realtime_session after setdefault - Guard against non-dict error payloads in preflight 400 response - Log warning for unmatched 400 bodies in preflight probe - Clean up peer connection on partial connect() failure - Don't swallow asyncio.CancelledError in RTCManager.close() Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (4)
489-493:⚠️ Potential issue | 🟠 MajorClear local and remote tools when the plan check fails in
update_tools().This branch skips the update but leaves any existing local/session tools intact. If tools were previously configured, the server can continue failing responses.
🛠️ Proposed fix
if not await self._plan_supports_tool_calling(): + self.realtime_session.pop("tools", None) + await self.rtc.send_event( + { + "type": "session.update", + "session": {"tools": []}, + } + ) logger.warning( "Inworld plan does not permit tool calling; skipping tool update" ) return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 489 - 493, In update_tools(), when the early-return branch triggers because await self._plan_supports_tool_calling() is False, clear any existing configured tools both locally and remotely before returning; call the methods or operations that remove session/local tools (e.g., reset or clear the internal tools store on the agent instance) and send the corresponding remote unregistration/clear request so that previous tool state is not retained when the plan disallows tool calling. Ensure you reference update_tools and _plan_supports_tool_calling to locate the branch and invoke the existing tool-clearing mechanisms (or add clear_tools/clear_remote_tools helper calls) prior to the return.
202-208:⚠️ Potential issue | 🟡 MinorSurface
session.updatedtimeout as a connection failure.The docstring says
connect()waits until the session is configured, but this timeout path still emitsRealtimeConnectedEvent. Callers cannot distinguish a configured session from one where model/voice/tools were not applied.🛠️ Proposed fix
try: await asyncio.wait_for(self._session_updated.wait(), timeout=10.0) except asyncio.TimeoutError: - logger.warning( - "Timed out waiting for Inworld session.updated; proceeding anyway" + exc = InworldRealtimeError( + "Timed out waiting for Inworld session.updated" ) + self._emit_error_event(exc, context="session_update_timeout") + raise exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 202 - 208, The connect() path currently swallows asyncio.TimeoutError from await self._session_updated.wait() and only logs a warning, which causes callers to still receive RealtimeConnectedEvent even though session configuration may have failed; update the connect() implementation so that when asyncio.wait_for(self._session_updated.wait(), timeout=10.0) times out you treat it as a connection/configuration failure: either raise a clear exception (e.g., ConnectionError) from connect() or emit a RealtimeConnectionFailedEvent and return/stop further success-path processing so RealtimeConnectedEvent is not emitted; modify the timeout except block in connect() (the code referencing self._session_updated.wait()) to perform that failure signaling and ensure any downstream code that emits RealtimeConnectedEvent is not executed after the timeout.
371-373:⚠️ Potential issue | 🟠 MajorDo not send raw tool exception text back to the model.
str(error)can contain secrets, PII, stack fragments, or backend details from tools. Return a generic failure payload and keep detailed diagnostics out of the model-visible channel.🛡️ Proposed fix
tc, result, error = await self._run_one_tool(tool_call, timeout_s=30) if error: - response_data: dict[str, object] = {"error": str(error)} + response_data: dict[str, object] = {"error": "tool call failed"} logger.error("Tool call %s failed", name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 371 - 373, The current block constructs response_data with response_data = {"error": str(error)} which exposes raw exception text to the model; change this to return a generic, non-sensitive payload (e.g., {"error": "tool_call_failed"} or {"error": "Tool call failed"} and optionally an opaque error_id) and ensure the detailed exception is only written to server logs via logger.error("Tool call %s failed: %s", name, error) or logger.exception(...) so diagnostics remain out of the model-visible response; update the code paths that consume response_data so they expect the generic message/optional error_id instead of raw exception text.
155-175:⚠️ Potential issue | 🟠 MajorRun the tool-plan preflight for preconfigured session tools too.
If a caller passes
realtime_sessionwith existing"tools"but no functions are registered locally, this block skips the preflight and still sends those tools insession.update. That re-enters the restricted-plan failure path. Track whether tools are effectively enabled and use that when emitting capabilities.🛡️ Sketch
self._session_updated.clear() available_tools = self.get_available_functions() - if available_tools: - tools_for_inworld = convert_tools_to_openai_format( - available_tools, for_realtime=True - ) + configured_tools = self.realtime_session.get("tools") + function_calling_enabled = False + if available_tools or configured_tools: + tools_for_inworld = ( + convert_tools_to_openai_format(available_tools, for_realtime=True) + if available_tools + else configured_tools + ) if await self._plan_supports_tool_calling(): - self.realtime_session["tools"] = tools_for_inworld # type: ignore[typeddict-item] + self.realtime_session["tools"] = tools_for_inworld + function_calling_enabled = True logger.info( "Added %d tools to Inworld session config: %s", len(tools_for_inworld), @@ self._emit_connected_event( session_config={"model": self.model, "voice": self.voice}, - capabilities=["text", "audio", "function_calling"], + capabilities=[ + "text", + "audio", + *(["function_calling"] if function_calling_enabled else []), + ], )Also applies to: 199-212
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around lines 155 - 175, The current logic skips the plan preflight when there are no locally registered functions but the incoming realtime_session already contains "tools", causing restricted-plan failures later; change the block around get_available_functions/convert_tools_to_openai_format/_plan_supports_tool_calling so you always run the plan preflight for any candidate tools (either converted from available_tools or the preconfigured self.realtime_session.get("tools")), e.g., compute tools_candidate = converted_tools if available_tools else self.realtime_session.get("tools"), call await self._plan_supports_tool_calling() for that candidate, and then set or pop self.realtime_session["tools"] and emit the same logger.info/logger.warning based on the preflight result so capabilities reflect whether tools are actually enabled (refer to get_available_functions, convert_tools_to_openai_format, _plan_supports_tool_calling, and self.realtime_session["tools"]).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 83-87: The on_track handler creates AudioForwarder instances as
local variables and spawns background tasks that are never awaited or inspected,
so modify the RTC manager to track and properly shut down these workers: store
each AudioForwarder (or any created background Task) on the instance (e.g., add
to self._audio_forwarders or push created asyncio.Tasks into
self._pending_tasks) when you create them in the on_track callback and ensure
tasks remove themselves from that collection on completion; update close() to
iterate over tracked AudioForwarder instances and call their stop/shutdown
method (or cancel and await tasks in self._pending_tasks), await their
termination, and gather exceptions (use asyncio.gather(...,
return_exceptions=True)) so no task exceptions remain unobserved; also apply the
same tracking and shutdown pattern to the other locations mentioned (the other
AudioForwarder creations and any event callbacks referenced around the same
areas).
- Around line 203-208: The data-channel handler currently json.loads(message)
and immediately dispatches via asyncio.create_task(self._handle_event(data));
add a guard to reject non-object payloads by checking isinstance(data, dict) (or
type(data) is dict) after parsing, and if it is not a dict log a clear message
(e.g., logger.warning or logger.debug) and return without creating the task;
update the block around json.loads, logger.exception, and
asyncio.create_task(self._handle_event) in rtc_manager.py so only dicts are
forwarded to _handle_event.
- Around line 225-244: The connection-state callbacks in
_setup_connection_logging reference self.pc which can be cleared by close()
before final events fire; capture the PeerConnection into a local variable
(e.g., pc = self.pc) when registering the handlers and use that local pc inside
on_connectionstatechange and on_iceconnectionstatechange so the callbacks
reference the captured connection instance instead of self.pc (also keep the
existing asserts/guards if desired).
---
Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 489-493: In update_tools(), when the early-return branch triggers
because await self._plan_supports_tool_calling() is False, clear any existing
configured tools both locally and remotely before returning; call the methods or
operations that remove session/local tools (e.g., reset or clear the internal
tools store on the agent instance) and send the corresponding remote
unregistration/clear request so that previous tool state is not retained when
the plan disallows tool calling. Ensure you reference update_tools and
_plan_supports_tool_calling to locate the branch and invoke the existing
tool-clearing mechanisms (or add clear_tools/clear_remote_tools helper calls)
prior to the return.
- Around line 202-208: The connect() path currently swallows
asyncio.TimeoutError from await self._session_updated.wait() and only logs a
warning, which causes callers to still receive RealtimeConnectedEvent even
though session configuration may have failed; update the connect()
implementation so that when asyncio.wait_for(self._session_updated.wait(),
timeout=10.0) times out you treat it as a connection/configuration failure:
either raise a clear exception (e.g., ConnectionError) from connect() or emit a
RealtimeConnectionFailedEvent and return/stop further success-path processing so
RealtimeConnectedEvent is not emitted; modify the timeout except block in
connect() (the code referencing self._session_updated.wait()) to perform that
failure signaling and ensure any downstream code that emits
RealtimeConnectedEvent is not executed after the timeout.
- Around line 371-373: The current block constructs response_data with
response_data = {"error": str(error)} which exposes raw exception text to the
model; change this to return a generic, non-sensitive payload (e.g., {"error":
"tool_call_failed"} or {"error": "Tool call failed"} and optionally an opaque
error_id) and ensure the detailed exception is only written to server logs via
logger.error("Tool call %s failed: %s", name, error) or logger.exception(...) so
diagnostics remain out of the model-visible response; update the code paths that
consume response_data so they expect the generic message/optional error_id
instead of raw exception text.
- Around line 155-175: The current logic skips the plan preflight when there are
no locally registered functions but the incoming realtime_session already
contains "tools", causing restricted-plan failures later; change the block
around
get_available_functions/convert_tools_to_openai_format/_plan_supports_tool_calling
so you always run the plan preflight for any candidate tools (either converted
from available_tools or the preconfigured self.realtime_session.get("tools")),
e.g., compute tools_candidate = converted_tools if available_tools else
self.realtime_session.get("tools"), call await
self._plan_supports_tool_calling() for that candidate, and then set or pop
self.realtime_session["tools"] and emit the same logger.info/logger.warning
based on the preflight result so capabilities reflect whether tools are actually
enabled (refer to get_available_functions, convert_tools_to_openai_format,
_plan_supports_tool_calling, and self.realtime_session["tools"]).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c7c06809-8ca4-4bd5-806c-1575f570ba92
📒 Files selected for processing (2)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.pyplugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
| @self.pc.on("track") | ||
| async def on_track(track): | ||
| if track.kind == "audio" and self._audio_callback: | ||
| audio_forwarder = AudioForwarder(track, self._audio_callback) | ||
| await audio_forwarder.start() |
There was a problem hiding this comment.
Close all background work started by the RTC manager.
AudioForwarder instances are only local variables, and _pending_tasks are retained but never awaited or inspected during shutdown. close() can return while audio/event callbacks are still running, and task exceptions remain unobserved.
🛠️ Proposed lifecycle fix
self._event_callback: Callable[[dict], Awaitable[None]] | None = None
self._data_channel_open_event: asyncio.Event = asyncio.Event()
self._pending_tasks: set[asyncio.Task[None]] = set()
+ self._audio_forwarders: set[AudioForwarder] = set()
@@
async def on_track(track):
if track.kind == "audio" and self._audio_callback:
audio_forwarder = AudioForwarder(track, self._audio_callback)
+ self._audio_forwarders.add(audio_forwarder)
await audio_forwarder.start()
@@
task = asyncio.create_task(self._handle_event(data))
self._pending_tasks.add(task)
- task.add_done_callback(self._pending_tasks.discard)
+ task.add_done_callback(self._on_pending_task_done)
+
+ def _on_pending_task_done(self, task: asyncio.Task[None]) -> None:
+ self._pending_tasks.discard(task)
+ if task.cancelled():
+ return
+ exc = task.exception()
+ if exc is not None:
+ logger.error(
+ "Inworld event handler task failed",
+ exc_info=(type(exc), exc, exc.__traceback__),
+ )
@@
async def close(self) -> None:
if self.data_channel is not None:
self.data_channel.close()
self.data_channel = None
self._audio_to_inworld_track.stop()
+
+ for audio_forwarder in tuple(self._audio_forwarders):
+ await audio_forwarder.stop()
+ self._audio_forwarders.clear()
+
+ if self._pending_tasks:
+ await asyncio.gather(*self._pending_tasks, return_exceptions=True)
+ self._pending_tasks.clear()
pc = self.pcAlso applies to: 208-210, 252-266
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 83
- 87, The on_track handler creates AudioForwarder instances as local variables
and spawns background tasks that are never awaited or inspected, so modify the
RTC manager to track and properly shut down these workers: store each
AudioForwarder (or any created background Task) on the instance (e.g., add to
self._audio_forwarders or push created asyncio.Tasks into self._pending_tasks)
when you create them in the on_track callback and ensure tasks remove themselves
from that collection on completion; update close() to iterate over tracked
AudioForwarder instances and call their stop/shutdown method (or cancel and
await tasks in self._pending_tasks), await their termination, and gather
exceptions (use asyncio.gather(..., return_exceptions=True)) so no task
exceptions remain unobserved; also apply the same tracking and shutdown pattern
to the other locations mentioned (the other AudioForwarder creations and any
event callbacks referenced around the same areas).
| try: | ||
| data = json.loads(message) | ||
| except json.JSONDecodeError: | ||
| logger.exception("Failed to decode data-channel message") | ||
| return | ||
| task = asyncio.create_task(self._handle_event(data)) |
There was a problem hiding this comment.
Reject non-object data-channel payloads before dispatch.
json.loads() can return a list/string/number, but downstream handlers call event.get(...). A malformed server message would raise in the background dispatcher.
🛡️ Proposed guard
try:
data = json.loads(message)
except json.JSONDecodeError:
logger.exception("Failed to decode data-channel message")
return
+ if not isinstance(data, dict):
+ logger.warning("Ignoring non-object Inworld data-channel message")
+ return
task = asyncio.create_task(self._handle_event(data))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines
203 - 208, The data-channel handler currently json.loads(message) and
immediately dispatches via asyncio.create_task(self._handle_event(data)); add a
guard to reject non-object payloads by checking isinstance(data, dict) (or
type(data) is dict) after parsing, and if it is not a dict log a clear message
(e.g., logger.warning or logger.debug) and return without creating the task;
update the block around json.loads, logger.exception, and
asyncio.create_task(self._handle_event) in rtc_manager.py so only dicts are
forwarded to _handle_event.
| def _setup_connection_logging(self) -> None: | ||
| assert self.pc is not None | ||
|
|
||
| @self.pc.on("connectionstatechange") | ||
| async def on_connectionstatechange(): | ||
| assert self.pc is not None | ||
| state = self.pc.connectionState | ||
| if state == "failed": | ||
| logger.error("Inworld RTC connection failed") | ||
| elif state == "disconnected": | ||
| logger.warning("Inworld RTC connection disconnected") | ||
| elif state == "connected": | ||
| logger.info("Inworld RTC connection established") | ||
| elif state == "closed": | ||
| logger.info("Inworld RTC connection closed") | ||
|
|
||
| @self.pc.on("iceconnectionstatechange") | ||
| async def on_iceconnectionstatechange(): | ||
| assert self.pc is not None | ||
| state = self.pc.iceConnectionState |
There was a problem hiding this comment.
Capture pc in connection-state callbacks.
These callbacks dereference self.pc, but close() clears self.pc before pc.close() can emit final state changes. Capture the peer connection when registering handlers.
🛠️ Proposed fix
def _setup_connection_logging(self) -> None:
assert self.pc is not None
+ pc = self.pc
- `@self.pc.on`("connectionstatechange")
+ `@pc.on`("connectionstatechange")
async def on_connectionstatechange():
- assert self.pc is not None
- state = self.pc.connectionState
+ state = pc.connectionState
if state == "failed":
logger.error("Inworld RTC connection failed")
@@
- `@self.pc.on`("iceconnectionstatechange")
+ `@pc.on`("iceconnectionstatechange")
async def on_iceconnectionstatechange():
- assert self.pc is not None
- state = self.pc.iceConnectionState
+ state = pc.iceConnectionState
if state == "failed":
logger.error("Inworld ICE connection failed")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines
225 - 244, The connection-state callbacks in _setup_connection_logging reference
self.pc which can be cleared by close() before final events fire; capture the
PeerConnection into a local variable (e.g., pc = self.pc) when registering the
handlers and use that local pc inside on_connectionstatechange and
on_iceconnectionstatechange so the callbacks reference the captured connection
instance instead of self.pc (also keep the existing asserts/guards if desired).
This pull request introduces the Inworld Realtime plugin, adding low-latency speech-to-speech conversational capabilities to Vision Agents via Inworld's WebRTC-based Realtime API. The update includes a new
Realtimeclass, comprehensive documentation, an example, and tests. It also updates dependencies and project metadata to support the new features.New Inworld Realtime (WebRTC) plugin:
inworld.Realtimeclass, enabling speech-to-speech conversations over WebRTC with support for OpenAI-compatible function calling, turn detection, and multiple upstream models. [1] [2]README.mdwith detailed usage instructions, configuration options for both TTS and Realtime, tool registration, and requirements for the new plugin.Documentation and examples:
example/inworld_realtime_example.py, demonstrating how to use the Inworld Realtime agent, including registering a tool and joining a call.Testing and quality assurance:
test_realtime.pywith unit and integration tests for theRealtimeclass, covering API key handling, configuration, event dispatch, tool registration, and WebRTC connection.Project metadata and dependencies:
pyproject.tomlto reflect the new plugin’s capabilities and added required dependencies:aiortc,openai[realtime], and updatedhttpxandavversions.Changelog:
CHANGELOG.md.Summary by CodeRabbit
New Features
Documentation
Examples
Infrastructure
Exports
Tests