Skip to content

add inworld realtime api#502

Open
maxkahan wants to merge 6 commits intomainfrom
add-inworld-realtime-voice
Open

add inworld realtime api#502
maxkahan wants to merge 6 commits intomainfrom
add-inworld-realtime-voice

Conversation

@maxkahan
Copy link
Copy Markdown
Contributor

@maxkahan maxkahan commented Apr 21, 2026

This pull request introduces the Inworld Realtime plugin, adding low-latency speech-to-speech conversational capabilities to Vision Agents via Inworld's WebRTC-based Realtime API. The update includes a new Realtime class, comprehensive documentation, an example, and tests. It also updates dependencies and project metadata to support the new features.

New Inworld Realtime (WebRTC) plugin:

  • Added the inworld.Realtime class, enabling speech-to-speech conversations over WebRTC with support for OpenAI-compatible function calling, turn detection, and multiple upstream models. [1] [2]
  • Updated the README.md with detailed usage instructions, configuration options for both TTS and Realtime, tool registration, and requirements for the new plugin.

Documentation and examples:

  • Added a new example script, example/inworld_realtime_example.py, demonstrating how to use the Inworld Realtime agent, including registering a tool and joining a call.

Testing and quality assurance:

  • Introduced test_realtime.py with unit and integration tests for the Realtime class, covering API key handling, configuration, event dispatch, tool registration, and WebRTC connection.

Project metadata and dependencies:

  • Updated pyproject.toml to reflect the new plugin’s capabilities and added required dependencies: aiortc, openai[realtime], and updated httpx and av versions.

Changelog:

  • Documented the addition of the Inworld Realtime plugin in CHANGELOG.md.

Summary by CodeRabbit

  • New Features

    • Inworld Realtime (WebRTC) integration: low-latency speech-to-speech with function/tool calling, turn detection, and selectable upstream models.
  • Documentation

    • Expanded docs and README with Realtime configuration, usage, requirements, and changelog entry.
  • Examples

    • Added runnable Realtime examples: interviewer and tool-enabled call flows.
  • Infrastructure

    • New WebRTC/session manager and tool conversion utilities to support realtime transport and function calling.
  • Exports

    • Realtime capability added to plugin exports for easier discovery.
  • Tests

    • New unit and integration tests covering Realtime connections, events, and tool calling.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an Inworld Realtime (WebRTC) integration: new Realtime LLM class, RTCManager for WebRTC signaling/transport, tool-calling utilities, examples and tests, README/CHANGELOG updates, and new runtime dependencies for aiortc and OpenAI realtime support. (≤50 words)

Changes

Cohort / File(s) Summary
Changelog & Top-level Docs
CHANGELOG.md, README.md
Added “Inworld Realtime plugin (WebRTC)” entry and appended Inworld to Realtime provider list.
Plugin README & Metadata
plugins/inworld/README.md, plugins/inworld/pyproject.toml
README expanded to cover TTS + Realtime (usage, options, requirements); pyproject.toml updated and dependencies aiortc and openai[realtime] added.
Realtime core
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
New Realtime class and InworldRealtimeError: session creation, connect/close, event routing, streaming function-calling, instruction/tool management, and lifecycle events. Logic-dense — review carefully.
WebRTC manager
plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
New RTCManager: ICE/TURN fetch, RTCPeerConnection lifecycle, SDP offer/answer exchange with Inworld, audio track handling, ordered data-channel send/receive, and send helpers with open-state gating.
Tool utilities
plugins/inworld/vision_agents/plugins/inworld/tool_utils.py
New helpers: convert_tools_to_openai_format, tool_call_dedup_key, parse_tool_arguments — used for function-calling conversion, deduplication, and argument parsing.
Exports
plugins/inworld/vision_agents/plugins/inworld/__init__.py
__all__ updated to export Realtime alongside TTS.
Examples
plugins/inworld/example/inworld_realtime_example.py, plugins/inworld/example/inworld_realtime_tools_example.py
New runnable examples demonstrating a Realtime interviewer and a forced tool-calling flow (agent factory, join_call, tool registration).
Tests
plugins/inworld/tests/test_realtime.py
New pytest module with unit and integration tests covering lifecycle, events, tool registration, instruction updates, and data-channel readiness.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent
    participant Realtime as Realtime
    participant RTCManager as RTCManager
    participant InworldAPI as Inworld API
    participant WebRTC as WebRTC

    Agent->>Realtime: create(model, voice, api_key, instructions)
    Realtime->>Realtime: build realtime_session config
    Realtime->>RTCManager: instantiate with session params

    Agent->>Realtime: connect()
    Realtime->>RTCManager: connect()
    RTCManager->>InworldAPI: fetch ICE/TURN servers
    RTCManager->>WebRTC: create RTCPeerConnection & data channel
    RTCManager->>InworldAPI: post SDP offer → receive SDP answer
    WebRTC-->>RTCManager: data channel open
    RTCManager-->>Realtime: callback(events/audio)
    Realtime-->>Agent: emit connected event

    Agent->>Realtime: simple_response("Hello")
    Realtime->>RTCManager: send_text("Hello")
    RTCManager->>WebRTC: send conversation.item.create + response.create
    WebRTC-->>RTCManager: incoming audio frames & events
    RTCManager-->>Realtime: dispatch event callback
    Realtime->>Realtime: handle event (transcript/func-call/etc.)

    Realtime->>Realtime: if function call → stream args, run tool
    Realtime->>RTCManager: send_event(function_call_output)
    RTCManager->>WebRTC: deliver via data channel

    Agent->>Realtime: close()
    Realtime->>RTCManager: close()
    RTCManager->>WebRTC: close peer connection
    Realtime-->>Agent: emit disconnected event
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

The throat a bell of wires, bright and accusatory,
I teach it short sentences like small, obedient knives,
A data channel breathes and mouths a tool I cannot touch,
It asks the weather and returns my own precise silence,
Then closes, tidy as a throat stitched shut.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add inworld realtime api' accurately reflects the main change: introducing Inworld Realtime (WebRTC) plugin for low-latency speech-to-speech conversations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-inworld-realtime-voice

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (3)
plugins/inworld/vision_agents/plugins/inworld/tool_utils.py (1)

8-15: Replace old typing aliases with modern Python annotations throughout this helper.

Lines 8, 14-15, 34, and 56 still use Any, Dict, and List from the typing module. Update these to built-in generics with concrete types:

  • Remove from typing import Any, Dict, List
  • Change List[ToolSchema]list[ToolSchema]
  • Change List[Dict[str, Any]]list[dict[str, object]]
  • Change Dict[str, Any]dict[str, object]
  • Update parse_tool_arguments signature: dictdict[str, object]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py` around lines 8 -
15, Replace old typing aliases with built-in generics: remove the line importing
Any, Dict, List from typing; update the signature of
convert_tools_to_openai_format to use list[ToolSchema] and its return type to
list[dict[str, object]]; update any intermediate variables or annotations that
use Dict[str, Any] to dict[str, object]; update parse_tool_arguments' parameter
and return annotations from plain dict to dict[str, object]; ensure references
to NormalizedToolCallItem and ToolSchema remain unchanged (e.g.,
convert_tools_to_openai_format, parse_tool_arguments, NormalizedToolCallItem,
ToolSchema).
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (2)

34-34: Avoid load_dotenv() at module import.

Calling load_dotenv() on import mutates process env as a side effect of merely importing the plugin, which surprises library consumers and makes test isolation awkward. Let the application entrypoint (or example script) own that, or move it behind an explicit configure() call. The example in plugins/inworld/example/inworld_realtime_example.py is the right place for it.

♻️ Proposed change
 import json
 import logging
 import os
 from typing import Any, Optional

 import aiortc.mediastreams
 import httpx
-from dotenv import load_dotenv
 from getstream.video.rtc.track_util import PcmData
 ...
-load_dotenv()
-
 logger = logging.getLogger(__name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line
34, Remove the top-level load_dotenv() call so importing the module has no side
effects; instead add an explicit configure function (e.g., configure_env() or
initialize()) in this module that calls load_dotenv() when the application wants
to initialize env, and update callers/examples (such as the example in
plugins/inworld/example/inworld_realtime_example.py) to call that configure
function before using any other symbols from this module; ensure all references
to module-level environment access in functions like any realtime client
constructors read from os.environ at runtime so they work after configure_env()
is run.

38-54: Fold the trailing exceptions into _IGNORED_EVENT_TYPES.

"session.updated" and "rate_limits.updated" are tacked on as or-chained string compares at line 283 instead of living in the frozenset a few lines above. Moving them keeps dispatch O(1) and the ignore list in one spot.

♻️ Proposed change
 _IGNORED_EVENT_TYPES = frozenset(
     {
         "conversation.item.created",
         ...
         "input_audio_buffer.speech_stopped",
         "input_audio_buffer.committed",
+        "session.updated",
+        "rate_limits.updated",
     }
 )
...
-        elif et in _IGNORED_EVENT_TYPES or et == "session.updated" or et == "rate_limits.updated":
+        elif et in _IGNORED_EVENT_TYPES:
             pass

Also applies to: 283-284

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 38 - 54, Add "session.updated" and "rate_limits.updated" to the existing
_IGNORED_EVENT_TYPES frozenset and remove the separate or-chained string
comparisons in the event dispatch logic that currently checks those two values
(the check near the dispatch/event handling code in inworld_realtime.py). This
keeps all ignored event types centralized in _IGNORED_EVENT_TYPES and allows the
dispatch membership test to remain a single O(1) containment check against
_IGNORED_EVENT_TYPES.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Around line 65-67: Update the changelog entry for "Inworld Realtime plugin
(WebRTC)" / "inworld.Realtime" to include the PR reference for this user-facing
feature by appending the PR number or link (e.g., " (PR #<number>)" or "
([#<number>](link))") to the end of the entry; ensure it remains under the
appropriate version heading and "New Features" section and uses the same
phrasing as the existing entry so only the PR reference is added.

In `@plugins/inworld/README.md`:
- Around line 49-61: The example omits the required agent_user parameter for
Agent.__init__, causing a TypeError; update the Agent(...) call in the README
example to pass a valid agent_user (e.g., a user id or user object) when
constructing Agent so the inworld.Realtime usage (inworld.Realtime) works;
ensure the parameter name matches the constructor signature (agent_user) and
include a brief placeholder value in the snippet so the example runs as-is.

In `@plugins/inworld/tests/test_realtime.py`:
- Around line 1-4: Add a guard to skip these live integration tests when the
INWORLD_API_KEY is not set: import os and pytest at top of
plugins/inworld/tests/test_realtime.py, then in the session/test fixture used
for realtime integration (the fixture defined around lines ~152-154 in this
file) check if os.environ.get("INWORLD_API_KEY") is falsy and call
pytest.skip("INWORLD_API_KEY not set - skipping live integration tests") from
the fixture setup path so tests are cleanly skipped instead of raising during
setup.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 138: Replace the type ignore by ensuring tools_for_inworld is typed to
match the RealtimeSessionCreateRequestParam TypedDict's "tools" field: update
the convert_tools_to_openai_format signature to return the concrete list item
type expected by RealtimeSessionCreateRequestParam (or explicitly wrap
tools_for_inworld with typing.cast to that list type) and then assign it to
self.realtime_session["tools"] without a "# type: ignore"; reference the
convert_tools_to_openai_format function and the
RealtimeSessionCreateRequestParam TypedDict so the assignment
self.realtime_session["tools"] = tools_for_inworld type-checks correctly.
- Around line 271-282: Replace the raise in _handle_inworld_event for et ==
"response.done" with emitting an error event and returning: detect
response.get("status") == "failed" inside the response.done branch and call
self._emit_error_event(InworldRealtimeError(json.dumps(response)),
context="response_failed") (mirroring the "error" branch), then return so the
dispatcher does not raise from the background task; keep the rest of the
response.done handling intact if any.
- Around line 102-115: The code currently always overwrites caller-supplied
realtime_session fields (realtime_session["model"] and
realtime_session["audio"]["output"]["voice"]) with default kwargs; change the
logic in the RealtimeSessionCreateRequestParam handling so you only set
realtime_session["model"] if the caller explicitly passed a model kwarg (use a
sentinel or check if model is not None) and only set
realtime_session["audio"]["output"]["voice"] if the caller explicitly passed a
voice kwarg (or if audio/output is missing and you need to create defaults). In
practice, update the block that assigns self.realtime_session["model"] and the
block that assigns self.realtime_session["audio"]["output"]["voice"] to first
honor existing keys on self.realtime_session (and only overwrite when the
corresponding kwarg was explicitly provided), while still creating missing
audio/output objects when realtime_session lacks them using
RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 196-203: The on_message handler currently calls
asyncio.create_task(self._handle_event(data)) without keeping a strong
reference, so tasks may be garbage-collected mid-flight; modify the RTCManager
(or the class containing on_message) to maintain a set attribute (e.g.,
self._pending_tasks) and when creating the task from on_message add it to that
set, attach a done callback that removes the task from self._pending_tasks, and
ensure the callback also logs exceptions from the task if needed; specifically
update the on_message closure that calls asyncio.create_task and the class
initializer to create and manage self._pending_tasks for tasks spawned from
self._handle_event.
- Around line 243-259: The close() method fires-and-forgets the peer-connection
close via asyncio.create_task(_safe_close()) causing RTCManager.close() to
return before pc.close() finishes; change it to await pc.close() directly
(inside close()) and handle expected teardown exceptions explicitly (catch
asyncio.CancelledError, ConnectionError and aiortc/InvalidStateError-like
exceptions) rather than silently spawning a background task; if non-blocking
behavior is required instead, provide a separate helper that returns the task
and store/await it from callers (e.g., in inworld_realtime.Realtime.close()) so
the disconnect event is emitted only after the task completes.

In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py`:
- Around line 27-32: The code currently assigns params =
t.get("parameters_schema") or t.get("parameters") or {} then calls
params.setdefault(...), which mutates the caller/registry-owned schema; instead,
create a copy of the resolved schema before mutating (e.g., shallow or deep
copy) so the original t["parameters_schema"] / t["parameters"] is untouched,
then perform params.setdefault("type", "object"),
params.setdefault("properties", {}), and
params.setdefault("additionalProperties", False) on that copy; update any
downstream use to reference the new copied variable rather than mutating t in
place.
- Around line 56-65: parse_tool_arguments currently swallows malformed JSON and
non-object results by returning {} which can cause tools to run with missing or
incorrect arguments; change it so that when args is a dict it is returned, when
args is falsy keep returning {} (if you want to preserve that behavior), but if
args is a string then attempt json.loads and if json.loads raises
json.JSONDecodeError or returns a value that is not a dict raise a ValueError
(or a specific exception) describing "tool arguments must be a JSON object";
update parse_tool_arguments to validate the parsed value is a dict and raise on
errors instead of returning {}.

---

Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 34: Remove the top-level load_dotenv() call so importing the module has
no side effects; instead add an explicit configure function (e.g.,
configure_env() or initialize()) in this module that calls load_dotenv() when
the application wants to initialize env, and update callers/examples (such as
the example in plugins/inworld/example/inworld_realtime_example.py) to call that
configure function before using any other symbols from this module; ensure all
references to module-level environment access in functions like any realtime
client constructors read from os.environ at runtime so they work after
configure_env() is run.
- Around line 38-54: Add "session.updated" and "rate_limits.updated" to the
existing _IGNORED_EVENT_TYPES frozenset and remove the separate or-chained
string comparisons in the event dispatch logic that currently checks those two
values (the check near the dispatch/event handling code in inworld_realtime.py).
This keeps all ignored event types centralized in _IGNORED_EVENT_TYPES and
allows the dispatch membership test to remain a single O(1) containment check
against _IGNORED_EVENT_TYPES.

In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py`:
- Around line 8-15: Replace old typing aliases with built-in generics: remove
the line importing Any, Dict, List from typing; update the signature of
convert_tools_to_openai_format to use list[ToolSchema] and its return type to
list[dict[str, object]]; update any intermediate variables or annotations that
use Dict[str, Any] to dict[str, object]; update parse_tool_arguments' parameter
and return annotations from plain dict to dict[str, object]; ensure references
to NormalizedToolCallItem and ToolSchema remain unchanged (e.g.,
convert_tools_to_openai_format, parse_tool_arguments, NormalizedToolCallItem,
ToolSchema).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 47653894-e65b-4370-a2e3-6981f67c5e8b

📥 Commits

Reviewing files that changed from the base of the PR and between 48e32da and d42064a.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • CHANGELOG.md
  • README.md
  • plugins/inworld/README.md
  • plugins/inworld/example/inworld_realtime_example.py
  • plugins/inworld/pyproject.toml
  • plugins/inworld/tests/test_realtime.py
  • plugins/inworld/vision_agents/plugins/inworld/__init__.py
  • plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
  • plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
  • plugins/inworld/vision_agents/plugins/inworld/tool_utils.py

Comment thread CHANGELOG.md Outdated
Comment thread plugins/inworld/README.md
Comment on lines +1 to +4
import asyncio

import pytest
from dotenv import load_dotenv
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Skip live integration tests when INWORLD_API_KEY is absent.

The fixture currently raises during setup when the secret is unavailable, which makes integration-enabled CI fail instead of reporting a clean skip.

🧪 Proposed fixture guard
 import asyncio
+import os
 
 import pytest
 from dotenv import load_dotenv
@@
     `@pytest.fixture`
     async def live_realtime(self):
+        if not os.getenv("INWORLD_API_KEY"):
+            pytest.skip("INWORLD_API_KEY is required for Inworld Realtime integration tests")
         rt = inworld.Realtime()
         try:
             yield rt

Also applies to: 152-154

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/tests/test_realtime.py` around lines 1 - 4, Add a guard to
skip these live integration tests when the INWORLD_API_KEY is not set: import os
and pytest at top of plugins/inworld/tests/test_realtime.py, then in the
session/test fixture used for realtime integration (the fixture defined around
lines ~152-154 in this file) check if os.environ.get("INWORLD_API_KEY") is falsy
and call pytest.skip("INWORLD_API_KEY not set - skipping live integration
tests") from the fixture setup path so tests are cleanly skipped instead of
raising during setup.

Comment thread plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py Outdated
tools_for_inworld = convert_tools_to_openai_format(
available_tools, for_realtime=True
)
self.realtime_session["tools"] = tools_for_inworld # type: ignore[typeddict-item]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Drop the # type: ignore; use a properly typed assignment.

Repo guidelines explicitly say to avoid # type: ignore. RealtimeSessionCreateRequestParam is a TypedDict with a tools key — cast the list to the expected item type rather than silencing the checker. If convert_tools_to_openai_format returns a concrete schema type, annotate its return accordingly so the assignment type-checks without a suppression.

As per coding guidelines: "Avoid # type: ignore comments."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line
138, Replace the type ignore by ensuring tools_for_inworld is typed to match the
RealtimeSessionCreateRequestParam TypedDict's "tools" field: update the
convert_tools_to_openai_format signature to return the concrete list item type
expected by RealtimeSessionCreateRequestParam (or explicitly wrap
tools_for_inworld with typing.cast to that list type) and then assign it to
self.realtime_session["tools"] without a "# type: ignore"; reference the
convert_tools_to_openai_format function and the
RealtimeSessionCreateRequestParam TypedDict so the assignment
self.realtime_session["tools"] = tools_for_inworld type-checks correctly.

Comment thread plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
Comment thread plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py Outdated
Comment thread plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py Outdated
Comment thread plugins/inworld/vision_agents/plugins/inworld/tool_utils.py Outdated
Comment on lines +56 to +65
def parse_tool_arguments(args: str | dict) -> dict:
"""Parse tool arguments from string or dict."""
if isinstance(args, dict):
return args
if not args:
return {}
try:
return json.loads(args)
except json.JSONDecodeError:
return {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject malformed or non-object tool arguments instead of defaulting to {}.

Returning {} for invalid JSON can execute a tool with omitted arguments, and json.loads() may return a list/string while the function promises a dict.

🛠️ Proposed fix
-def parse_tool_arguments(args: str | dict) -> dict:
+def parse_tool_arguments(args: str | dict[str, object]) -> dict[str, object]:
     """Parse tool arguments from string or dict."""
     if isinstance(args, dict):
         return args
     if not args:
         return {}
     try:
-        return json.loads(args)
-    except json.JSONDecodeError:
-        return {}
+        parsed = json.loads(args)
+    except json.JSONDecodeError as exc:
+        raise ValueError("Tool arguments must be valid JSON.") from exc
+    if not isinstance(parsed, dict):
+        raise ValueError("Tool arguments must decode to a JSON object.")
+    return parsed
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/tool_utils.py` around lines 56
- 65, parse_tool_arguments currently swallows malformed JSON and non-object
results by returning {} which can cause tools to run with missing or incorrect
arguments; change it so that when args is a dict it is returned, when args is
falsy keep returning {} (if you want to preserve that behavior), but if args is
a string then attempt json.loads and if json.loads raises json.JSONDecodeError
or returns a value that is not a dict raise a ValueError (or a specific
exception) describing "tool arguments must be a JSON object"; update
parse_tool_arguments to validate the parsed value is a dict and raise on errors
instead of returning {}.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)

109-122: ⚠️ Potential issue | 🟡 Minor

Preserve caller-provided realtime_session model and voice.

This still overwrites advanced session config with default kwargs, so a caller-provided model or audio.output.voice is silently lost unless they also pass matching constructor args. Consider using None/sentinel defaults and only applying kwargs when explicitly provided.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 109 - 122, The constructor currently forces caller-provided session fields
by unconditionally setting self.realtime_session["model"] = model and
self.realtime_session["audio"]["output"]["voice"] = voice; change this to only
apply the provided model and voice when those kwargs are not None (or use a
sentinel) and only if the keys are missing on the caller-supplied
realtime_session. Specifically, when initializing self.realtime_session (and
when creating defaults like
RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam), check for the presence
of "model" and the nested "audio"->"output"->"voice" before assigning; if model
or voice parameters are None, preserve whatever the caller passed, and only set
defaults when keys are absent. Ensure checks use the existing
self.realtime_session dict/object access patterns so you don't clobber nested
caller-provided config.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/inworld/example/inworld_realtime_tools_example.py`:
- Around line 51-52: get_time currently returns a naive ISO timestamp; make it
timezone-aware by calling datetime.datetime.now with an explicit tz (e.g.,
datetime.timezone.utc or another desired tz) so get_time() returns an ISO string
that includes the UTC offset (use datetime.datetime.now(tz=...) and then
.isoformat(timespec="seconds")).

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 149-168: The code logs when the Inworld plan disallows tool
calling but fails to remove previously-set tools from the session; update the
else branch in the block using
get_available_functions/convert_tools_to_openai_format and
_plan_supports_tool_calling to explicitly drop any existing tools from
self.realtime_session (e.g., pop/"del" the "tools" key) before returning so
stale tool config is not sent in session.update, and keep the warning log
intact.
- Around line 363-369: The current logging in the tool execution path logs raw
arguments and full error details; change it so that logger.info only records the
tool name (from name) and does not include arguments or payloads, and change the
logger.error call in the _run_one_tool handling to record a generic failure
message (e.g., "Tool call %s failed") without printing the raw error details; if
you need to preserve the full arguments or exception for troubleshooting, emit
them at debug level (logger.debug) or write them to a secure audit sink instead,
and keep response_data as {"error": "tool call failed"} or similarly generic to
avoid leaking PII/secrets.
- Around line 470-478: The update_tools() path currently sends tools without
applying the same preflight that connect() uses, which can re-add disallowed
tools and trigger the server-error state from _plan_supports_tool_calling();
modify update_tools() to run the same preflight/validation before converting and
sending tools (reuse _plan_supports_tool_calling() or the same validation logic
used in connect()), and if the preflight fails set the session to the
server-error state consistently (the same behavior as in connect()), then only
call convert_tools_to_openai_format(...) and await self.rtc.send_event(...) when
the plan check passes.

---

Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 109-122: The constructor currently forces caller-provided session
fields by unconditionally setting self.realtime_session["model"] = model and
self.realtime_session["audio"]["output"]["voice"] = voice; change this to only
apply the provided model and voice when those kwargs are not None (or use a
sentinel) and only if the keys are missing on the caller-supplied
realtime_session. Specifically, when initializing self.realtime_session (and
when creating defaults like
RealtimeAudioConfigParam/RealtimeAudioConfigOutputParam), check for the presence
of "model" and the nested "audio"->"output"->"voice" before assigning; if model
or voice parameters are None, preserve whatever the caller passed, and only set
defaults when keys are absent. Ensure checks use the existing
self.realtime_session dict/object access patterns so you don't clobber nested
caller-provided config.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e669cbd3-6c04-494e-8524-698bd3e5cb73

📥 Commits

Reviewing files that changed from the base of the PR and between d42064a and 4e04bce.

📒 Files selected for processing (3)
  • plugins/inworld/example/inworld_realtime_example.py
  • plugins/inworld/example/inworld_realtime_tools_example.py
  • plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Comment thread plugins/inworld/example/inworld_realtime_tools_example.py Outdated
Comment thread plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
Comment on lines +363 to +369
logger.info("Executing tool call: %s with args: %s", name, arguments)

tc, result, error = await self._run_one_tool(tool_call, timeout_s=30)

if error:
response_data: dict[str, Any] = {"error": str(error)}
logger.error("Tool call %s failed: %s", name, error)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging raw tool arguments and error details at info/error level.

Tool arguments can contain user PII or secrets. Log the tool name and keep payload details out of default logs.

🛡️ Proposed fix
-        logger.info("Executing tool call: %s with args: %s", name, arguments)
+        logger.info("Executing tool call: %s", name)
 
         tc, result, error = await self._run_one_tool(tool_call, timeout_s=30)
 
         if error:
             response_data: dict[str, Any] = {"error": str(error)}
-            logger.error("Tool call %s failed: %s", name, error)
+            logger.error("Tool call %s failed", name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 363 - 369, The current logging in the tool execution path logs raw
arguments and full error details; change it so that logger.info only records the
tool name (from name) and does not include arguments or payloads, and change the
logger.error call in the _run_one_tool handling to record a generic failure
message (e.g., "Tool call %s failed") without printing the raw error details; if
you need to preserve the full arguments or exception for troubleshooting, emit
them at debug level (logger.debug) or write them to a secure audit sink instead,
and keep response_data as {"error": "tool call failed"} or similarly generic to
avoid leaking PII/secrets.

Comment on lines +470 to +478
tools_for_inworld = convert_tools_to_openai_format(
available_tools, for_realtime=True
)
await self.rtc.send_event(
{
"type": "session.update",
"session": {"tools": tools_for_inworld},
}
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Apply the tool-calling preflight in update_tools() too.

connect() may drop tools for restricted plans, but update_tools() can add them back later and put the session into the server-error state described in _plan_supports_tool_calling().

🛡️ Proposed fix
         tools_for_inworld = convert_tools_to_openai_format(
             available_tools, for_realtime=True
         )
+        if not await self._plan_supports_tool_calling():
+            self.realtime_session.pop("tools", None)
+            await self.rtc.send_event(
+                {
+                    "type": "session.update",
+                    "session": {"tools": []},
+                }
+            )
+            logger.warning(
+                "Inworld plan does not permit tool calling; not updating %d tools",
+                len(tools_for_inworld),
+            )
+            return
+
         await self.rtc.send_event(
             {
                 "type": "session.update",
                 "session": {"tools": tools_for_inworld},
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 470 - 478, The update_tools() path currently sends tools without applying
the same preflight that connect() uses, which can re-add disallowed tools and
trigger the server-error state from _plan_supports_tool_calling(); modify
update_tools() to run the same preflight/validation before converting and
sending tools (reuse _plan_supports_tool_calling() or the same validation logic
used in connect()), and if the preflight fails set the session to the
server-error state consistently (the same behavior as in connect()), then only
call convert_tools_to_openai_format(...) and await self.rtc.send_event(...) when
the plan check passes.

- Fix ruff format on inworld_realtime.py and tools example
- Remove top-level load_dotenv() side effect from module import
- Consolidate ignored event types into _IGNORED_EVENT_TYPES frozenset
- Preserve caller-provided model/voice in realtime_session (use setdefault)
- Drop stale tools from session when plan disallows tool calling
- Add preflight check to update_tools() matching connect() behavior
- Keep strong references to asyncio tasks in RTCManager (prevent GC)
- Await pc.close() in RTCManager.close() instead of fire-and-forget
- Shallow-copy tool params in convert_tools_to_openai_format to avoid mutation
- Modernize type annotations (remove Dict/List/Any, use builtins and unions)
- Reduce logging of raw arguments/error details in tool execution
- Add agent_user to README example snippet
- Add PR reference (#502) to CHANGELOG
- Make get_time timezone-aware in tools example

Made-with: Cursor
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)

153-160: 🛠️ Refactor suggestion | 🟠 Major

# type: ignore suppressions still present — replace with a typed cast.

Both self.realtime_session["tools"] = tools_for_inworld # type: ignore[typeddict-item] on line 153 and self.realtime_session.pop("tools", None) # type: ignore[misc] on line 160 still silence the type checker. As per coding guidelines: "Avoid # type: ignore comments." Tighten convert_tools_to_openai_format's return type (or typing.cast the assignment to the concrete RealtimeSessionCreateRequestParam["tools"] item type) so the assignment and pop type-check cleanly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 153 - 160, The assignment and pop use "# type: ignore" to bypass typing;
instead update convert_tools_to_openai_format to return the exact
RealtimeSessionCreateRequestParam["tools"] type (or wrap tools_for_inworld with
typing.cast(RealtimeSessionCreateRequestParam["tools"], tools_for_inworld)) so
the line self.realtime_session["tools"] = tools_for_inworld type-checks without
ignores, and change the pop to use a typed deletion such as: cast the session to
the proper TypedDict before popping or delete the key with a typed helper so the
self.realtime_session.pop("tools", None) no longer requires "# type: ignore".
🧹 Nitpick comments (1)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (1)

125-125: Type _pending_tool_calls with a TypedDict to avoid object-typed list mutation.

self._pending_tool_calls: dict[str, dict[str, object]] means pending["argument_parts"] is object, yet line 293 calls .append(delta) on it and line 350 does "".join(pending["argument_parts"]). This only works because object gets narrowed at runtime — static checkers will reject it, and any future attempt to drop the remaining # type: ignores will stall here too.

♻️ Proposed refactor
+from typing import TypedDict
+
+
+class _PendingToolCall(TypedDict):
+    call_id: str | None
+    name: str
+    argument_parts: list[str]
@@
-        self._pending_tool_calls: dict[str, dict[str, object]] = {}
+        self._pending_tool_calls: dict[str, _PendingToolCall] = {}

Then pending["argument_parts"] / .append(...) / "".join(...) all check without casts.

Also applies to: 278-300, 342-350

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` at line
125, Define a TypedDict (e.g., PendingToolCall) describing the exact shape of
each pending call (including argument_parts: list[str], any other keys like
args/metadata with their proper types) and replace the current untyped map
annotation on self._pending_tool_calls in inworld_realtime.py with dict[str,
PendingToolCall]; update code paths that mutate pending["argument_parts"] (the
.append(delta) call) and later read it (the "".join(pending["argument_parts"])
call) to rely on the TypedDict so static checkers accept list operations without
casts or # type: ignore.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 103-110: The constructor currently assigns self.model = model and
self.voice = voice which can diverge from a caller-provided realtime_session;
change the assignment to derive the effective values from the session after you
call self.realtime_session.setdefault("model", model) (e.g. self.model =
self.realtime_session["model"]) and read the voice from the session's nested
audio.output (e.g. self.voice = self.realtime_session.get("audio",
{}).get("output", {}).get("voice", voice)); update any other places in the
constructor that set/assume model/voice from kwargs (the blocks around the
earlier assignments and the later uses that call _emit_connected_event and
_plan_supports_tool_calling) so they all use the effective session-derived
self.model/self.voice instead of the original kwarg defaults.
- Around line 395-448: The _plan_supports_tool_calling probe is called
repeatedly and can silently miss reworded error messages; cache its boolean
result on the instance (e.g., add self._plan_tool_support: bool | None = None in
__init__), return the cached value unless api key (self._api_key) or model
(self.model) changed (invalidate cache when those change), and on a 400 response
inspect resp.json() for structured fields like "error.code" or "error.type" in
addition to the message; if none of those indicate plan restriction, emit a
processLogger.warning (or self.logger.warning) including the full
resp.json()/resp.text so operators see unexpected 400s, and only fall back to
True when the checks explicitly indicate tools are allowed or when a
network/parse error occurs.

---

Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 153-160: The assignment and pop use "# type: ignore" to bypass
typing; instead update convert_tools_to_openai_format to return the exact
RealtimeSessionCreateRequestParam["tools"] type (or wrap tools_for_inworld with
typing.cast(RealtimeSessionCreateRequestParam["tools"], tools_for_inworld)) so
the line self.realtime_session["tools"] = tools_for_inworld type-checks without
ignores, and change the pop to use a typed deletion such as: cast the session to
the proper TypedDict before popping or delete the key with a typed helper so the
self.realtime_session.pop("tools", None) no longer requires "# type: ignore".

---

Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Line 125: Define a TypedDict (e.g., PendingToolCall) describing the exact
shape of each pending call (including argument_parts: list[str], any other keys
like args/metadata with their proper types) and replace the current untyped map
annotation on self._pending_tool_calls in inworld_realtime.py with dict[str,
PendingToolCall]; update code paths that mutate pending["argument_parts"] (the
.append(delta) call) and later read it (the "".join(pending["argument_parts"])
call) to rely on the TypedDict so static checkers accept list operations without
casts or # type: ignore.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 69bc7fb5-ab05-4201-a932-7d4ca9494bb7

📥 Commits

Reviewing files that changed from the base of the PR and between 4e04bce and 569dd19.

📒 Files selected for processing (6)
  • CHANGELOG.md
  • plugins/inworld/README.md
  • plugins/inworld/example/inworld_realtime_tools_example.py
  • plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
  • plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
  • plugins/inworld/vision_agents/plugins/inworld/tool_utils.py
✅ Files skipped from review due to trivial changes (3)
  • CHANGELOG.md
  • plugins/inworld/example/inworld_realtime_tools_example.py
  • plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/inworld/README.md
  • plugins/inworld/vision_agents/plugins/inworld/tool_utils.py

Comment thread plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py Outdated
Comment on lines +395 to +448
async def _plan_supports_tool_calling(self) -> bool:
"""Probe whether the account's plan permits tool calling.

Inworld's Realtime API silently returns ``server_error`` on every
response when a tool-restricted plan has tools in the session, which
leaves the agent unable to respond at all. The REST completions
endpoint rejects the same request up-front with a validation 400
carrying ``"Tool calling is currently restricted on your plan."`` —
we use that as a cheap preflight so we can strip tools and keep the
session functional when the plan disallows them.

Returns True on any unexpected response so we fall back to the old
behaviour rather than silently disabling tool calling.
"""
if self._force_tool_calling:
return True
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{INWORLD_API_BASE}/v1/chat/completions",
headers={
"Authorization": f"Bearer {self._api_key}",
"Content-Type": "application/json",
},
json={
"model": self.model,
"messages": [{"role": "user", "content": "hi"}],
"tools": [
{
"type": "function",
"function": {
"name": "_preflight",
"description": "preflight",
"parameters": {
"type": "object",
"properties": {},
},
},
}
],
"max_tokens": 1,
},
)
except httpx.RequestError:
return True

if resp.status_code != 400:
return True
try:
message = resp.json().get("error", {}).get("message", "")
except ValueError:
message = resp.text
msg_lower = message.lower()
return not ("tool calling" in msg_lower and "restricted" in msg_lower)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does the Inworld Realtime API expose an error code or type field (not just a human-readable message) when a request is rejected because the account's plan restricts tool calling?

💡 Result:

Yes, the Inworld Realtime API exposes both an error type field (error category, e.g., "server_error") and an error code field (specific error code) in the error event when a request is rejected. These are machine-readable fields separate from the human-readable message. Although no source explicitly documents a plan restriction error for tool calling, the general error schema includes these fields for all errors, including those from account plan limits (e.g., rate limits, concurrency), as shown in official API reference documentation. Tool calling is configured via session.tools and tool_choice, and plan limits apply to Realtime API usage including concurrent sessions.

Citations:


🏁 Script executed:

# First, locate the file mentioned in the review
find . -name "inworld_realtime.py" -type f

Repository: GetStream/Vision-Agents

Length of output: 134


🏁 Script executed:

# Examine the file around lines 395-448 to see the current implementation
wc -l plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 136


🏁 Script executed:

# Look for usages of _plan_supports_tool_calling to see if it's called repeatedly
rg "_plan_supports_tool_calling" -A 2 -B 2

Repository: GetStream/Vision-Agents

Length of output: 1687


🏁 Script executed:

# Check the connect() and update_tools() methods to confirm the concern
rg "def (connect|update_tools)" -A 10

Repository: GetStream/Vision-Agents

Length of output: 27312


🏁 Script executed:

# Look at __init__ to see current instance attributes
rg "def __init__" plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py -A 30

Repository: GetStream/Vision-Agents

Length of output: 1250


🏁 Script executed:

# Look at the connect() method in detail
sed -n '348,450p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 4152


🏁 Script executed:

# Look at the update_tools() method
sed -n '450,483p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 1354


🏁 Script executed:

# Search for any existing _plan_tool_support variable
rg "_plan_tool_support"

Repository: GetStream/Vision-Agents

Length of output: 49


🏁 Script executed:

# Find the connect() method in inworld_realtime.py
sed -n '265,350p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 4149


🏁 Script executed:

# Check where _plan_supports_tool_calling is called - look for the line numbers
grep -n "_plan_supports_tool_calling" plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 249


🏁 Script executed:

# Look at the connect() method around line 152
sed -n '120,200p' plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py

Repository: GetStream/Vision-Agents

Length of output: 3459


Cache preflight probe and log unmatched error responses to avoid silent tool-calling failures.

Two issues in _plan_supports_tool_calling (called on every connect() and update_tools()):

  1. Redundant HTTP requests. Each call fires a live /v1/chat/completions preflight request, adding roundtrip latency and token usage to session startup. Cache the result on the instance, invalidating only when api_key or model changes.

  2. Silent failure on error message rephrasing. Lines 447–448 match the tool-restriction response by checking for both "tool calling" and "restricted" substrings. If Inworld rewords the message, the check fails silently and the session falls into the server_error trap the preflight was meant to avoid. Log unmatched 400 responses at warning level so operators see the regression. Better still, also check the error code or type field (confirmed available in Inworld's error schema) rather than relying on substring matching alone.

Suggested changes
+        if self._plan_tool_support is not None:
+            return self._plan_tool_support
+
         if resp.status_code != 400:
-            return True
+            self._plan_tool_support = True
+            return True
         try:
             message = resp.json().get("error", {}).get("message", "")
         except ValueError:
             message = resp.text
         msg_lower = message.lower()
-        return not ("tool calling" in msg_lower and "restricted" in msg_lower)
+        restricted = "tool calling" in msg_lower and "restricted" in msg_lower
+        if not restricted:
+            logger.warning(
+                "Inworld preflight returned 400 but did not match the known "
+                "tool-restriction message; assuming tools are allowed. body=%s",
+                message,
+            )
+        self._plan_tool_support = not restricted
+        return self._plan_tool_support

Initialize self._plan_tool_support: bool | None = None in __init__.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 395 - 448, The _plan_supports_tool_calling probe is called repeatedly and
can silently miss reworded error messages; cache its boolean result on the
instance (e.g., add self._plan_tool_support: bool | None = None in __init__),
return the cached value unless api key (self._api_key) or model (self.model)
changed (invalidate cache when those change), and on a 400 response inspect
resp.json() for structured fields like "error.code" or "error.type" in addition
to the message; if none of those indicate plan restriction, emit a
processLogger.warning (or self.logger.warning) including the full
resp.json()/resp.text so operators see unexpected 400s, and only fall back to
True when the checks explicitly indicate tools are allowed or when a
network/parse error occurs.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py (1)

82-86: AudioForwarder instance is not retained anywhere.

The forwarder is created as a local inside on_track and start() spawns its task internally; nothing in RTCManager holds a reference or calls stop() in close(). Today it self-terminates when the track ends, but that is a load-bearing implicit contract — any future change to AudioForwarder.start() that depends on an explicit stop() (or any exception in its internal task) will silently become un-observable here. Store the forwarder on self and stop it in close().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 82
- 86, The AudioForwarder created in the on_track handler is only a local
variable and not retained, so create and store it on the RTCManager instance
(e.g., self._audio_forwarders as a list or dict) when you instantiate it in
on_track (where AudioForwarder(track, self._audio_callback) and await
audio_forwarder.start() are called), ensure you remove it when the
forwarder/track ends, and update the close() method to iterate stored forwarders
and call their stop() (or await stop() if async) to cleanly shut them down;
reference the on_track handler, AudioForwarder.start(), and RTCManager.close()
when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 198-206: The connect() flow currently swallows a session.updated
timeout (in the block using self.rtc.send_event and await
asyncio.wait_for(self._session_updated.wait(), timeout=10.0)) and only logs a
warning; change this to surface the failure: when asyncio.TimeoutError occurs,
emit a RealtimeErrorEvent with context="session_update_timeout" (so subscribers
can react) and then raise InworldRealtimeError instead of continuing to fire
RealtimeConnectedEvent; ensure the timeout branch references _session_updated,
RealtimeErrorEvent, RealtimeConnectedEvent, and InworldRealtimeError so callers
get a clear failure rather than a silent warning.
- Around line 450-455: The code assumes resp.json().get("error",
{}).get("message", "") where "error" is always a dict; to fix, parse resp.json()
into a variable (resp_json), check if resp_json.get("error") is a dict before
calling .get("message") and otherwise treat error as a string (or use
resp_json.get("error") directly), falling back to resp.text on JSON decode
errors; update the logic that sets message (and then msg_lower) to handle both
dict and non-dict error payloads so the function (using resp, message,
msg_lower) no longer raises AttributeError for string errors.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 259-262: The except block around awaiting pc.close() currently
suppresses asyncio.CancelledError and breaks cooperative cancellation; update
the handler so that only ConnectionError is caught (or re-raise CancelledError)
instead of swallowing asyncio.CancelledError. Locate the await pc.close() call
in rtc_manager.py (in the close/shutdown logic managing the Inworld peer
connection) and change the except tuple to catch only ConnectionError (or
explicitly except asyncio.CancelledError and re-raise it) so cancellation
propagates to the caller.
- Around line 65-94: After creating the RTCPeerConnection in RTCManager.connect,
wrap all subsequent work (calls to _setup_connection_logging, _add_data_channel,
addTrack, createOffer, setLocalDescription, _exchange_sdp, setRemoteDescription
and any started AudioForwarder) in a try/except (or try/finally) block so that
if any of those calls fail you await self.pc.close() and clear self.pc before
re-raising the exception; ensure you await self.pc.close() (not just call close)
and set self.pc = None (or a cleared state) so the leaked ICE/DTLS state machine
is terminated when _add_data_channel, createOffer, _exchange_sdp or
setRemoteDescription raise.

---

Nitpick comments:
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 82-86: The AudioForwarder created in the on_track handler is only
a local variable and not retained, so create and store it on the RTCManager
instance (e.g., self._audio_forwarders as a list or dict) when you instantiate
it in on_track (where AudioForwarder(track, self._audio_callback) and await
audio_forwarder.start() are called), ensure you remove it when the
forwarder/track ends, and update the close() method to iterate stored forwarders
and call their stop() (or await stop() if async) to cleanly shut them down;
reference the on_track handler, AudioForwarder.start(), and RTCManager.close()
when making these changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bb549e7b-2daf-4792-85ad-9108f3c21a1a

📥 Commits

Reviewing files that changed from the base of the PR and between 569dd19 and 819cdad.

📒 Files selected for processing (2)
  • plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
  • plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py

Comment on lines +198 to +206
await self.rtc.send_event(
{"type": "session.update", "session": dict(self.realtime_session)}
)
try:
await asyncio.wait_for(self._session_updated.wait(), timeout=10.0)
except asyncio.TimeoutError:
logger.warning(
"Timed out waiting for Inworld session.updated; proceeding anyway"
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Treat session.updated timeout as a real error, not a log warning.

The docstring on connect() (lines 141-152) explicitly states that the method "waits for the server's session.updated confirmation before returning so callers can trust the session is fully configured." The current implementation contradicts that: on timeout, it logs a warning and proceeds to emit RealtimeConnectedEvent anyway, so callers have no way to distinguish a healthy connect from a silently-degraded one where tools/voice/instructions never took effect. At minimum, emit a RealtimeErrorEvent with context="session_update_timeout" so subscribers can react; consider raising InworldRealtimeError to honour the contract the docstring advertises.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 198 - 206, The connect() flow currently swallows a session.updated timeout
(in the block using self.rtc.send_event and await
asyncio.wait_for(self._session_updated.wait(), timeout=10.0)) and only logs a
warning; change this to surface the failure: when asyncio.TimeoutError occurs,
emit a RealtimeErrorEvent with context="session_update_timeout" (so subscribers
can react) and then raise InworldRealtimeError instead of continuing to fire
RealtimeConnectedEvent; ensure the timeout branch references _session_updated,
RealtimeErrorEvent, RealtimeConnectedEvent, and InworldRealtimeError so callers
get a clear failure rather than a silent warning.

Comment thread plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py Outdated
Comment on lines +65 to +94
async def connect(self) -> None:
"""Establish the WebRTC connection to Inworld.

Pre-fetches TURN/STUN servers from Inworld so the peer connection can
be created with them upfront — required because Inworld's media relay
sits behind NAT (private IPs in ICE candidates) and cannot be reached
without the server-issued TURN credentials.
"""
ice_servers = await self._fetch_ice_servers()
self.pc = RTCPeerConnection(
configuration=RTCConfiguration(iceServers=ice_servers)
)
self._setup_connection_logging()

await self._add_data_channel()
self.pc.addTrack(self._audio_to_inworld_track)

@self.pc.on("track")
async def on_track(track):
if track.kind == "audio" and self._audio_callback:
audio_forwarder = AudioForwarder(track, self._audio_callback)
await audio_forwarder.start()

offer = await self.pc.createOffer()
await self.pc.setLocalDescription(offer)

answer_sdp = await self._exchange_sdp(offer.sdp)

answer = RTCSessionDescription(sdp=answer_sdp, type="answer")
await self.pc.setRemoteDescription(answer)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clean up self.pc if connect() fails partway through.

Once RTCPeerConnection is instantiated on line 74, any failure in _add_data_channel, createOffer, _exchange_sdp, or setRemoteDescription leaves a live peer connection attached to self.pc. The caller in inworld_realtime.Realtime.connect() re-raises without invoking RTCManager.close(), so the pc (and its ICE/DTLS state machines) leaks until process exit. Wrap the post-instantiation body in try/except and close self.pc before re-raising.

🛡️ Sketch
         ice_servers = await self._fetch_ice_servers()
         self.pc = RTCPeerConnection(
             configuration=RTCConfiguration(iceServers=ice_servers)
         )
-        self._setup_connection_logging()
-
-        await self._add_data_channel()
-        self.pc.addTrack(self._audio_to_inworld_track)
-        ...
-        answer = RTCSessionDescription(sdp=answer_sdp, type="answer")
-        await self.pc.setRemoteDescription(answer)
+        try:
+            self._setup_connection_logging()
+            await self._add_data_channel()
+            self.pc.addTrack(self._audio_to_inworld_track)
+            ...
+            answer = RTCSessionDescription(sdp=answer_sdp, type="answer")
+            await self.pc.setRemoteDescription(answer)
+        except BaseException:
+            await self.close()
+            raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 65
- 94, After creating the RTCPeerConnection in RTCManager.connect, wrap all
subsequent work (calls to _setup_connection_logging, _add_data_channel,
addTrack, createOffer, setLocalDescription, _exchange_sdp, setRemoteDescription
and any started AudioForwarder) in a try/except (or try/finally) block so that
if any of those calls fail you await self.pc.close() and clear self.pc before
re-raising the exception; ensure you await self.pc.close() (not just call close)
and set self.pc = None (or a cleared state) so the leaked ICE/DTLS state machine
is terminated when _add_data_channel, createOffer, _exchange_sdp or
setRemoteDescription raise.

Comment thread plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py
- Derive self.model/self.voice from realtime_session after setdefault
- Guard against non-dict error payloads in preflight 400 response
- Log warning for unmatched 400 bodies in preflight probe
- Clean up peer connection on partial connect() failure
- Don't swallow asyncio.CancelledError in RTCManager.close()

Made-with: Cursor
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py (4)

489-493: ⚠️ Potential issue | 🟠 Major

Clear local and remote tools when the plan check fails in update_tools().

This branch skips the update but leaves any existing local/session tools intact. If tools were previously configured, the server can continue failing responses.

🛠️ Proposed fix
         if not await self._plan_supports_tool_calling():
+            self.realtime_session.pop("tools", None)
+            await self.rtc.send_event(
+                {
+                    "type": "session.update",
+                    "session": {"tools": []},
+                }
+            )
             logger.warning(
                 "Inworld plan does not permit tool calling; skipping tool update"
             )
             return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 489 - 493, In update_tools(), when the early-return branch triggers
because await self._plan_supports_tool_calling() is False, clear any existing
configured tools both locally and remotely before returning; call the methods or
operations that remove session/local tools (e.g., reset or clear the internal
tools store on the agent instance) and send the corresponding remote
unregistration/clear request so that previous tool state is not retained when
the plan disallows tool calling. Ensure you reference update_tools and
_plan_supports_tool_calling to locate the branch and invoke the existing
tool-clearing mechanisms (or add clear_tools/clear_remote_tools helper calls)
prior to the return.

202-208: ⚠️ Potential issue | 🟡 Minor

Surface session.updated timeout as a connection failure.

The docstring says connect() waits until the session is configured, but this timeout path still emits RealtimeConnectedEvent. Callers cannot distinguish a configured session from one where model/voice/tools were not applied.

🛠️ Proposed fix
         try:
             await asyncio.wait_for(self._session_updated.wait(), timeout=10.0)
         except asyncio.TimeoutError:
-            logger.warning(
-                "Timed out waiting for Inworld session.updated; proceeding anyway"
+            exc = InworldRealtimeError(
+                "Timed out waiting for Inworld session.updated"
             )
+            self._emit_error_event(exc, context="session_update_timeout")
+            raise exc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 202 - 208, The connect() path currently swallows asyncio.TimeoutError from
await self._session_updated.wait() and only logs a warning, which causes callers
to still receive RealtimeConnectedEvent even though session configuration may
have failed; update the connect() implementation so that when
asyncio.wait_for(self._session_updated.wait(), timeout=10.0) times out you treat
it as a connection/configuration failure: either raise a clear exception (e.g.,
ConnectionError) from connect() or emit a RealtimeConnectionFailedEvent and
return/stop further success-path processing so RealtimeConnectedEvent is not
emitted; modify the timeout except block in connect() (the code referencing
self._session_updated.wait()) to perform that failure signaling and ensure any
downstream code that emits RealtimeConnectedEvent is not executed after the
timeout.

371-373: ⚠️ Potential issue | 🟠 Major

Do not send raw tool exception text back to the model.

str(error) can contain secrets, PII, stack fragments, or backend details from tools. Return a generic failure payload and keep detailed diagnostics out of the model-visible channel.

🛡️ Proposed fix
         tc, result, error = await self._run_one_tool(tool_call, timeout_s=30)
 
         if error:
-            response_data: dict[str, object] = {"error": str(error)}
+            response_data: dict[str, object] = {"error": "tool call failed"}
             logger.error("Tool call %s failed", name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 371 - 373, The current block constructs response_data with response_data =
{"error": str(error)} which exposes raw exception text to the model; change this
to return a generic, non-sensitive payload (e.g., {"error": "tool_call_failed"}
or {"error": "Tool call failed"} and optionally an opaque error_id) and ensure
the detailed exception is only written to server logs via logger.error("Tool
call %s failed: %s", name, error) or logger.exception(...) so diagnostics remain
out of the model-visible response; update the code paths that consume
response_data so they expect the generic message/optional error_id instead of
raw exception text.

155-175: ⚠️ Potential issue | 🟠 Major

Run the tool-plan preflight for preconfigured session tools too.

If a caller passes realtime_session with existing "tools" but no functions are registered locally, this block skips the preflight and still sends those tools in session.update. That re-enters the restricted-plan failure path. Track whether tools are effectively enabled and use that when emitting capabilities.

🛡️ Sketch
         self._session_updated.clear()
         available_tools = self.get_available_functions()
-        if available_tools:
-            tools_for_inworld = convert_tools_to_openai_format(
-                available_tools, for_realtime=True
-            )
+        configured_tools = self.realtime_session.get("tools")
+        function_calling_enabled = False
+        if available_tools or configured_tools:
+            tools_for_inworld = (
+                convert_tools_to_openai_format(available_tools, for_realtime=True)
+                if available_tools
+                else configured_tools
+            )
             if await self._plan_supports_tool_calling():
-                self.realtime_session["tools"] = tools_for_inworld  # type: ignore[typeddict-item]
+                self.realtime_session["tools"] = tools_for_inworld
+                function_calling_enabled = True
                 logger.info(
                     "Added %d tools to Inworld session config: %s",
                     len(tools_for_inworld),
@@
         self._emit_connected_event(
             session_config={"model": self.model, "voice": self.voice},
-            capabilities=["text", "audio", "function_calling"],
+            capabilities=[
+                "text",
+                "audio",
+                *(["function_calling"] if function_calling_enabled else []),
+            ],
         )

Also applies to: 199-212

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py` around
lines 155 - 175, The current logic skips the plan preflight when there are no
locally registered functions but the incoming realtime_session already contains
"tools", causing restricted-plan failures later; change the block around
get_available_functions/convert_tools_to_openai_format/_plan_supports_tool_calling
so you always run the plan preflight for any candidate tools (either converted
from available_tools or the preconfigured self.realtime_session.get("tools")),
e.g., compute tools_candidate = converted_tools if available_tools else
self.realtime_session.get("tools"), call await
self._plan_supports_tool_calling() for that candidate, and then set or pop
self.realtime_session["tools"] and emit the same logger.info/logger.warning
based on the preflight result so capabilities reflect whether tools are actually
enabled (refer to get_available_functions, convert_tools_to_openai_format,
_plan_supports_tool_calling, and self.realtime_session["tools"]).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py`:
- Around line 83-87: The on_track handler creates AudioForwarder instances as
local variables and spawns background tasks that are never awaited or inspected,
so modify the RTC manager to track and properly shut down these workers: store
each AudioForwarder (or any created background Task) on the instance (e.g., add
to self._audio_forwarders or push created asyncio.Tasks into
self._pending_tasks) when you create them in the on_track callback and ensure
tasks remove themselves from that collection on completion; update close() to
iterate over tracked AudioForwarder instances and call their stop/shutdown
method (or cancel and await tasks in self._pending_tasks), await their
termination, and gather exceptions (use asyncio.gather(...,
return_exceptions=True)) so no task exceptions remain unobserved; also apply the
same tracking and shutdown pattern to the other locations mentioned (the other
AudioForwarder creations and any event callbacks referenced around the same
areas).
- Around line 203-208: The data-channel handler currently json.loads(message)
and immediately dispatches via asyncio.create_task(self._handle_event(data));
add a guard to reject non-object payloads by checking isinstance(data, dict) (or
type(data) is dict) after parsing, and if it is not a dict log a clear message
(e.g., logger.warning or logger.debug) and return without creating the task;
update the block around json.loads, logger.exception, and
asyncio.create_task(self._handle_event) in rtc_manager.py so only dicts are
forwarded to _handle_event.
- Around line 225-244: The connection-state callbacks in
_setup_connection_logging reference self.pc which can be cleared by close()
before final events fire; capture the PeerConnection into a local variable
(e.g., pc = self.pc) when registering the handlers and use that local pc inside
on_connectionstatechange and on_iceconnectionstatechange so the callbacks
reference the captured connection instance instead of self.pc (also keep the
existing asserts/guards if desired).

---

Duplicate comments:
In `@plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py`:
- Around line 489-493: In update_tools(), when the early-return branch triggers
because await self._plan_supports_tool_calling() is False, clear any existing
configured tools both locally and remotely before returning; call the methods or
operations that remove session/local tools (e.g., reset or clear the internal
tools store on the agent instance) and send the corresponding remote
unregistration/clear request so that previous tool state is not retained when
the plan disallows tool calling. Ensure you reference update_tools and
_plan_supports_tool_calling to locate the branch and invoke the existing
tool-clearing mechanisms (or add clear_tools/clear_remote_tools helper calls)
prior to the return.
- Around line 202-208: The connect() path currently swallows
asyncio.TimeoutError from await self._session_updated.wait() and only logs a
warning, which causes callers to still receive RealtimeConnectedEvent even
though session configuration may have failed; update the connect()
implementation so that when asyncio.wait_for(self._session_updated.wait(),
timeout=10.0) times out you treat it as a connection/configuration failure:
either raise a clear exception (e.g., ConnectionError) from connect() or emit a
RealtimeConnectionFailedEvent and return/stop further success-path processing so
RealtimeConnectedEvent is not emitted; modify the timeout except block in
connect() (the code referencing self._session_updated.wait()) to perform that
failure signaling and ensure any downstream code that emits
RealtimeConnectedEvent is not executed after the timeout.
- Around line 371-373: The current block constructs response_data with
response_data = {"error": str(error)} which exposes raw exception text to the
model; change this to return a generic, non-sensitive payload (e.g., {"error":
"tool_call_failed"} or {"error": "Tool call failed"} and optionally an opaque
error_id) and ensure the detailed exception is only written to server logs via
logger.error("Tool call %s failed: %s", name, error) or logger.exception(...) so
diagnostics remain out of the model-visible response; update the code paths that
consume response_data so they expect the generic message/optional error_id
instead of raw exception text.
- Around line 155-175: The current logic skips the plan preflight when there are
no locally registered functions but the incoming realtime_session already
contains "tools", causing restricted-plan failures later; change the block
around
get_available_functions/convert_tools_to_openai_format/_plan_supports_tool_calling
so you always run the plan preflight for any candidate tools (either converted
from available_tools or the preconfigured self.realtime_session.get("tools")),
e.g., compute tools_candidate = converted_tools if available_tools else
self.realtime_session.get("tools"), call await
self._plan_supports_tool_calling() for that candidate, and then set or pop
self.realtime_session["tools"] and emit the same logger.info/logger.warning
based on the preflight result so capabilities reflect whether tools are actually
enabled (refer to get_available_functions, convert_tools_to_openai_format,
_plan_supports_tool_calling, and self.realtime_session["tools"]).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c7c06809-8ca4-4bd5-806c-1575f570ba92

📥 Commits

Reviewing files that changed from the base of the PR and between 819cdad and 6eafeae.

📒 Files selected for processing (2)
  • plugins/inworld/vision_agents/plugins/inworld/inworld_realtime.py
  • plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py

Comment on lines +83 to +87
@self.pc.on("track")
async def on_track(track):
if track.kind == "audio" and self._audio_callback:
audio_forwarder = AudioForwarder(track, self._audio_callback)
await audio_forwarder.start()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Close all background work started by the RTC manager.

AudioForwarder instances are only local variables, and _pending_tasks are retained but never awaited or inspected during shutdown. close() can return while audio/event callbacks are still running, and task exceptions remain unobserved.

🛠️ Proposed lifecycle fix
         self._event_callback: Callable[[dict], Awaitable[None]] | None = None
         self._data_channel_open_event: asyncio.Event = asyncio.Event()
         self._pending_tasks: set[asyncio.Task[None]] = set()
+        self._audio_forwarders: set[AudioForwarder] = set()
@@
             async def on_track(track):
                 if track.kind == "audio" and self._audio_callback:
                     audio_forwarder = AudioForwarder(track, self._audio_callback)
+                    self._audio_forwarders.add(audio_forwarder)
                     await audio_forwarder.start()
@@
             task = asyncio.create_task(self._handle_event(data))
             self._pending_tasks.add(task)
-            task.add_done_callback(self._pending_tasks.discard)
+            task.add_done_callback(self._on_pending_task_done)
+
+    def _on_pending_task_done(self, task: asyncio.Task[None]) -> None:
+        self._pending_tasks.discard(task)
+        if task.cancelled():
+            return
+        exc = task.exception()
+        if exc is not None:
+            logger.error(
+                "Inworld event handler task failed",
+                exc_info=(type(exc), exc, exc.__traceback__),
+            )
@@
     async def close(self) -> None:
         if self.data_channel is not None:
             self.data_channel.close()
             self.data_channel = None
         self._audio_to_inworld_track.stop()
+
+        for audio_forwarder in tuple(self._audio_forwarders):
+            await audio_forwarder.stop()
+        self._audio_forwarders.clear()
+
+        if self._pending_tasks:
+            await asyncio.gather(*self._pending_tasks, return_exceptions=True)
+            self._pending_tasks.clear()
 
         pc = self.pc

Also applies to: 208-210, 252-266

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines 83
- 87, The on_track handler creates AudioForwarder instances as local variables
and spawns background tasks that are never awaited or inspected, so modify the
RTC manager to track and properly shut down these workers: store each
AudioForwarder (or any created background Task) on the instance (e.g., add to
self._audio_forwarders or push created asyncio.Tasks into self._pending_tasks)
when you create them in the on_track callback and ensure tasks remove themselves
from that collection on completion; update close() to iterate over tracked
AudioForwarder instances and call their stop/shutdown method (or cancel and
await tasks in self._pending_tasks), await their termination, and gather
exceptions (use asyncio.gather(..., return_exceptions=True)) so no task
exceptions remain unobserved; also apply the same tracking and shutdown pattern
to the other locations mentioned (the other AudioForwarder creations and any
event callbacks referenced around the same areas).

Comment on lines +203 to +208
try:
data = json.loads(message)
except json.JSONDecodeError:
logger.exception("Failed to decode data-channel message")
return
task = asyncio.create_task(self._handle_event(data))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject non-object data-channel payloads before dispatch.

json.loads() can return a list/string/number, but downstream handlers call event.get(...). A malformed server message would raise in the background dispatcher.

🛡️ Proposed guard
             try:
                 data = json.loads(message)
             except json.JSONDecodeError:
                 logger.exception("Failed to decode data-channel message")
                 return
+            if not isinstance(data, dict):
+                logger.warning("Ignoring non-object Inworld data-channel message")
+                return
             task = asyncio.create_task(self._handle_event(data))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines
203 - 208, The data-channel handler currently json.loads(message) and
immediately dispatches via asyncio.create_task(self._handle_event(data)); add a
guard to reject non-object payloads by checking isinstance(data, dict) (or
type(data) is dict) after parsing, and if it is not a dict log a clear message
(e.g., logger.warning or logger.debug) and return without creating the task;
update the block around json.loads, logger.exception, and
asyncio.create_task(self._handle_event) in rtc_manager.py so only dicts are
forwarded to _handle_event.

Comment on lines +225 to +244
def _setup_connection_logging(self) -> None:
assert self.pc is not None

@self.pc.on("connectionstatechange")
async def on_connectionstatechange():
assert self.pc is not None
state = self.pc.connectionState
if state == "failed":
logger.error("Inworld RTC connection failed")
elif state == "disconnected":
logger.warning("Inworld RTC connection disconnected")
elif state == "connected":
logger.info("Inworld RTC connection established")
elif state == "closed":
logger.info("Inworld RTC connection closed")

@self.pc.on("iceconnectionstatechange")
async def on_iceconnectionstatechange():
assert self.pc is not None
state = self.pc.iceConnectionState
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Capture pc in connection-state callbacks.

These callbacks dereference self.pc, but close() clears self.pc before pc.close() can emit final state changes. Capture the peer connection when registering handlers.

🛠️ Proposed fix
     def _setup_connection_logging(self) -> None:
         assert self.pc is not None
+        pc = self.pc
 
-        `@self.pc.on`("connectionstatechange")
+        `@pc.on`("connectionstatechange")
         async def on_connectionstatechange():
-            assert self.pc is not None
-            state = self.pc.connectionState
+            state = pc.connectionState
             if state == "failed":
                 logger.error("Inworld RTC connection failed")
@@
-        `@self.pc.on`("iceconnectionstatechange")
+        `@pc.on`("iceconnectionstatechange")
         async def on_iceconnectionstatechange():
-            assert self.pc is not None
-            state = self.pc.iceConnectionState
+            state = pc.iceConnectionState
             if state == "failed":
                 logger.error("Inworld ICE connection failed")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/inworld/vision_agents/plugins/inworld/rtc_manager.py` around lines
225 - 244, The connection-state callbacks in _setup_connection_logging reference
self.pc which can be cleared by close() before final events fire; capture the
PeerConnection into a local variable (e.g., pc = self.pc) when registering the
handlers and use that local pc inside on_connectionstatechange and
on_iceconnectionstatechange so the callbacks reference the captured connection
instance instead of self.pc (also keep the existing asserts/guards if desired).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants