Skip to content

Fix SIP example runner #286

Merged
Nash0x7E2 merged 4 commits intomainfrom
chore/fix-phone-ex
Jan 13, 2026
Merged

Fix SIP example runner #286
Nash0x7E2 merged 4 commits intomainfrom
chore/fix-phone-ex

Conversation

@Nash0x7E2
Copy link
Member

@Nash0x7E2 Nash0x7E2 commented Jan 12, 2026

Note

Modernizes telephony examples and refreshes docs while upgrading dependencies.

  • Refactors inbound_phone_and_rag_example.py and outbound_phone_example.py to return (agent, phone_user, stream_call) from prepare, and use async with agent.join(..., participant_wait_timeout=0) instead of managing a separate agent_session
  • README: fixes features table formatting, adds Phone and RAG feature and a new demo row; updates GeoGuesser link
  • Dependency updates in uv.lock: bumps core libs (e.g., aiohttp, fastapi, openai, uvicorn, urllib3, etc.), adjusts getstream[webrtc] optionals (adds av, removes torch/torchaudio), pins deepgram-sdk range, and expands extras (adds nvidia, turbopuffer, twilio)

Written by Cursor Bugbot for commit 3981dd8. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • Documentation

    • Added "Phone and RAG" feature to the README.
    • Reformatted and improved readability of feature tables with consistent alignment and spacing.
  • Refactor

    • Updated code examples to reflect changes in how agent participation is managed in phone and RAG scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

This PR introduces a new "Phone and RAG" feature to the documentation and refactors the phone example code to manage agent session joining through async context managers instead of pre-obtained session objects, simplifying the control flow.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added new "Phone and RAG" feature row to Features table; reflowed and normalized Markdown table formatting for improved consistency and readability across Features and Demo Applications sections.
Phone Examples Refactoring
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py, examples/03_phone_and_rag_example/outbound_phone_example.py
Removed agent_session from prepare_call return tuple; replaced direct session object usage with async with agent.join(stream_call, participant_wait_timeout=0) context manager pattern. Updated parameter from wait_for_participant=False to participant_wait_timeout=0.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

A session shed like snakeskin, cast aside,
Now context managers hold what once was bound—
Async hearts entwine in tighter coils,
Where joining blooms through manager's controlled hand,
And RAG whispers phone-ward through the dark.

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title 'Fix SIP example runner' does not accurately reflect the primary changes. The PR primarily updates Twilio phone examples, adds documentation, and refactors session handling, but the title misleadingly references 'SIP' which is not mentioned in the actual changes. Change the title to more accurately describe the main changes, such as 'Simplify Twilio phone example session handling and update docs' or 'Refactor phone examples and update README with Phone and RAG feature'.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @README.md:
- Around line 158-165: Update the GeoGuesser demo text to use a hyphenated
adjective for consistency: locate the GeoGuesser row in the Demo Applications
table (the cell containing "GeoGuesser" and the sentence "together with OpenAI
Realtime and Vision Agents, we can take GeoGuesser to the next level by asking
it to identify places in our real world surroundings") and change "real world
surroundings" to "real-world surroundings".
🧹 Nitpick comments (1)
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (1)

137-164: Consider adding Google-style docstrings to nested functions.

The create_rag_from_directory function has a brief docstring, but it accesses private attributes (_indexed_files, _uploaded_files) for logging. While this is acceptable for example code, note that these private attributes may change without notice.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c433fe4 and 3981dd8.

⛔ Files ignored due to path filters (2)
  • assets/demo_gifs/va_phone.png is excluded by !**/*.png
  • examples/03_phone_and_rag_example/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • README.md
  • examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py
  • examples/03_phone_and_rag_example/outbound_phone_example.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Never adjust sys.path in Python code
Never write except Exception as e - use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings

Files:

  • examples/03_phone_and_rag_example/outbound_phone_example.py
  • examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py
🧬 Code graph analysis (2)
examples/03_phone_and_rag_example/outbound_phone_example.py (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)
  • join (328-383)
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)
  • join (328-383)
🪛 LanguageTool
README.md

[uncategorized] ~163-~163: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... by asking it to identify places in our real world surroundings.

• Real-world locat...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Ruff
  • GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Mypy
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (5)
README.md (1)

91-102: Features table updated with Phone and RAG entry — LGTM!

The new feature row cleanly documents the Twilio and TurboPuffer integration. Table formatting is consistent.

examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (2)

98-98: Simplified return tuple — LGTM!

Removing agent_session from the return and deferring the join to the async context manager is a cleaner approach. The flow now correctly separates preparation from joining.


128-132: Async context manager for session management — clean refactor.

Using async with agent.join(...) properly manages the session lifecycle and ensures cleanup on exit. The participant_wait_timeout=0 parameter allows the agent to join immediately without waiting for other participants (0 = do not wait), which is appropriate for phone call scenarios where the phone user is attached separately.

examples/03_phone_and_rag_example/outbound_phone_example.py (2)

52-52: Consistent with inbound example — LGTM!

The return tuple change mirrors the inbound example, maintaining consistency across both phone examples.


89-93: Session handling refactored to async context manager — LGTM!

The change from the previous session-based approach to async with agent.join(...) is consistent with the inbound example. This ensures proper resource cleanup and simplifies the control flow. The participant_wait_timeout=0 allows immediate joining without blocking for participants.

Comment on lines +158 to 165
| 🔮 Demo Applications | |
|:-----|--------------------------------------------------------------------------------|
| <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> |
| <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> |
| <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> |
| <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> |
| <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Demo Applications table updated — minor text inconsistency.

At line 163, "real world surroundings" appears without a hyphen, while "Real-world" is hyphenated correctly later in the same line. Consider using "real-world surroundings" for consistency.

📝 Suggested fix
-|  <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo">  |
+|  <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real-world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo">  |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| 🔮 Demo Applications | |
|:-----|--------------------------------------------------------------------------------|
| <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> |
| <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> |
| <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> |
| <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> |
| <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> |
| 🔮 Demo Applications | |
|:-----|--------------------------------------------------------------------------------|
| <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> |
| <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> |
| <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> |
| <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real-world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> |
| <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> |
🧰 Tools
🪛 LanguageTool

[uncategorized] ~163-~163: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... by asking it to identify places in our real world surroundings.

• Real-world locat...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🤖 Prompt for AI Agents
In @README.md around lines 158 - 165, Update the GeoGuesser demo text to use a
hyphenated adjective for consistency: locate the GeoGuesser row in the Demo
Applications table (the cell containing "GeoGuesser" and the sentence "together
with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level
by asking it to identify places in our real world surroundings") and change
"real world surroundings" to "real-world surroundings".

@Nash0x7E2 Nash0x7E2 merged commit d2175fa into main Jan 13, 2026
11 checks passed
@Nash0x7E2 Nash0x7E2 deleted the chore/fix-phone-ex branch January 13, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant