Conversation
📝 WalkthroughWalkthroughThis PR introduces a new "Phone and RAG" feature to the documentation and refactors the phone example code to manage agent session joining through async context managers instead of pre-obtained session objects, simplifying the control flow. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @README.md:
- Around line 158-165: Update the GeoGuesser demo text to use a hyphenated
adjective for consistency: locate the GeoGuesser row in the Demo Applications
table (the cell containing "GeoGuesser" and the sentence "together with OpenAI
Realtime and Vision Agents, we can take GeoGuesser to the next level by asking
it to identify places in our real world surroundings") and change "real world
surroundings" to "real-world surroundings".
🧹 Nitpick comments (1)
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (1)
137-164: Consider adding Google-style docstrings to nested functions.The
create_rag_from_directoryfunction has a brief docstring, but it accesses private attributes (_indexed_files,_uploaded_files) for logging. While this is acceptable for example code, note that these private attributes may change without notice.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
assets/demo_gifs/va_phone.pngis excluded by!**/*.pngexamples/03_phone_and_rag_example/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
README.mdexamples/03_phone_and_rag_example/inbound_phone_and_rag_example.pyexamples/03_phone_and_rag_example/outbound_phone_example.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/python.mdc)
**/*.py: Never adjust sys.path in Python code
Never writeexcept Exception as e- use specific exception handling
Avoid using getattr, hasattr, delattr and setattr; prefer normal attribute access in Python
Docstrings should follow the Google style guide for docstrings
Files:
examples/03_phone_and_rag_example/outbound_phone_example.pyexamples/03_phone_and_rag_example/inbound_phone_and_rag_example.py
🧬 Code graph analysis (2)
examples/03_phone_and_rag_example/outbound_phone_example.py (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)
join(328-383)
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)
join(328-383)
🪛 LanguageTool
README.md
[uncategorized] ~163-~163: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... by asking it to identify places in our real world surroundings.
• Real-world locat...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Mypy
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Ruff
- GitHub Check: unit / Ruff
- GitHub Check: unit / Validate extra dependencies in "agents-core/pyproject.toml"
- GitHub Check: unit / Test "not integration"
- GitHub Check: unit / Mypy
- GitHub Check: Cursor Bugbot
🔇 Additional comments (5)
README.md (1)
91-102: Features table updated with Phone and RAG entry — LGTM!The new feature row cleanly documents the Twilio and TurboPuffer integration. Table formatting is consistent.
examples/03_phone_and_rag_example/inbound_phone_and_rag_example.py (2)
98-98: Simplified return tuple — LGTM!Removing
agent_sessionfrom the return and deferring the join to the async context manager is a cleaner approach. The flow now correctly separates preparation from joining.
128-132: Async context manager for session management — clean refactor.Using
async with agent.join(...)properly manages the session lifecycle and ensures cleanup on exit. Theparticipant_wait_timeout=0parameter allows the agent to join immediately without waiting for other participants (0 = do not wait), which is appropriate for phone call scenarios where the phone user is attached separately.examples/03_phone_and_rag_example/outbound_phone_example.py (2)
52-52: Consistent with inbound example — LGTM!The return tuple change mirrors the inbound example, maintaining consistency across both phone examples.
89-93: Session handling refactored to async context manager — LGTM!The change from the previous session-based approach to
async with agent.join(...)is consistent with the inbound example. This ensures proper resource cleanup and simplifies the control flow. Theparticipant_wait_timeout=0allows immediate joining without blocking for participants.
| | 🔮 Demo Applications | | | ||
| |:-----|--------------------------------------------------------------------------------| | ||
| | <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> | | ||
| | <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> | | ||
| | <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> | | ||
| | <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> | | ||
| | <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> | | ||
|
|
There was a problem hiding this comment.
Demo Applications table updated — minor text inconsistency.
At line 163, "real world surroundings" appears without a hyphen, while "Real-world" is hyphenated correctly later in the same line. Consider using "real-world surroundings" for consistency.
📝 Suggested fix
-| <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> |
+| <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real-world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | 🔮 Demo Applications | | | |
| |:-----|--------------------------------------------------------------------------------| | |
| | <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> | | |
| | <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> | | |
| | <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> | | |
| | <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> | | |
| | <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> | | |
| | 🔮 Demo Applications | | | |
| |:-----|--------------------------------------------------------------------------------| | |
| | <br><h3>Cartesia</h3>Using Cartesia's Sonic 3 model to visually look at what's in the frame and tell a story with emotion.<br><br>• Real-time visual understanding<br>• Emotional storytelling<br>• Frame-by-frame analysis<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/cartesia/example) | <img src="assets/demo_gifs/cartesia.gif" width="320" alt="Cartesia Demo"> | | |
| | <br><h3>Realtime Stable Diffusion</h3>Realtime stable diffusion using Vision Agents and Decart's Mirage 2 model to create interactive scenes and stories.<br><br>• Real-time video restyling<br>• Interactive scene generation<br>• Stable diffusion integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/plugins/decart/example) | <img src="assets/demo_gifs/mirage.gif" width="320" alt="Mirage Demo"> | | |
| | <br><h3>Golf Coach</h3>Using Gemini Live together with Vision Agents and Ultralytics YOLO, we're able to track the user's pose and provide realtime actionable feedback on their golf game.<br><br>• Real-time pose tracking<br>• Actionable coaching feedback<br>• YOLO pose detection<br>• Gemini Live integration<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/02_golf_coach_example) | <img src="assets/demo_gifs/golf.gif" width="320" alt="Golf Coach Demo"> | | |
| | <br><h3>GeoGuesser</h3>Together with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level by asking it to identify places in our real-world surroundings.<br><br>• Real-world location identification<br>• OpenAI Realtime integration<br>• Visual scene understanding<br><br> [>Source Code and tutorial](https://visionagents.ai/integrations/openai#openai-realtime)| <img src="assets/demo_gifs/geoguesser.gif" width="320" alt="GeoGuesser Demo"> | | |
| | <br><h3>Phone and RAG</h3>Interact with your Agent over the phone using Twilio. This example demonstrates how to use TurboPuffer for Retrieval Augmented Generation (RAG) to give your agent specialized knowledge.<br><br>• Inbound/Outbound telephony<br>• Twilio Media Streams integration<br>• Vector search with TurboPuffer<br>• Retrieval Augmented Generation<br><br> [>Source Code and tutorial](https://github.com/GetStream/Vision-Agents/tree/main/examples/03_phone_and_rag_example) | <img src="assets/demo_gifs/va_phone.png" width="320" alt="Phone and RAG Demo"> | |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~163-~163: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... by asking it to identify places in our real world surroundings.
• Real-world locat...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
🤖 Prompt for AI Agents
In @README.md around lines 158 - 165, Update the GeoGuesser demo text to use a
hyphenated adjective for consistency: locate the GeoGuesser row in the Demo
Applications table (the cell containing "GeoGuesser" and the sentence "together
with OpenAI Realtime and Vision Agents, we can take GeoGuesser to the next level
by asking it to identify places in our real world surroundings") and change
"real world surroundings" to "real-world surroundings".
Note
Modernizes telephony examples and refreshes docs while upgrading dependencies.
inbound_phone_and_rag_example.pyandoutbound_phone_example.pyto return(agent, phone_user, stream_call)from prepare, and useasync with agent.join(..., participant_wait_timeout=0)instead of managing a separateagent_sessionPhone and RAGfeature and a new demo row; updates GeoGuesser linkuv.lock: bumps core libs (e.g.,aiohttp,fastapi,openai,uvicorn,urllib3, etc.), adjustsgetstream[webrtc]optionals (addsav, removestorch/torchaudio), pinsdeepgram-sdkrange, and expands extras (addsnvidia,turbopuffer,twilio)Written by Cursor Bugbot for commit 3981dd8. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
Documentation
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.