Skip to content

[AI-269] Eager turn taking#169

Merged
tschellenbach merged 15 commits intomainfrom
simplify_turn_events
Nov 11, 2025
Merged

[AI-269] Eager turn taking#169
tschellenbach merged 15 commits intomainfrom
simplify_turn_events

Conversation

@tschellenbach
Copy link
Member

@tschellenbach tschellenbach commented Nov 11, 2025

  • Eager turn taking support
  • Fix screensharing support for Gemini & OpenAI
  • Agent testing for wait for participant
  • Agent testing for screensharing
  • Remove STT & Turn keeping event duplication

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced eager end-of-turn detection for faster turn transitions in speech-to-text processing
    • Enhanced LLM turn lifecycle management with explicit turn state tracking and completion handling
    • Added turn detection capability flag to STT implementations
  • Improvements

    • Refined video track forwarding logic with better priority handling
    • Strengthened event system with runtime validation for async handlers
    • Improved event messaging for unknown event types
  • Documentation

    • Added documentation on turn detection configuration and eager turn-end event emission
  • Tests

    • Added comprehensive test coverage for agent track handling and participant presence detection

@coderabbitai
Copy link

coderabbitai bot commented Nov 11, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This PR introduces eager turn detection for vision agents. It relocates AgentOptions to a new types module with TrackInfo and LLMTurn dataclasses, implements pending LLM turn coordination in the Agent class, adds eager turn detection fields to STT and turn detection events, refactors event emission signatures, updates plugin implementations, and adds comprehensive test coverage.

Changes

Cohort / File(s) Summary
Core types restructuring
agents-core/vision_agents/core/agents/agent_options.py, agents-core/vision_agents/core/agents/agent_types.py
AgentOptions moved from agent_options.py to agent_types.py; new TrackInfo and LLMTurn dataclasses added to agent_types.py with fields for track metadata and LLM turn lifecycle management.
Agent implementation
agents-core/vision_agents/core/agents/agents.py
Added pending LLM turn handling (_pending_turn, _finish_llm_turn), async event handlers (_on_vad_audio, _on_turn_event), turn_detection_enabled property, enhanced event subscriptions with on_turn_ended handling, and turn-based transcript accumulation logic.
Event system
agents-core/vision_agents/core/events/manager.py
Added runtime validation to enforce async handlers; improved warning messaging for unknown events during preparation.
STT enhancements
agents-core/vision_agents/core/stt/events.py, agents-core/vision_agents/core/stt/stt.py
Added eager_end_of_turn boolean field to STTTranscriptEvent; added turn_detection attribute and _emit_turn_ended_event method to STT class.
Turn detection
agents-core/vision_agents/core/turn_detection/events.py, agents-core/vision_agents/core/turn_detection/turn_detection.py
Added eager_end_of_turn field to TurnEndedEvent; refactored _emit_end_turn_event signature to accept individual parameters (participant, confidence, trailing_silence_ms, duration_ms, eager_end_of_turn) instead of event object.
Plugin import updates
plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py, plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py, plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
Updated imports of AgentOptions and default_agent_options from agent_options to agent_types; updated _emit_end_turn_event call sites to use keyword arguments.
Plugin STT implementation
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py, plugins/deepgram/tests/test_deepgram_stt.py
Added turn_detection attribute and eager_turn_detection parameter to Deepgram STT; implemented eager end-of-turn detection via EagerEndOfTurn event; integrated _emit_turn_ended_event calls with eager_end_of_turn flag.
Video forwarder management
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, plugins/openai/vision_agents/plugins/openai/rtc_manager.py
Simplified watch_video_track to remove and reattach handlers instead of stopping/restarting forwarders; added _current_video_forwarder tracking in RTCManager for safe forwarder switching.
Vogent turn detection
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
Removed TurnEndedEvent import; updated _emit_end_turn_event call to use keyword arguments (participant, confidence, trailing_silence_ms, duration_ms).
Minor updates
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py, plugins/openai/vision_agents/plugins/openai/openai_realtime.py
Removed empty custom field from ChannelMemberRequest; updated TODO comments with additional implementation notes.
Documentation and examples
docs/ai/instructions/ai-stt.md, docs/ai/instructions/ai-turn-detector.md, examples/01_simple_agent_example/simple_agent_example.py
Added turn keeping documentation section with eager_end_of_turn example; updated turn detector docs to reflect new _emit_end_turn_event signature; updated simple example to use eager_turn_detection and gemini-2.5-flash-lite.
Test coverage
tests/test_agent.py, tests/test_agent_tracks.py
Added comprehensive test suite for Agent.wait_for_participant with mock implementations; added test suite validating track handling, priority-based forwarding, and processor/LLM integration.

Sequence Diagram

sequenceDiagram
    participant STT as STT Module
    participant Agent as Agent
    participant LLM as LLM
    participant TTS as TTS
    
    rect rgb(200, 220, 255)
        Note over STT,TTS: Eager Turn Detection Flow
        STT->>STT: Detect eager end-of-turn (EagerEndOfTurn event)
        STT->>Agent: Emit TurnEndedEvent (eager_end_of_turn=true)
        Agent->>Agent: Store pending LLM turn (_pending_turn)
        Agent->>LLM: Request response
        LLM->>Agent: Return response (async)
        Agent->>Agent: _finish_llm_turn()
        Agent->>TTS: Trigger speech synthesis
    end
    
    rect rgb(220, 255, 220)
        Note over STT,TTS: Normal Turn Detection Flow
        STT->>STT: Detect normal end-of-turn
        STT->>Agent: Emit TurnEndedEvent (eager_end_of_turn=false)
        Agent->>Agent: Process turn completion
        Agent->>LLM: Request response
        LLM->>Agent: Return response
        Agent->>TTS: Trigger speech synthesis
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • Agent pending turn logic (agents.py _pending_turn, _finish_llm_turn, event handler coordination): Verify async state transitions and ensure no race conditions in turn completion flow.
  • Turn detection event refactoring (turn_detection.py signature change): Validate all call sites pass correct parameters and that existing behavior is preserved across all turn detector implementations.
  • Eager turn detection implementation (deepgram_stt.py): Confirm proper handling of EagerEndOfTurn detection, threshold logic, and event emission with eager_end_of_turn flag.
  • Video forwarder switching (gemini_realtime.py, rtc_manager.py): Ensure handler removal/reattachment logic prevents frame loss or double-forwarding during transitions.
  • Event handler async enforcement (events/manager.py): Review validation logic and ensure error messages guide users correctly.

Possibly related PRs

  • PR #70: Both modify Agent turn/turn-detection and transcript-to-LLM flow with pending turn state and turn-detection emission handling.
  • PR #157: Both relocate AgentOptions to agent_types.py and update call sites across Moondream and other plugins.
  • PR #163: Both modify Agent._on_turn_event handling and pass event.participant into LLM response logic.

Suggested labels

tests, turn-detection, eager-turn, events, refactor

Suggested reviewers

  • yarikdevcom
  • maxkahan

Poem

Bell jars of silence shatter—
eager turns spring forth unbidden,
pending dreams float through circuits,
while forwarders dance their careful waltz.
The agent breathes in sync,
transcripts bloom like dark flowers.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch simplify_turn_events

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ec457b9 and 71883ac.

📒 Files selected for processing (23)
  • agents-core/vision_agents/core/agents/agent_options.py (0 hunks)
  • agents-core/vision_agents/core/agents/agent_types.py (1 hunks)
  • agents-core/vision_agents/core/agents/agents.py (9 hunks)
  • agents-core/vision_agents/core/events/manager.py (2 hunks)
  • agents-core/vision_agents/core/stt/events.py (1 hunks)
  • agents-core/vision_agents/core/stt/stt.py (4 hunks)
  • agents-core/vision_agents/core/turn_detection/events.py (1 hunks)
  • agents-core/vision_agents/core/turn_detection/turn_detection.py (1 hunks)
  • docs/ai/instructions/ai-stt.md (1 hunks)
  • docs/ai/instructions/ai-turn-detector.md (1 hunks)
  • examples/01_simple_agent_example/simple_agent_example.py (2 hunks)
  • plugins/deepgram/tests/test_deepgram_stt.py (1 hunks)
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (4 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1 hunks)
  • plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (0 hunks)
  • plugins/moondream/vision_agents/plugins/moondream/detection/moondream_local_processor.py (1 hunks)
  • plugins/moondream/vision_agents/plugins/moondream/vlm/moondream_local_vlm.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/rtc_manager.py (2 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2 hunks)
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1 hunks)
  • tests/test_agent.py (1 hunks)
  • tests/test_agent_tracks.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tschellenbach tschellenbach marked this pull request as ready for review November 11, 2025 19:42
@tschellenbach tschellenbach merged commit f5d37ea into main Nov 11, 2025
5 of 6 checks passed
@tschellenbach tschellenbach deleted the simplify_turn_events branch November 11, 2025 19:42
@coderabbitai coderabbitai bot mentioned this pull request Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants