Skip to content

/run_live Endpoint Missing RunConfig Options #4263

@bashimr

Description

@bashimr

ADK Bug Report:

Summary

The ADK's built-in /run_live WebSocket endpoint does not expose RunConfig options for proactivity, enable_affective_dialog, or session_resumption. This prevents developers from using Gemini Live's proactive audio features through the standard ADK endpoint.

ADK Version

  • Version: 1.23.0
  • File: src/google/adk/cli/adk_web_server.py
  • Line: 1655

Current Behavior

The /run_live endpoint hardcodes a minimal RunConfig:

@app.websocket("/run_live")
async def run_agent_live(
    websocket: WebSocket,
    app_name: str,
    user_id: str,
    session_id: str,
    modalities: List[Literal["TEXT", "AUDIO"]] = Query(default=["AUDIO"]),
) -> None:
    # ...
    run_config = RunConfig(response_modalities=modalities)  # Line 1655

This means:

  1. Model waits for user input - Without proactivity=ProactivityConfig(proactive_audio=True), the model will not speak first
  2. No emotional awareness - enable_affective_dialog is not enabled
  3. Session resumption not configured - session_resumption is not passed

Expected Behavior

The /run_live endpoint should accept optional query parameters (or request body) to configure these RunConfig options:

@app.websocket("/run_live")
async def run_agent_live(
    websocket: WebSocket,
    app_name: str,
    user_id: str,
    session_id: str,
    modalities: List[Literal["TEXT", "AUDIO"]] = Query(default=["AUDIO"]),
    proactive_audio: bool = Query(default=False),
    enable_affective_dialog: bool = Query(default=False),
    enable_session_resumption: bool = Query(default=False),
) -> None:
    # ...
    run_config = RunConfig(
        response_modalities=modalities,
        proactivity=types.ProactivityConfig(proactive_audio=proactive_audio) if proactive_audio else None,
        enable_affective_dialog=enable_affective_dialog if enable_affective_dialog else None,
        session_resumption=types.SessionResumptionConfig() if enable_session_resumption else None,
    )

Impact

Without these options exposed:

  1. Agents cannot greet users first - The agent waits silently until the user speaks, creating an awkward UX
  2. Emotional awareness disabled - The model cannot adapt to user emotions
  3. Session resumption broken - Multi-connection sessions don't work properly

Workaround

We created a custom /julia_run_live endpoint that mirrors /run_live but passes the full RunConfig:

run_config = RunConfig(
    response_modalities=modalities,
    proactivity=types.ProactivityConfig(proactive_audio=True),
    enable_affective_dialog=True,
    session_resumption=types.SessionResumptionConfig(),
    input_audio_transcription=types.AudioTranscriptionConfig(),
    output_audio_transcription=types.AudioTranscriptionConfig(),
)

See: routers/julia_run_live.py in our codebase.

Suggested Fix

Option A: Query Parameters (Minimal Change)

Add query parameters to the existing endpoint:

@app.websocket("/run_live")
async def run_agent_live(
    websocket: WebSocket,
    app_name: str,
    user_id: str,
    session_id: str,
    modalities: List[Literal["TEXT", "AUDIO"]] = Query(default=["AUDIO"]),
    proactive_audio: bool = Query(default=False),
    affective_dialog: bool = Query(default=False),
    session_resumption: bool = Query(default=False),
) -> None:

Option B: RunConfig in Initial Message

Allow passing RunConfig as the first WebSocket message before streaming begins:

{
  "run_config": {
    "proactivity": {"proactive_audio": true},
    "enable_affective_dialog": true,
    "session_resumption": {}
  }
}

Option C: Agent-Level Default Config

Allow agents to specify default RunConfig in their agent.py that gets used by /run_live:

root_agent = Agent(
    # ...
    default_live_config=RunConfig(
        proactivity=types.ProactivityConfig(proactive_audio=True),
        enable_affective_dialog=True,
    )
)

Related Code

The RunConfig class already supports these options (added in a recent release):

# From src/google/adk/agents/run_config.py

class RunConfig(BaseModel):
    # ...
    enable_affective_dialog: Optional[bool] = None
    proactivity: Optional[types.ProactivityConfig] = None
    session_resumption: Optional[types.SessionResumptionConfig] = None

And they are correctly passed to the LLM in basic.py:

# From src/google/adk/flows/llm_flows/basic.py:79-81
llm_request.live_connect_config.proactivity = (
    invocation_context.run_config.proactivity
)

The infrastructure is in place - it just needs to be exposed through the /run_live endpoint.

Environment

  • ADK Version: 1.23.0
  • Python: 3.11+
  • Gemini Model: gemini-2.0-flash-live (native audio)

Steps to Reproduce

  1. Create an ADK agent with Gemini Live support
  2. Connect to /run_live WebSocket endpoint
  3. Observe that the model waits silently for user input
  4. Expected: Model should greet the user if proactive_audio=True

Additional Context

The ADK documentation (Part 5: Bidi-streaming) shows how to use proactivity:

run_config = RunConfig(
    proactivity=types.ProactivityConfig(proactive_audio=True),
    enable_affective_dialog=True,
)

But there's no way to pass these options through the built-in /run_live endpoint, forcing developers to create custom endpoints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    live[Component] This issue is related to live, voice and video chat

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions