Skip to content

@livekit/agents-plugin-google 1.2.1 fails with gemini-3.1-flash-live-preview #1179

@bhanusharma

Description

@bhanusharma

Summary

@livekit/agents-plugin-google@1.2.1 appears incompatible with gemini-3.1-flash-live-preview on the Gemini Live API.

On the same code path and API key:

  • gemini-2.5-flash-native-audio-preview-12-2025 works
  • gemini-3.1-flash-live-preview fails at the plugin/API boundary

Environment

  • @livekit/agents: 1.2.1
  • @livekit/agents-plugin-google: 1.2.1
  • @livekit/agents-plugin-silero: 1.2.1
  • @livekit/rtc-node: 0.13.24
  • @google/genai: 1.45.0
  • Node 20

What we tested

We built a minimal repro that bypasses our room/session app wiring and talks directly to google.beta.realtime.RealtimeModel().session().

We ran two scenarios against the same Gemini API key:

  1. greet: direct generateReply() / greeting-style generation path
  2. speech: push realtime PCM audio and wait for input transcription / generation

We explicitly set both:

  • inputAudioTranscription: {}
  • outputAudioTranscription: {}

Actual results

gemini-2.5-flash-native-audio-preview-12-2025

  • greet: works
    • generation_created fires
    • text chunks arrive
    • audio frames arrive
  • speech: input transcription works
    • input_audio_transcription_completed fires and reaches a final transcript

gemini-3.1-flash-live-preview

  • greet: fails
    • no generation_created
    • server error: Request contains an invalid argument.
  • speech: fails
    • no input transcription
    • server error: realtime_input.media_chunks is deprecated. Use audio, video, or text instead.

Likely cause

In the current plugin source, realtime audio is still sent through the older path:

  • pushAudio() queues realtime_input messages with mediaChunks
  • sendTask() calls session.sendRealtimeInput({ media: mediaChunk })

Google's current Live API docs now expose realtime input fields like audio, video, and text, and the 3.1 API is explicitly rejecting media_chunks in our repro.

Why this looks plugin-specific

This is not just app-level behavior:

  • the repro bypasses our LiveKit rooms / VAD / pause-resume logic
  • the same API key and same code path work on 2.5
  • only the model changes to gemini-3.1-flash-live-preview

Request

Could you confirm whether agents-plugin-google needs a Gemini 3.1 compatibility update for:

  1. realtime audio input (media / mediaChunks -> audio)
  2. the current generateReply() / client-content generation path for 3.1

If useful, I can also provide the minimal repro script in a follow-up comment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions