-
Notifications
You must be signed in to change notification settings - Fork 260
@livekit/agents-plugin-google 1.2.1 fails with gemini-3.1-flash-live-preview #1179
Description
Summary
@livekit/agents-plugin-google@1.2.1 appears incompatible with gemini-3.1-flash-live-preview on the Gemini Live API.
On the same code path and API key:
gemini-2.5-flash-native-audio-preview-12-2025worksgemini-3.1-flash-live-previewfails at the plugin/API boundary
Environment
@livekit/agents:1.2.1@livekit/agents-plugin-google:1.2.1@livekit/agents-plugin-silero:1.2.1@livekit/rtc-node:0.13.24@google/genai:1.45.0- Node 20
What we tested
We built a minimal repro that bypasses our room/session app wiring and talks directly to google.beta.realtime.RealtimeModel().session().
We ran two scenarios against the same Gemini API key:
greet: directgenerateReply()/ greeting-style generation pathspeech: push realtime PCM audio and wait for input transcription / generation
We explicitly set both:
inputAudioTranscription: {}outputAudioTranscription: {}
Actual results
gemini-2.5-flash-native-audio-preview-12-2025
greet: worksgeneration_createdfires- text chunks arrive
- audio frames arrive
speech: input transcription worksinput_audio_transcription_completedfires and reaches a final transcript
gemini-3.1-flash-live-preview
greet: fails- no
generation_created - server error:
Request contains an invalid argument.
- no
speech: fails- no input transcription
- server error:
realtime_input.media_chunks is deprecated. Use audio, video, or text instead.
Likely cause
In the current plugin source, realtime audio is still sent through the older path:
pushAudio()queuesrealtime_inputmessages withmediaChunkssendTask()callssession.sendRealtimeInput({ media: mediaChunk })
Google's current Live API docs now expose realtime input fields like audio, video, and text, and the 3.1 API is explicitly rejecting media_chunks in our repro.
Why this looks plugin-specific
This is not just app-level behavior:
- the repro bypasses our LiveKit rooms / VAD / pause-resume logic
- the same API key and same code path work on
2.5 - only the model changes to
gemini-3.1-flash-live-preview
Request
Could you confirm whether agents-plugin-google needs a Gemini 3.1 compatibility update for:
- realtime audio input (
media/mediaChunks->audio) - the current
generateReply()/ client-content generation path for 3.1
If useful, I can also provide the minimal repro script in a follow-up comment.