Skip to content

fix(google realtime): support gemini-3.1-flash-live-preview#5251

Open
Hormold wants to merge 2 commits intomainfrom
fix/gemini-31-live-compat
Open

fix(google realtime): support gemini-3.1-flash-live-preview#5251
Hormold wants to merge 2 commits intomainfrom
fix/gemini-31-live-compat

Conversation

@Hormold
Copy link
Copy Markdown
Contributor

@Hormold Hormold commented Mar 27, 2026

Adds working support for gemini-3.1-flash-live-preview.

Gemini 3.1 changed how send_client_content works. It's now only allowed for initial history seeding (with history_config.initial_history_in_client_content=true).

After the first model turn, all text input must go through send_realtime_input. This broke generate_reply, and any mid-session LiveClientContent (from update_instructions, update_chat_ctx, etc.) gets rejected with a 1007 error.

What this PR does:

  • Introduces RESTRICTED_CLIENT_CONTENT_MODELS set in api_proto.py to identify models with this limitation - generate_reply() uses LiveClientRealtimeInput(text=...) instead of LiveClientContent for these models
  • _send_task drops LiveClientContent events mid-session to prevent 1007 crashes. With reconnect_on_update=True, it triggers a reconnect instead so updates take effect via the new session's setup config
  • _build_connect_config adds history_config.initial_history_in_client_content=True so initial history seeding still works on connect/reconnect
  • _handle_server_content skips empty server_content that 3.1 sends (only usage_metadata, no actual content) - without this the first generation's audio gets dropped, thx [livekit-plugins-google] does not support gemini-3.1-flash-live-preview #5234
  • Adds reconnect_on_update parameter to RealtimeModel — opt-in reconnect when update_instructions/update_chat_ctx is called on restricted models. Off by default since it interrupts the session.

Tested locally: greeting via generate_reply, bidirectional audio, function tool calling, update_instructions and update_chat_ctx with reconnect all work.

Known limitation: mid-session update_instructions and update_chat_ctx require a reconnect on 3.1 - there's no way around this since the model simply doesn't accept send_client_content after the first turn. Google's migration guide confirms this is by design.
image

Ref: https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-live-preview#migrating
Related: #5234

Gemini 3.1 rejects send_client_content after the first model turn. Route
generate_reply through send_realtime_input, drop mid-session
LiveClientContent to prevent 1007 errors, add history_config for initial
context seeding, and skip empty server_content events.

New reconnect_on_update option enables session restart on
update_instructions/update_chat_ctx for restricted models.
@chenghao-mou chenghao-mou requested a review from a team March 27, 2026 20:50
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In generate_reply, the restricted-model branch sends instructions directly as a LiveClientRealtimeInput text prompt, but the original non-restricted path framed those same instructions as a role="model" content turn followed by a role="user" placeholder turn. These have meaningfully different semantics — the instructions were intended to prime the model's perspective, not arrive as user speech — so the behavior diverges silently for callers passing instructions to generate_reply on a restricted model.

In _send_task, when reconnect_on_update=True triggers _mark_restart_needed() upon receiving a LiveClientContent, the content in msg.turns is discarded. The reconnect re-seeds history via system_instruction/initial_history, but any turn data specific to that particular LiveClientContent message (e.g., a one-off model-role instruction) is lost without any log warning, making this a silent data loss path that could be hard to debug.

The RESTRICTED_CLIENT_CONTENT_MODELS membership check is now scattered across at least three call sites (generate_reply, _send_task, and implicitly _build_connect_config per the comment). Centralizing this behind a small helper method like _is_restricted_client_content_model() would reduce the risk of a future model being added to the frozenset but missing one of the branching locations.

Copy link
Copy Markdown
Member

@davidzhao davidzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this type of workaround is a good idea. let's engage the deepmind folks to understand the best path forward

conn_options: APIConnectOptions = DEFAULT_API_CONNECT_OPTIONS,
http_options: NotGivenOr[types.HttpOptions] = NOT_GIVEN,
thinking_config: NotGivenOr[types.ThinkingConfig] = NOT_GIVEN,
reconnect_on_update: bool = False,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a user specified option? when do you want this to be True unless the model requires it?

# See: https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-live-preview#migrating
if self._opts.model in RESTRICTED_CLIENT_CONTENT_MODELS:
prompt = instructions if is_given(instructions) else "."
self._send_client_event(types.LiveClientRealtimeInput(text=prompt))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't going to work. realtime input is coming from the end user, but generate_reply instructions needs to be coming from the model itsel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants