Skip to content

fix(gemini): forward stream chunks that carry only UsageMetadata#2848

Merged
dgageot merged 1 commit into
docker:mainfrom
kenijkawada:fix/gemini-stream-usage-metadata
May 21, 2026
Merged

fix(gemini): forward stream chunks that carry only UsageMetadata#2848
dgageot merged 1 commit into
docker:mainfrom
kenijkawada:fix/gemini-stream-usage-metadata

Conversation

@kenijkawada
Copy link
Copy Markdown
Contributor

Summary

Gemini 3 family models (e.g. gemini-3-flash-preview,
gemini-3.1-flash-lite-preview) emit usageMetadata on stream chunks that
contain no text and no function calls. The current gemini.StreamAdapter
drops those chunks before they reach the runtime, so all token counts and
costs come back as zero. Gemini 2.5 happens to fit its usageMetadata into
the same chunk as the final text, which is why it has worked so far.

Root cause

Two places in pkg/model/provider/gemini/adapter.go:

  1. NewStreamAdapter forwarded only chunks with hasText || hasFuncs.
    Usage-only chunks were silently discarded before reaching the channel.
  2. Recv() extracted UsageMetadata only in the else if res.resp != nil
    branch, so even when the synthesised "done" event carried the last
    response, the usage was not surfaced.

Fix

  • Forward chunks that have UsageMetadata even when text and tool calls are
    absent (hasUsage check added to the OR).
  • Move Usage extraction out of the else if branch so the done event
    also exposes the usage of the last response.

Reproduction

gemini-3-flash-preview Vertex AI response contains usageMetadata (verified
with a direct curl against :generateContent and :streamGenerateContent),
but docker agent run --exec --yolo records usage = {input:0, output:0}
in session_items.message_json and sessions.cost = 0 in the SQLite store.

After the fix the same run records non-zero tokens and the cost computed by
the runtime against the models.dev catalog.

model before after
gemini-3-flash-preview usage = all zero, cost = 0 usage populated, cost > 0
gemini-2.5-flash already worked still works (regression test)

Tests

pkg/model/provider/gemini/adapter_test.go gains three sub-tests under
TestStreamAdapter_GeminiUsageMetadata:

  • forwards chunks containing only UsageMetadata — directly guards the
    dropped-chunk regression.
  • done event carries usage from last response — guards the second fix
    (extracting usage on the done branch).
  • trackUsage=false suppresses usage extraction — confirms the existing
    opt-out still works.

The existing TestStreamAdapter_FunctionCalls continues to pass.

@kenijkawada kenijkawada marked this pull request as ready for review May 21, 2026 08:28
@kenijkawada kenijkawada requested a review from a team as a code owner May 21, 2026 08:28
@dgageot dgageot merged commit c6848bd into docker:main May 21, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants