Skip to content

fix: record token usage once per stream to prevent inflated telemetry#1855

Merged
dgageot merged 1 commit intodocker:mainfrom
dgageot:fix-gemini-cost
Feb 26, 2026
Merged

fix: record token usage once per stream to prevent inflated telemetry#1855
dgageot merged 1 commit intodocker:mainfrom
dgageot:fix-gemini-cost

Conversation

@dgageot
Copy link
Member

@dgageot dgageot commented Feb 26, 2026

Providers like Gemini emit usage metadata on every streaming chunk with cumulative token counts. The previous code ran telemetry.RecordTokenUsage (which uses +=) on every chunk, causing massively inflated telemetry token counts and cost.

Move the session token updates and telemetry call out of the per-chunk loop into a recordUsage helper that runs exactly once, right before handleStream returns. The latest usage snapshot is still captured (last wins) so token counts remain accurate.

Assisted-By: cagent

Providers like Gemini emit usage metadata on every streaming chunk with
cumulative token counts. The previous code ran telemetry.RecordTokenUsage
(which uses +=) on every chunk, causing massively inflated telemetry
token counts and cost.

Move the session token updates and telemetry call out of the per-chunk
loop into a recordUsage helper that runs exactly once, right before
handleStream returns. The latest usage snapshot is still captured (last
wins) so token counts remain accurate.

Assisted-By: cagent
Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Approved - This PR correctly fixes the token usage inflation issue by moving the telemetry recording out of the per-chunk loop.

The refactoring is well-structured with the recordUsage helper function that uses a guard flag to ensure telemetry is recorded exactly once per stream. The logic properly handles multiple exit paths (early return on FinishReason and fallback at end of loop), and the latest usage snapshot is correctly captured for providers like Gemini that emit cumulative token counts on every chunk.

No bugs found in the changed code.

@rumpl
Copy link
Member

rumpl commented Feb 26, 2026

Wasn't this already fixed once?

@dgageot dgageot merged commit 4c16dda into docker:main Feb 26, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants