-
Notifications
You must be signed in to change notification settings - Fork 27
🤖 fix: correct context usage display for multi-step tool calls #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b358450 to
8883615
Compare
The Context Usage UI was showing inflated cachedInputTokens for plan messages with multi-step tool calls (e.g., ~150k instead of ~50k). Root cause: contextUsage was falling back to cumulative usage (summed across all steps) when contextUsage was undefined. For multi-step requests, cachedInputTokens gets summed because each step reads from cache, but the actual context window only sees one step's worth. Changes: - Backend: Refactor getStreamMetadata() to fetch totalUsage (for costs) and contextUsage (last step, for context window) separately from AI SDK - Backend: Add contextProviderMetadata from streamResult.providerMetadata for accurate cache creation token display - Frontend: Remove fallback from contextUsage to usage - only use contextUsage for context window display The fix ensures context window shows last step's inputTokens (actual context size) while cost calculation still uses cumulative totals.
8883615 to
51e533d
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Each promise (totalUsage, contextUsage, contextProviderMetadata) now has independent timeout + error handling. If providerMetadata rejects or times out, totalUsage is still returned for cost calculation.
|
@codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Problem
The Context Usage UI was sometimes showing inflated token counts for multi-step tool calls (e.g., plan messages showing ~150k
cachedInputTokensinstead of ~50k).Root Cause
Two issues combined:
Backend: On stream-end, we only used
lastStepUsagetracked fromfinish-stepevents. If nofinish-stepevent was received (edge cases),contextUsagewould be undefined. We weren't fetchingstreamResult.usage(which contains last step's usage) as the primary source.Frontend: When
contextUsagewas undefined, it fell back to cumulativeusage. For multi-step requests, cumulative usage sumscachedInputTokensacross all steps (each step reads from cache), inflating the context window display.Fix
Backend (
streamManager.ts):contextUsagefromstreamResult.usage(last step) on stream-end as the primary sourcelastStepUsageonly ifstreamResult.usagetimes outFrontend (
WorkspaceStore.ts):contextUsagetousage- only usecontextUsagefor context window displaySchema (
message.ts):contextUsageandcontextProviderMetadatato Zod schema to match TypeScript typesMigration Note
contextUsagestored in their message history. These workspaces will show no context usage until the next message is sent.Generated with
mux