Skip to content

Conversation

@ethanndickson
Copy link
Member

@ethanndickson ethanndickson commented Dec 4, 2025

Problem

The Context Usage UI was sometimes showing inflated token counts for multi-step tool calls (e.g., plan messages showing ~150k cachedInputTokens instead of ~50k).

Root Cause

Two issues combined:

  1. Backend: On stream-end, we only used lastStepUsage tracked from finish-step events. If no finish-step event was received (edge cases), contextUsage would be undefined. We weren't fetching streamResult.usage (which contains last step's usage) as the primary source.

  2. Frontend: When contextUsage was undefined, it fell back to cumulative usage. For multi-step requests, cumulative usage sums cachedInputTokens across all steps (each step reads from cache), inflating the context window display.

image

Fix

Backend (streamManager.ts):

  • Fetch contextUsage from streamResult.usage (last step) on stream-end as the primary source
  • Fall back to tracked lastStepUsage only if streamResult.usage times out
  • Each metadata fetch has independent timeout/error handling so one failure doesn't mask others

Frontend (WorkspaceStore.ts):

  • Remove fallback from contextUsage to usage - only use contextUsage for context window display

Schema (message.ts):

  • Add contextUsage and contextProviderMetadata to Zod schema to match TypeScript types

Migration Note

⚠️ Old workspaces created before this fix may not have contextUsage stored in their message history. These workspaces will show no context usage until the next message is sent.

image

Generated with mux

@ethanndickson ethanndickson force-pushed the fix-cached-tokens-context-usage branch from b358450 to 8883615 Compare December 4, 2025 05:08
The Context Usage UI was showing inflated cachedInputTokens for plan
messages with multi-step tool calls (e.g., ~150k instead of ~50k).

Root cause: contextUsage was falling back to cumulative usage (summed
across all steps) when contextUsage was undefined. For multi-step
requests, cachedInputTokens gets summed because each step reads from
cache, but the actual context window only sees one step's worth.

Changes:
- Backend: Refactor getStreamMetadata() to fetch totalUsage (for costs)
  and contextUsage (last step, for context window) separately from AI SDK
- Backend: Add contextProviderMetadata from streamResult.providerMetadata
  for accurate cache creation token display
- Frontend: Remove fallback from contextUsage to usage - only use
  contextUsage for context window display

The fix ensures context window shows last step's inputTokens (actual
context size) while cost calculation still uses cumulative totals.
@ethanndickson ethanndickson force-pushed the fix-cached-tokens-context-usage branch from 8883615 to 51e533d Compare December 4, 2025 05:11
@ethanndickson
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Each promise (totalUsage, contextUsage, contextProviderMetadata) now has
independent timeout + error handling. If providerMetadata rejects or times
out, totalUsage is still returned for cost calculation.
@ethanndickson
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ethanndickson ethanndickson added this pull request to the merge queue Dec 4, 2025
@ethanndickson ethanndickson removed this pull request from the merge queue due to a manual request Dec 4, 2025
@ethanndickson ethanndickson added this pull request to the merge queue Dec 4, 2025
Merged via the queue into main with commit f5e1d6c Dec 4, 2025
16 checks passed
@ethanndickson ethanndickson deleted the fix-cached-tokens-context-usage branch December 4, 2025 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant