fix(cost): Opus-accurate context ring + cost ledger + stop button by stansalvatec · Pull Request #2273 · pingdotgg/t3code

stansalvatec · 2026-04-21T19:53:37Z

Summary

Five related fixes stacked on the token-cost-meter branch, now merged together:

Stop button stuck active after model done (96768f1) — client optimistically reconciles session.status in thread.message-sent and guards onInterrupt when the latest turn is in a terminal state.
Bot-review cleanups (76a3495) — dead useInvalidateCostSummary removed, duplicate formatUsd re-exported from @t3tools/shared/pricing, no-op ternaries in sanitizePersistedFile simplified.
Cost ledger over-counting (b027c89) — ProviderRuntimeIngestion now gates recordUsage on the presence of any lastXxxTokens field (the canonical "turn-final" signal). Mid-turn Claude snapshots only flow to the context-window activity, not the ledger. Also normalises the model slug before ledger writes so the byModel breakdown stays stable.
Context-window ring over-reporting (step 1) (b027c89) — both adapters redefined usedTokens as input-side only (input + cache-read + cache-creation for Claude, last.inputTokens + last.cachedInputTokens for Codex). totalProcessedTokens keeps its billing-side semantic. ContextWindowSnapshot now carries cacheCreationInputTokens / lastCacheCreationInputTokens.
Opus-accurate context ring (step 2) (d46b444) — the step-1 fix still over-reported on Opus because we fell back to result.usage (which is session-cumulative across every API call, not per-turn), and the task_progress.usage SDK field only exposes an opaque total_tokens. Now we capture input + cache_read + cache_creation from every SDKAssistantMessage.message.usage (Anthropic-native per-call breakdown) and use it as the top-priority usedTokens source, emit mid-turn ring updates on each assistant frame, and drop the session-cumulative fallback entirely.

Migration: existing ledger files are polluted and can't be repaired in-place. CostTrackerLive writes a .schema-v2 sentinel in the usage dir on boot; when absent it wipes the known ledger files (session_*.json, YYYY-MM.json, alltime.json) and writes the sentinel. Stray non-ledger files are left alone. Bumping LEDGER_SCHEMA_VERSION is the single line needed for future reducer-incompatible changes.

Existing threads: context-window.updated activities sit in the orchestration event log per-thread. The ring reads the latest such activity, so existing threads keep their pre-fix (inflated) values until a new turn lands. New chats → correct immediately. Old threads → self-heal on next turn.

Files

apps/server/src/orchestration/Layers/ProviderRuntimeIngestion.ts — turn-final filter + model slug normalisation.
apps/server/src/provider/Layers/ClaudeAdapter.ts — usedTokens = input-side; per-call capture from SDKAssistantMessage.message.usage; priority for lastApiCallInputSide.
apps/server/src/provider/Layers/CodexAdapter.ts — usedTokens = last.inputTokens + last.cachedInputTokens.
apps/server/src/cost/Layers/CostTracker.ts — schema sentinel + first-boot wipe.
apps/web/src/lib/contextWindow.ts — carry through cacheCreationInputTokens / lastCacheCreationInputTokens.
packages/contracts/src/providerRuntime.ts — JSDoc for ThreadTokenUsageSnapshot (two dimensions, turn-final signal).
apps/web/src/store.ts + apps/web/src/components/ChatView.tsx — stop-button reconciliation.
apps/web/src/lib/costQuery.ts + apps/server/src/cost/Reducer.ts — bot-review cleanups.

Test plan

Migration note for users

On first server boot after merging, the usage ledger at <T3CODE_HOME>/<state>/usage/ is wiped (session + month + all-time files) to clear totals polluted by the pre-fix reducer. A .schema-v2 sentinel is written to prevent re-wipes. Month + all-time totals rebuild from subsequent turns; per-thread session files show $0/0 until a new turn lands.

Existing thread rings may keep their old wrong values until the next turn generates a new activity.

🤖 Generated with Claude Code

Note

Add per-turn cost ledger, accurate context-window ring, and stop-button guard for Claude/Codex

Adds a CostTrackerLive service that atomically persists per-session, per-month (local timezone), and all-time cost ledgers as JSON files under usageDir, exposed via GET /api/cost/summary.
Adds a pricing.ts module in packages/shared with model pricing for Claude and Codex models, computeTurnCost, and formatUsd for UI display.
Reworks ClaudeAdapter and CodexAdapter to report usedTokens as input-side tokens only (not including output), emit mid-turn thread.token-usage.updated events from assistant frames, and include per-class deltas (last* fields) on turn completion.
Adds a CostMeter ring in the chat composer footer showing session and month-to-date spend with a per-model breakdown popover, invalidating on each context-window.updated activity.
ProviderRuntimeIngestion now records costs to CostTracker only for turn-final token-usage events (those with last* fields); mid-turn snapshots are ignored.
Fixes ChatView.onInterrupt to skip dispatching a stop command when the latest turn is already in a terminal state (completed, interrupted, or error).
Behavioral Change: usedTokens in token-usage snapshots now reflects input-side tokens only and is no longer capped to maxTokens; totalProcessedTokens carries the full cumulative billing total.

^{Macroscope summarized bd0fc3b.}

Seed rates for Claude (sonnet-4.6, opus-4.6/4.7/4.5, haiku-4.5) and Codex (gpt-5.4, 5.3-codex, spark, mini) in USD per 1M tokens. getPricing() resolves via provider aliases with zero-rate fallback. computeTurnCost() splits input / cached / output / reasoning spend. Prep for session + MTD cost meter.

localStorage-persisted zustand store at t3code:cost-store:v1. Pure reducers accumulate token + USD spend per thread (session) and per YYYY-MM in local tz (month-to-date). sanitize*() guards garbage payloads; selectors expose session/month buckets and avg cost per turn. Tests: 17 pass.

useCostTracking hook observes activeThread activities and records each new context-window.updated event (with lastXxxTokens deltas) into the cost store. Seeds seen-set on mount / thread switch so historical activity is not retroactively charged to this month. Pure processActivitiesForCost reducer is unit-tested; the hook is a thin ref+effect wrapper. Tests: 9 pass.

CostMeter mirrors ContextWindowMeter's ring + Popover style. Fill ratio uses VITE_MONTHLY_BUDGET_USD if set, else a compressed log scale. Popover shows session/MTD totals, budget %, turn count, avg cost per turn, and per-model breakdown. Turns destructive color when over budget. useCostSummary zustand hook reads sessions + months slices and recomputes summary; cheap enough to recompute per render since selector is O(models). Composer wires useCostTracking side-effect + passes summary to ComposerFooterPrimaryActions next to ContextWindowMeter.

Let dev mode point at the installed app's "userdata" state for history continuity, and pave the way for a server-side usage/ JSON store that both dev and prod reuse. - deriveServerPaths accepts optional stateSubdir; env wins over the default (dev/userdata selection via devUrl). - Adds usageDir (<stateDir>/usage) to derived paths + ensures it exists at startup. - dev-runner: new --state-subdir flag + --use-userdata shortcut; forwards to T3CODE_STATE_SUBDIR. Startup logs warn loudly when dev is aimed at userdata. - Tests: dev-runner env matrix (22 pass), cli-config subdir override + usageDir derivation (10 pass).

- Add cacheCreationInputTokens + lastCacheCreationInputTokens to ThreadTokenUsageSnapshot. Anthropic charges cache-write at 1.25x input; reporting it separately lets the cost meter bill correctly. - Add optional model field to ThreadTokenUsageUpdatedPayload so the server-side cost tracker can resolve pricing without a lookup against thread state.

Anthropic bills cache-writes at 1.25x input; OpenAI has no separate write tier. Model a distinct cacheCreationInputPerMTok rate (with provider-aware defaults) so the cost meter no longer conflates cache hits, cache writes, and fresh input. - ModelPricing gains cacheCreationInputPerMTok; Claude auto-applies the 1.25x multiplier, OpenAI defaults to inputPerMTok. - TurnTokenDeltas + TurnCostBreakdown gain cacheCreation slots; zero for providers that don't distinguish the tier. - computeTurnCost bills each class additively. - Client extractDeltas reads lastCacheCreationInputTokens; helpers + fixtures carry the new field through. - Tests: +2 cases covering Anthropic cache-write premium and the OpenAI default.

… usage The Claude adapter lumped cache_read / cache_creation / fresh input into a single inputTokens field and emitted no per-turn deltas, leaving the cost meter silently $0 for every Claude turn and over-charging cached contexts by ~10x when it did fire. It also clamped usedTokens at maxTokens on cumulative totals, pinning the context ring at 100% once totalProcessedTokens exceeded the window. Changes: - Extract parseClaudeUsageBreakdown: splits SDK usage into four tiers (input / cachedInput / cacheCreationInput / output) with an explicit totalTokens. - normalizeClaudeTokenUsage emits all four tiers and drops the min(total, max) cap; callers decide how to render overflow. - Add buildClaudeTurnCompleteUsage: maintains a per-session lastTurnCumulativeUsage accumulator, subtracts from each result.usage to produce lastInputTokens / lastCachedInputTokens / lastCacheCreationInputTokens / lastOutputTokens deltas for the cost tracker. usedTokens prefers the task snapshot (real current context) over the cumulative total. - Context state gains lastTurnCumulativeUsage; initialized at session start, advanced on each turn-complete emission. Tests: - New ClaudeAdapter.usage.test.ts: 10 unit tests cover parseBreakdown semantics, first-turn vs second-turn deltas, clamp behaviour, task-snapshot fallback, and negative-delta guards. - ClaudeAdapter.test.ts updated: three existing cases now assert the split tiers + uncapped usedTokens (what the SDK actually reports). - Full server suite: 894 pass.

Introduces a server-owned cost ledger that writes three atomic JSON files per recorded turn: - session_<threadId>.json per-thread cumulative - YYYY-MM.json month bucket (local tz) - alltime.json running total since install Works across dev, installed app, and standalone binaries because persistence lives next to the server's existing SQLite state at <T3CODE_HOME>/<state>/usage/. Atomic writes mirror serverSettings: write .tmp, rename into place; errors log and swallow so orchestration never blocks on FS failure. Components: - types.ts: plain-TS interfaces + local-tz month key helper + empty-bucket constructors. - Reducer.ts: pure deriveTurnDeltas / processTurn / isTurnNoOp / sanitizePersistedFile. Prefers lastXxxTokens from the payload (Codex + post-fix Claude); falls back to delta-vs-lastCumulative for older providers. Zero-cost unknown models still record their token usage. - Services/CostTracker.ts: Effect Context.Service API (recordUsage / getSummary / updates stream). - Layers/CostTracker.ts: FS-backed live layer; semaphore-serialized writes; PubSub exposes live updates for WS broadcast. - shared/pricing: re-export ProviderKind so server consumers don't reach into contracts for it. Tests: 14 pure reducer cases + 5 live-layer cases (record, idempotent no-op, accumulate, stream emission, zero-summary). All green.

Wire the runtime event stream into the new CostTracker and expose the ledger over HTTP so web + desktop + standalone binaries all share the same authoritative cost data. Server (c11 + c12) - ProviderRuntimeIngestion now calls CostTracker.recordUsage after appending the context-window.updated activity. Errors are logged and swallowed so orchestration is never blocked by FS faults. - Model comes from event.payload.model (set by adapters) with a fallback to thread.modelSelection.model. - CostTrackerLive added to the server composition root + wired into test + integration layers (stub mock for server.test.ts). - New GET /api/cost/summary?threadId=X route returns the freshest session + month + all-time summary. CORS handled via the existing browserApi layer. Client (c13) - Drop zustand + localStorage. The old costStore.ts / useCostTracking.ts (plus their tests) are gone — server is now source of truth. - New lib/costQuery.ts: react-query queryOptions + sanitizer for the HTTP response, plus formatUsd utility. Invalidation helper bumps the cache whenever the active thread receives a new context-window.updated activity, so the ring updates within one render of the server write. - ChatComposer replaces useCostTracking/useCostSummary with a useQuery subscription and a tiny effect that invalidates on new usage activities. Plumbs activeProvider through to the meter. - CostMeter: rebuild around the new {thread, month, allTime} shape. Popover now shows session ⋅ MTD ⋅ all-time and gracefully renders "—" for providers without token-usage telemetry (cursor / opencode) instead of a misleading $0. Tests: 913 server pass, 906 web pass (26 old localStorage tests deleted, replaced by server-owned CostTracker coverage from c10).

When the final `thread.message-sent` (streaming:false) arrives, the client marks `latestTurn.state` as "completed" but leaves `session.status === "running"` until the separate `thread.session-set` event (emitted server-side on `turn.completed`) arrives. In that gap: - The stop button stays red because visibility is derived from `derivePhase(session)` → `"running"` via `session.status`. - Clicking it dispatches `thread.turn.interrupt`; the server has no active turn so the command is a no-op, and the UI stays stuck until the late `thread.session-set` lands. Fix: - `store.ts` `thread.message-sent` handler: when the final assistant message for the currently active turn arrives and `latestTurn` resolves to "completed", optimistically flip `session.status` / `orchestrationStatus` to "ready" and clear `activeTurnId`. The later server-sent `thread.session-set` overwrites session via `mapSession` and is idempotent over this change. Interrupted and errored turns are excluded (checked via `latestTurn.state === "completed"` and the `activeTurnId === event.turnId` guard). - `ChatView.tsx` `onInterrupt`: defensive guard — if `latestTurn` is already in a terminal state (completed / interrupted / error), skip the dispatch. This closes the small window where a click lands before React re-renders the composer. Tests: - Updated the existing replay-batch test: after a final assistant `message-sent` for the active turn, `session.status` is now "ready" and `activeTurnId` is cleared. - Added a test that a mismatched turnId (active turn ≠ streaming:false message turn) does NOT reconcile — the server's session-set remains authoritative. - Added a test that an interrupted turn's final message does NOT reconcile session to "ready". All 908 web tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-21T19:53:45Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cca1ac92-e77a-4f6b-8156-6c4070c32b09

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

macroscopeapp · 2026-04-21T19:57:35Z

Approvability

Verdict: Needs human review

This PR introduces a significant new cost tracking feature with server-side ledger, pricing calculations, HTTP endpoints, and UI components. Unresolved HIGH severity review comments identify potential bugs in the Codex adapter token calculation (NaN from undefined addition) that could cause incorrect cost reporting. The billing/cost tracking nature of these changes combined with the identified bugs warrant careful human review.

^{You can customize Macroscope's approvability policy. Learn more.}

Address Cursor Bugbot + Macroscope findings on pingdotgg#2273: - apps/server/src/cost/Reducer.ts: drop the no-op ternaries in sanitizePersistedFile (`r.version === 1 ? 1 : 1` and `r.kind === expectedKind ? expectedKind : expectedKind`). Both always returned the right-hand value regardless of the stored value, so they were silently forcing the expected defaults — which is actually the intended sanitize-on-mismatch behaviour. Simplify to the constants directly and add a comment explaining the intent. (Macroscope, Reducer.ts:325-326.) - apps/web/src/lib/costQuery.ts: stop duplicating `formatUsd` and instead re-export it from `@t3tools/shared/pricing` (the shared package was already a workspace dep and owns computeTurnCost next to the formatter). Keeping the re-export so CostMeter and any future consumer continue to import from `~/lib/costQuery` as the single cost-UI utility module. (Cursor, duplicated-function.) - apps/web/src/lib/costQuery.ts: remove the dead `useInvalidateCostSummary` hook. The ChatComposer calls `invalidateCostSummary` directly with its own `useQueryClient`, so the hook wrapper was unused surface area. (Cursor, dead-code.) Verified: web typecheck clean, web tests 908/908 pass, server cost tests 19/19 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Olympicx · 2026-04-21T20:58:16Z

updated the PR

Two independent bugs in the token-usage pipeline, both user-visible and both rooted in the same conflation between the context-window dimension (what fills the ring) and the billing dimension (what lands in the cost ledger). ## 1. Cost ledger over-counting (CRITICAL) Claude emits `thread.token-usage.updated` events from three places per turn: every `task_progress`, every `task_notification`, and the final `completeTurn`. The mid-turn snapshots carry per-API-call breakdowns *without* `lastXxxTokens` fields, while the turn-complete snapshot carries cumulative totals *with* `lastXxx` deltas. `ProviderRuntimeIngestion` fed every one of these events into `CostTracker.recordUsage`. For the mid-turn events, the Reducer's `hasExplicitLast=false` branch subtracts the payload's cumulative against the session's `lastCumulative` — but what gets stored in `lastCumulative` between mid-turn events is one API call's breakdown, not the session running total, so the resulting "deltas" are arbitrary diffs between per-call snapshots. Net effect: cost over/undercounted unpredictably every turn, and `turnCount` inflated by 3–10× because every mid-turn snapshot with any positive delta bumped it. Fix: gate `recordUsage` in `ProviderRuntimeIngestion` on the presence of any `lastXxxTokens` field. Mid-turn snapshots still flow to the `context-window.updated` activity for the ring, they just skip the ledger. Codex only emits one snapshot per turn (and always with `lastXxx`) so it's unaffected. While here, normalise the model slug (`resolveModelSlugForProvider`) before passing it to the ledger so aliased/canonical variants collapse to a single `byModel` key. ## 2. Context-window ring over-reporting Both adapters set `usedTokens = totalTokens`, which for the cost dimension meant *every* billed token including outputs. But the ring consumes `usedTokens / maxTokens`, and output tokens are generated *out* of the model — they don't live in the prompt window, so including them inflated the ring (especially on long- output turns). Reasoning tokens have the same property (ephemeral, not persisted into next-turn context). Fix: redefine `usedTokens` as the input-side total only (`input + cache-read + cache-creation`), in both `normalizeClaudeTokenUsage`/`buildClaudeTurnCompleteUsage` and `normalizeCodexTokenUsage` (`last.inputTokens + last.cachedInputTokens` — Codex V2 has no cache-creation tier). `totalProcessedTokens` keeps the original semantic ("tokens processed so far", billing-side). Added a contract-level JSDoc on `ThreadTokenUsageSnapshot` that spells out the two dimensions and the `lastXxxTokens` "turn-final" signal. Also: the client's `deriveLatestContextWindowSnapshot` was silently dropping `cacheCreationInputTokens` / `lastCacheCreationInputTokens` from the `ContextWindowSnapshot` shape even though the payload carries them. Wire them through. ## 3. Migration Existing ledger files are polluted and can't be repaired in-place. Added a `.schema-v2` sentinel in the usage dir: `CostTrackerLive` boots, sees no sentinel, wipes only the known ledger files (`session_*.json`, `YYYY-MM.json`, `alltime.json`) — any stray files are left alone — writes the sentinel, and subsequent boots skip. Bumping `LEDGER_SCHEMA_VERSION` is the single line needed for any future reducer-incompatible change. ## Tests - Reworked Claude/Codex adapter assertions for the new input-side `usedTokens` semantic (24542 → 23863 for the Claude cumulative case, 126 → 120 for Codex, etc.); explanatory comments added. - New ProviderRuntimeIngestion test: mid-turn snapshot (no `lastXxx`) projects into the activity stream but does NOT bump the ledger; turn-final snapshot records exactly one turn. - New CostTrackerLive tests: first boot wipes pre-v2 ledger files (including a `.json` stray, which survives); subsequent boot with sentinel present leaves ledger files intact. - Existing ingestion tests retargeted at a temp-dir base so the first-boot wipe can't touch the developer's real `<cwd>/userdata/usage/` directory. All 203 server tests pass in the changed files; 908 web tests pass; 126 shared tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-04-21T22:40:45Z

+  const usedTokens = inputSideTokens > 0 ? inputSideTokens : usage.last.totalTokens;
+  if (usedTokens <= 0) {
+    return undefined;
+  }


Missing undefined guard in Codex token usage fallback

Medium Severity

The old code guarded against usage.last.totalTokens being undefined with an explicit usedTokens === undefined || usedTokens <= 0 check. The new code removed the undefined check. When inputSideTokens is 0, usedTokens falls back to usage.last.totalTokens. If that value is undefined, the guard usedTokens <= 0 evaluates to false (because undefined coerces to NaN, and NaN <= 0 is false), so the function proceeds to return a snapshot with usedTokens: undefined instead of returning undefined to signal no valid usage data.

^{Reviewed by Cursor Bugbot for commit b027c89. Configure here.}

…accuracy The earlier switch to input-side `usedTokens` still showed inflated values for Claude Opus (and any multi-call turn) because the two signals we trusted are both unreliable sources of current context size: 1. `result.usage` is **session-cumulative** across every API call on the thread, not just this turn. Summing its input-side classes grows linearly with turn count — exactly what users saw on Opus, which makes many API calls per turn. 2. `task_progress.usage` only carries an opaque SDK `total_tokens`; the Anthropic-native per-class breakdown (`input_tokens` / `cache_read_input_tokens` / `cache_creation_input_tokens`) is **not present** on `SDKTaskProgressMessage.usage`. Parsing it always falls through to `total_tokens`. The only source that carries the *exact per-call prompt breakdown* is `SDKAssistantMessage.message.usage` — that's `BetaUsage` from the Anthropic API, refreshed on every assistant frame. Fix: - New `context.lastApiCallInputSideTokens` tracks `input_tokens + cache_read_input_tokens + cache_creation_input_tokens` captured from each `SDKAssistantMessage.message.usage`. Refreshed per frame, cleared after the turn-completion emission so the next turn starts clean. - `handleAssistantMessage` also emits a `thread.token-usage.updated` event on each assistant frame with this input-side sum as `usedTokens`, so the mid-turn ring tracks real prompt size (not the SDK's opaque total). - `buildClaudeTurnCompleteUsage` now takes an optional `lastApiCallInputSide` and uses it as the top-priority `usedTokens` source. Priority: 1. `lastApiCallInputSide` — exact current context. 2. `taskSnapshot.usedTokens` — SDK opaque (fallback). 3. Per-turn *delta* input-side — last-ditch when neither above is present. The old session-cumulative fallback has been removed; it inflated any multi-call turn. - `lastUsedTokens` mirrors `usedTokens` when the per-turn input-side delta is zero, so we never fall back to the session-cumulative sum. Tests: - Updated the "preserves oversized result totals after task progress" test: `lastUsedTokens` is now `190_000` (mirrors `usedTokens`), not `535_000` (the removed cumulative fallback). - New `prefers lastApiCallInputSide over the task snapshot for usedTokens`: when both are present, per-call wins. - New `does NOT fall back to cumulative input-side for usedTokens`: with a real prior cumulative, fallback now returns the per-turn delta, not the session-wide sum. - New adapter-level test verifying an assistant frame with Anthropic-native usage emits a `thread.token-usage.updated` event with `usedTokens = input + cache_read + cache_creation`. Important: existing threads retain their pre-fix `usedTokens` values in stored `context-window.updated` activities until the next turn generates a new activity. The ring self-heals on the first new turn; old turns in-history keep their stale numbers. Verified: 206/206 targeted server tests pass (3 new), 908/908 web tests pass, typecheck + oxlint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Local rebuild for personal distribution off the feat/token-cost-meter branch. Keeps the app bundle identifier (`com.t3tools.t3code`) untouched so existing auto-update channels aren't disturbed, but changes the user-facing name, dev launcher label, and artifact filename. - apps/desktop/package.json: productName → "T3 by Stan". - apps/desktop/scripts/electron-launcher.mjs: APP_DISPLAY_NAME follows the new name (dev / prod variants). - scripts/build-desktop-artifact.ts: artifactName → `T3-by-Stan-${version}-${arch}.${ext}` so the DMG / zip / blockmap files land as `release/T3-by-Stan-0.0.21-arm64.dmg` etc. - apps/{desktop,server,web}/package.json + bun.lock: version bump 0.0.20 → 0.0.21. The legacy user-data migration constant in `apps/desktop/src/main.ts` (`LEGACY_USER_DATA_DIR_NAME = "T3 Code (Alpha)"`) is intentionally left alone so this build still picks up data from the prior install. Built macOS arm64 DMG sits at release/T3-by-Stan-0.0.21-arm64.dmg (136 MB, unsigned / ad-hoc — Gatekeeper first-launch warning expected). Signing / notarization not configured; would require Apple Developer credentials. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-04-21T23:14:22Z

+  // size.  Fall back to the raw `last.totalTokens` only when the
+  // breakdown is zero (defensive — shouldn't happen for any real turn).
+  const inputSideTokens = inputTokens + cachedInputTokens;
+  const usedTokens = inputSideTokens > 0 ? inputSideTokens : usage.last.totalTokens;


NaN from undefined addition defeats input-side-only fix

High Severity

In normalizeCodexTokenUsage, inputTokens and cachedInputTokens can be undefined (the code itself checks !== undefined when conditionally spreading them into the snapshot a few lines later). Adding undefined + undefined or number + undefined produces NaN, and NaN > 0 is false, so the fallback to usage.last.totalTokens (which includes output + reasoning tokens) silently kicks in — exactly the over-reporting this PR is meant to fix. The values need a nullish coalesce to zero before addition.

^{Reviewed by Cursor Bugbot for commit 1790ec5. Configure here.}

cursor · 2026-04-21T23:14:23Z

+    kind,
+    key,
+    bucket: emptyCostBucket(now),
+  });


Unused emptyBucketFile function is dead code

Low Severity

The emptyBucketFile helper is defined inside the make generator but never called anywhere. The loadFile function already handles missing files via sanitizePersistedFile, which returns an empty bucket when the raw input is undefined. This is dead code that can be removed.

^{Reviewed by Cursor Bugbot for commit 1790ec5. Configure here.}

macroscopeapp · 2026-04-21T23:18:13Z

+  const usedTokens =
+    input.lastApiCallInputSide !== undefined && input.lastApiCallInputSide > 0
+      ? input.lastApiCallInputSide
+      : (input.taskSnapshot?.usedTokens ?? deltaUsedFallback);


🟢 Low Layers/ClaudeAdapter.ts:517

Line 520 uses ?? so when input.taskSnapshot.usedTokens is 0, the code keeps that 0 instead of falling through to deltaUsedFallback. The comment on lines 521–524 states the intent is to "never emit 0 for a turn that clearly had activity", but ?? only falls back on undefined/null, not on 0. If the SDK reports usedTokens: 0 while cumulative indicates activity, usedTokens becomes 0, violating the stated intent. Consider using a ternary that checks > 0 instead of ??.

- const usedTokens = - input.lastApiCallInputSide !== undefined && input.lastApiCallInputSide > 0 - ? input.lastApiCallInputSide - : (input.taskSnapshot?.usedTokens ?? deltaUsedFallback); + const usedTokens = + input.lastApiCallInputSide !== undefined && input.lastApiCallInputSide > 0 + ? input.lastApiCallInputSide + : (input.taskSnapshot?.usedTokens ?? deltaUsedFallback) || deltaUsedFallback;

🤖 Copy this AI Prompt to have your agent fix this:

In file apps/server/src/provider/Layers/ClaudeAdapter.ts around lines 517-520: Line 520 uses `??` so when `input.taskSnapshot.usedTokens` is `0`, the code keeps that `0` instead of falling through to `deltaUsedFallback`. The comment on lines 521–524 states the intent is to "never emit 0 for a turn that clearly had activity", but `??` only falls back on `undefined`/`null`, not on `0`. If the SDK reports `usedTokens: 0` while `cumulative` indicates activity, `usedTokens` becomes `0`, violating the stated intent. Consider using a ternary that checks `> 0` instead of `??`. Evidence trail: apps/server/src/provider/Layers/ClaudeAdapter.ts lines 505-525 at REVIEWED_COMMIT. Line 520 shows `(input.taskSnapshot?.usedTokens ?? deltaUsedFallback)` using nullish coalescing. Lines 521-524 contain the comment stating "so we never emit 0 for a turn that clearly had activity". Line 505-506 shows `deltaUsedFallback = lastInputSideTokens > 0 ? lastInputSideTokens : cumulative.totalTokens` which would provide a non-zero fallback when cumulative indicates activity.

Rebuilds the personal T3-by-Stan DMG to pick up the per-call input-side usedTokens fix (d46b444) so the context ring shows accurate values on Opus + multi-call turns. No behavioural change beyond version; bun.lock re-synced. Artifact: release/T3-by-Stan-0.0.22-arm64.dmg (136 MB, unsigned). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 5 total unresolved issues (including 3 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit bd0fc3b. Configure here.}

cursor · 2026-04-21T23:28:07Z

+
+function sanitizeNumber(value: unknown): number {
+  return typeof value === "number" && Number.isFinite(value) && value >= 0 ? value : 0;
+}


Duplicate identical functions in same file

Low Severity

sanitizeNumber and finiteNonNeg are identical functions defined in the same file — both accept unknown, check for a finite non-negative number, and return 0 otherwise. One of them can be removed and all call sites pointed at the surviving function.

Additional Locations (1)

apps/server/src/cost/Reducer.ts#L21-L24

^{Reviewed by Cursor Bugbot for commit bd0fc3b. Configure here.}

cursor · 2026-04-21T23:28:07Z

+  thread: null,
+  month: emptyBucket(),
+  allTime: emptyBucket(),
+};


Stale monthKey in shared singleton constant

Low Severity

EMPTY_COST_SUMMARY computes monthKey via monthKeyNow() once at module-load time and reuses it as a frozen constant. If the browser tab stays open across a month boundary, the placeholder and fallback monthKey becomes stale (e.g., shows "2026-03" in April). Converting EMPTY_COST_SUMMARY to a function or computing monthKey lazily would avoid the stale value.

^{Reviewed by Cursor Bugbot for commit bd0fc3b. Configure here.}

Olympicx and others added 11 commits April 21, 2026 19:30

github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Apr 21, 2026

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread apps/web/src/lib/costQuery.ts Outdated

Comment thread apps/web/src/lib/costQuery.ts Outdated

macroscopeapp Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread apps/server/src/cost/Reducer.ts Outdated

stansalvatec changed the title ~~fix(web): stop button stays active after model response completes~~ fix(cost): context ring + cost ledger accuracy + stop button Apr 21, 2026

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Olympicx and others added 2 commits April 22, 2026 01:11

cursor Bot reviewed Apr 21, 2026

View reviewed changes

macroscopeapp Bot reviewed Apr 21, 2026

View reviewed changes

stansalvatec changed the title ~~fix(cost): context ring + cost ledger accuracy + stop button~~ fix(cost): Opus-accurate context ring + cost ledger + stop button Apr 21, 2026

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cost): Opus-accurate context ring + cost ledger + stop button#2273

fix(cost): Opus-accurate context ring + cost ledger + stop button#2273
stansalvatec wants to merge 16 commits intopingdotgg:mainfrom
stansalvatec:feat/token-cost-meter

stansalvatec commented Apr 21, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

Review skipped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

Olympicx commented Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

macroscopeapp Bot Apr 21, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stansalvatec commented Apr 21, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files

Test plan

Migration note for users

Add per-turn cost ledger, accurate context-window ring, and stop-button guard for Claude/Codex

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

Olympicx commented Apr 21, 2026

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Missing undefined guard in Codex token usage fallback

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

NaN from undefined addition defeats input-side-only fix

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Unused emptyBucketFile function is dead code

Uh oh!

macroscopeapp Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Duplicate identical functions in same file

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Stale monthKey in shared singleton constant

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stansalvatec commented Apr 21, 2026 •

edited by macroscopeapp Bot

Loading

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

macroscopeapp Bot commented Apr 21, 2026 •

edited

Loading

Unused `emptyBucketFile` function is dead code