feat(billing): rework usage tracking + context breakdown#2373
Conversation
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
apps/code/src/renderer/features/billing/hooks/useUsage.ts:31-37
The `refetch` callback depends on the entire `refreshMutation` result object, which contains stateful fields (`isPending`, `data`, etc.) that change on every render. This causes a new `refetch` function reference on every render. Depend on just the stable `mutateAsync` function to avoid unnecessary re-renders downstream.
```suggestion
const mutateAsync = refreshMutation.mutateAsync;
const refetch = useCallback(async () => {
const fresh = await mutateAsync();
if (fresh) {
queryClient.setQueryData(trpc.usageMonitor.getLatest.queryKey(), fresh);
}
return fresh;
}, [mutateAsync, queryClient, trpc.usageMonitor.getLatest]);
```
### Issue 2 of 3
apps/code/src/renderer/features/billing/utils.test.ts:54-86
**Prefer parameterised tests**
The `formatResetTime` suite has six structurally identical tests that each supply different inputs and expected outputs. Grouping them into a single `it.each` block would make the intent clearer and make it easier to add new cases. The same pattern appears in `service.test.ts` for the threshold-crossing assertions and in `context-breakdown.test.ts` for the various estimator functions.
### Issue 3 of 3
apps/code/src/renderer/features/sessions/components/ContextBreakdownPopover.tsx:68-90
**Segment widths can exceed 100% when estimation overestimates stable tokens**
`SegmentedBar` renders each segment as `(value / total) * 100%` where `total` is `used` (the real token count). `buildBreakdown` floors `conversation` at 0, so when `stableSum > currentInputTokens` the stable categories are emitted unchanged while `conversation = 0`. The segments therefore sum to more than `total`, and each individual width exceeds its true share. The `overflow-hidden` CSS clips the bar, but the relative proportions shown to the user become misleading when estimates drift significantly above actual usage.
Reviews (1): Last reviewed commit: "feat(billing): rework usage tracking + c..." | Re-trigger Greptile |
| ); | ||
| } | ||
|
|
||
| function SegmentedBar({ | ||
| breakdown, | ||
| total, | ||
| fallback, | ||
| }: { | ||
| breakdown: NonNullable<ContextUsage["breakdown"]>; | ||
| total: number; | ||
| fallback: string; | ||
| }) { | ||
| if (total <= 0) { | ||
| return <div className="h-1.5 w-full rounded-full bg-(--gray-4)" />; | ||
| } | ||
| return ( | ||
| <div className="flex h-1.5 w-full overflow-hidden rounded-full bg-(--gray-4)"> | ||
| {CONTEXT_CATEGORIES.map((cat) => { | ||
| const value = breakdown[cat.key]; | ||
| if (value <= 0) return null; | ||
| return ( | ||
| <div | ||
| key={cat.key} |
There was a problem hiding this comment.
Segment widths can exceed 100% when estimation overestimates stable tokens
SegmentedBar renders each segment as (value / total) * 100% where total is used (the real token count). buildBreakdown floors conversation at 0, so when stableSum > currentInputTokens the stable categories are emitted unchanged while conversation = 0. The segments therefore sum to more than total, and each individual width exceeds its true share. The overflow-hidden CSS clips the bar, but the relative proportions shown to the user become misleading when estimates drift significantly above actual usage.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/features/sessions/components/ContextBreakdownPopover.tsx
Line: 68-90
Comment:
**Segment widths can exceed 100% when estimation overestimates stable tokens**
`SegmentedBar` renders each segment as `(value / total) * 100%` where `total` is `used` (the real token count). `buildBreakdown` floors `conversation` at 0, so when `stableSum > currentInputTokens` the stable categories are emitted unchanged while `conversation = 0`. The segments therefore sum to more than `total`, and each individual width exceeds its true share. The `overflow-hidden` CSS clips the bar, but the relative proportions shown to the user become misleading when estimates drift significantly above actual usage.
How can I resolve this? If you propose a fix, please make it concise.
jonathanlab
left a comment
There was a problem hiding this comment.
would recommend running this agents the newly merged AGENTS.md

Problem
Users have no visibility into how much of their LLM usage quota they've consumed until they hit a hard limit, and the existing polling-based usage fetch was inefficient and prone to drift. Free-tier users lacked a reset-time label on the sidebar bar, and the context window indicator showed only a tooltip with aggregate numbers rather than a breakdown by source.
Changes
Usage monitoring service (
UsageMonitorService)UsageMonitorServicein the main process that replaces the renderer's 30-second polling loop. The service listens forLlmActivityevents emitted once per completed agent turn (both Claude and Codex adapters) and coalesces bursts into a single trailing fetch per 5-second window. A 30-minute backstop timer handles idle periods and billing-period rollovers.ThresholdCrossedevents at 50/75/90/100% for bothburstandsustainedbuckets. Crossed thresholds are deduplicated per user/product/bucket/billing-window anchor and persisted toelectron-storeso notifications don't re-fire after relaunch within the same window. Stale entries are pruned on boot.reset_at(absolute UTC timestamp) andbilling_period_endto the gateway schemas so the anchor logic doesn't drift with rollingresets_in_secondsvalues.tRPC surface
llmGateway.usagequery with a dedicatedusageMonitorrouter exposinggetLatest,refresh,onUsageUpdated(subscription), andonThresholdCrossed(subscription).useUsagenow subscribes toonUsageUpdatedand seeds the query cache fromgetLatestinstead of polling on a timer.Threshold toast notifications
initializeUsageThresholdToastsubscribes toonThresholdCrossedand shows a warning toast at 50/75/90% with a reset-time label and a "View usage" action, or triggers theUsageLimitModalat 100% when a session is active.useUsageLimitDetectionhook.Sidebar usage bar
Resets in 4h 30m,Resets Jun 1 at 12:00 AM PDT, etc.) derived from the newformatResetTimeutility, which prefersreset_atoverresets_in_seconds.Context breakdown popover
ContextBreakdownPopoverthat replaces the plain tooltip on theContextUsageIndicator. It shows a segmented bar and per-category token counts (System prompt, Tools, Rules, Skills, MCP, Subagents, Conversation)._posthog/usage_updatenotifications. Claude estimates system prompt, CLAUDE.md rules, slash-command skills, and MCP tool metadata at session init and on changes; Codex uses a constant baseline plus the injected system prompt.formatTokensCompact,getOverallUsageColor, andCONTEXT_CATEGORIESinto a sharedcontextColors.tsutility.Codex session-state fix
this.sessionState = createSessionState(...)reassignments with a newresetSessionState()mutation so the codex-client closure always writescontextUsed/contextSizeto the same object reference acrossnewSession/loadSession/resumeSession/forkSessioncalls.How did you test this?
UsageMonitorServicecovering threshold deduplication, cross-relaunch persistence, independent burst/sustained tracking,isProdetection, error resilience,UsageUpdatedchange detection, coalesce debouncing, andstop()cleanup.context-breakdown.tscovering all estimator functions andbuildBreakdownedge cases.useContextUsagecovering aggregate extraction, breakdown merging, and the double-underscore method prefix variant.ContextBreakdownPopovercovering header rendering, placeholder copy, and per-category row filtering.createCodexClientverifying thatcontextUsedwrites land on the correct object afterresetSessionState.formatResetTimecovering sub-hour, sub-day, multi-day,reset_atpreference, and past-reset cases.Publish to changelog?
no