You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today HAPI surfaces context usage in the composer status bar for some flavors (Claude/Codex per #645, #750, #813). It does not surface quota / budget — i.e. "how much of your 5-hour window / weekly limit / monthly spend cap is gone, and when does it reset?" An operator running many sessions has no in-UI signal that a session is about to die from quota until it actually dies.
This issue proposes an umbrella for cross-flavor agent budget/quota gauges in the composer. The work splits cleanly into a small renderer/schema PR plus per-flavor data-extractor PRs.
The Codex piece is essentially #537 by @dsus4wang — this issue is structured so #537 (or its successor) is the natural seed and the other flavors slot in alongside it.
Why this is more tractable than it looks
The two providers HAPI most commonly wraps now ship structured quota data on documented surfaces:
Claude Code — rate_limits in statusline JSON stdin
Since Claude Code v2.1+ (some shape since v1.2.80) the JSON object piped to statusLine.command on every render contains:
Documented at code.claude.com/docs/en/statusline. rate_limits is present for Claude.ai subscribers (Pro/Max) after the first API response. The underlying source is the anthropic-ratelimit-unified-* response headers (with anthropic-beta: oauth-2025-04-20) — same five-hour / seven-day windows. The same data is what Claude Code's own /usage slash command reads. Third-party tools confirming the contract: nsanden/claude-rate-monitor, wakamex/ccusage, z80020100/claude-code-statusline.
So for Claude HAPI sessions: HAPI's CLI wrapper installs (or composes with) a statusline command, captures the JSON stream, extracts the rate_limits fields, forwards as a typed agentUsage SSE patch. No reverse-engineering, no new API auth.
PR #537 ("[codex] add Codex usage indicator") by @dsus4wang already builds the Codex side: captures token_count events, stores session.metadata.codexUsage, renders a "compact usage ring beside the send button, with popover details for context window, rate limits, and token breakdown." This umbrella treats #537 as the natural seed and proposes the unified schema be shaped so #537's codexUsage slots in cleanly (or is renamed to agentUsage of flavor: 'codex').
Cursor — has quota data via Admin API (Enterprise only)
The official Cursor Admin API exposes /teams/spend, /teams/daily-usage-data, /teams/filtered-usage-events, /teams/user-spend-limit. Auth via admin API key (Basic Auth). Rate-limited at 20 req/min — hourly polling is the documented best-practice cadence.
Caveat: Enterprise plan only. Pro/Team plans have no documented quota API. That's a real gap (see "Out of scope" below).
Other flavors
Gemini, Kimi, OpenCode: data sources vary, mostly turn-end token counts on response objects. Tractable but flavor-specific. Not blocking for v1 — schema is open to extension.
Hourly poll of Admin API behind opt-in cursorAdminApiKey config
hub/src/cursor/
Acceptance criteria
session.metadata.agentUsage field added with schema above (or compatible)
Renderer slot in composer that gracefully renders nothing when agentUsage absent (no UI jitter for free-tier Claude users, OpenCode, Gemini, etc. where data isn't available)
Claude statusline-JSON extractor working for Claude.ai Pro/Max sessions; no-op for non-subscribers
Cursor Enterprise Admin API poller behind opt-in config; correctly maps per-user spend when HAPI user email matches Cursor user email
No new XSS / auth-token-leak surface; admin API keys never logged or sent to the client
Each flavor's extractor degrades quietly (logs at debug, not error) when the underlying data source is unavailable
Out of scope
Cursor Pro/Team quota: no official API exists. Operators can wire a workaround via the reverse-engineered dashboard endpoints (WorkosCursorSessionToken cookie + GET /api/usage-summary + POST /api/dashboard/get-filtered-usage-events, documented at https://gist.github.com/dmwyatt/1e9359b1862e7cbfe1e754fe4c8db764, with prior art in dmwyatt/cursor-usage CLI and kdosiodjinud/cursor-chrome-extension). That's not appropriate for upstream — undocumented endpoints, brittle, the cookie is httpOnly and must be manually extracted, JWT expiry must be handled. But the renderer + schema proposed here are exactly the surface a Pro/Team operator would need to render their own scraped data, so this issue stays useful for them. If Cursor ships a Pro/Team-level usage API in the future, a follow-up issue swaps the data source.
OpenCode / Gemini / Kimi gauges: extensible-by-design; ship after the unified surface lands.
Setting spend limits via the Cursor Admin API (it's available via POST /teams/user-spend-limit, 250 req/min). Read-only for v1.
Generalising the Cursor failure-mode classifier (stderr parsing for out of usage etc.) — orthogonal; that's about post-mortem signal, this issue is about pre-mortem visibility.
Related
#537 — Codex usage indicator (open, the natural seed; this umbrella unblocks generalisation after it lands)
#750 — OpenCode status bar context-usage refresh (closed, same surface neighborhood)
#645 — Claude remote-mode ctx stuck at 0 (open, context not quota but adjacent code path)
#813 — Mermaid render-error SVG suppression (recently merged, in passing because the composer touches the same neighborhood)
#818 — Cursor stream-json wrapper silent-drop on agent exit (closed, fixed the failure mode but not the visibility surface)
Notes for maintainers
This issue is intentionally framed as umbrella + slot-shape, not a single fat PR. Suggested PR sequence:
Unified schema + renderer skeleton — adds AgentUsageSchema, empty AgentUsageGauge that renders nothing when data absent. Tiny diff, no behavior change.
Claude statusline extractor — implements Fix-5-shape; opt-in to start if you'd prefer.
Summary
Today HAPI surfaces context usage in the composer status bar for some flavors (Claude/Codex per #645, #750, #813). It does not surface quota / budget — i.e. "how much of your 5-hour window / weekly limit / monthly spend cap is gone, and when does it reset?" An operator running many sessions has no in-UI signal that a session is about to die from quota until it actually dies.
This issue proposes an umbrella for cross-flavor agent budget/quota gauges in the composer. The work splits cleanly into a small renderer/schema PR plus per-flavor data-extractor PRs.
The Codex piece is essentially #537 by @dsus4wang — this issue is structured so #537 (or its successor) is the natural seed and the other flavors slot in alongside it.
Why this is more tractable than it looks
The two providers HAPI most commonly wraps now ship structured quota data on documented surfaces:
Claude Code —
rate_limitsin statusline JSON stdinSince Claude Code v2.1+ (some shape since v1.2.80) the JSON object piped to
statusLine.commandon every render contains:{ "model": { "display_name": "Claude Sonnet 4.6" }, "context_window": { "used_percentage": 12.4 }, "rate_limits": { "five_hour": { "used_percentage": 42, "resets_at": 1742651200 }, "seven_day": { "used_percentage": 18, "resets_at": 1743120000 } } }Documented at code.claude.com/docs/en/statusline.
rate_limitsis present for Claude.ai subscribers (Pro/Max) after the first API response. The underlying source is theanthropic-ratelimit-unified-*response headers (withanthropic-beta: oauth-2025-04-20) — same five-hour / seven-day windows. The same data is what Claude Code's own/usageslash command reads. Third-party tools confirming the contract:nsanden/claude-rate-monitor,wakamex/ccusage,z80020100/claude-code-statusline.So for Claude HAPI sessions: HAPI's CLI wrapper installs (or composes with) a statusline command, captures the JSON stream, extracts the
rate_limitsfields, forwards as a typedagentUsageSSE patch. No reverse-engineering, no new API auth.Codex — already addressed by #537
PR #537 ("[codex] add Codex usage indicator") by @dsus4wang already builds the Codex side: captures
token_countevents, storessession.metadata.codexUsage, renders a "compact usage ring beside the send button, with popover details for context window, rate limits, and token breakdown." This umbrella treats #537 as the natural seed and proposes the unified schema be shaped so #537'scodexUsageslots in cleanly (or is renamed toagentUsageofflavor: 'codex').Cursor — has quota data via Admin API (Enterprise only)
The official Cursor Admin API exposes
/teams/spend,/teams/daily-usage-data,/teams/filtered-usage-events,/teams/user-spend-limit. Auth via admin API key (Basic Auth). Rate-limited at 20 req/min — hourly polling is the documented best-practice cadence.Caveat: Enterprise plan only. Pro/Team plans have no documented quota API. That's a real gap (see "Out of scope" below).
Other flavors
Gemini, Kimi, OpenCode: data sources vary, mostly turn-end token counts on response objects. Tractable but flavor-specific. Not blocking for v1 — schema is open to extension.
Proposed shape
1. Unified schema (
shared/src/schemas.ts)Stored on
session.metadata.agentUsage. Broadcast as SSE metadata patches.2. Renderer (
web/src/components/AssistantChat/AgentUsageGauge.tsx)One component, discriminates on
flavor, delegates to flavor-specific subcomponents:Slot: beside the send button in the composer (same neighborhood as #537's ring, and where
StatusBar.tsx'sctx N/Msegment lives today).3. Per-flavor extractors
rate_limitsinto the CLI wrappercli/src/claude/cursorAdminApiKeyconfighub/src/cursor/Acceptance criteria
session.metadata.agentUsagefield added with schema above (or compatible)agentUsageabsent (no UI jitter for free-tier Claude users, OpenCode, Gemini, etc. where data isn't available)Out of scope
WorkosCursorSessionTokencookie +GET /api/usage-summary+POST /api/dashboard/get-filtered-usage-events, documented at https://gist.github.com/dmwyatt/1e9359b1862e7cbfe1e754fe4c8db764, with prior art indmwyatt/cursor-usageCLI andkdosiodjinud/cursor-chrome-extension). That's not appropriate for upstream — undocumented endpoints, brittle, the cookie ishttpOnlyand must be manually extracted, JWT expiry must be handled. But the renderer + schema proposed here are exactly the surface a Pro/Team operator would need to render their own scraped data, so this issue stays useful for them. If Cursor ships a Pro/Team-level usage API in the future, a follow-up issue swaps the data source.POST /teams/user-spend-limit, 250 req/min). Read-only for v1.out of usageetc.) — orthogonal; that's about post-mortem signal, this issue is about pre-mortem visibility.Related
ctxstuck at 0 (open, context not quota but adjacent code path)Notes for maintainers
This issue is intentionally framed as umbrella + slot-shape, not a single fat PR. Suggested PR sequence:
AgentUsageSchema, emptyAgentUsageGaugethat renders nothing when data absent. Tiny diff, no behavior change.Happy to start at (1) if you'd like. Will follow CONTRIBUTING.md including the AI disclosure block in PR bodies.