Skip to content

feat(web+cli+hub): cross-flavor agent budget/quota gauges in composer (umbrella) #846

@heavygee

Description

@heavygee

Summary

Today HAPI surfaces context usage in the composer status bar for some flavors (Claude/Codex per #645, #750, #813). It does not surface quota / budget — i.e. "how much of your 5-hour window / weekly limit / monthly spend cap is gone, and when does it reset?" An operator running many sessions has no in-UI signal that a session is about to die from quota until it actually dies.

This issue proposes an umbrella for cross-flavor agent budget/quota gauges in the composer. The work splits cleanly into a small renderer/schema PR plus per-flavor data-extractor PRs.

The Codex piece is essentially #537 by @dsus4wang — this issue is structured so #537 (or its successor) is the natural seed and the other flavors slot in alongside it.

Why this is more tractable than it looks

The two providers HAPI most commonly wraps now ship structured quota data on documented surfaces:

Claude Code — rate_limits in statusline JSON stdin

Since Claude Code v2.1+ (some shape since v1.2.80) the JSON object piped to statusLine.command on every render contains:

{
  "model": { "display_name": "Claude Sonnet 4.6" },
  "context_window": { "used_percentage": 12.4 },
  "rate_limits": {
    "five_hour":  { "used_percentage": 42, "resets_at": 1742651200 },
    "seven_day":  { "used_percentage": 18, "resets_at": 1743120000 }
  }
}

Documented at code.claude.com/docs/en/statusline. rate_limits is present for Claude.ai subscribers (Pro/Max) after the first API response. The underlying source is the anthropic-ratelimit-unified-* response headers (with anthropic-beta: oauth-2025-04-20) — same five-hour / seven-day windows. The same data is what Claude Code's own /usage slash command reads. Third-party tools confirming the contract: nsanden/claude-rate-monitor, wakamex/ccusage, z80020100/claude-code-statusline.

So for Claude HAPI sessions: HAPI's CLI wrapper installs (or composes with) a statusline command, captures the JSON stream, extracts the rate_limits fields, forwards as a typed agentUsage SSE patch. No reverse-engineering, no new API auth.

Codex — already addressed by #537

PR #537 ("[codex] add Codex usage indicator") by @dsus4wang already builds the Codex side: captures token_count events, stores session.metadata.codexUsage, renders a "compact usage ring beside the send button, with popover details for context window, rate limits, and token breakdown." This umbrella treats #537 as the natural seed and proposes the unified schema be shaped so #537's codexUsage slots in cleanly (or is renamed to agentUsage of flavor: 'codex').

Cursor — has quota data via Admin API (Enterprise only)

The official Cursor Admin API exposes /teams/spend, /teams/daily-usage-data, /teams/filtered-usage-events, /teams/user-spend-limit. Auth via admin API key (Basic Auth). Rate-limited at 20 req/min — hourly polling is the documented best-practice cadence.

Caveat: Enterprise plan only. Pro/Team plans have no documented quota API. That's a real gap (see "Out of scope" below).

Other flavors

Gemini, Kimi, OpenCode: data sources vary, mostly turn-end token counts on response objects. Tractable but flavor-specific. Not blocking for v1 — schema is open to extension.

Proposed shape

1. Unified schema (shared/src/schemas.ts)

export const AgentUsageSchema = z.discriminatedUnion('flavor', [
  z.object({
    flavor: z.literal('claude'),
    fiveHour: z.object({ usedPercentage: z.number(), resetsAt: z.number() }).nullish(),
    sevenDay: z.object({ usedPercentage: z.number(), resetsAt: z.number() }).nullish(),
    updatedAt: z.number(),
  }),
  z.object({
    flavor: z.literal('codex'),
    // shape mirrors #537's session.metadata.codexUsage to avoid divergence
    contextWindow: z.object({ used: z.number(), total: z.number() }).optional(),
    rateLimits: z.unknown().optional(),
    tokens: z.unknown().optional(),
    updatedAt: z.number(),
  }),
  z.object({
    flavor: z.literal('cursor'),
    tier: z.enum(['enterprise', 'pro', 'team']),
    spendCents: z.number(),
    overallSpendCents: z.number(),
    spendLimitCents: z.number().nullable(),
    billingCycleEnd: z.number(),
    updatedAt: z.number(),
  }),
])

Stored on session.metadata.agentUsage. Broadcast as SSE metadata patches.

2. Renderer (web/src/components/AssistantChat/AgentUsageGauge.tsx)

One component, discriminates on flavor, delegates to flavor-specific subcomponents:

  • Claude: two-arc gauge (5h + 7d), each showing used % + relative reset time
  • Codex: the ring widget already shipped in [codex] add Codex usage indicator #537
  • Cursor: spend bar + dollar amount + days-until-cycle-end

Slot: beside the send button in the composer (same neighborhood as #537's ring, and where StatusBar.tsx's ctx N/M segment lives today).

3. Per-flavor extractors

Flavor Extractor Where
Claude Install (or chain with) a statusline command that pipes rate_limits into the CLI wrapper cli/src/claude/
Codex Already in #537 as proposed in #537
Cursor (Enterprise) Hourly poll of Admin API behind opt-in cursorAdminApiKey config hub/src/cursor/

Acceptance criteria

  • session.metadata.agentUsage field added with schema above (or compatible)
  • Renderer slot in composer that gracefully renders nothing when agentUsage absent (no UI jitter for free-tier Claude users, OpenCode, Gemini, etc. where data isn't available)
  • Claude statusline-JSON extractor working for Claude.ai Pro/Max sessions; no-op for non-subscribers
  • Codex extractor in place (via [codex] add Codex usage indicator #537 merging, or aligned-shape successor)
  • Cursor Enterprise Admin API poller behind opt-in config; correctly maps per-user spend when HAPI user email matches Cursor user email
  • No new XSS / auth-token-leak surface; admin API keys never logged or sent to the client
  • Each flavor's extractor degrades quietly (logs at debug, not error) when the underlying data source is unavailable

Out of scope

  • Cursor Pro/Team quota: no official API exists. Operators can wire a workaround via the reverse-engineered dashboard endpoints (WorkosCursorSessionToken cookie + GET /api/usage-summary + POST /api/dashboard/get-filtered-usage-events, documented at https://gist.github.com/dmwyatt/1e9359b1862e7cbfe1e754fe4c8db764, with prior art in dmwyatt/cursor-usage CLI and kdosiodjinud/cursor-chrome-extension). That's not appropriate for upstream — undocumented endpoints, brittle, the cookie is httpOnly and must be manually extracted, JWT expiry must be handled. But the renderer + schema proposed here are exactly the surface a Pro/Team operator would need to render their own scraped data, so this issue stays useful for them. If Cursor ships a Pro/Team-level usage API in the future, a follow-up issue swaps the data source.
  • OpenCode / Gemini / Kimi gauges: extensible-by-design; ship after the unified surface lands.
  • Setting spend limits via the Cursor Admin API (it's available via POST /teams/user-spend-limit, 250 req/min). Read-only for v1.
  • Generalising the Cursor failure-mode classifier (stderr parsing for out of usage etc.) — orthogonal; that's about post-mortem signal, this issue is about pre-mortem visibility.

Related

  • #537 — Codex usage indicator (open, the natural seed; this umbrella unblocks generalisation after it lands)
  • #750 — OpenCode status bar context-usage refresh (closed, same surface neighborhood)
  • #645 — Claude remote-mode ctx stuck at 0 (open, context not quota but adjacent code path)
  • #813 — Mermaid render-error SVG suppression (recently merged, in passing because the composer touches the same neighborhood)
  • #818 — Cursor stream-json wrapper silent-drop on agent exit (closed, fixed the failure mode but not the visibility surface)

Notes for maintainers

This issue is intentionally framed as umbrella + slot-shape, not a single fat PR. Suggested PR sequence:

  1. Unified schema + renderer skeleton — adds AgentUsageSchema, empty AgentUsageGauge that renders nothing when data absent. Tiny diff, no behavior change.
  2. Claude statusline extractor — implements Fix-5-shape; opt-in to start if you'd prefer.
  3. Cursor Enterprise poller — implements Fix-7-shape behind config flag.
  4. [codex] add Codex usage indicator #537 alignment — either [codex] add Codex usage indicator #537 merges and we adjust, or we propose a shape-aligned variant in coordination with @dsus4wang.

Happy to start at (1) if you'd like. Will follow CONTRIBUTING.md including the AI disclosure block in PR bodies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions