feat(web+cli+hub): cross-flavor agent budget/quota gauges in composer (umbrella)

## Summary

Today HAPI surfaces **context** usage in the composer status bar for some flavors (Claude/Codex per #645, #750, #813). It does **not** surface **quota / budget** — i.e. "how much of your 5-hour window / weekly limit / monthly spend cap is gone, and when does it reset?" An operator running many sessions has no in-UI signal that a session is about to die from quota until it actually dies.

This issue proposes an umbrella for **cross-flavor agent budget/quota gauges** in the composer. The work splits cleanly into a small renderer/schema PR plus per-flavor data-extractor PRs.

The Codex piece is essentially [#537](https://github.com/tiann/hapi/pull/537) by @dsus4wang — this issue is structured so #537 (or its successor) is the natural seed and the other flavors slot in alongside it.

## Why this is more tractable than it looks

The two providers HAPI most commonly wraps now ship structured quota data on documented surfaces:

### Claude Code — `rate_limits` in statusline JSON stdin

Since Claude Code v2.1+ (some shape since v1.2.80) the JSON object piped to `statusLine.command` on every render contains:

```json
{
  "model": { "display_name": "Claude Sonnet 4.6" },
  "context_window": { "used_percentage": 12.4 },
  "rate_limits": {
    "five_hour":  { "used_percentage": 42, "resets_at": 1742651200 },
    "seven_day":  { "used_percentage": 18, "resets_at": 1743120000 }
  }
}
```

Documented at code.claude.com/docs/en/statusline. `rate_limits` is present for Claude.ai subscribers (Pro/Max) after the first API response. The underlying source is the `anthropic-ratelimit-unified-*` response headers (with `anthropic-beta: oauth-2025-04-20`) — same five-hour / seven-day windows. The same data is what Claude Code's own `/usage` slash command reads. Third-party tools confirming the contract: `nsanden/claude-rate-monitor`, `wakamex/ccusage`, `z80020100/claude-code-statusline`.

So for Claude HAPI sessions: HAPI's CLI wrapper installs (or composes with) a statusline command, captures the JSON stream, extracts the `rate_limits` fields, forwards as a typed `agentUsage` SSE patch. No reverse-engineering, no new API auth.

### Codex — already addressed by #537

[PR #537](https://github.com/tiann/hapi/pull/537) ("[codex] add Codex usage indicator") by @dsus4wang already builds the Codex side: captures `token_count` events, stores `session.metadata.codexUsage`, renders a "compact usage ring beside the send button, with popover details for context window, rate limits, and token breakdown." This umbrella treats #537 as the natural seed and proposes the unified schema be shaped so #537's `codexUsage` slots in cleanly (or is renamed to `agentUsage` of `flavor: 'codex'`).

### Cursor — has quota data via Admin API (Enterprise only)

The official Cursor [Admin API](https://cursor.com/docs/account/teams/admin-api) exposes `/teams/spend`, `/teams/daily-usage-data`, `/teams/filtered-usage-events`, `/teams/user-spend-limit`. Auth via admin API key (Basic Auth). Rate-limited at 20 req/min — hourly polling is the documented best-practice cadence.

Caveat: **Enterprise plan only**. Pro/Team plans have no documented quota API. That's a real gap (see "Out of scope" below).

### Other flavors

Gemini, Kimi, OpenCode: data sources vary, mostly turn-end token counts on response objects. Tractable but flavor-specific. Not blocking for v1 — schema is open to extension.

## Proposed shape

### 1. Unified schema (`shared/src/schemas.ts`)

```ts
export const AgentUsageSchema = z.discriminatedUnion('flavor', [
  z.object({
    flavor: z.literal('claude'),
    fiveHour: z.object({ usedPercentage: z.number(), resetsAt: z.number() }).nullish(),
    sevenDay: z.object({ usedPercentage: z.number(), resetsAt: z.number() }).nullish(),
    updatedAt: z.number(),
  }),
  z.object({
    flavor: z.literal('codex'),
    // shape mirrors #537's session.metadata.codexUsage to avoid divergence
    contextWindow: z.object({ used: z.number(), total: z.number() }).optional(),
    rateLimits: z.unknown().optional(),
    tokens: z.unknown().optional(),
    updatedAt: z.number(),
  }),
  z.object({
    flavor: z.literal('cursor'),
    tier: z.enum(['enterprise', 'pro', 'team']),
    spendCents: z.number(),
    overallSpendCents: z.number(),
    spendLimitCents: z.number().nullable(),
    billingCycleEnd: z.number(),
    updatedAt: z.number(),
  }),
])
```

Stored on `session.metadata.agentUsage`. Broadcast as SSE metadata patches.

### 2. Renderer (`web/src/components/AssistantChat/AgentUsageGauge.tsx`)

One component, discriminates on `flavor`, delegates to flavor-specific subcomponents:

- **Claude:** two-arc gauge (5h + 7d), each showing used % + relative reset time
- **Codex:** the ring widget already shipped in #537
- **Cursor:** spend bar + dollar amount + days-until-cycle-end

Slot: beside the send button in the composer (same neighborhood as #537's ring, and where `StatusBar.tsx`'s `ctx N/M` segment lives today).

### 3. Per-flavor extractors

| Flavor | Extractor | Where |
|---|---|---|
| Claude | Install (or chain with) a statusline command that pipes `rate_limits` into the CLI wrapper | `cli/src/claude/` |
| Codex | Already in #537 | as proposed in #537 |
| Cursor (Enterprise) | Hourly poll of Admin API behind opt-in `cursorAdminApiKey` config | `hub/src/cursor/` |

## Acceptance criteria

- [ ] `session.metadata.agentUsage` field added with schema above (or compatible)
- [ ] Renderer slot in composer that gracefully renders nothing when `agentUsage` absent (no UI jitter for free-tier Claude users, OpenCode, Gemini, etc. where data isn't available)
- [ ] Claude statusline-JSON extractor working for Claude.ai Pro/Max sessions; no-op for non-subscribers
- [ ] Codex extractor in place (via #537 merging, or aligned-shape successor)
- [ ] Cursor Enterprise Admin API poller behind opt-in config; correctly maps per-user spend when HAPI user email matches Cursor user email
- [ ] No new XSS / auth-token-leak surface; admin API keys never logged or sent to the client
- [ ] Each flavor's extractor degrades quietly (logs at debug, not error) when the underlying data source is unavailable

## Out of scope

- **Cursor Pro/Team quota:** no official API exists. Operators can wire a workaround via the reverse-engineered dashboard endpoints (`WorkosCursorSessionToken` cookie + `GET /api/usage-summary` + `POST /api/dashboard/get-filtered-usage-events`, documented at https://gist.github.com/dmwyatt/1e9359b1862e7cbfe1e754fe4c8db764, with prior art in `dmwyatt/cursor-usage` CLI and `kdosiodjinud/cursor-chrome-extension`). That's **not appropriate for upstream** — undocumented endpoints, brittle, the cookie is `httpOnly` and must be manually extracted, JWT expiry must be handled. But the *renderer + schema* proposed here are exactly the surface a Pro/Team operator would need to render their own scraped data, so this issue stays useful for them. If Cursor ships a Pro/Team-level usage API in the future, a follow-up issue swaps the data source.
- **OpenCode / Gemini / Kimi gauges:** extensible-by-design; ship after the unified surface lands.
- **Setting** spend limits via the Cursor Admin API (it's available via `POST /teams/user-spend-limit`, 250 req/min). Read-only for v1.
- **Generalising the Cursor failure-mode classifier** (stderr parsing for `out of usage` etc.) — orthogonal; that's about post-mortem signal, this issue is about pre-mortem visibility.

## Related

- [#537](https://github.com/tiann/hapi/pull/537) — Codex usage indicator (open, the natural seed; this umbrella unblocks generalisation after it lands)
- [#750](https://github.com/tiann/hapi/issues/750) — OpenCode status bar context-usage refresh (closed, same surface neighborhood)
- [#645](https://github.com/tiann/hapi/issues/645) — Claude remote-mode `ctx` stuck at 0 (open, context not quota but adjacent code path)
- [#813](https://github.com/tiann/hapi/pull/813) — Mermaid render-error SVG suppression (recently merged, in passing because the composer touches the same neighborhood)
- [#818](https://github.com/tiann/hapi/issues/818) — Cursor stream-json wrapper silent-drop on agent exit (closed, fixed the failure mode but not the visibility surface)

## Notes for maintainers

This issue is intentionally framed as **umbrella + slot-shape**, not a single fat PR. Suggested PR sequence:

1. **Unified schema + renderer skeleton** — adds `AgentUsageSchema`, empty `AgentUsageGauge` that renders nothing when data absent. Tiny diff, no behavior change.
2. **Claude statusline extractor** — implements Fix-5-shape; opt-in to start if you'd prefer.
3. **Cursor Enterprise poller** — implements Fix-7-shape behind config flag.
4. **#537 alignment** — either #537 merges and we adjust, or we propose a shape-aligned variant in coordination with @dsus4wang.

Happy to start at (1) if you'd like. Will follow [CONTRIBUTING.md](https://github.com/tiann/hapi/blob/main/CONTRIBUTING.md) including the AI disclosure block in PR bodies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(web+cli+hub): cross-flavor agent budget/quota gauges in composer (umbrella) #846

Summary

Why this is more tractable than it looks

Claude Code — `rate_limits` in statusline JSON stdin

Codex — already addressed by #537

Cursor — has quota data via Admin API (Enterprise only)

Other flavors

Proposed shape

1. Unified schema (`shared/src/schemas.ts`)

2. Renderer (`web/src/components/AssistantChat/AgentUsageGauge.tsx`)

3. Per-flavor extractors

Acceptance criteria

Out of scope

Related

Notes for maintainers

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Flavor	Extractor	Where
Claude	Install (or chain with) a statusline command that pipes `rate_limits` into the CLI wrapper	`cli/src/claude/`
Codex	Already in #537	as proposed in #537
Cursor (Enterprise)	Hourly poll of Admin API behind opt-in `cursorAdminApiKey` config	`hub/src/cursor/`

Uh oh!

feat(web+cli+hub): cross-flavor agent budget/quota gauges in composer (umbrella) #846

Description

Summary

Why this is more tractable than it looks

Claude Code — rate_limits in statusline JSON stdin

Codex — already addressed by #537

Cursor — has quota data via Admin API (Enterprise only)

Other flavors

Proposed shape

1. Unified schema (shared/src/schemas.ts)

2. Renderer (web/src/components/AssistantChat/AgentUsageGauge.tsx)

3. Per-flavor extractors

Acceptance criteria

Out of scope

Related

Notes for maintainers

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Claude Code — `rate_limits` in statusline JSON stdin

1. Unified schema (`shared/src/schemas.ts`)

2. Renderer (`web/src/components/AssistantChat/AgentUsageGauge.tsx`)