feat(billing): revert reserved message slot on failed/aborted chats by ABB65 · Pull Request #51 · Contentrain/studio

ABB65 · 2026-05-15T15:00:22Z

Summary

Both chat handlers reserve a message slot up-front for race-free cap enforcement, but every reservation was previously terminal. A pre-AI failure (DB error, brain-cache failure, Anthropic auth failure, abort before first token) consumed a slot even though no LLM cost was incurred — workspaces lost monthly quota to work that never happened.

Billing semantic (review-agreed)

A message becomes billable the moment Anthropic streams its first real provider event (text or tool_use).

tool_result is internal (engine-emitted post tool exec) → not a commit trigger
done is synthetic → not a commit trigger
error may fire before any tokens → not a commit trigger

Pre-first-event failure → refund slot. Post-first-event failure (mid-stream error, abort, save fail) → keep slot, we paid Anthropic.

Changes

Migration `007_agent_usage_revert_rpcs.sql`

Adds decrement_agent_usage and decrement_api_usage SECURITY DEFINER RPCs with GREATEST(message_count - 1, 0) clamp — double-revert race (caller's outer catch + inner finally both invoking best-effort wrappers) cannot drive the counter negative.

Provider surface

decrementAgentUsage({ workspaceId, userId, month, source })
decrementAPIUsage({ workspaceId, apiKeyId, month })

Handlers

Per review feedback (finally inside processChat would miss pre-AI failures because conversation/history/brain/branches load happens at handler scope), the structure now scopes the revert correctly:

Studio chat (SSE): reserved / committed declared above request body; outer try wraps reserve + all pre-AI work + processChat launch + eventStream.send(); outer catch invokes tryRevert('pre-AI') then rethrows so h3 converts to HTTP error. ProcessChat's own finally invokes tryRevert('post-reserve') for failures once the SSE pipeline is live.
Conversation API (JSON): single try/finally covers the entire reserve-onward scope — not SSE, no fire-and-forget.
recordAIUsage / recordAPIUsage moved into the for-await loop, gated on the same first-event commit so failed/aborted requests never emit a meter event to Polar.
tryRevert(reason) logs RPC failures (console.error) rather than silently swallowing them — silent revert failure makes quota drift hard to debug.

Out of scope

Partial token accounting on aborted streams. The engine still only captures inputTokens/outputTokens on message_end, so a stream that errors after first token writes 0 tokens to the usage row. Anthropic billed us for those tokens but the dashboard underreports. Separate observability follow-up — different concern (cost capture vs. quota refund).

Test plan

pnpm test — 590 passed (587 + 3 new integration tests)
pnpm typecheck clean
pnpm lint — 0 errors on changed files

Three new integration tests in chat-route.integration.test.ts:

Test	Scenario	Assertion
`refunds the reserved slot when the provider errors before yielding any event`	Stream throws on first `next()`	`decrementAgentUsage` called once, `recordAIUsage` NOT called
`keeps the reservation when the provider yields at least one event before erroring`	Stream yields text then throws	`decrementAgentUsage` NOT called, `recordAIUsage` called once
`refunds the reserved slot when a pre-AI step fails after reservation`	`loadConversationMessages` rejects	`decrementAgentUsage` called once, `recordAIUsage` NOT called, HTTP 500

Conversation API path uses the symmetric helper (decrementAPIUsage); its integration test coverage stays in the same E2E harness pattern when we add the Conversation API integration suite (separate PR for the harness).

Both the Studio chat handler and the Conversation API handler reserve a message slot up-front via `incrementAgentUsageIfAllowed` / `incrementAPIUsageIfAllowed` so concurrent requests can't double-spend the last cap slot. Previously every reservation was terminal — a pre-AI DB error, brain-cache failure, Anthropic auth failure, or client abort before any tokens streamed back still consumed a slot even though no LLM cost was incurred. The runtime rule established in review: a message becomes billable the moment Anthropic streams its first real provider event (`text` or `tool_use`). `tool_result` is internal (engine-emitted post tool exec), `done` is synthetic, and `error` can fire before any tokens — none of those qualify. Any failure before the first billable event refunds the slot. After the first billable event we keep the reservation regardless of downstream outcome (mid-stream error, client abort, DB save failure) — we paid Anthropic for those tokens. Migration 007 adds `decrement_agent_usage` and `decrement_api_usage` SECURITY DEFINER RPCs with a `GREATEST(message_count - 1, 0)` clamp so a double-revert race (caller's outer catch + inner finally both calling through best-effort wrappers) can't drive the counter negative. Handler restructure mirrors the rule: - `reserved` / `committed` flags scoped above the request body. - `tryRevert(reason)` helper logs RPC failures rather than swallowing them (silent revert failure makes quota drift hard to debug, per review feedback). - Studio chat: outer try wraps reserve + all pre-AI work + processChat launch; outer catch hits `tryRevert('pre-AI')` then rethrows. ProcessChat's own finally hits `tryRevert('post-reserve')` for any failure once the SSE pipeline is live. - Conversation API: single try/finally — non-SSE request/response, so one scope covers both pre-AI and AI paths. - `recordAIUsage` / `recordAPIUsage` outbox writes moved into the for-await loop, gated on the same first-event commit so failed requests never emit a meter event to Polar. Three new integration tests cover the matrix: - Provider errors before any event → revert called, no meter event. - Provider yields a text event then errors → no revert, meter event written (committed = true). - Pre-AI step (loadConversationMessages) throws after reservation → outer catch fires revert, no meter event. Token accounting on aborted/partial streams stays out of scope — the engine still only captures totals on `message_end`, so a stream that errors after first token writes 0 tokens to the usage row. That's a separate observability follow-up.

ABB65 merged commit 66884c7 into main May 15, 2026
1 check passed

ABB65 deleted the fix/billing-commit-or-revert-usage branch May 15, 2026 19:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(billing): revert reserved message slot on failed/aborted chats#51

feat(billing): revert reserved message slot on failed/aborted chats#51
ABB65 merged 1 commit into
mainfrom
fix/billing-commit-or-revert-usage

ABB65 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ABB65 commented May 15, 2026

Summary

Billing semantic (review-agreed)

Changes

Migration 007_agent_usage_revert_rpcs.sql

Provider surface

Handlers

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Migration `007_agent_usage_revert_rpcs.sql`