release: charge credits per chat turn (#612)#613
Conversation
…run (#609) * fix(chat-workflow): persist the assistant message after a successful run Closes the silent-data-loss gap that the open-agents → recoup-api cutover introduced: the chat workflow streamed the final assistant message to the client over SSE but never wrote it to `chat_messages`, so a page refresh after a successful exchange wiped the reply. Changes: - New `lib/chat/persistAssistantMessage.ts` step (mirrors open-agents' `app/workflows/chat-post-finish.ts` helper of the same name). Fire-and-forget upsert + chat `updated_at` touch on fresh inserts; idempotent on workflow replay; never throws. - `runAgentStep` now wires an `onFinish` callback into `toUIMessageStream` to capture the assembled assistant message, and returns it alongside `finishReason` as part of the new `RunAgentStepResult` type. - `runAgentWorkflow` calls `persistAssistantMessage(chatId, responseMessage)` after a successful `runAgentStep` (in the try block, BEFORE the existing `clearChatActiveStream` + `closeChatStream` finally). On throw, no message is persisted (nothing was generated); cleanup still runs. Tests: - `persistAssistantMessage.test.ts` — 6 cases (insert + touch, duplicate skip, wrong-role guard, DB-error swallow, exception swallow, role assertion). - `runAgentStep.test.ts` — 3 new cases (onFinish wired, captured responseMessage returned, undefined when onFinish never fires). - `runAgentWorkflow.test.ts` — 3 new cases (persist called on success, not called when responseMessage undefined, not called on throw while cleanup still runs). Full suite: 3159 → 3171 passing. Tracking: #605 (Tier 1, item 1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-workflow): loosen AssistantMessage type to accept UIMessage The over-strict `& Record<string, unknown>` intersection on the outer shape required an index signature that `UIMessage` from `ai` doesn't carry, so wiring runAgentStep's UIMessage return into persistAssistantMessage failed the Vercel build with TS2345. Switched to a minimal duck-typed shape (id/role/parts) — matches both UIMessage and the in-test fixtures structurally. The `chat_messages.parts` column is jsonb so persistence doesn't care about the part subtypes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-workflow): mark persistAssistantMessage as a "use step" Vercel Workflow blocks `fetch()` in workflow-body code; the Supabase JS client uses fetch under the hood, so `upsertChatMessage` inside `persistAssistantMessage` failed at runtime with: Global "fetch" is unavailable in workflow functions. Use the "fetch" step function from "workflow" to make HTTP requests. `"use step"` directive moves the function into step-context where fetch is legal. Mirrors open-agents' `persistAssistantMessage` step in `app/workflows/chat-post-finish.ts` (which carries the same directive). Caught via runtime log inspection on the PR preview before merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * debug(chat-workflow): TEMP diagnostic logs in persistAssistantMessage Hard-refresh of a chat that ran on PR #609's preview showed the assistant message NOT in chat_messages — meaning silence in the existing error log is NOT the same as "row was written." Adding explicit logs at entry, after upsert, and after updateChat so the runtime tail can disambiguate: - "skip: not assistant role" branch - upsert result shape (ok / isDuplicate / rowPresent) - "persisted + touched chat" success line Will be reverted before merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-workflow): pass generateMessageId to toUIMessageStream Diagnostic logs revealed every assistant message was arriving at persistAssistantMessage with `messageId: ''` — the AI SDK's default when `generateMessageId` isn't provided. Supabase's `chat_messages.id` PK then treated every workflow run after the first as a duplicate (`onConflict: "id", ignoreDuplicates: true` → isDuplicate: true, rowPresent: false) so no assistant row landed. Generating a stable id once per `runAgentStep` invocation via `generateId()` from `ai`, then plumbing it into `result.toUIMessageStream({ generateMessageId: () => ... })` so: - the streamed chunks carry the id (existing wire format), - `onFinish.responseMessage.id` carries the id, - `persistAssistantMessage` sees a real id and the upsert lands a fresh row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(chat-workflow): move assistantMessageId generation to workflow body Match open-agents' structural pattern instead of generating the id inline inside runAgentStep. Rationale (which I should have applied the first time, per review feedback): 1. **Multi-step support** — when the Tier 2 outer loop lands, each runAgentStep call needs the SAME assistantMessageId so chunks accumulate under one chat_messages row instead of fragmenting per tool-call iteration. Generating inside the step gives every call a fresh id; generating in the workflow body and threading through makes the upgrade path one-line. 2. **Resume-after-tool-call** — open-agents reuses the latest message's id when `latestMessage.role === "assistant"` so the in-progress assistant turn re-attaches instead of starting a new row. Ported now to avoid a future surprise. 3. **Determinism** — `generateId()` is non-deterministic; the workflow body's WDK constraint forbids that. Wrapping it in a `"use step"` (`generateAssistantMessageId.ts`) makes the value durable across workflow replays. Changes: - New `app/lib/workflows/generateAssistantMessageId.ts` step (mirrors open-agents' local `generateId` step in `apps/web/app/workflows/chat.ts`). - `RunAgentStepInput` gains `assistantMessageId: string`. The inline `generateId()` call is removed. - `runAgentWorkflow` reads `latestMessage`; reuses its id when it's an assistant message, otherwise awaits the step. Threads the result into `runAgentStep`. - Tests: 2 new for the step, 1 new for runAgentStep forwarding, 2 new for the resume-aware branch in runAgentWorkflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(chat-workflow): revert temp diagnostic logs in persistAssistantMessage Logs served their purpose — surfaced the empty-messageId bug (fixed in 8974a37 by threading a workflow-generated id through toUIMessageStream's generateMessageId). UI verification on the PR preview confirmed the assistant row now persists. Reverting the debug logs so production runtime stays quiet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chat-workflow): bump last_assistant_message_at on persist (unread badge parity) Match open-agents' `updateChatAssistantActivity` which sets BOTH `updated_at` and `last_assistant_message_at` to the same timestamp. The recoup-api sidebar's `hasUnread` badge is computed in `lib/sessions/chats/getChatSummaries.ts` as `lastAssistantMessageAt > lastReadAt`, mirroring open-agents' identical query in `apps/web/lib/db/sessions.ts:201`. Without this column bump, an assistant message persisted by the workflow streams to the client, lands in `chat_messages`, but never lights up the unread badge for any other tabs/devices the user has open. The column already exists in `api`'s `chats` schema and `updateChat` already accepts it via `ChatMutableFields` — this is purely a "we forgot to set it" fix. Added two new unit tests: - bumps `last_assistant_message_at` on fresh insert - uses the same timestamp for both columns Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: format-fix workflow files (prettier --write) Resolves the format/lint CI failures on df312db — purely whitespace collapsing per the repo's prettier config (no behavior change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dit) (#612) * feat(credits): port computeCreditsDeductedCents + estimateModelUsageCost from open-agents First piece of the chat-workflow billing path. Ports the per-turn cost math from open-agents' `apps/web/lib/credits/compute-credits-deducted-cents.ts` and `apps/web/lib/models.ts:estimateModelUsageCost` so the same billing logic runs on both sides during the cutover. Resolution order matches open-agents exactly: 1. gateway-reported cost on responseMessage.metadata.totalMessageCost (the same number the chat UI shows next to the response) 2. token-based estimate against the model catalog's cost entry 3. 1c floor when no pricing is available — so a successful turn never lands as a free run Three new files (per api's one-exported-function-per-file SRP): - AvailableModelCost.ts — shape mirroring open-agents' richer cost type (input, output, cache_read, context_over_200k) so the same estimator runs against either catalog - estimateModelUsageCost.ts — token-based USD estimator including the 200k+ context tier swap and cache_read pricing - computeCreditsDeductedCents.ts — top-level orchestrator (gateway cost → token estimate → 1c floor) using api's getAvailableModels directly (no HTTP self-fetch like open-agents does) Test coverage: 27 new unit tests across the two test files. All pricing edge cases covered (NaN/Infinity/negative gateway cost, cached-tokens- exceeding-input clamping, context_over_200k tier swap with partial overrides, catalog miss / fetch failure fallbacks). Unblocks step 3 (deductCreditsWithAudit TS wrapper) of the chat credits gap in #605. Full suite: 3191 → 3205 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(credits): charge credits per chat turn (atomic wallet debit + audit) Closes the silent revenue-loss gap tracked in #605: every successful chat workflow turn now debits the account's wallet AND records a usage_events audit row, in a single atomic transaction. End-to-end flow: 1. runAgentStep's onFinish captures responseMessage.metadata ({totalMessageCost, totalMessageUsage}) — same number the chat UI shows next to the response. 2. runAgentWorkflow calls recordChatUsage(accountId, modelId, message) after persistAssistantMessage. 3. recordChatUsage → computeCreditsDeductedCents (gateway cost OR token estimate OR 1c floor) → deductCreditsWithAudit (supabase.rpc'deduct_credits_with_audit'). 4. The Postgres function (recoupable/database#26) runs the wallet UPDATE and the usage_events INSERT in one implicit transaction — either both land or neither does. Matches open-agents' db.transaction(...) atomicity guarantee. Threads accountId through RunAgentWorkflowInput from validateChatWorkflow (auth-derived; never trusted from the request body). New files: - lib/supabase/credits_usage/deductCreditsWithAudit.ts (+ tests) Thin supabase.rpc wrapper; fire-and-forget (returns ok/error instead of throwing). Lives in lib/supabase/ per CLAUDE.md SRP. - app/lib/workflows/recordChatUsage.ts (+ tests) "use step" function that ties the two together with entry/skip/ success/error logs and graceful handling of missing metadata, catalog failures, and RPC errors. Updated: - app/lib/workflows/runAgentWorkflow.ts + accountId field on RunAgentWorkflowInput + recordChatUsage call after successful persistAssistantMessage - lib/chat/handleChatWorkflowStream.ts + passes validated.accountId into start(runAgentWorkflow, ...) - app/lib/workflows/__tests__/runAgentWorkflow.test.ts + 3 new tests (records on success, skips when no responseMessage, skips when runAgentStep throws) TDD: each new file went red → minimal impl → green. Suite: 3205 → 3220 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(credits): use flat interface for DeductCreditsWithAuditResult Next.js 16's type checker wasn't narrowing the discriminated union `{ ok: true } | { ok: false; error: string }` through `if (!result.ok)`, breaking the production build at `recordChatUsage.ts:90`. Vitest's own type config tolerated it, so this only surfaced on the preview deploy. Flat interface with optional `error?: string` avoids the narrowing requirement entirely — caller can read `result.error` directly when `result.ok` is false. Slight type-safety loss (compiler doesn't enforce that `error` is present when ok is false) is worth the build stability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(credits): regenerate supabase RPC type for deduct_credits_with_audit The previous deploy failed because: 1. `types/database.types.ts` was stale — it didn't include the `deduct_credits_with_audit` RPC that landed in recoupable/database#26 (and was manually applied via the MCP after Supabase's GitHub App 502'd post-merge). Without that entry, `supabase.rpc("deduct_credits_with_audit", ...)` failed Next.js's stricter type check. 2. Even with the entry, the typed `Args.p_event: Json` couldn't accept our `DeductCreditsAuditEvent` interface directly — TS doesn't infer interface → index-signature assignment. Fixes: - Added the `deduct_credits_with_audit` entry to the Functions block of types/database.types.ts (matches the upstream regen via mcp__plugin_supabase_supabase__generate_typescript_types). - Cast `params.event as unknown as Json` at the supabase boundary in deductCreditsWithAudit.ts. The runtime payload is unchanged and the interface keeps its strong typing for callers. Verified locally: `pnpm exec tsc --noEmit` shows no errors in any file this PR touches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(credits): consolidate chat-workflow billing into handleChatCredits (DRY) Addresses the user's PR review: my new files duplicated existing infrastructure. Consolidates everything into the existing pattern (handleChatCredits → getCreditUsage + recordCreditDeduction) so chat workflow billing uses the SAME orchestrator that the streaming chat path (handleChatStream) already uses. Changes: 1. lib/credits/getCreditUsage.ts - Added optional `gatewayCostUsd?: number` parameter - When positive, returns it directly (skips catalog lookup) - Otherwise existing token-math path is unchanged (backwards compat) 2. lib/credits/handleChatCredits.ts - Added `gatewayCostUsd?: number` (threaded to getCreditUsage) - Added `source?: "web" | "api"` (defaults to "web" for backwards compat; chat workflow passes "api" so admin dashboards can distinguish surface in spend rollups) 3. lib/credits/recordCreditDeduction.ts - Switched from `deductCredits + insertUsageEvent` (two separate Supabase calls, non-atomic — could leave wallet/meter drifted on partial failure) to the single `deduct_credits_with_audit` RPC. - Now atomic for ALL callers (chat workflow + research handlers), not just the new chat-workflow path. - Return shape simplified: `{ success: boolean }` instead of `{ success, newBalance }` (no caller was reading newBalance). 4. app/lib/workflows/runAgentWorkflow.ts - Imports handleChatCredits instead of recordChatUsage. - Reads gatewayCostUsd + token counts from responseMessage.metadata.{totalMessageCost, totalMessageUsage}. 5. Deleted (consolidated into existing infrastructure): - app/lib/workflows/recordChatUsage.ts - lib/credits/computeCreditsDeductedCents.ts - lib/credits/estimateModelUsageCost.ts - lib/credits/AvailableModelCost.ts - lib/credits/resolveCostTier.ts - All their test files Net delta: -7 files, +0 new orchestrator function. Plus the atomicity guarantee now applies to research handlers too. TDD: each change went RED → minimum impl → GREEN, with all 3195 tests passing at the end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(credits): mark recordCreditDeduction as 'use step' for workflow runtime Vercel Workflow's build-time detector flagged `nanoid` as a Node.js module that can't run inside the workflow body. Marking recordCreditDeduction as 'use step' moves it into the step runtime where Node modules are allowed. Backwards compatible for the existing research-handler callers (regular API routes) — 'use step' functions execute immediately when called from non-workflow contexts. Also matches open-agents' pattern: their recordWorkflowUsage (which contains the equivalent nanoid call) is a 'use step' function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(workflow): collapse inline metadata duck-type (KISS) PR review feedback: the 14-line inline type assertion for `result.responseMessage` was needless boilerplate. Replaced with: 1. Import the existing `AgentMessageMetadata` type (already used by `runAgentStep`'s `messageMetadata` callback — single source of truth for the shape). 2. Hoist a module-level `ZERO_USAGE` default so the fallback when metadata is missing is a named constant, not an inline literal. 3. Cast `result.responseMessage.metadata` once (`as AgentMessageMetadata | undefined`). Net delta: 14 lines → 5 lines inside the workflow body, no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Review limit reached
More reviews will be available in 54 minutes. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (7)
📒 Files selected for processing (6)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Releases PR #612 from `test` → `main`.
Test plan
🤖 Generated with Claude Code
Summary by cubic
Charges credits for every successful chat workflow turn with an atomic wallet debit + audit, closing the chat-workflow billing gap. Consolidates billing for both
/api/chatand/api/chat/workflow; closes #605.New Features
deduct_credits_with_auditPostgres RPC for each turn (no wallet/meter drift).runAgentWorkflowpersists the assistant message, then bills usinghandleChatCreditswithaccountId, model, gateway cost, and token usage from message metadata.getCreditUsageaccepts a gateway-reported USD cost and short-circuits token math; 1¢ minimum still applies.source="api"; streaming path defaults tosource="web".Refactors
handleChatCreditsfor/api/chatand/api/chat/workflow.deductCredits + insertUsageEventwith a single RPC wrapper:lib/supabase/credits_usage/deductCreditsWithAudit.accountIdinto the workflow; updated tests and Supabase RPC types.Written for commit d56aa31. Summary will update on new commits. Review in cubic