Skip to content

feat(credits): port computeCreditsDeductedCents from open-agents (step 2/6)#611

Closed
sweetmantech wants to merge 1 commit into
testfrom
feat/charge-credits-per-chat-turn
Closed

feat(credits): port computeCreditsDeductedCents from open-agents (step 2/6)#611
sweetmantech wants to merge 1 commit into
testfrom
feat/charge-credits-per-chat-turn

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented May 25, 2026

Summary

Step 2 of 6 in closing the chat-workflow credits gap tracked in #605 (silent revenue loss — every chat turn on `/api/chat/workflow` is currently free).

Ports the per-turn cost math from open-agents so the same billing math runs on both sides of the chat cutover. No behavior change yet — this PR only adds the foundation that step 3 (the `deductCreditsWithAudit` wrapper) will use.

What landed

Three new files in `lib/credits/` (one exported function each, matching api's SRP convention):

File Purpose
`AvailableModelCost.ts` Cost-catalog type. Mirrors open-agents' `AvailableModelCost` so the same estimator handles either catalog. api's current catalog (via `getAvailableModels` → `models.dev`) only emits `{ input, output }` — the richer `cache_read` / `context_over_200k` fields are typed-but-undefined for now and future-proof for catalog expansion.
`estimateModelUsageCost.ts` Token-based USD estimator. Ports open-agents `apps/web/lib/models.ts:estimateModelUsageCost` line-for-line. Handles the 200k+ context tier swap with partial-override fallback, and bills cached input tokens at `cache_read` price when present (falls back to base input price otherwise).
`computeCreditsDeductedCents.ts` Top-level orchestrator. Ports open-agents' `compute-credits-deducted-cents.ts` with one improvement: calls api's `getAvailableModels()` directly instead of HTTP-fetching `/api/ai/models` from itself.

Resolution order (unchanged from open-agents)

  1. Gateway-reported cost on `responseMessage.metadata.totalMessageCost` — the same number the chat UI shows next to the assistant response. Used directly so the wallet debit converges with the cost label.
  2. Token-based estimate against the catalog's `cost` entry. Catalog comes from `getAvailableModels()` (gateway + models.dev pipeline).
  3. 1c floor — every successful turn moves the wallet by at least 1c so a catalog outage can't make a turn free.

Tests

27 new unit tests:

  • `estimateModelUsageCost.test.ts` — 13 tests:

    • guard rails (missing cost / missing input price / missing output price → undefined)
    • base tier math, cached + uncached portions
    • `cache_read` priced correctly when present; falls back to input price when absent
    • clamps negative cached / negative output to 0
    • clamps `cachedInputTokens > inputTokens` so uncached portion can't go negative
    • `context_over_200k` tier swap triggers at strictly > 200k input
    • tier swap ignored when only `cache_read` is overridden (not a real tier)
    • falls back to base tier values when override is partial
  • `computeCreditsDeductedCents.test.ts` — 14 tests:

    • gateway cost path: positive number → cents, rounds correctly, 1c floor
    • does NOT call catalog when gateway cost is usable (perf check)
    • falls back to estimate for undefined / 0 / negative / NaN / Infinity gateway cost
    • returns 1c when model not in catalog / cost missing / fetch rejects / estimate ≤ 0
    • matches modelId exactly (no fuzzy matching)

Full suite: 3191 → 3205 passing.

What's next

After this merges:

  • Step 3: `lib/credits/deductCreditsWithAudit.ts` — thin `supabase.rpc('deduct_credits_with_audit', ...)` wrapper using the RPC merged in feat: deduct_credits_with_audit RPC for atomic wallet + meter writes database#26
  • Step 4: `"use step"` `recordChatUsage.ts` reading `responseMessage.metadata.{totalMessageCost, totalMessageUsage}` and calling the wrapper
  • Step 5: Wire `recordChatUsage` into `runAgentWorkflow` after `persistAssistantMessage`
  • Step 6: Preview verification — trigger a chat, then `GET /api/accounts/{id}/credits` (balance drops) and `/api/admins/credits/events` (new audit row visible)

Test plan

  • 27 new unit tests pass
  • Full suite passes (3205/3205)
  • Lint / format clean
  • After merge to `test`: confirm gateway-cost path with a live chat turn (`responseMessage.metadata.totalMessageCost` is currently emitted by `buildMessageMetadataCallback` and would now also be the input to billing)

🤖 Generated with Claude Code


Summary by cubic

Ports the per-turn billing math from open-agents to compute chat turn charges consistently across the cutover. Step 2/6 for #605; no behavior change yet.

  • New Features
    • Added lib/credits/computeCreditsDeductedCents.ts to compute cents per turn: gateway-reported cost → token estimate → 1c floor.
    • Added lib/credits/estimateModelUsageCost.ts for token-based USD estimates, including 200k+ context tier and cache_read pricing.
    • Added lib/credits/AvailableModelCost.ts types mirroring open-agents; uses the getAvailableModels() catalog.

Written for commit 319d2e6. Summary will update on new commits. Review in cubic

…ost from open-agents

First piece of the chat-workflow billing path. Ports the per-turn cost
math from open-agents' `apps/web/lib/credits/compute-credits-deducted-cents.ts`
and `apps/web/lib/models.ts:estimateModelUsageCost` so the same billing
logic runs on both sides during the cutover.

Resolution order matches open-agents exactly:
  1. gateway-reported cost on responseMessage.metadata.totalMessageCost
     (the same number the chat UI shows next to the response)
  2. token-based estimate against the model catalog's cost entry
  3. 1c floor when no pricing is available — so a successful turn
     never lands as a free run

Three new files (per api's one-exported-function-per-file SRP):
  - AvailableModelCost.ts — shape mirroring open-agents' richer cost
    type (input, output, cache_read, context_over_200k) so the same
    estimator runs against either catalog
  - estimateModelUsageCost.ts — token-based USD estimator including
    the 200k+ context tier swap and cache_read pricing
  - computeCreditsDeductedCents.ts — top-level orchestrator (gateway
    cost → token estimate → 1c floor) using api's getAvailableModels
    directly (no HTTP self-fetch like open-agents does)

Test coverage: 27 new unit tests across the two test files. All pricing
edge cases covered (NaN/Infinity/negative gateway cost, cached-tokens-
exceeding-input clamping, context_over_200k tier swap with partial
overrides, catalog miss / fetch failure fallbacks).

Unblocks step 3 (deductCreditsWithAudit TS wrapper) of the chat credits
gap in #605.

Full suite: 3191 → 3205 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
api Ready Ready Preview May 25, 2026 7:42pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Warning

Review limit reached

@sweetmantech, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 1 review of capacity. Refill in 46 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b2f9a910-9f0d-4ee4-a337-433cf78e480c

📥 Commits

Reviewing files that changed from the base of the PR and between 5ec88cc and 319d2e6.

⛔ Files ignored due to path filters (2)
  • lib/credits/__tests__/computeCreditsDeductedCents.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/credits/__tests__/estimateModelUsageCost.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (3)
  • lib/credits/AvailableModelCost.ts
  • lib/credits/computeCreditsDeductedCents.ts
  • lib/credits/estimateModelUsageCost.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/charge-credits-per-chat-turn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Confidence score: 4/5

  • This PR looks safe to merge with low functional risk, since both findings are test-organization/style issues rather than runtime logic defects.
  • The most severe issue is in lib/credits/__tests__/computeCreditsDeductedCents.test.ts (6/10): exceeding the 100-line limit and mixing concerns can make tests harder to maintain and debug over time.
  • lib/credits/__tests__/estimateModelUsageCost.test.ts has a similar but lower-severity length violation (3/10), which is mainly a housekeeping concern and not likely to cause regression by itself.
  • Pay close attention to lib/credits/__tests__/computeCreditsDeductedCents.test.ts and lib/credits/__tests__/estimateModelUsageCost.test.ts - both exceed repository test file size/style expectations and may benefit from splitting for maintainability.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="lib/credits/__tests__/computeCreditsDeductedCents.test.ts">

<violation number="1" location="lib/credits/__tests__/computeCreditsDeductedCents.test.ts:1">
P2: Custom agent: **Enforce Clear Code Style and Maintainability Practices**

Test file exceeds the repository’s 100-line limit and combines multiple concerns in one module.</violation>
</file>

<file name="lib/credits/__tests__/estimateModelUsageCost.test.ts">

<violation number="1" location="lib/credits/__tests__/estimateModelUsageCost.test.ts:1">
P3: Custom agent: **Enforce Clear Code Style and Maintainability Practices**

This new test file exceeds the repository’s 100-line limit.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant UI as Chat UI
    participant API as Chat API (runAgentWorkflow)
    participant Cost as computeCreditsDeductedCents
    participant Est as estimateModelUsageCost
    participant Catalog as getAvailableModels
    participant Models as models.dev / Gateway

    Note over UI,Models: NEW: Per-turn credit charge computation

    API->>Cost: NEW: computeCreditsDeductedCents(usage, modelId, gatewayCostUsd)

    alt Gateway cost available (positive finite number)
        Cost->>Cost: NEW: Round to cents, apply 1c floor
        Cost-->>API: cents (integer ≥ 1)
    else Gateway cost missing/0/negative/NaN/Infinity
        Cost->>Catalog: NEW: fetch model catalog
        Catalog->>Models: getAvailableModels()
        Models-->>Catalog: [{ id, cost }, ...]

        alt Model found with cost entry
            Catalog-->>Cost: model cost tier
            Cost->>Est: NEW: estimateModelUsageCost(usage, cost)

            Est->>Est: NEW: resolveCostTier (base vs 200k+ override)
            alt Base tier (input ≤ 200k)
                Est->>Est: NEW: compute uncached input + cached input + output
            else Context over 200k tier (input > 200k AND override present)
                Est->>Est: NEW: use overridden input/output/cache_read prices
            end
            Est-->>Cost: USD estimate or undefined

            alt Estimate is valid positive number
                Cost->>Cost: NEW: Round to cents, apply 1c floor
            else Estimate undefined or ≤ 0
                Cost->>Cost: NEW: return 1c floor
            end
        else Model not found or no cost entry
            Cost->>Cost: NEW: return 1c floor
        end

        alt Catalog fetch fails (error)
            Cost->>Cost: NEW: log error, return 1c floor
        end
        Cost-->>API: cents (integer ≥ 1)
    end

    Note over API,Cost: Result used by step 3 (deductCreditsWithAudit)
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

@@ -0,0 +1,183 @@
import { describe, it, expect, vi, beforeEach } from "vitest";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Custom agent: Enforce Clear Code Style and Maintainability Practices

Test file exceeds the repository’s 100-line limit and combines multiple concerns in one module.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/credits/__tests__/computeCreditsDeductedCents.test.ts, line 1:

<comment>Test file exceeds the repository’s 100-line limit and combines multiple concerns in one module.</comment>

<file context>
@@ -0,0 +1,183 @@
+import { describe, it, expect, vi, beforeEach } from "vitest";
+import { computeCreditsDeductedCents } from "@/lib/credits/computeCreditsDeductedCents";
+import { getAvailableModels } from "@/lib/ai/getAvailableModels";
</file context>

@@ -0,0 +1,165 @@
import { describe, it, expect } from "vitest";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Custom agent: Enforce Clear Code Style and Maintainability Practices

This new test file exceeds the repository’s 100-line limit.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/credits/__tests__/estimateModelUsageCost.test.ts, line 1:

<comment>This new test file exceeds the repository’s 100-line limit.</comment>

<file context>
@@ -0,0 +1,165 @@
+import { describe, it, expect } from "vitest";
+import { estimateModelUsageCost } from "@/lib/credits/estimateModelUsageCost";
+
</file context>

@sweetmantech
Copy link
Copy Markdown
Contributor Author

Closing — this slice was pure library code with no integration point, which made the preview deployment untestable (no route or workflow exercises the new functions yet). Per feedback on the "step 2/6" framing, reshaping the work into a single user-testable PR on the same branch (`feat/charge-credits-per-chat-turn`). The next PR will include:

  1. The lib code from this PR (`AvailableModelCost`, `estimateModelUsageCost`, `computeCreditsDeductedCents`)
  2. `lib/credits/deductCreditsWithAudit.ts` — `supabase.rpc('deduct_credits_with_audit', ...)` wrapper
  3. `app/lib/workflows/recordChatUsage.ts` — `"use step"` that ties the two together
  4. Wiring into `runAgentWorkflow` after `persistAssistantMessage`

So one merge = chat turns actually charge credits, demonstrable on the preview by triggering a workflow and watching `GET /api/accounts/{id}/credits` drop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant