fix: use actual token counts for /stats cost calculation instead of estimates by kagurachen28-prog · Pull Request #106 · BlockRunAI/ClawRouter

kagurachen28-prog · 2026-03-18T07:34:08Z

Problem

Fixes #36

The /stats endpoint overstates costs by ~2.5x compared to on-chain charges. Two compounding issues:

1. `maxTokens` used instead of actual output tokens

calculateModelCost() receives maxTokens (the maximum allowed output, e.g. 4096) rather than the actual completion_tokens from the response. A response generating 200 tokens is costed as if 4096 were produced.

2. 20% buffer applied to logged cost

The pre-payment estimation correctly applies a 1.2x buffer (you overpay slightly because exact cost is unknown upfront). However, this same buffer was also applied to the logged cost shown in /stats, inflating displayed expenses.

Fix

Changes in `src/proxy.ts`:

Capture completion_tokens from upstream responses (both streaming and non-streaming paths) into a new responseOutputTokens variable
Use actual token counts (responseInputTokens, responseOutputTokens) for cost calculation when available, falling back to estimates only when the upstream provider doesn't return usage data
Remove the 1.2x buffer from the logged UsageEntry — log actual cost, not the pre-payment estimate

Changes in `src/logger.ts`:

Add optional outputTokens field to UsageEntry type for richer /stats reporting

What was NOT changed:

estimateAmount() — pre-payment 20% buffer is correct and stays
BALANCE_CHECK_BUFFER (1.5x) — balance checking buffer is correct and stays
calculateModelCost() — function is correct, it was just receiving wrong inputs

Testing

All 331 tests pass, 3 skipped (pre-existing)
TypeScript compiles cleanly (tsc --noEmit)

Summary by CodeRabbit

Release Notes

New Features
- Added output token tracking to provide more detailed usage metrics alongside input tokens.
Bug Fixes
- Improved usage and cost calculations to use actual token counts from provider responses instead of estimates, removing inaccurate approximation buffers for greater precision.

Previously, /stats expense calculations used estimated input tokens (body.length / 4) and max_tokens for output, then applied a 20% buffer. This caused logged costs to significantly exceed actual costs. Changes: - Capture completion_tokens from API responses (streaming + non-streaming) - Use actual input/output token counts when available, fall back to estimates - Remove 20% buffer from logged cost (buffer remains in estimateAmount() for pre-payment, which is correct) - Add outputTokens field to UsageEntry type and log entries The pre-payment 20% buffer in estimateAmount() and the 1.5x BALANCE_CHECK_BUFFER are intentionally unchanged — those are conservative safeguards for payment, not for reporting. Closes BlockRunAI#36

coderabbitai · 2026-03-18T07:34:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 22372e1f-3910-42fd-8b3b-19331ec9f423

📥 Commits

Reviewing files that changed from the base of the PR and between 8365e29 and 9c7bbfd.

📒 Files selected for processing (2)

src/logger.ts
src/proxy.ts

📝 Walkthrough

Walkthrough

The changes introduce output token tracking alongside existing input token tracking and refactor cost calculations to use actual token counts from provider responses instead of estimates based on body length. A 20% buffer previously applied to cost estimates is removed. The logging payload is extended to include output token data when available.

Changes

Cohort / File(s)	Summary
Type Definition Updates `src/logger.ts`	Adds optional `outputTokens?: number` field to `UsageEntry` type to support logging of provider-reported completion tokens.
Token Tracking & Cost Calculation `src/proxy.ts`	Implements tracking of `responseOutputTokens` from upstream completion_tokens data in both streaming and non-streaming response paths. Replaces cost estimation logic based on body length with calculations derived from actual token counts (`actualInputTokens`, `actualOutputTokens`). Removes 20% cost buffer and uses exact calculated values. Updates logging payload to include outputTokens when available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly summarizes the main change: replacing cost estimate calculations with actual token counts from upstream responses.
Linked Issues check	✅ Passed	The pull request directly addresses issue `#36` by capturing actual completion tokens and removing the pre-payment buffer from logged costs, fixing the dashboard/on-chain cost mismatch.
Out of Scope Changes check	✅ Passed	All changes are scoped to fixing the /stats cost calculation issue: adding outputTokens field to UsageEntry and updating cost logic in proxy.ts to use actual tokens.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

1bcMax

Fix is correct and well-scoped. Both issues accurately diagnosed: maxTokens ≠ completion_tokens, and the 1.2x buffer belongs only in pre-payment estimation, not in logged costs. Pre-payment path (estimateAmount, BALANCE_CHECK_BUFFER) correctly left untouched. No conflicts with recent main. Merging.

1bcMax approved these changes Mar 18, 2026

View reviewed changes

1bcMax merged commit 993fe1e into BlockRunAI:main Mar 18, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use actual token counts for /stats cost calculation instead of estimates#106

fix: use actual token counts for /stats cost calculation instead of estimates#106
1bcMax merged 1 commit intoBlockRunAI:mainfrom
kagurachen28-prog:fix/stats-expense-calculation

kagurachen28-prog commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

1bcMax left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kagurachen28-prog commented Mar 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

1. maxTokens used instead of actual output tokens

2. 20% buffer applied to logged cost

Fix

Changes in src/proxy.ts:

Changes in src/logger.ts:

What was NOT changed:

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

1bcMax left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kagurachen28-prog commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

1. `maxTokens` used instead of actual output tokens

Changes in `src/proxy.ts`:

Changes in `src/logger.ts`:

coderabbitai bot commented Mar 18, 2026 •

edited

Loading