Skip to content

fix: use actual token counts for /stats cost calculation instead of estimates#106

Merged
1bcMax merged 1 commit intoBlockRunAI:mainfrom
kagurachen28-prog:fix/stats-expense-calculation
Mar 18, 2026
Merged

fix: use actual token counts for /stats cost calculation instead of estimates#106
1bcMax merged 1 commit intoBlockRunAI:mainfrom
kagurachen28-prog:fix/stats-expense-calculation

Conversation

@kagurachen28-prog
Copy link
Contributor

@kagurachen28-prog kagurachen28-prog commented Mar 18, 2026

Problem

Fixes #36

The /stats endpoint overstates costs by ~2.5x compared to on-chain charges. Two compounding issues:

1. maxTokens used instead of actual output tokens

calculateModelCost() receives maxTokens (the maximum allowed output, e.g. 4096) rather than the actual completion_tokens from the response. A response generating 200 tokens is costed as if 4096 were produced.

2. 20% buffer applied to logged cost

The pre-payment estimation correctly applies a 1.2x buffer (you overpay slightly because exact cost is unknown upfront). However, this same buffer was also applied to the logged cost shown in /stats, inflating displayed expenses.

Fix

Changes in src/proxy.ts:

  • Capture completion_tokens from upstream responses (both streaming and non-streaming paths) into a new responseOutputTokens variable
  • Use actual token counts (responseInputTokens, responseOutputTokens) for cost calculation when available, falling back to estimates only when the upstream provider doesn't return usage data
  • Remove the 1.2x buffer from the logged UsageEntry — log actual cost, not the pre-payment estimate

Changes in src/logger.ts:

  • Add optional outputTokens field to UsageEntry type for richer /stats reporting

What was NOT changed:

  • estimateAmount() — pre-payment 20% buffer is correct and stays
  • BALANCE_CHECK_BUFFER (1.5x) — balance checking buffer is correct and stays
  • calculateModelCost() — function is correct, it was just receiving wrong inputs

Testing

  • All 331 tests pass, 3 skipped (pre-existing)
  • TypeScript compiles cleanly (tsc --noEmit)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added output token tracking to provide more detailed usage metrics alongside input tokens.
  • Bug Fixes

    • Improved usage and cost calculations to use actual token counts from provider responses instead of estimates, removing inaccurate approximation buffers for greater precision.

Previously, /stats expense calculations used estimated input tokens
(body.length / 4) and max_tokens for output, then applied a 20% buffer.
This caused logged costs to significantly exceed actual costs.

Changes:
- Capture completion_tokens from API responses (streaming + non-streaming)
- Use actual input/output token counts when available, fall back to estimates
- Remove 20% buffer from logged cost (buffer remains in estimateAmount()
  for pre-payment, which is correct)
- Add outputTokens field to UsageEntry type and log entries

The pre-payment 20% buffer in estimateAmount() and the 1.5x
BALANCE_CHECK_BUFFER are intentionally unchanged — those are
conservative safeguards for payment, not for reporting.

Closes BlockRunAI#36
@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 22372e1f-3910-42fd-8b3b-19331ec9f423

📥 Commits

Reviewing files that changed from the base of the PR and between 8365e29 and 9c7bbfd.

📒 Files selected for processing (2)
  • src/logger.ts
  • src/proxy.ts

📝 Walkthrough

Walkthrough

The changes introduce output token tracking alongside existing input token tracking and refactor cost calculations to use actual token counts from provider responses instead of estimates based on body length. A 20% buffer previously applied to cost estimates is removed. The logging payload is extended to include output token data when available.

Changes

Cohort / File(s) Summary
Type Definition Updates
src/logger.ts
Adds optional outputTokens?: number field to UsageEntry type to support logging of provider-reported completion tokens.
Token Tracking & Cost Calculation
src/proxy.ts
Implements tracking of responseOutputTokens from upstream completion_tokens data in both streaming and non-streaming response paths. Replaces cost estimation logic based on body length with calculations derived from actual token counts (actualInputTokens, actualOutputTokens). Removes 20% cost buffer and uses exact calculated values. Updates logging payload to include outputTokens when available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly summarizes the main change: replacing cost estimate calculations with actual token counts from upstream responses.
Linked Issues check ✅ Passed The pull request directly addresses issue #36 by capturing actual completion tokens and removing the pre-payment buffer from logged costs, fixing the dashboard/on-chain cost mismatch.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the /stats cost calculation issue: adding outputTokens field to UsageEntry and updating cost logic in proxy.ts to use actual tokens.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Member

@1bcMax 1bcMax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix is correct and well-scoped. Both issues accurately diagnosed: maxTokens ≠ completion_tokens, and the 1.2x buffer belongs only in pre-payment estimation, not in logged costs. Pre-payment path (estimateAmount, BALANCE_CHECK_BUFFER) correctly left untouched. No conflicts with recent main. Merging.

@1bcMax 1bcMax merged commit 993fe1e into BlockRunAI:main Mar 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/stats Expense Calcuation Dismatch

2 participants