Fix usage & billing for custom model aliases and cached/reasoning tokens#4222
Conversation
|
@rekram1-node, we fixed the same bug, but this also fixes the spent calculations. |
|
@melihmucuk can u explain the cached input stuff? Cached input isn't free |
|
@rekram1-node Yes, cached input isn’t free. The SDK returns total input tokens, but we need to know how many of those were uncached to calculate For example, suppose the SDK reports 1M input tokens: 100K |
|
okay cool, I think I misread ur original comment which got me confused so apologies |
|
@melihmucuk can u resolve the conflict, I think ur changes are correct for the input tokens |
|
@rekram1-node done. |
|
@rekram1-node ok for revert, but definitely add reasoning token cost to total cost. Almost all providers calculate the reasoning tokens as output tokens. |
|
Yeah for sure ill add in |
|
ill merge this in sorry got distracted |
…ens (anomalyco#4222) Co-authored-by: Melih Mucuk <melih@monkeysteam.com> Co-authored-by: Aiden Cline <aidenpcline@gmail.com>
Summary:
Changed files:
parsed.models[model.id]for aliases.getUsage: computeuncachedInput = inputTokens - cachedInputTokens, bill cached/uncached input separately, include cache.write, and charge reasoning tokens at output rate.Tests: ran session tests locally (2 passed).
Fixes #4162