feat: Improve lib with cachedTokens and reasoningTokens#1
Conversation
| prompt_rate = pricing["prompt"] | ||
| completion_rate = pricing["completion"] | ||
| cache_read_rate = pricing.get("cache_read", prompt_rate) | ||
| cache_creation_rate = pricing.get("cache_creation", prompt_rate) |
There was a problem hiding this comment.
[CORRECTNESS] Reported reasoning token count can differ from billed count.
Why: Billing clamps with effective_reasoning = min(reasoning_tokens, output_tokens) but CostResult.reasoning_tokens keeps the original value, so CLI output can over-report reasoning usage.
Suggestion: Store/report the clamped reasoning value (or include both raw and billed values with clear labels) so displayed usage matches billed usage.
There was a problem hiding this comment.
Technically valid observation, but I'd push back on the fix. The clamping (min(reasoning_tokens, output_tokens)) is a defensive guard, not an expected case. If reasoning > output, the caller passed bad data. Reporting the raw value in CostResult is more useful for debugging — the caller can see their own input. Silently changing it would be confusing. The total_cost_usd is always correct regardless.
|
|
||
| results["tokencost"] = _tokencost_source(model, input_tokens, output_tokens) | ||
|
|
||
| # Preserve order: litellm -> openrouter -> tokencost |
There was a problem hiding this comment.
[BEHAVIOR] tokencost path silently drops cache and reasoning token inputs.
Why: The tokencost branch only calls _tokencost_source(model, input_tokens, output_tokens), so cached, cache-creation, and reasoning tokens are ignored while other sources include them.
Suggestion: Either include equivalent token components in tokencost calculations or emit an explicit warning/flag that those token types are excluded for that source.
There was a problem hiding this comment.
Already addressed. We added a comment in calculator.py:107-110 explaining this, and a note in the README. The tokencost library simply doesn't have these fields — we can't invent pricing it doesn't provide. Adding a warning/flag would add complexity for a known limitation that's already documented.
robertomier
left a comment
There was a problem hiding this comment.
Compatibilidad/API: al introducir * antes de source en calculate_cost, el 4o argumento posicional deja de funcionar. Impacto: llamadas existentes como calculate_cost(m, i, o, "openrouter") pasan a fallar con TypeError. Sugerencia: manten source como cuarto parametro posicional y mueve los nuevos campos a keyword-only (..., output_tokens, source="litellm", *, cached_input_tokens=0, ...). El resto del ajuste de formula y documentacion va en buena direccion.
No description provided.