-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
What happened?
https://docs.litellm.ai/docs/completion/prompt_caching says
prompt_tokens: These are the non-cached prompt tokens
but in reality (using litellm 1.79.0), this field does include cached prompt tokens.
Example from a litellm response (after using litellm completion_cost to compute the cost) using openai/gpt-5:
"metrics": {
"prompt_tokens": 9126,
"completion_tokens": 3197,
"cached_tokens": 4864,
"cost_usd": 0.03790550000000001
}
This only aligns with gpt-5 pricing if prompt_tokens also includes cached_tokens.
This test also implies that prompt_tokens includes cached_tokens:
litellm/tests/local_testing/test_prompt_caching.py
Lines 13 to 17 in c0890e7
| def _usage_format_tests(usage: litellm.Usage): | |
| """ | |
| OpenAI prompt caching | |
| - prompt_tokens = sum of non-cache hit tokens + cache-hit tokens | |
| - total_tokens = prompt_tokens + completion_tokens |
This code also implies prompt_tokens includes cached tokens:
litellm/litellm/litellm_core_utils/llm_cost_calc/utils.py
Lines 535 to 542 in c0890e7
| if prompt_tokens_details["text_tokens"] == 0: | |
| text_tokens = ( | |
| usage.prompt_tokens | |
| - prompt_tokens_details["cache_hit_tokens"] | |
| - prompt_tokens_details["audio_tokens"] | |
| - prompt_tokens_details["cache_creation_tokens"] | |
| ) | |
| prompt_tokens_details["text_tokens"] = text_tokens |
Seems documentation is wrong
Relevant log output
Are you a ML Ops Team?
Yes
What LiteLLM version are you on ?
v1.79.0
Twitter / LinkedIn details
No response