Skip to content

[Bug]: prompt_tokens incorrectly documented as non-cached tokens only #15945

@li-boxuan

Description

@li-boxuan

What happened?

https://docs.litellm.ai/docs/completion/prompt_caching says

prompt_tokens: These are the non-cached prompt tokens

but in reality (using litellm 1.79.0), this field does include cached prompt tokens.

Example from a litellm response (after using litellm completion_cost to compute the cost) using openai/gpt-5:

      "metrics": {
        "prompt_tokens": 9126,
        "completion_tokens": 3197,
        "cached_tokens": 4864,
        "cost_usd": 0.03790550000000001
      }

This only aligns with gpt-5 pricing if prompt_tokens also includes cached_tokens.

This test also implies that prompt_tokens includes cached_tokens:

def _usage_format_tests(usage: litellm.Usage):
"""
OpenAI prompt caching
- prompt_tokens = sum of non-cache hit tokens + cache-hit tokens
- total_tokens = prompt_tokens + completion_tokens

This code also implies prompt_tokens includes cached tokens:

if prompt_tokens_details["text_tokens"] == 0:
text_tokens = (
usage.prompt_tokens
- prompt_tokens_details["cache_hit_tokens"]
- prompt_tokens_details["audio_tokens"]
- prompt_tokens_details["cache_creation_tokens"]
)
prompt_tokens_details["text_tokens"] = text_tokens

Seems documentation is wrong

Relevant log output

Are you a ML Ops Team?

Yes

What LiteLLM version are you on ?

v1.79.0

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions