v1.3.1 - Detailed token usage: cache + reasoning
Highlights
Patch release adding upstream-aligned detailed token-usage span attributes for cache and reasoning, ready ahead of semantic-conventions-genai#76 (detailed token usage) and the metric-dimension counterpart at PR #96.
What's new
Detailed token-usage span attributes
When the provider supplies the corresponding data, LLM-call spans now emit three additional attributes — all matching the upstream-canonical names already in the gen-ai registry:
| Attribute | Provider source | Upstream registry status |
|---|---|---|
gen_ai.usage.cache_read.input_tokens |
Anthropic usage.cache_read_input_tokens · OpenAI usage.prompt_tokens_details.cached_tokens |
Stable in registry; aligned with PR #96 mapping |
gen_ai.usage.cache_creation.input_tokens |
Anthropic usage.cache_creation_input_tokens |
Stable in registry; aligned with PR #96 mapping |
gen_ai.usage.reasoning.output_tokens |
OpenAI usage.completion_tokens_details.reasoning_tokens (o1 / o3) |
Development stability in registry since semantic-conventions#3194, migrated to semantic-conventions-genai |
Anthropic prompt-caching dashboard
gen_ai.usage.prompt_tokens = 4096
gen_ai.usage.completion_tokens = 256
gen_ai.usage.total_tokens = 4352
gen_ai.usage.cache_creation.input_tokens = 1024 # cache write
gen_ai.usage.cache_read.input_tokens = 3072 # cache hit
OpenAI o1/o3 reasoning visibility
gen_ai.usage.prompt_tokens = 200
gen_ai.usage.completion_tokens = 1500
gen_ai.usage.total_tokens = 1700
gen_ai.usage.reasoning.output_tokens = 1200 # hidden reasoning tokens
gen_ai.usage.cache_read.input_tokens = 180 # OpenAI prompt cache hit
Zero / missing values are skipped, so non-cache / non-reasoning spans aren't littered with = 0 attributes.
Implementation note
anthropic_instrumentoralready extractedcache_read_input_tokensandcache_creation_input_tokensinto the usage dict for cost calculation since earlier releases. This release surfaces them as span attributes.openai_instrumentor._extract_usagenow also pullsprompt_tokens_details.cached_tokens(OpenAI prompt-cache hits) into the same canonicalcache_read_input_tokenskey, so the span-side attribute is uniform across providers.
Compatibility
- API surface: no Python API changes.
- Wire format: purely additive. The three new attributes are only set when the provider returns the corresponding usage detail; existing attributes are unchanged.
Tests
86/86 base + openai + anthropic tests pass, including two new tests covering presence and zero-value-skip behaviour.