Skip to content

v1.3.1 - Detailed token usage: cache + reasoning

Choose a tag to compare

@Mandark-droid Mandark-droid released this 12 May 09:57
· 3 commits to main since this release

Highlights

Patch release adding upstream-aligned detailed token-usage span attributes for cache and reasoning, ready ahead of semantic-conventions-genai#76 (detailed token usage) and the metric-dimension counterpart at PR #96.

What's new

Detailed token-usage span attributes

When the provider supplies the corresponding data, LLM-call spans now emit three additional attributes — all matching the upstream-canonical names already in the gen-ai registry:

Attribute Provider source Upstream registry status
gen_ai.usage.cache_read.input_tokens Anthropic usage.cache_read_input_tokens · OpenAI usage.prompt_tokens_details.cached_tokens Stable in registry; aligned with PR #96 mapping
gen_ai.usage.cache_creation.input_tokens Anthropic usage.cache_creation_input_tokens Stable in registry; aligned with PR #96 mapping
gen_ai.usage.reasoning.output_tokens OpenAI usage.completion_tokens_details.reasoning_tokens (o1 / o3) Development stability in registry since semantic-conventions#3194, migrated to semantic-conventions-genai

Anthropic prompt-caching dashboard

gen_ai.usage.prompt_tokens                = 4096
gen_ai.usage.completion_tokens            = 256
gen_ai.usage.total_tokens                 = 4352
gen_ai.usage.cache_creation.input_tokens  = 1024    # cache write
gen_ai.usage.cache_read.input_tokens      = 3072    # cache hit

OpenAI o1/o3 reasoning visibility

gen_ai.usage.prompt_tokens                 = 200
gen_ai.usage.completion_tokens             = 1500
gen_ai.usage.total_tokens                  = 1700
gen_ai.usage.reasoning.output_tokens       = 1200   # hidden reasoning tokens
gen_ai.usage.cache_read.input_tokens       = 180    # OpenAI prompt cache hit

Zero / missing values are skipped, so non-cache / non-reasoning spans aren't littered with = 0 attributes.

Implementation note

  • anthropic_instrumentor already extracted cache_read_input_tokens and cache_creation_input_tokens into the usage dict for cost calculation since earlier releases. This release surfaces them as span attributes.
  • openai_instrumentor._extract_usage now also pulls prompt_tokens_details.cached_tokens (OpenAI prompt-cache hits) into the same canonical cache_read_input_tokens key, so the span-side attribute is uniform across providers.

Compatibility

  • API surface: no Python API changes.
  • Wire format: purely additive. The three new attributes are only set when the provider returns the corresponding usage detail; existing attributes are unchanged.

Tests

86/86 base + openai + anthropic tests pass, including two new tests covering presence and zero-value-skip behaviour.

Commit list

  • a466ed7 feat: emit detailed token-usage attributes (cache, reasoning) per semconv-genai#76
  • 1822d5f fix: rename reasoning token attribute to upstream-standardised name
  • 012feaf chore: cut v1.3.1