Problem
The May 21 telemetry triage flagged "Anthropic tokens_input=0 cost telemetry broken" across 54k Sonnet generations. Investigation shows the field is working correctly — but the dashboard semantics are unintuitive enough that the analyst (and likely others) read it as a bug.
Root cause (it's a measurement-semantics issue)
tokens_input is normalized across providers to mean uncached input tokens only:
- Anthropic returns this directly (their
inputTokens excludes cache).
- OpenAI/OpenRouter return inclusive totals;
getUsage subtracts cache_read/cache_write to derive the uncached portion.
On a full cache hit, inputTokens=0 is correct — the model literally processed zero new input tokens. The cached input was served from tokens_cache_read. The 54k "broken" generations were Anthropic cache hits where this is the expected value.
The companion field tokens_input_total carries the inclusive total (uncached + cache_read + cache_write) — the value the dashboard actually wanted. But it was emitted conditionally: only when it differed from tokens_input. For non-cache-using providers it was absent, which made dashboard queries that assumed presence yield nulls.
Fix
Two structural changes to remove ambiguity:
- Always emit
tokens_input_total (remove the conditional). Cost: ~12 bytes per generation event, negligible. Benefit: dashboards can query the field unconditionally; no null-coalesce needed.
- Tighten the schema from
tokens_input_total?: number to tokens_input_total: number. Type now reflects the always-present contract.
- Document the semantics in a doc-block above the
generation event type. Future query-writers should read it before assuming what tokens_input means.
Plus 7 new unit tests pinning the cross-provider normalization invariant: tokens.input + tokens.cache.read + tokens.cache.write === tokens.inputTotal.
Impact
- Dashboard queries that previously returned null for
tokens_input_total on non-cache-using providers now return a real number.
- The "Anthropic tokens_input=0" false-positive disappears once dashboards switch to
tokens_input_total for input volume / cost analysis.
- No behavioral change for cost calculation (cost was already computed correctly using the components).
Problem
The May 21 telemetry triage flagged "Anthropic tokens_input=0 cost telemetry broken" across 54k Sonnet generations. Investigation shows the field is working correctly — but the dashboard semantics are unintuitive enough that the analyst (and likely others) read it as a bug.
Root cause (it's a measurement-semantics issue)
tokens_inputis normalized across providers to mean uncached input tokens only:inputTokensexcludes cache).getUsagesubtracts cache_read/cache_write to derive the uncached portion.On a full cache hit,
inputTokens=0is correct — the model literally processed zero new input tokens. The cached input was served fromtokens_cache_read. The 54k "broken" generations were Anthropic cache hits where this is the expected value.The companion field
tokens_input_totalcarries the inclusive total (uncached + cache_read + cache_write) — the value the dashboard actually wanted. But it was emitted conditionally: only when it differed fromtokens_input. For non-cache-using providers it was absent, which made dashboard queries that assumed presence yield nulls.Fix
Two structural changes to remove ambiguity:
tokens_input_total(remove the conditional). Cost: ~12 bytes per generation event, negligible. Benefit: dashboards can query the field unconditionally; no null-coalesce needed.tokens_input_total?: numbertotokens_input_total: number. Type now reflects the always-present contract.generationevent type. Future query-writers should read it before assuming whattokens_inputmeans.Plus 7 new unit tests pinning the cross-provider normalization invariant:
tokens.input + tokens.cache.read + tokens.cache.write === tokens.inputTotal.Impact
tokens_input_totalon non-cache-using providers now return a real number.tokens_input_totalfor input volume / cost analysis.