You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Anthropic prompt caching.splitPrompt translates structured system
messages with cache: '5m' | '1h' markers into Anthropic's typed cache_control: { type: 'ephemeral', ttl?: '1h' } blocks. The adapter
now extracts cache_creation_input_tokens and cache_read_input_tokens
from message_start.usage and surfaces them on AnswerResult and via costFor.
OpenAI / chat-completions cache reporting. OpenAI Responses,
cerebras, fireworks, and xAI adapters extract prompt_tokens_details.cached_tokens and convert the OpenAI "subset
of prompt_tokens" convention into mohdel's additive cacheReadInputTokens field. Caller code sees one consistent shape
regardless of provider semantics.
AnswerResult.cacheWriteInputTokens / cacheReadInputTokens
added to the type definition. Symmetric write/read pair matching
catalog cacheWritePrice/cacheReadPrice; Anthropic's cache_creation_input_tokens is normalized into cacheWriteInputTokens
at the adapter boundary.
computeCost honours cacheWritePrice and cacheReadPrice in
catalog specs, falling back to inputPrice when absent so non-caching
providers degrade gracefully. Pricing is additive: i*ip + cw*cwp + cr*crp + o*op + t*tp.