mohdel v0.114.0

Latest

Latest

clbrge released this 14 Jun 11:49

c5e0e31

Added

Anthropic per-TTL cache-creation accounting. The streaming adapter now reads
usage.cache_creation.ephemeral_1h_input_tokens and exposes
cacheWrite1hInputTokens on the done result. computeCost prices the 1h write
tier via a new optional spec field cacheWrite1hPrice; the 5m portion
(cacheWriteInputTokens − cacheWrite1hInputTokens) stays on cacheWritePrice.
A request may mix TTLs, so both tiers can be non-zero.
Gemini implicit-cache read accounting. The adapter now reads
usageMetadata.cachedContentTokenCount and exposes cacheReadInputTokens on
the done result, priced via the existing cacheReadPrice spec field.

Fixed

Gemini cached input was billed at full inputPrice. Cached tokens are now
subtracted from inputTokens (subset convention, matching the OpenAI adapter)
and priced at cacheReadPrice, so reported cost reflects the provider's
cached-token discount.
Anthropic 1h cache writes were priced at the 5m rate. They now bill at
cacheWrite1hPrice when the spec provides it.

Notes

Additive and backward-compatible. When cacheWrite1hInputTokens is absent or a
spec omits cacheWrite1hPrice, the whole write bills at cacheWritePrice —
prior behaviour. Realizing the corrected rates requires catalog specs to carry
cacheWrite1hPrice (Anthropic) and cacheReadPrice (Gemini); without them the
new fields are observability-only and cost is unchanged.

Assets 2