feat(cost): split cache creation pricing by 5m/1h duration#1221
Conversation
Claude Code JSONL includes a `cache_creation` breakdown with `ephemeral_5m_input_tokens` and `ephemeral_1h_input_tokens`. Previously all cache creation tokens were priced at the 5-minute rate (1.25x base input), but Claude Code predominantly uses 1-hour caching (2x base input). - Parse `cache_creation.ephemeral_5m_input_tokens` and `ephemeral_1h_input_tokens` from JSONL usage records - Price 5m tokens at existing cache_create rate, 1h tokens at 2x base input (matching Anthropic pricing) - Fall back to flat `cache_creation_input_tokens` when breakdown is absent (older records, non-Claude agents) - Fix token count aggregation to include breakdown tokens Closes #899
|
@coderabbitai review
|
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
ccusage-guide | 2d20874 | Commit Preview URL Branch Preview URL |
Jun 08 2026, 09:13 PM |
|
@coderabbitai review it |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Get the PR diff
gh pr diff 1221 --repo ryoppippi/ccusage
echo "=== PR FILES CHANGED ==="
gh pr view 1221 --repo ryoppippi/ccusage --json files --jq '.files[].path'Length of output: 32147 🏁 Script executed: #!/bin/bash
# Check if TokenUsageRaw derives Copy and if it's serialized anywhere
echo "=== TokenUsageRaw full definition ==="
rg -n "struct TokenUsageRaw" -A 20 rust/crates/ccusage/src/types.rs
echo ""
echo "=== Speed type definition ==="
rg -n "struct Speed|enum Speed" -A 10 rust/crates/ccusage/src/types.rs
echo ""
echo "=== Serialize on TokenUsageRaw ==="
rg -n "Serialize" rust/crates/ccusage/src/types.rs
echo ""
echo "=== Serialize usage on TokenUsageRaw ==="
rg -rn "serde::Serialize|impl Serialize" rust/crates/ccusage/src/
echo ""
echo "=== Claude JSONL parsing of cache_creation field ==="
rg -n "cache_creation" rust/crates/ccusage/src/adapter/claude/ -A 3 -B 3
echo ""
echo "=== CacheCreationRaw test coverage ==="
rg -n "CacheCreationRaw\|ephemeral_5m\|ephemeral_1h" rust/crates/ccusage/src/ -A 3
echo ""
echo "=== cost.rs full function ==="
cat rust/crates/ccusage/src/cost.rsLength of output: 12468
Review:
|
Add focused regression coverage for Claude Code cache creation duration handling. The tests pin the three important cases from #899: records with 5-minute and 1-hour cache creation breakdowns use separate rates, older records without the breakdown keep the flat cache creation fallback, and serde parses the nested cache_creation usage fields. Update the cost mode guide so the documented formula matches the new 5m/1h calculation. Also make cache_creation_token_count borrow self and name the 1-hour multiplier so future pricing changes have a single obvious constant to inspect.
…che-creation-pricing # Conflicts: # .github/workflows/pullfrog.yml
ccusage
@ccusage/ccusage-darwin-arm64
@ccusage/ccusage-darwin-x64
@ccusage/ccusage-linux-arm64
@ccusage/ccusage-linux-x64
@ccusage/ccusage-win32-arm64
@ccusage/ccusage-win32-x64
commit: |
ccusage performance comparisonPR SHA: This compares the PR package against the configured base package on the same CI runner. Package runner startupExecution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one
Cached bunx execution performanceRuns the same large fixture through Fixtures: Claude
Package runtime diagnosticsCompares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself. Fixtures: Claude
Committed fixture performanceCommitted small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage. Fixtures: Claude
Large real-world-shaped fixture performanceGenerated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures. Fixtures: Claude
Artifact size
Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees. |
ccusage performance comparisonPR SHA: This compares the Rust PR release binary against the configured base package on the same CI runner. Package runner startupExecution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one
Cached bunx execution performanceRuns the same large fixture through Fixtures: Claude
Package runtime diagnosticsCompares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself. Fixtures: Claude
Committed fixture performanceCommitted small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage. Fixtures: Claude
Large real-world-shaped fixture performanceGenerated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures. Fixtures: Claude
Artifact size
Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees. |

Closes #899
Problem
ccusage prices all
cache_creation_input_tokensat a single 1.25x base input rate (5-minute cache write), but Claude Code predominantly uses 1-hour caching, which Anthropic prices at 2x base input.What Claude Code JSONL provides
{ "usage": { "cache_creation_input_tokens": 23566, "cache_creation": { "ephemeral_5m_input_tokens": 0, "ephemeral_1h_input_tokens": 23566 } } }Fix
cache_creation.ephemeral_5m_input_tokensandephemeral_1h_input_tokensfrom JSONL usage recordscache_createrate (1.25x), 1h tokens atinput * 2.0(matching Anthropic pricing)cache_creation_input_tokenswhen breakdown is absent (older records, non-Claude agents)total_usage_tokens,TokenCounts::add_usage,usage_token_total,daily_usage_token_total, both accumulator breakdowns)Impact
For users with mixed 5m/1h cache creation (the vast majority of Claude Code users), costs will be ~20-60% higher than previously reported, matching actual billing.
DeepSeek Pro(free via Pullfrog for OSS) | 𝕏Summary by cubic
Split cache creation pricing into 5-minute and 1-hour durations to match Anthropic prompt caching rates, aligning Claude Code costs with actual billing. Closes #899 by charging 1h cache writes at 2x base input.
New Features
usage.cache_creationwithephemeral_5m_input_tokensandephemeral_1h_input_tokens; parse from Claude Code JSONL and default toNonein other adapters.cache_createrate (1.25x); price 1h tokens atinput * 2.0, including above-200k tiers.cache_creation_input_tokenswhen breakdown is missing (older records, non-Claude agents).Bug Fixes
cache_creation_token_count()across summaries, daily reports, and calculators.Written for commit 2d20874. Summary will update on new commits.