Skip to content

fix(codex): show non-cached input tokens#1067

Merged
ryoppippi merged 3 commits into
mainfrom
codex/fix-codex-input-token-display
May 19, 2026
Merged

fix(codex): show non-cached input tokens#1067
ryoppippi merged 3 commits into
mainfrom
codex/fix-codex-input-token-display

Conversation

@ryoppippi
Copy link
Copy Markdown
Owner

@ryoppippi ryoppippi commented May 19, 2026

Summary

  • normalise Codex report `inputTokens` to exclude cached input tokens
  • apply the same normalisation when Codex usage is shown in the all-agent report
  • keep `cachedInputTokens`, `totalTokens`, and pricing calculation semantics intact
  • add regression coverage for Codex report JSON and all-agent Codex rows

Root Cause

Codex usage events use OpenAI-style input token counts where cached prompt tokens are included in `input_tokens`. The native reports forwarded that inclusive value to the user-facing Input column while also showing cached tokens separately, which made heavy cache-read days look like far more paid input than the reported cost supported.

Validation

  • `direnv exec . pnpm run format`
  • `direnv exec . pnpm typecheck`
  • `direnv exec . pnpm run test`
  • pre-push hook: clippy + oxfmt

Fixes #1056


Summary by cubic

Show non-cached Codex input tokens in Codex and all-agent reports so the Input column reflects paid input and matches cost. Cached reads stay visible, pricing is unchanged, and per-model rows and totals are normalized. Fixes #1056.

  • Bug Fixes

    • Subtract cachedInputTokens from Codex inputTokens in group rows, per-model entries, and totals (including all-agent Codex rows).
    • Keep cachedInputTokens, totalTokens, and costUSD semantics.
    • Add regression tests for Codex report JSON, per-model JSON, totals, and all-agent Codex rows.
  • Refactors

    • Remove direct Serialize from CodexModelUsage and serialize via a normalizing helper so all user-facing JSON shows non-cached input.

Written for commit 665500d. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • Bug Fixes

    • Codex usage reporting now separates non-cached input tokens from cached input tokens. "Input" metrics (row, per-model, and totals) report non-cached counts while cached tokens are reported separately for clearer token consumption.
  • Tests

    • Updated tests to validate the new token reporting and per-model JSON output.

Review Change Stack

Codex usage events report OpenAI-style input tokens that include cached prompt tokens. The native all-agent and Codex reports were forwarding that inclusive value directly, so days with heavy prompt caching could show an Input column that looked close to Cache Read even though cost calculation had already treated cached tokens separately.

Normalise Codex report output to expose non-cached input tokens while preserving cachedInputTokens, totalTokens, and the raw internal values used for pricing. Apply the same normalisation when Codex rows are folded into the all-agent report, and cover the behaviour with regression tests for both report paths.

Fixes #1056
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8d2687e4-6a36-4615-bf67-7817345c5471

📥 Commits

Reviewing files that changed from the base of the PR and between 47e6aa7 and 665500d.

📒 Files selected for processing (3)
  • rust/crates/ccusage/src/adapter/all.rs
  • rust/crates/ccusage/src/adapter/codex.rs
  • rust/crates/ccusage/src/types.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • rust/crates/ccusage/src/adapter/all.rs
  • rust/crates/ccusage/src/adapter/codex.rs

📝 Walkthrough

Walkthrough

Compute Codex non-cached input tokens (saturating subtraction of cached from total) and use that value in JSON output and AllRow generation; update tests to assert non-cached daily and per-model inputTokens while preserving cachedInputTokens.

Changes

Codex non-cached input token reporting

Layer / File(s) Summary
Non-cached input token helpers and per-model JSON serialization
rust/crates/ccusage/src/adapter/codex.rs
Adds pub(crate) fn non_cached_input_tokens(input_tokens: u64, cached_input_tokens: u64) -> u64 and model_usage_json to emit inputTokens (non-cached) and cachedInputTokens per model.
Group and totals JSON output using non-cached logic
rust/crates/ccusage/src/adapter/codex.rs
Row-level inputTokens and aggregated totals.inputTokens are computed using the non-cached helper instead of raw input token sums.
Codex JSON reporting test coverage
rust/crates/ccusage/src/adapter/codex.rs
Unit test verifies daily row and per-model JSON entries report non-cached inputTokens and retain cachedInputTokens.

AllRow integration and testing

Layer / File(s) Summary
AllRow Codex row computation
rust/crates/ccusage/src/adapter/all.rs
codex_group_row now calls codex::non_cached_input_tokens(...) to compute input_tokens instead of using the raw group input value.
AllRow test updates
rust/crates/ccusage/src/adapter/all.rs
Adds import for CodexModelUsage and a unit test constructing a CodexGroup with cached input tokens, asserting AllRow.input_tokens equals non-cached value and cache_read_tokens equals cached amount.

Integration test updates

Layer / File(s) Summary
Daily report test expectations
rust/crates/ccusage/src/main.rs
Updates builds_codex_daily_json_report to expect daily[0].inputTokens == 90 (non-cached) and adds assertions for per-model inputTokens and cachedInputTokens under daily[0].models["gpt-5"].

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble numbers, subtract the cached heap,
Input tokens clarified—no more counting cheap.
Rows and models now sing straight and true,
Cached reads kept safe, while the totals renew.
Hoppity hop — the report shines through!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: updating Codex to report non-cached input tokens instead of total input tokens.
Linked Issues check ✅ Passed The PR correctly addresses issue #1056 by excluding cached input tokens from the displayed Input value and providing per-model breakdowns.
Out of Scope Changes check ✅ Passed All changes directly support the goal of normalizing Codex input token reporting; no unrelated modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/fix-codex-input-token-display

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ryoppippi
Copy link
Copy Markdown
Owner Author

@coderabbitai review

Please review this PR. The key change is that Codex input token display now separates non-cached input from cached input while preserving total token and pricing semantics.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 19, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
ccusage-guide 665500d Commit Preview URL

Branch Preview URL
May 19 2026, 04:25 PM

@ryoppippi ryoppippi changed the title [codex] Show non-cached Codex input tokens fix(codex): show non-cached input tokens May 19, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the PR diff
gh pr view 1067 --repo ryoppippi/ccusage --json files,additions,deletions,changedFiles

# Get the actual diff
gh pr diff 1067 --repo ryoppippi/ccusage

Length of output: 7140


🏁 Script executed:

#!/bin/bash
# Check the CodexModelUsage struct and its serde/serialization
rg -n "CodexModelUsage" rust/crates/ccusage/src/ -A 5 -B 2

# Check if there are other places where input_tokens from Codex is used without normalization
rg -n "input_tokens" rust/crates/ccusage/src/adapter/codex.rs -C 2

# Look at the full codex.rs totals_json and group_json functions
rg -n "totals_json\|group_json\|non_cached" rust/crates/ccusage/src/adapter/codex.rs -A 3

Length of output: 11096


@ryoppippi: The fix is sound and covers all the relevant display paths. Here's the full review:


Review

What's correct ✅

  • Core logic is correct and complete. non_cached_input_tokens is applied consistently in all three places where inputTokens is surfaced to users: group_json, totals_json, and codex_group_row in all.rs.
  • Per-model entries are also normalized. The new model_usage_json() helper correctly replaces the previous direct struct serialization (which would have leaked inclusive tokens).
  • Pricing is unaffected. calculate_model_cost and calculate_group_cost remain untouched — they already subtracted cached tokens independently.
  • totalTokens and cachedInputTokens semantics are preserved as stated in the PR objectives.
  • Tests are well-structured and cover both group-level and per-model assertions.

Minor observations

1. Redundant guard in non_cached_input_tokens

// rust/crates/ccusage/src/adapter/codex.rs (new function)
pub(crate) fn non_cached_input_tokens(input_tokens: u64, cached_input_tokens: u64) -> u64 {
    input_tokens.saturating_sub(cached_input_tokens.min(input_tokens))
}

The .min(input_tokens) is redundant: saturating_sub on u64 already clamps to zero if cached > input. This means the two guards are saying the same thing. The existing calculate_model_cost function (line 516) uses the simpler and more idiomatic form:

// Already in the codebase – consistent pattern
let non_cached_input = usage.input_tokens.saturating_sub(usage.cached_input_tokens);

Simplifying to input_tokens.saturating_sub(cached_input_tokens) would be consistent with that precedent, though the current form is not wrong.


2. CodexModelUsage still derives Serialize with the inclusive field name

// rust/crates/ccusage/src/types.rs
#[derive(Debug, Clone, Default, Serialize)]
#[serde(rename_all = "camelCase")]
pub(crate) struct CodexModelUsage {
    pub(crate) input_tokens: u64,   // ← still serializes as raw inclusive value
    ...
}

Now that model_usage_json() exists as the canonical serialization path, a future caller that directly serializes a CodexModelUsage (e.g. in a new endpoint or debug log) would silently get the old inclusive semantics. Since the struct is pub(crate), the risk is limited, but it might be worth either:

  • Removing Serialize from CodexModelUsage if it's no longer needed directly, or
  • Adding a doc comment like /// Note: input_tokens is the OpenAI-style inclusive value; use model_usage_json() for display

3. Small test coverage gap — totals.inputTokens not asserted

The new test reports_non_cached_codex_input_separately_from_cached_input checks daily[0] and per-model values, but skips totals:

// Missing assertion in the new test:
assert_eq!(report["totals"]["inputTokens"], 10);

Since totals_json was also patched to use non_cached_input_tokens, adding this assertion would complete the regression coverage promised in the PR objectives.


Summary

Bug fix correctness
All display paths covered
Pricing unchanged
Test regression for Codex JSON
Test regression for all-agent rows
Totals assertion in new test ⚠️ Missing
Redundant .min() ⚠️ Nit
Serialize derive on CodexModelUsage ⚠️ Minor future-safety risk

The PR is in good shape — the three items above are nits/minor suggestions, not blockers.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
rust/crates/ccusage/src/adapter/codex.rs (1)

717-725: ⚡ Quick win

Add an explicit totals input assertion for regression completeness.

Please also assert report["totals"]["inputTokens"] (expected 10) so the totals path patched at Line 490 is directly covered.

✅ Suggested test addition
         assert_eq!(report["daily"][0]["inputTokens"], 10);
         assert_eq!(report["daily"][0]["cachedInputTokens"], 90);
         assert_eq!(report["daily"][0]["totalTokens"], 105);
         assert_eq!(report["daily"][0]["models"]["gpt-5"]["inputTokens"], 10);
         assert_eq!(
             report["daily"][0]["models"]["gpt-5"]["cachedInputTokens"],
             90
         );
+        assert_eq!(report["totals"]["inputTokens"], 10);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/ccusage/src/adapter/codex.rs` around lines 717 - 725, Add an
explicit assertion that verifies the totals path: after the existing per-day and
per-model assertions on the `report` JSON, add an `assert_eq!` that checks
`report["totals"]["inputTokens"]` equals 10 so the totals path patched at the
earlier change (around the totals handling at Line 490) is directly covered;
locate the assertion block using the `report` variable in this test in codex.rs
and insert the totals assertion alongside the other `assert_eq!` checks for
`daily` and `models`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@rust/crates/ccusage/src/adapter/codex.rs`:
- Around line 717-725: Add an explicit assertion that verifies the totals path:
after the existing per-day and per-model assertions on the `report` JSON, add an
`assert_eq!` that checks `report["totals"]["inputTokens"]` equals 10 so the
totals path patched at the earlier change (around the totals handling at Line
490) is directly covered; locate the assertion block using the `report` variable
in this test in codex.rs and insert the totals assertion alongside the other
`assert_eq!` checks for `daily` and `models`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4a5d53d2-703d-41ec-8ff9-d2f50cee7b40

📥 Commits

Reviewing files that changed from the base of the PR and between 34f9c68 and 47e6aa7.

📒 Files selected for processing (3)
  • rust/crates/ccusage/src/adapter/all.rs
  • rust/crates/ccusage/src/adapter/codex.rs
  • rust/crates/ccusage/src/main.rs

Address CodeRabbit review feedback by covering the totals path that was normalised with the Codex row and per-model JSON output. This keeps the regression test aligned with the display bug fixed in #1067.

Also simplify the non-cached input helper to match the existing pricing code and remove direct Serialize support from CodexModelUsage so future user-facing JSON must go through the normalised serializer.
@ryoppippi
Copy link
Copy Markdown
Owner Author

@coderabbitai addressed your review feedback in 9a4afc2.

Changes made:

  • added totals assertions for normalised Codex input and cached input
  • simplified the non-cached input helper to use direct saturating subtraction
  • removed direct Serialize from CodexModelUsage so display JSON uses the normalised serializer

Validation after the update:

  • direnv exec . pnpm run format
  • direnv exec . pnpm typecheck
  • direnv exec . pnpm run test
  • pre-push clippy + oxfmt

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the diff for commit 9a4afc2
gh api repos/ryoppippi/ccusage/commits/9a4afc2 --jq '.files[] | "=== \(.filename) ===\n\(.patch)"'

Length of output: 1320


@ryoppippi: All three items from the previous review are confirmed addressed in 9a4afc2:

Feedback Resolution
Redundant .min(input_tokens) in non_cached_input_tokens ✅ Simplified to input_tokens.saturating_sub(cached_input_tokens)
Serialize derive on CodexModelUsage ✅ Removed Serialize and #[serde(rename_all = "camelCase")] from the struct
Missing totals.inputTokens assertion in test ✅ Added assertions for totals.inputTokens, totals.cachedInputTokens, and totals.totalTokens

The fix is correct, complete, and the regression tests now cover all patched paths.

(ʳ ´•̥̥̥ ω•̥̥̥`)っ🐇

[approve]

@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 47e6aa778c0a
Base SHA: 34f9c68d7c61

This compares the Rust PR release binary against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new 34f9c68d7c61 562.9ms 824.1ms 33.9ms 3
PR pkg.pr.new 47e6aa778c0a 2.701s 581.3ms 34.0ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: 34f9c68d7c61; PR package: 47e6aa778c0a. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 678.5ms 685.2ms 0.99x 1.48 GiB/s 1.47 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 873.0ms 848.2ms 1.03x 1.15 GiB/s 1.19 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 668.9ms 1.51 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 638.5ms 1.58 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 709.8ms 1.42 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 837.6ms 1.20 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 808.9ms 1.24 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 818.5ms 1.23 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 31.2ms 4.6ms 6.82x 0.05 MiB/s 0.34 MiB/s
claude session --offline --json 0.00 MiB 31.6ms 4.8ms 6.60x 0.05 MiB/s 0.32 MiB/s
codex daily --offline --json 0.00 MiB 31.0ms 4.4ms 7.10x 0.03 MiB/s 0.20 MiB/s
codex session --offline --json 0.00 MiB 31.9ms 4.5ms 7.03x 0.03 MiB/s 0.19 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 690.3ms 717.6ms 0.96x 1.46 GiB/s 1.40 GiB/s
codex --offline --json 1.01 GiB 873.0ms 822.3ms 1.06x 1.15 GiB/s 1.22 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.05 KiB -0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2827.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 47e6aa778c0a
Base SHA: 34f9c68d7c61

This compares the PR package against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new 34f9c68d7c61 632.1ms 646.3ms 35.1ms 3
PR pkg.pr.new 47e6aa778c0a 973.3ms 5.161s 35.6ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: 34f9c68d7c61; PR package: 47e6aa778c0a. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 686.4ms 687.6ms 1.00x 1.47 GiB/s 1.46 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 893.3ms 905.7ms 0.99x 1.13 GiB/s 1.11 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 670.7ms 1.50 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 639.1ms 1.58 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 725.2ms 1.39 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 870.5ms 1.16 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 850.4ms 1.18 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 858.1ms 1.17 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 33.6ms 33.7ms 1.00x 0.05 MiB/s 0.05 MiB/s
claude session --offline --json 0.00 MiB 33.9ms 33.7ms 1.01x 0.05 MiB/s 0.05 MiB/s
codex daily --offline --json 0.00 MiB 33.0ms 32.7ms 1.01x 0.03 MiB/s 0.03 MiB/s
codex session --offline --json 0.00 MiB 32.3ms 33.2ms 0.97x 0.03 MiB/s 0.03 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 704.6ms 691.4ms 1.02x 1.43 GiB/s 1.46 GiB/s
codex --offline --json 1.01 GiB 923.4ms 874.6ms 1.06x 1.09 GiB/s 1.15 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.05 KiB -0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2827.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 19, 2026

Open in StackBlitz

ccusage

npx https://pkg.pr.new/ryoppippi/ccusage@1067

@ccusage/ccusage-darwin-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-darwin-arm64@1067

@ccusage/ccusage-darwin-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-darwin-x64@1067

@ccusage/ccusage-linux-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-linux-arm64@1067

@ccusage/ccusage-linux-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-linux-x64@1067

@ccusage/ccusage-win32-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-win32-arm64@1067

@ccusage/ccusage-win32-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-win32-x64@1067

commit: 665500d

Resolve the all-agent adapter test conflict by keeping main branch parallel loader coverage and this branch Codex non-cached input token regression import.\n\nValidation after the merge:\n- direnv exec . pnpm run format\n- direnv exec . pnpm typecheck\n- direnv exec . pnpm run test
@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 9a4afc278075
Base SHA: 34f9c68d7c61

This compares the PR package against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new 34f9c68d7c61 921.6ms 464.6ms 35.5ms 3
PR pkg.pr.new 9a4afc278075 794.9ms 605.9ms 35.7ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: 34f9c68d7c61; PR package: 9a4afc278075. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 700.2ms 679.4ms 1.03x 1.44 GiB/s 1.48 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 885.5ms 886.3ms 1.00x 1.14 GiB/s 1.14 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 675.7ms 1.49 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 644.7ms 1.56 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 735.8ms 1.37 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 879.4ms 1.14 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 848.8ms 1.19 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 838.3ms 1.20 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 33.4ms 33.3ms 1.01x 0.05 MiB/s 0.05 MiB/s
claude session --offline --json 0.00 MiB 33.3ms 33.1ms 1.01x 0.05 MiB/s 0.05 MiB/s
codex daily --offline --json 0.00 MiB 32.9ms 32.7ms 1.01x 0.03 MiB/s 0.03 MiB/s
codex session --offline --json 0.00 MiB 32.6ms 32.8ms 0.99x 0.03 MiB/s 0.03 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 695.7ms 687.5ms 1.01x 1.45 GiB/s 1.46 GiB/s
codex --offline --json 1.01 GiB 899.8ms 884.4ms 1.02x 1.12 GiB/s 1.14 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.06 KiB +0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2827.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 9a4afc278075
Base SHA: 34f9c68d7c61

This compares the Rust PR release binary against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new 34f9c68d7c61 599.0ms 670.9ms 34.8ms 3
PR pkg.pr.new 9a4afc278075 799.5ms 767.0ms 36.2ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: 34f9c68d7c61; PR package: 9a4afc278075. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 696.6ms 683.3ms 1.02x 1.45 GiB/s 1.47 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 902.3ms 894.8ms 1.01x 1.12 GiB/s 1.13 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 677.6ms 1.49 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 651.1ms 1.55 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 753.4ms 1.34 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 893.5ms 1.13 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 866.9ms 1.16 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 847.2ms 1.19 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 32.8ms 4.9ms 6.67x 0.05 MiB/s 0.31 MiB/s
claude session --offline --json 0.00 MiB 32.6ms 5.0ms 6.49x 0.05 MiB/s 0.31 MiB/s
codex daily --offline --json 0.00 MiB 32.1ms 4.7ms 6.84x 0.03 MiB/s 0.18 MiB/s
codex session --offline --json 0.00 MiB 32.5ms 4.5ms 7.20x 0.03 MiB/s 0.19 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 750.1ms 734.1ms 1.02x 1.34 GiB/s 1.37 GiB/s
codex --offline --json 1.01 GiB 893.7ms 847.9ms 1.05x 1.13 GiB/s 1.19 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.06 KiB +0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2827.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@ryoppippi
Copy link
Copy Markdown
Owner Author

@coderabbitai pushed 665500d to merge latest main and resolve the all-agent adapter test conflict.

Conflict resolution kept both main branch parallel loader test coverage and this PR Codex non-cached input token regression import.

Validation after the merge:

  • direnv exec . pnpm run format
  • direnv exec . pnpm typecheck
  • direnv exec . pnpm run test
  • pre-push oxfmt + clippy

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the diff for commit 665500d
gh api repos/ryoppippi/ccusage/commits/665500d --jq '.files[] | "=== \(.filename) ===\n\(.patch)"'

Length of output: 12749


🏁 Script executed:

#!/bin/bash
# Also check overall PR status
gh pr view 1067 --repo ryoppippi/ccusage --json state,mergeable,mergeStateStatus,commits

Length of output: 2483


@ryoppippi: The conflict resolution in 665500d is clean. Here's what I verified:

  • Main branch parallel loader (AgentLoadSpec, LoadedAgentRows, load_agent_rows_parallel, loads_agent_rows_concurrently test + associated std imports) is fully preserved.
  • This PR's use crate::CodexModelUsage import in the test module is retained alongside the new parallel-loader test imports — no loss of either contribution.
  • The Codex non-cached input token logic in load_codex_rows / codex_group_row is untouched by the merge.

The merge state shows UNSTABLE, which is likely just CI still running rather than a code issue. No concerns with the resolution itself.

(ฅ•ᴗ•ฅ) 🐇

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 665500d25cef
Base SHA: ba9ad765df04

This compares the PR package against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new ba9ad765df04 538.1ms 604.5ms 34.2ms 3
PR pkg.pr.new 665500d25cef 613.4ms 855.0ms 34.3ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: ba9ad765df04; PR package: 665500d25cef. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 687.6ms 686.4ms 1.00x 1.46 GiB/s 1.47 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 840.5ms 842.7ms 1.00x 1.20 GiB/s 1.19 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 672.6ms 1.50 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 650.8ms 1.55 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 708.4ms 1.42 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 836.9ms 1.20 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 801.9ms 1.26 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 838.8ms 1.20 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 32.5ms 32.0ms 1.02x 0.05 MiB/s 0.05 MiB/s
claude session --offline --json 0.00 MiB 32.3ms 32.3ms 1.00x 0.05 MiB/s 0.05 MiB/s
codex daily --offline --json 0.00 MiB 31.9ms 32.1ms 0.99x 0.03 MiB/s 0.03 MiB/s
codex session --offline --json 0.00 MiB 31.8ms 31.3ms 1.02x 0.03 MiB/s 0.03 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 687.7ms 678.1ms 1.01x 1.46 GiB/s 1.48 GiB/s
codex --offline --json 1.01 GiB 853.6ms 877.2ms 0.97x 1.18 GiB/s 1.15 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.05 KiB -0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2891.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@github-actions
Copy link
Copy Markdown

ccusage performance comparison

PR SHA: 665500d25cef
Base SHA: ba9ad765df04

This compares the Rust PR release binary against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package SHA Execution setup Bunx temp cache Bunx warm median Warm samples
Base pkg.pr.new ba9ad765df04 813.7ms 831.9ms 32.9ms 3
PR pkg.pr.new 665500d25cef 807.7ms 787.2ms 33.1ms 3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: ba9ad765df04; PR package: 665500d25cef. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
bunx -p <pkg> ccusage claude --offline --json 1.01 GiB 678.9ms 682.8ms 0.99x 1.48 GiB/s 1.47 GiB/s
bunx -p <pkg> ccusage codex --offline --json 1.01 GiB 833.9ms 823.3ms 1.01x 1.21 GiB/s 1.22 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command Runtime Input Median Throughput Samples
claude --offline --json Package wrapper 1.01 GiB 678.5ms 1.48 GiB/s 1
claude --offline --json Installed native binary 1.01 GiB 649.6ms 1.55 GiB/s 1
claude --offline --json Workspace release binary 1.01 GiB 707.9ms 1.42 GiB/s 1
codex --offline --json Package wrapper 1.01 GiB 812.7ms 1.24 GiB/s 1
codex --offline --json Installed native binary 1.01 GiB 783.3ms 1.29 GiB/s 1
codex --offline --json Workspace release binary 1.01 GiB 816.2ms 1.23 GiB/s 1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude daily --offline --json 0.00 MiB 30.9ms 4.6ms 6.70x 0.05 MiB/s 0.34 MiB/s
claude session --offline --json 0.00 MiB 31.0ms 4.4ms 6.97x 0.05 MiB/s 0.35 MiB/s
codex daily --offline --json 0.00 MiB 31.1ms 4.3ms 7.18x 0.03 MiB/s 0.20 MiB/s
codex session --offline --json 0.00 MiB 30.6ms 4.3ms 7.15x 0.03 MiB/s 0.20 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.

Command Input Base median PR median PR vs base Base throughput PR throughput
claude --offline --json 1.01 GiB 677.4ms 710.4ms 0.95x 1.49 GiB/s 1.42 GiB/s
codex --offline --json 1.01 GiB 831.6ms 840.5ms 0.99x 1.21 GiB/s 1.20 GiB/s

Artifact size

Artifact Base PR Delta Ratio
packed ccusage-*.tgz 9.06 KiB 9.05 KiB -0.00 KiB 1.00x
installed native package binary 3160.24 KiB 3160.24 KiB +0.00 KiB 1.00x
Rust release binary rust/target/release/ccusage - 2891.68 KiB - -

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

@ryoppippi ryoppippi marked this pull request as ready for review May 19, 2026 16:38
@ryoppippi ryoppippi merged commit bc5dc3c into main May 19, 2026
38 checks passed
@ryoppippi ryoppippi deleted the codex/fix-codex-input-token-display branch May 19, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Input tokens tracking is not meaningful with mixed model usage from same provider

1 participant