fix(codex): sort usage files before loading by ryoppippi · Pull Request #1152 · ryoppippi/ccusage

ryoppippi · 2026-05-25T18:43:44Z

Sorts Codex session files after recursive discovery so event loading no longer depends on filesystem read_dir ordering.

This matches Claude usage file ordering and keeps single-threaded and parallel Codex event loading on the same deterministic baseline.

Fixes #1105

Testing:

direnv exec . env -u CFLAGS -u CPPFLAGS -u LDFLAGS cargo test --manifest-path rust/Cargo.toml -p ccusage --bin ccusage tests::loads_codex_token_count_events_in_parallel -- --exact
direnv exec . pnpm run format
direnv exec . env -u CFLAGS -u CPPFLAGS -u LDFLAGS cargo test --manifest-path rust/Cargo.toml --workspace
direnv exec . env -u CFLAGS -u CPPFLAGS -u LDFLAGS cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings

Summary by cubic

Sort Codex usage files by path before loading to remove dependence on filesystem order and match the Claude loader, fixing #1105. Event processing is now deterministic across single-threaded and parallel modes.

^{Written for commit 81969e4. Summary will update on new commits. Review in cubic}

Summary by CodeRabbit

Bug Fixes
- Improved reliability of event loading by ensuring consistent ordering when processing code usage data.

Sort Codex session files after recursive discovery so single-threaded and parallel event loading operate from a stable path order instead of filesystem read_dir order. This matches the Claude loader ordering and makes the existing parallel loader regression test deterministic without changing token parsing, dedupe keys, or aggregation semantics. Fixes #1105

coderabbitai · 2026-05-25T18:43:55Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e1f6b65a-eaaf-4df5-85c3-f42bb44b72c9

📥 Commits

Reviewing files that changed from the base of the PR and between bf84d4b and 81969e4.

📒 Files selected for processing (1)

rust/crates/ccusage/src/adapter/codex/loader.rs

📝 Walkthrough

Walkthrough

This PR fixes nondeterministic Codex session file ordering by adding an explicit sort on collected file paths before event loading. The change aligns the Codex loader with the Claude loader's behavior and resolves test flakiness caused by filesystem read order variation.

Changes

Deterministic File Ordering

Layer / File(s)	Summary
File sorting before event loading `rust/crates/ccusage/src/adapter/codex/loader.rs`	Collected Codex session files are sorted by string path representation before single-threaded or parallel event loading, ensuring deterministic output order prior to deduplication.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

ryoppippi/ccusage#989: Both PRs address Codex session aggregation correctness—this PR makes file load order deterministic before deduplication, while the related PR adds global token-usage event fingerprint deduplication.

Poem

🐰 A sorted path through session trees,
No more chaos in the breeze,
Files align in order true,
Tests pass red and tests pass blue,
Deterministic victory! 🎉

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: sorting Codex usage files before loading to ensure deterministic file ordering.
Linked Issues check	✅ Passed	The code change directly addresses issue `#1105` by sorting collected Codex session files before loading, making file ordering deterministic and resolving the test failure.
Out of Scope Changes check	✅ Passed	The single-line change is narrowly scoped to sorting files before loading, directly addressing the deterministic ordering issue without introducing unrelated modifications.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/deterministic-codex-loading

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ryoppippi · 2026-05-25T18:43:55Z

@coderabbitai review

@cubic-dev-ai review

cubic-dev-ai · 2026-05-25T18:44:01Z

@coderabbitai review

@cubic-dev-ai review

@ryoppippi I have started the AI code review. It will take a few minutes to complete.

cloudflare-workers-and-pages · 2026-05-25T18:44:17Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	ccusage-guide	`81969e4`	Commit Preview URL Branch Preview URL	May 25 2026, 06:44 PM

coderabbitai · 2026-05-25T18:45:09Z

Tip

For best results, initiate chat on the files or code changes.

@ryoppippi: This is a clean, well-scoped fix. Here's the full review:

Review: `fix(codex): sort usage files before loading`

Overall: ✅ Correct and ready to merge.

What the change does

files.sort_by_cached_key(|path| path.to_string_lossy().into_owned());

collect_usage_files uses read_dir under the hood, which returns entries in inode/filesystem order — non-deterministic and OS-dependent. Sorting before dispatch ensures both the single-threaded path (flat_map in order) and the parallel path (chunk_file_indexes_by_size → index-reassembly) see a stable, reproducible file sequence. This also keeps dedupe_codex_events deterministic when the same event appears across files.

Implementation notes

sort_by_cached_key is the right choice — the key (String from to_string_lossy().into_owned()) is computed once per element and cached, avoiding a heap allocation per comparison. ✅
into_owned() is necessary because to_string_lossy() returns Cow<'_, str>, which can't be cached without converting to an owned value. ✅
Consistency with other loaders — Droid and Codebuff use files.sort() (which sorts PathBuf by OsStr bytes). For well-formed UTF-8 paths like Codex session files, the two approaches are equivalent. The explicit string sort here is also fine and arguably more readable.

Minor observation (non-blocking)

files.sort() would be a marginally lighter alternative (avoids per-element String allocation), since PathBuf: Ord gives the same lexicographic order for valid UTF-8 paths:

files.sort();

But the current approach is correct and idiomatic Rust, so this is purely cosmetic.

Test coverage

The referenced integration test (loads_codex_token_count_events_in_parallel) exercises the parallel path. One thing worth noting is that there's no multi-file test asserting that events appear in a specific deterministic order after the sort — but the existing test is sufficient to catch regressions in the parallel loading path, and the deduplication logic provides a natural correctness check.

Verdict: LGTM 🐇

cubic-dev-ai

No issues found across 1 file

_{Re-trigger cubic}

pkg-pr-new · 2026-05-25T18:56:36Z

Open in StackBlitz

ccusage

npx https://pkg.pr.new/ryoppippi/ccusage@1152

@ccusage/ccusage-darwin-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-darwin-arm64@1152

@ccusage/ccusage-darwin-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-darwin-x64@1152

@ccusage/ccusage-linux-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-linux-arm64@1152

@ccusage/ccusage-linux-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-linux-x64@1152

@ccusage/ccusage-win32-arm64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-win32-arm64@1152

@ccusage/ccusage-win32-x64

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/ccusage-win32-x64@1152

commit: 81969e4

github-actions · 2026-05-25T19:00:40Z

ccusage performance comparison

PR SHA: 81969e48b8a5
Base SHA: bf84d4bc1ce8

This compares the Rust PR release binary against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package	SHA	Execution setup	Bunx temp cache	Bunx warm median	Warm samples
Base pkg.pr.new	`bf84d4bc1ce8`	951.5ms	418.5ms	34.6ms	3
PR pkg.pr.new	`81969e48b8a5`	421.6ms	454.2ms	35.5ms	3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: bf84d4bc1ce8; PR package: 81969e48b8a5. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Input	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`bunx -p <pkg> ccusage claude --offline --json`	1.01 GiB	503.7ms	500.7ms	1.01x	267.20 MiB	251.45 MiB	0.94x	2.00 GiB/s	2.01 GiB/s
`bunx -p <pkg> ccusage codex --offline --json`	1.01 GiB	383.5ms	363.1ms	1.06x	66.33 MiB	54.08 MiB	0.82x	2.63 GiB/s	2.77 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command	Runtime	Input	Median	Throughput	Samples
`claude --offline --json`	Package wrapper	1.01 GiB	499.0ms	2.02 GiB/s	1
`claude --offline --json`	Installed native binary	1.01 GiB	467.1ms	2.16 GiB/s	1
`codex --offline --json`	Package wrapper	1.01 GiB	363.6ms	2.77 GiB/s	1
`codex --offline --json`	Installed native binary	1.01 GiB	330.4ms	3.05 GiB/s	1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`claude daily --offline --json`	32.3ms	4.4ms	7.38x	-	2.83 MiB	-	0.05 MiB/s	0.35 MiB/s
`claude session --offline --json`	32.5ms	4.3ms	7.50x	-	2.83 MiB	-	0.05 MiB/s	0.36 MiB/s
`codex daily --offline --json`	31.9ms	4.0ms	7.97x	43.61 MiB	2.83 MiB	0.06x	0.03 MiB/s	0.21 MiB/s
`codex session --offline --json`	31.9ms	4.1ms	7.76x	-	2.83 MiB	-	0.03 MiB/s	0.21 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs rust/target/release/ccusage directly. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Input	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`claude --offline --json`	1.01 GiB	497.4ms	467.6ms	1.06x	249.58 MiB	265.08 MiB	1.06x	2.02 GiB/s	2.15 GiB/s
`codex --offline --json`	1.01 GiB	360.9ms	325.4ms	1.11x	60.20 MiB	55.20 MiB	0.92x	2.79 GiB/s	3.09 GiB/s

Artifact size

Artifact	Base	PR	Delta	Ratio
packed `ccusage-*.tgz`	14.25 KiB	14.25 KiB	-0.00 KiB	1.00x
installed native package binary	3289.49 KiB	3289.49 KiB	+0.00 KiB	1.00x

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

github-actions · 2026-05-25T19:00:50Z

ccusage performance comparison

PR SHA: 81969e48b8a5
Base SHA: bf84d4bc1ce8

This compares the PR package against the configured base package on the same CI runner.

Package runner startup

Execution setup measures any pre-benchmark package materialization used by the execution benchmark. Bunx temp cache measures one bunx -p <url> ccusage --version run with an empty Bun install cache. Warm reuses that cache and reports the median of repeated runs.

Package	SHA	Execution setup	Bunx temp cache	Bunx warm median	Warm samples
Base pkg.pr.new	`bf84d4bc1ce8`	421.4ms	434.7ms	36.3ms	3
PR pkg.pr.new	`81969e48b8a5`	699.2ms	591.5ms	35.8ms	3

Cached bunx execution performance

Runs the same large fixture through bunx -p <pkg.pr.new URL> ccusage after the Bun install cache has already been populated by the startup measurement. This separates cached package-runner execution from first-fetch package materialization.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base package: bf84d4bc1ce8; PR package: 81969e48b8a5. Both run through bunx -p <pkg.pr.new URL> ccusage using the warmed Bun install cache from package runner startup, measured by hyperfine with 0 warmups and 1 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Input	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`bunx -p <pkg> ccusage claude --offline --json`	1.01 GiB	514.4ms	505.5ms	1.02x	259.70 MiB	262.20 MiB	1.01x	1.96 GiB/s	1.99 GiB/s
`bunx -p <pkg> ccusage codex --offline --json`	1.01 GiB	374.0ms	378.7ms	0.99x	65.70 MiB	57.95 MiB	0.88x	2.69 GiB/s	2.66 GiB/s

Package runtime diagnostics

Compares the PR package wrapper, the installed native optional dependency binary, and the workspace release binary on the same large fixture. This identifies whether slow package results come from JavaScript wrapper overhead, the published native binary build, or the Rust core itself.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
All rows run --offline --json, measured by hyperfine with 0 warmups and 1 runs. This isolates wrapper overhead from the installed native optional dependency and the workspace release binary built on the runner.

Command	Runtime	Input	Median	Throughput	Samples
`claude --offline --json`	Package wrapper	1.01 GiB	496.4ms	2.03 GiB/s	1
`claude --offline --json`	Installed native binary	1.01 GiB	480.3ms	2.10 GiB/s	1
`codex --offline --json`	Package wrapper	1.01 GiB	369.9ms	2.72 GiB/s	1
`codex --offline --json`	Installed native binary	1.01 GiB	343.2ms	2.93 GiB/s	1

Committed fixture performance

Committed small fixtures for stable PR-to-PR feedback and explicit Claude/Codex command coverage.

Fixtures: Claude apps/ccusage/test/fixtures/claude (0.00 MiB, 2 files), Codex apps/ccusage/test/fixtures/codex (0.00 MiB, 1 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 2 warmups and 7 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`claude daily --offline --json`	32.7ms	32.8ms	1.00x	-	-	-	0.05 MiB/s	0.05 MiB/s
`claude session --offline --json`	33.0ms	32.3ms	1.02x	43.61 MiB	-	-	0.05 MiB/s	0.05 MiB/s
`codex daily --offline --json`	32.7ms	32.6ms	1.00x	43.48 MiB	43.48 MiB	1.00x	0.03 MiB/s	0.03 MiB/s
`codex session --offline --json`	32.8ms	32.3ms	1.02x	-	43.73 MiB	-	0.03 MiB/s	0.03 MiB/s

Large real-world-shaped fixture performance

Generated fixtures shaped from aggregate local log statistics: thousands of JSONL files, many small sessions, and a long tail of larger sessions. No real prompts, paths, or outputs are stored in the fixtures.

Fixtures: Claude /home/runner/work/_temp/ccusage-large-fixture (1.01 GiB, 2,597 files), Codex /home/runner/work/_temp/ccusage-large-codex-fixture (1.01 GiB, 2,597 files)
Base runs the published ccusage package from pkg.pr.new, installed before measurement; PR runs the published ccusage package from pkg.pr.new, installed before measurement. Both run --offline --json, measured by hyperfine with 0 warmups and 1 runs.
Peak RSS is measured separately with /usr/bin/time using 1 runs. Lower RSS ratios are better.

Command	Input	Base median	PR median	PR vs base	Base peak RSS	PR peak RSS	PR/base RSS	Base throughput	PR throughput
`claude --offline --json`	1.01 GiB	502.5ms	496.4ms	1.01x	268.95 MiB	272.83 MiB	1.01x	2.00 GiB/s	2.03 GiB/s
`codex --offline --json`	1.01 GiB	367.9ms	364.9ms	1.01x	63.95 MiB	63.70 MiB	1.00x	2.74 GiB/s	2.76 GiB/s

Artifact size

Artifact	Base	PR	Delta	Ratio
packed `ccusage-*.tgz`	14.25 KiB	14.25 KiB	-0.00 KiB	1.00x
installed native package binary	3289.49 KiB	3289.49 KiB	+0.00 KiB	1.00x

Lower medians and smaller artifacts are better. CI runner noise still applies; use same-run ratios as directional PR feedback, not release guarantees.

cubic-dev-ai Bot reviewed May 25, 2026

View reviewed changes

ryoppippi merged commit 63cf17e into main May 25, 2026
41 checks passed

ryoppippi deleted the codex/deterministic-codex-loading branch May 25, 2026 19:01

This was referenced May 25, 2026

fix(codex): dedupe copied branch history #1156

Merged

feat(codex): include archived sessions #1176

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(codex): sort usage files before loading#1152

fix(codex): sort usage files before loading#1152
ryoppippi merged 1 commit into
mainfrom
codex/deterministic-codex-loading

ryoppippi commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

ryoppippi commented May 25, 2026

Uh oh!

cubic-dev-ai Bot commented May 25, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

pkg-pr-new Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ryoppippi commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

ryoppippi commented May 25, 2026

Uh oh!

cubic-dev-ai Bot commented May 25, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented May 25, 2026

Deploying with Cloudflare Workers

Uh oh!

coderabbitai Bot commented May 25, 2026

Review: fix(codex): sort usage files before loading

What the change does

Implementation notes

Minor observation (non-blocking)

Test coverage

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pkg-pr-new Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

ccusage performance comparison

Package runner startup

Cached bunx execution performance

Package runtime diagnostics

Committed fixture performance

Large real-world-shaped fixture performance

Artifact size

Uh oh!

github-actions Bot commented May 25, 2026

ccusage performance comparison

Package runner startup

Cached bunx execution performance

Package runtime diagnostics

Committed fixture performance

Large real-world-shaped fixture performance

Artifact size

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ryoppippi commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Review: `fix(codex): sort usage files before loading`