Skip to content

[claude] benchmarks v3 cleanup#7745

Merged
connortsui20 merged 4 commits intoct/benchmarks-v3from
claude/benchmarks-v3-cleanup
May 1, 2026
Merged

[claude] benchmarks v3 cleanup#7745
connortsui20 merged 4 commits intoct/benchmarks-v3from
claude/benchmarks-v3-cleanup

Conversation

@connortsui20
Copy link
Copy Markdown
Contributor

Summary

Closes: #000

Testing

claude added 4 commits May 1, 2026 02:38
…ult" comments

The four UI iterations landed on a layout where every group's
disclosure renders collapsed by default; only the *payload* for the
first group is inlined into the cold HTML. Several comments still
described the older "first group is opened by default" model. Update
them in api.rs (GROUP_ORDER), html.rs (LANDING_INLINE_N, UiQuery,
LandingGroup, the per-iteration shell-vs-inline branch),
chart-init.js (lazy-fetch comment), and the matching web_ui test
messages so a future reader doesn't chase a behaviour the code no
longer implements. No observable behaviour changes.

Signed-off-by: Claude <noreply@anthropic.com>
…rt wrapper, addressed orchestrator notes

`ChartQuery` accepted `?y=` and `?mode=` for an old client deep-link
shape; the live UI ships rendering hints purely client-side
(chart-init.js owns the slider, Y-axis, and filter chip state) and
nothing reads `q.y` / `q.mode`. Drop the fields and tighten the doc to
match.

`api::collect_chart` was a thin alias for `chart_payload` left for "old
callers"; the only in-tree caller is one site in `html.rs::chart_page`.
Inline it and delete the wrapper. The crate is explicitly outside the
workspace public-api lockfile set, so this is safe to remove.

`V3_ENGINES` and `V3_FORMATS` carried `ORCHESTRATOR NOTE: confirm
against vortex-bench's Engine enum / Format::name()` markers. Both
lists already mirror the live emitter (`vortex-bench/src/lib.rs`'s
`Engine` and `Format::name()`), so the notes are addressed; replace
them with a one-line "mirrors X" reference.

No observable behaviour changes. `cargo build -p vortex-bench-server
-p vortex-bench-migrate` is clean.

Signed-off-by: Claude <noreply@anthropic.com>
CLAUDE.md requires every public API definition to carry a doc
comment. Sweep the v3 server and migrator for `pub` items that
landed without one and add a one-liner that says what each is for —
no essays. Items touched:

- `vortex-bench-server::api`: GroupsResponse, Group, ChartLink,
  ChartResponse, CommitPoint, HealthResponse, RowCounts (every wire
  shape on the public read API now has a doc).
- `vortex-bench-migrate::lib`: per-`pub mod` summaries on classifier,
  commits, migrate, source, v2, verify.
- `vortex-bench-migrate::v2::V2Person`: author/committer block.
- `vortex-bench-migrate::verify::ChartDiff`: per-group chart-count
  divergence row.

No code changes. `cargo build -p vortex-bench-server -p
vortex-bench-migrate` is clean.

Signed-off-by: Claude <noreply@anthropic.com>
CLAUDE.md asks tests to return `Result<()>` and use `?` instead of
`unwrap`. Two test functions and one helper still leaned on
`unwrap()`:

- `migrate::tests::flush_all_does_not_overcount_on_failure` plus its
  `open_db_without` helper.
- `classifier::tests::random_access_bins_dataset_pattern`.
- `read_routes_serve_after_ingest` in
  `vortex-bench-server/tests/ingest.rs`.

The `axum::serve(listener, app).await.unwrap()` calls inside the
spawned background-server closures stay — they're inside
`tokio::spawn`'s unit-returning future, so `?` cannot propagate, and
panicking on a setup failure is the right shape there.

`cargo test -p vortex-bench-server -p vortex-bench-migrate` is green;
no snapshot rewrites needed.

Signed-off-by: Claude <noreply@anthropic.com>
@connortsui20 connortsui20 merged commit 6b76221 into ct/benchmarks-v3 May 1, 2026
55 of 58 checks passed
@connortsui20 connortsui20 deleted the claude/benchmarks-v3-cleanup branch May 1, 2026 02:52
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 1, 2026

Merging this PR will degrade performance by 61.46%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 10 improved benchmarks
❌ 14 regressed benchmarks
✅ 1174 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime dict[10M_u64_values_u8_codes] 251.3 µs 216.2 µs +16.21%
WallTime dict[10M_u32_values_u16_codes] 182.2 µs 146.6 µs +24.29%
Simulation chunked_opt_bool_into_canonical[(100, 100)] 259.6 µs 223.1 µs +16.38%
Simulation chunked_opt_bool_into_canonical[(10, 1000)] 1.4 ms 1 ms +35.77%
Simulation chunked_opt_bool_into_canonical[(1000, 10)] 68.4 µs 62 µs +10.34%
Simulation chunked_varbinview_opt_into_canonical[(10, 1000)] 2.9 ms 2.5 ms +16.18%
Simulation chunked_varbinview_into_canonical[(10, 1000)] 2.1 ms 1.7 ms +20.56%
Simulation new_bp_prim_test_between[i16, 32768] 134.2 µs 121.5 µs +10.44%
Simulation new_bp_prim_test_between[i64, 32768] 236.2 µs 176.6 µs +33.77%
Simulation decompress[alp_for_bp_f64] 1.8 ms 2.8 ms -36.92%
Simulation decompress[datetime_for_bp] 2.4 ms 1.6 ms +45.58%
Simulation alp_rd_decompress_f64 1.1 ms 2.4 ms -54.94%
Simulation decompress_rd[f32, (10000, 0.01)] 82.2 µs 165.8 µs -50.44%
Simulation decompress_rd[f32, (100000, 0.0)] 495.9 µs 1,286.7 µs -61.46%
Simulation decompress_rd[f32, (10000, 0.0)] 86 µs 165.8 µs -48.14%
Simulation decompress_rd[f32, (10000, 0.1)] 82.1 µs 165.5 µs -50.38%
Simulation decompress_rd[f32, (100000, 0.01)] 582.8 µs 1,374.8 µs -57.61%
Simulation decompress_rd[f64, (10000, 0.0)] 122.4 µs 257.7 µs -52.49%
Simulation decompress_rd[f32, (100000, 0.1)] 582.8 µs 1,374.3 µs -57.59%
Simulation decompress_rd[f64, (100000, 0.01)] 1 ms 2.3 ms -56.31%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing claude/benchmarks-v3-cleanup (fda2b44) with ct/benchmarks-v3 (ab5592f)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants