[claude] chore(benchmarks-migrate): post-alpha cleanup nits by connortsui20 · Pull Request #7671 · vortex-data/vortex

connortsui20 · 2026-04-27T15:18:05Z

Six small fixes left over from the v3 migration alpha. All paths
relative to benchmarks-website/migrate/ unless noted.

Fixes

Scale-factor canonicalization (src/classifier.rs::bin_compression_size,
src/migrate.rs::migrate_file_sizes, helper in src/v2.rs): both paths
now route the v2 SF string through canonical_scale_factor, which parses
to f64 and formats with no trailing zeros. Without this, "1" vs
"1.0" and "10" vs "10.0" would produce different dataset_variant
strings and prevent the data.json.gz and file-sizes-*.json.gz rows from
sharing a measurement_id.
Summary counter timing (src/migrate.rs::run): per-fact counters
used to be set from accumulator length before the flush, so a flush
failure would print a summary that lied. Refactored into a flush_all
helper that bumps summary.<fact>_inserted from the flushed
RecordBatch::num_rows() only after each Appender::append_record_batch
succeeds.
Empty-string normalization in commits (src/commits.rs,
benchmarks-website/server/src/schema.rs,
benchmarks-website/server/src/api.rs): message,
author_name/email, committer_name/email now bind as
Option<String> and store SQL NULL when v2 supplied an empty or
whitespace-only string. Schema columns made nullable; server reads
use COALESCE(c.message, '') so the existing String decoder still
works.
Orphan WAL cleanup (src/migrate.rs::open_target_db): the existing
code already attempts remove_if_exists on the .wal regardless of
whether the main file was present; pinned the behavior with a
regression test that stages an orphan .wal (no main file) and
asserts the orphan bytes don't survive open_target_db.
Random-access dataset extraction (src/classifier.rs::bin_random_access):
4-part records random-access/<dataset>/<pattern>/<format>-tokio-local-disk
continue to extract dataset/pattern from the raw name. 2-part legacy
records carry no dataset and used to render under the placeholder
"random access"; they're now dropped to keep the v3 dataset column
meaningful.
migrate_file_sizes dataset fallback (src/migrate.rs::migrate_file_sizes):
when the matrix id stripped from file-sizes-<id>.json.gz isn't on
the KNOWN_FILE_SIZES_SUITES allowlist, the fallback now emits
unknown:<id> so the UI clearly flags it instead of presenting it
as a real dataset.

Tests

Each fix has a focused regression test (rstest parametrization where
useful):

tests/classifier.rs::compression_size_scale_factor_canonicalizes
covering "1", "1.0", "10", "10.0", "0.1", whitespace, and "".
tests/classifier.rs::unmapped_records_yield_none extended with
random_access_2_part_legacy and random_access_3_part.
migrate::tests::flush_all_does_not_overcount_on_failure (private
unit test that drops compression_times to force the second flush
to fail and asserts only the queries counter is set).
tests/end_to_end.rs::summary_counts_match_actual_rows_on_success
(sister invariant for the success path).
tests/end_to_end.rs::empty_author_email_stored_as_null.
tests/end_to_end.rs::open_target_db_removes_orphan_wal.
tests/end_to_end.rs::file_sizes_unknown_id_falls_back_to_unknown_prefix
and file_sizes_known_id_uses_id_directly.
tests/end_to_end.rs::compression_size_data_and_file_sizes_merge_with_canonical_sf
(cross-path SF canonicalization end to end).

Verification

cargo build -p vortex-bench-migrate — clean.
cargo test -p vortex-bench-migrate — 7 unit + 46 classifier + 12
end-to-end tests all pass.
cargo test -p vortex-bench-server — 6 unit + 10 ingest + 6 web_ui
tests pass; schema and COALESCE changes are server-safe.
cargo clippy -p vortex-bench-migrate --all-targets — clean.
cargo fmt on changed files (nightly fmt unavailable in this
sandbox; ran with stable, which is a no-op for the imports-granularity
options the repo's rustfmt.toml gates on nightly).
Skipped ./scripts/public-api.sh: migrate is a leaf binary outside
the public-api lockfile set, and the only newly pub item is the
internal canonical_scale_factor helper.

Signed-off-by: Claude noreply@anthropic.com

Generated by Claude Code

Six small fixes left over from the v3 migration alpha: - Scale-factor canonicalization: a single `canonical_scale_factor` helper folds `"1"`, `"1.0"`, `"10"`, `"10.0"` into one form so the data.json.gz and file-sizes paths share `measurement_id`. - Flush-before-count: per-fact summary counters now bump only after the matching `Appender::append_record_batch` flush succeeds, so a partial failure no longer over-reports inserted rows. - Empty author/email/message → NULL: blank human-input strings now round-trip as SQL NULL instead of empty strings the UI rendered as blank cells. Schema columns made nullable; server reads coalesce. - Orphan WAL sweep: covered by a regression test pinning that `open_target_db` removes a `.wal` even when the main DB file was missing. - Random-access dataset: 2-part legacy records (no dataset in name) are now dropped instead of bucketed under a `"random access"` placeholder; 4-part records continue to extract `dataset/pattern` from the raw name. - File-sizes unknown id: matrix ids that don't appear in `KNOWN_FILE_SIZES_SUITES` now surface as `unknown:<id>` rather than masquerading as a real dataset name. Tests: covered each fix with an `rstest`-style or focused test in `tests/classifier.rs`, `tests/end_to_end.rs`, or a new private `migrate::tests` module. Verification: - cargo build -p vortex-bench-migrate - cargo test -p vortex-bench-migrate (all 65 tests pass) - cargo test -p vortex-bench-server (all 22 tests pass; schema / COALESCE change is server-safe) - cargo clippy -p vortex-bench-migrate --all-targets (clean) - cargo fmt on changed files (nightly fmt unavailable, stable run) Signed-off-by: Claude <noreply@anthropic.com>

The post-alpha cleanup made `bin_random_access` reject 2-part legacy shapes by returning None, but `classify_outcome` mapped any None bin to `Outcome::Unknown`, so those records counted against the 5% uncategorized gate. Intentional drops must Skip; only truly unrecognized shapes are Unknown. Reuses the existing `Skip::UnsupportedShape` variant, originally added for "get_group succeeded but classify_v2 didn't" — same semantic shape (recognized but unsupported). Regression test in `tests/classifier.rs` exercises `classify_outcome` (not just the top-level `classify`) so Skip vs Unknown is observable. Signed-off-by: Claude <noreply@anthropic.com>

…7681) ## Summary Brings the v3 benchmarks website to a demo-ready state focused on the historical-comparison use case (Vortex vs other engines on the same commit, HEAD vs N commits ago, latest vs first as % delta). Single process, single binary; SSR `maud` + inline JSON `<script>` + Chart.js — no client-side framework, no build step, no post-load API round-trips. > Branch note: this PR was developed on the harness-assigned branch > `claude/demo-ready-benchmarks-v3-H5ECI` rather than the > `claude/benchmarks-v3-ui-historical-comparison` branch the task > request mentioned, because the session's harness pins the working > branch (`Develop on branch …`, `NEVER push to a different branch > without explicit permission`). ## CI note The `Rust tests (windows-x64)` job is failing on this PR but the **same job is also failing on the merge commit at the tip of `ct/benchmarks-v3`** (PR #7671's run, job id `73229326105`, the commit `8697731` we branched from). The base branch shipped with that failure tolerated, and our diff only touches `benchmarks-website/server/` (no Windows-specific paths, no FFI, no new dependencies on Windows-fragile crates), so this failure is pre-existing and not caused by the PR. CodSpeed flagged two `varbinview_zip` regressions in `vortex-array/` — also untouched by this PR. ## What's new * **Scoped commit window** — `?n=25|50|100|250|all`, default 100, server-side clamp to `[1, 1000]`. SQL splices in a `LIMIT ?` filter and binds the value as a parameter (consistent with the rest of the file's `params!`-style use); the unbounded path is a separate query so the plan stays clean. * **Group page** — `GET /group/{slug}` renders every chart in one group on a single screen. Each card embeds its own `<script id="chart-data-N">` payload + sibling `<canvas data-chart-index="N">`. `IntersectionObserver` defers `Chart` construction until the canvas scrolls into view (mobile-friendly + cheap for 22-chart TPC-H groups). * **Toolbar** — same component on `/chart/{slug}` and `/group/{slug}`. Scope buttons + slider, linear/log Y-axis, absolute / `% of baseline` mode. URL query string is canonical state; subtitle mirrors active state. Slider step is `5` so it can land on every preset value (`25`, `50`, `100`, `250`). * **Rich tooltip** — custom external HTML tooltip with `<short-sha> · YYYY-MM-DD` title; per-series rows render value with friendly unit (ns→µs→ms→s, B→KiB→MiB→GiB) and a coloured `% delta` vs the prior visible commit; footer carries the truncated commit message + a GitHub link. Document-level click closes. * **Legend → URL** — clicking a legend item rewrites `?hidden=engine:format|…` via `history.replaceState` (no back-button hostility). Permalinks reproduce the view. Delimiter is `|` so series names can contain `:` and `,` without escaping. * **Mobile** — `@media (max-width: 768px)`: single-column chart grid, toolbar wraps with ≥ 40 px touch targets, slider expands to fill the row, legend pops to the *top* of the chart so it doesn't push the chart off-screen on a phone. * **Landing search** — client-side filter input above the group list. * **/api/group/{slug}** — JSON sibling to the HTML route, returns every chart in the group with payloads inlined. ## What was *not* picked up from `planning/components/web-ui.md`'s deferred list Done now (moved out of deferred): - mobile redesign basics (single column, ≥ 40 px tap targets, toolbar wrap) - engine + series toggling (legend ↔ URL) - deep-link state (every toolbar control is URL-canonical) - group landing with the start of "filters" (client-side search) Still deferred (intentional): - per-commit drill-down page - ad-hoc SQL page - LTTB downsampling - engine name lookup table + curated colour palettes - summary cards (geomean ratios, rankings) - full-screen modal / zoom-pan - `?mode=delta` (compare-to-main) — parser branch dropped pending data shape work; toolbar surface today is only `abs / rel` ## Repro INGEST_BEARER_TOKEN=$(openssl rand -hex 32) \ VORTEX_BENCH_DB=./bench.duckdb \ cargo run --release -p vortex-bench-server Then open `http://localhost:3000/`, click any group name (now a link to `/group/{slug}`), or any chart inside, and play with the toolbar. Toggle a series in the legend and notice `?hidden=…` appear in the URL. Resize to phone width to confirm single-column layout, sticky toolbar wrapping, and legend-on-top. ## Snapshot diffs Three `.snap` files refreshed by this PR: - `landing_page.snap` — group names now link to `/group/{slug}`, search input added, `data-group-name` for client filter. - `chart_page_query.snap` — toolbar + indexed `<script id="chart-data-0">` + tooltip host element. - `group_page_query.snap` (new) — group page rendered against the fixture DB, `?n=100` pinned for stability. Run `INSTA_UPDATE=always cargo test -p vortex-bench-server` (or `cargo insta accept`) to refresh. ## Test plan - [x] `cargo build -p vortex-bench-server` - [x] `cargo test -p vortex-bench-server` — 41 tests pass (22 unit + 10 ingest + 9 web_ui) - [x] `cargo clippy -p vortex-bench-server --all-targets -- -D warnings` — clean - [x] `cargo +nightly fmt` — no diff - [ ] `./scripts/public-api.sh` — skipped per CLAUDE.md (leaf binary, not in workspace public-api lockfile set) - [ ] Manual screenshots — couldn't capture from the sandbox; the reviewer or follow-up should record landing / single chart with toolbar / group desktop / group mobile / tooltip open / log+rel. ## Follow-up review fixes (commits `7042f0d` … `da668a4`) - `7042f0d` — `LIMIT` value travels as a bound parameter (`LIMIT ?`) via `params_from_iter` instead of being interpolated into SQL. - `9c80bce` — drop the unused `?mode=delta` parser branch in both `UiQuery::mode` and `chart-init.js::parseUrl`. - `d156ab8` — `?hidden=` delimiter is now `|`; new test pins the server/client wire agreement. - `da668a4` — slider `step` lowered to 5 so it can land on every preset (`25/50/100/250`). ## Things explicitly NOT changed - `/api/ingest`, auth, schema, write paths. - DB migration (none added). - Existing routes (no renames). - v2 site at `benchmarks-website/server.js` etc — untouched. - Single-chart page still works; reuses the same `chart-init.js`. https://claude.ai/code/session_015Nc73ihs9TUdx7QzLUZudK --------- Signed-off-by: Claude <claude@anthropic.com> Co-authored-by: Claude <claude@anthropic.com>

Six small fixes left over from the v3 migration alpha. All paths relative to `benchmarks-website/migrate/` unless noted. ## Fixes - **Scale-factor canonicalization** (`src/classifier.rs::bin_compression_size`, `src/migrate.rs::migrate_file_sizes`, helper in `src/v2.rs`): both paths now route the v2 SF string through `canonical_scale_factor`, which parses to `f64` and formats with no trailing zeros. Without this, `"1"` vs `"1.0"` and `"10"` vs `"10.0"` would produce different `dataset_variant` strings and prevent the data.json.gz and file-sizes-*.json.gz rows from sharing a `measurement_id`. - **Summary counter timing** (`src/migrate.rs::run`): per-fact counters used to be set from accumulator length *before* the flush, so a flush failure would print a summary that lied. Refactored into a `flush_all` helper that bumps `summary.<fact>_inserted` from the flushed `RecordBatch::num_rows()` only after each `Appender::append_record_batch` succeeds. - **Empty-string normalization in commits** (`src/commits.rs`, `benchmarks-website/server/src/schema.rs`, `benchmarks-website/server/src/api.rs`): `message`, `author_name`/`email`, `committer_name`/`email` now bind as `Option<String>` and store SQL `NULL` when v2 supplied an empty or whitespace-only string. Schema columns made nullable; server reads use `COALESCE(c.message, '')` so the existing `String` decoder still works. - **Orphan WAL cleanup** (`src/migrate.rs::open_target_db`): the existing code already attempts `remove_if_exists` on the `.wal` regardless of whether the main file was present; pinned the behavior with a regression test that stages an orphan `.wal` (no main file) and asserts the orphan bytes don't survive `open_target_db`. - **Random-access dataset extraction** (`src/classifier.rs::bin_random_access`): 4-part records `random-access/<dataset>/<pattern>/<format>-tokio-local-disk` continue to extract `dataset/pattern` from the raw name. 2-part legacy records carry no dataset and used to render under the placeholder `"random access"`; they're now dropped to keep the v3 dataset column meaningful. - **`migrate_file_sizes` dataset fallback** (`src/migrate.rs::migrate_file_sizes`): when the matrix id stripped from `file-sizes-<id>.json.gz` isn't on the `KNOWN_FILE_SIZES_SUITES` allowlist, the fallback now emits `unknown:<id>` so the UI clearly flags it instead of presenting it as a real dataset. ## Tests Each fix has a focused regression test (`rstest` parametrization where useful): - `tests/classifier.rs::compression_size_scale_factor_canonicalizes` covering `"1"`, `"1.0"`, `"10"`, `"10.0"`, `"0.1"`, whitespace, and `""`. - `tests/classifier.rs::unmapped_records_yield_none` extended with `random_access_2_part_legacy` and `random_access_3_part`. - `migrate::tests::flush_all_does_not_overcount_on_failure` (private unit test that drops `compression_times` to force the second flush to fail and asserts only the queries counter is set). - `tests/end_to_end.rs::summary_counts_match_actual_rows_on_success` (sister invariant for the success path). - `tests/end_to_end.rs::empty_author_email_stored_as_null`. - `tests/end_to_end.rs::open_target_db_removes_orphan_wal`. - `tests/end_to_end.rs::file_sizes_unknown_id_falls_back_to_unknown_prefix` and `file_sizes_known_id_uses_id_directly`. - `tests/end_to_end.rs::compression_size_data_and_file_sizes_merge_with_canonical_sf` (cross-path SF canonicalization end to end). ## Verification - `cargo build -p vortex-bench-migrate` — clean. - `cargo test -p vortex-bench-migrate` — 7 unit + 46 classifier + 12 end-to-end tests all pass. - `cargo test -p vortex-bench-server` — 6 unit + 10 ingest + 6 web_ui tests pass; schema and `COALESCE` changes are server-safe. - `cargo clippy -p vortex-bench-migrate --all-targets` — clean. - `cargo fmt` on changed files (nightly fmt unavailable in this sandbox; ran with stable, which is a no-op for the imports-granularity options the repo's `rustfmt.toml` gates on nightly). - Skipped `./scripts/public-api.sh`: migrate is a leaf binary outside the public-api lockfile set, and the only newly `pub` item is the internal `canonical_scale_factor` helper. Signed-off-by: Claude <noreply@anthropic.com> --- _Generated by [Claude Code](https://claude.ai/code/session_012XyYJRpcGFxmJXdTJuW8Ff)_ --------- Signed-off-by: Claude <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

…7681) ## Summary Brings the v3 benchmarks website to a demo-ready state focused on the historical-comparison use case (Vortex vs other engines on the same commit, HEAD vs N commits ago, latest vs first as % delta). Single process, single binary; SSR `maud` + inline JSON `<script>` + Chart.js — no client-side framework, no build step, no post-load API round-trips. > Branch note: this PR was developed on the harness-assigned branch > `claude/demo-ready-benchmarks-v3-H5ECI` rather than the > `claude/benchmarks-v3-ui-historical-comparison` branch the task > request mentioned, because the session's harness pins the working > branch (`Develop on branch …`, `NEVER push to a different branch > without explicit permission`). ## CI note The `Rust tests (windows-x64)` job is failing on this PR but the **same job is also failing on the merge commit at the tip of `ct/benchmarks-v3`** (PR #7671's run, job id `73229326105`, the commit `8697731` we branched from). The base branch shipped with that failure tolerated, and our diff only touches `benchmarks-website/server/` (no Windows-specific paths, no FFI, no new dependencies on Windows-fragile crates), so this failure is pre-existing and not caused by the PR. CodSpeed flagged two `varbinview_zip` regressions in `vortex-array/` — also untouched by this PR. ## What's new * **Scoped commit window** — `?n=25|50|100|250|all`, default 100, server-side clamp to `[1, 1000]`. SQL splices in a `LIMIT ?` filter and binds the value as a parameter (consistent with the rest of the file's `params!`-style use); the unbounded path is a separate query so the plan stays clean. * **Group page** — `GET /group/{slug}` renders every chart in one group on a single screen. Each card embeds its own `<script id="chart-data-N">` payload + sibling `<canvas data-chart-index="N">`. `IntersectionObserver` defers `Chart` construction until the canvas scrolls into view (mobile-friendly + cheap for 22-chart TPC-H groups). * **Toolbar** — same component on `/chart/{slug}` and `/group/{slug}`. Scope buttons + slider, linear/log Y-axis, absolute / `% of baseline` mode. URL query string is canonical state; subtitle mirrors active state. Slider step is `5` so it can land on every preset value (`25`, `50`, `100`, `250`). * **Rich tooltip** — custom external HTML tooltip with `<short-sha> · YYYY-MM-DD` title; per-series rows render value with friendly unit (ns→µs→ms→s, B→KiB→MiB→GiB) and a coloured `% delta` vs the prior visible commit; footer carries the truncated commit message + a GitHub link. Document-level click closes. * **Legend → URL** — clicking a legend item rewrites `?hidden=engine:format|…` via `history.replaceState` (no back-button hostility). Permalinks reproduce the view. Delimiter is `|` so series names can contain `:` and `,` without escaping. * **Mobile** — `@media (max-width: 768px)`: single-column chart grid, toolbar wraps with ≥ 40 px touch targets, slider expands to fill the row, legend pops to the *top* of the chart so it doesn't push the chart off-screen on a phone. * **Landing search** — client-side filter input above the group list. * **/api/group/{slug}** — JSON sibling to the HTML route, returns every chart in the group with payloads inlined. ## What was *not* picked up from `planning/components/web-ui.md`'s deferred list Done now (moved out of deferred): - mobile redesign basics (single column, ≥ 40 px tap targets, toolbar wrap) - engine + series toggling (legend ↔ URL) - deep-link state (every toolbar control is URL-canonical) - group landing with the start of "filters" (client-side search) Still deferred (intentional): - per-commit drill-down page - ad-hoc SQL page - LTTB downsampling - engine name lookup table + curated colour palettes - summary cards (geomean ratios, rankings) - full-screen modal / zoom-pan - `?mode=delta` (compare-to-main) — parser branch dropped pending data shape work; toolbar surface today is only `abs / rel` ## Repro INGEST_BEARER_TOKEN=$(openssl rand -hex 32) \ VORTEX_BENCH_DB=./bench.duckdb \ cargo run --release -p vortex-bench-server Then open `http://localhost:3000/`, click any group name (now a link to `/group/{slug}`), or any chart inside, and play with the toolbar. Toggle a series in the legend and notice `?hidden=…` appear in the URL. Resize to phone width to confirm single-column layout, sticky toolbar wrapping, and legend-on-top. ## Snapshot diffs Three `.snap` files refreshed by this PR: - `landing_page.snap` — group names now link to `/group/{slug}`, search input added, `data-group-name` for client filter. - `chart_page_query.snap` — toolbar + indexed `<script id="chart-data-0">` + tooltip host element. - `group_page_query.snap` (new) — group page rendered against the fixture DB, `?n=100` pinned for stability. Run `INSTA_UPDATE=always cargo test -p vortex-bench-server` (or `cargo insta accept`) to refresh. ## Test plan - [x] `cargo build -p vortex-bench-server` - [x] `cargo test -p vortex-bench-server` — 41 tests pass (22 unit + 10 ingest + 9 web_ui) - [x] `cargo clippy -p vortex-bench-server --all-targets -- -D warnings` — clean - [x] `cargo +nightly fmt` — no diff - [ ] `./scripts/public-api.sh` — skipped per CLAUDE.md (leaf binary, not in workspace public-api lockfile set) - [ ] Manual screenshots — couldn't capture from the sandbox; the reviewer or follow-up should record landing / single chart with toolbar / group desktop / group mobile / tooltip open / log+rel. ## Follow-up review fixes (commits `7042f0d` … `da668a4`) - `7042f0d` — `LIMIT` value travels as a bound parameter (`LIMIT ?`) via `params_from_iter` instead of being interpolated into SQL. - `9c80bce` — drop the unused `?mode=delta` parser branch in both `UiQuery::mode` and `chart-init.js::parseUrl`. - `d156ab8` — `?hidden=` delimiter is now `|`; new test pins the server/client wire agreement. - `da668a4` — slider `step` lowered to 5 so it can land on every preset (`25/50/100/250`). ## Things explicitly NOT changed - `/api/ingest`, auth, schema, write paths. - DB migration (none added). - Existing routes (no renames). - v2 site at `benchmarks-website/server.js` etc — untouched. - Single-chart page still works; reuses the same `chart-init.js`. https://claude.ai/code/session_015Nc73ihs9TUdx7QzLUZudK --------- Signed-off-by: Claude <claude@anthropic.com> Co-authored-by: Claude <claude@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 changed the title ~~chore(benchmarks-migrate): post-alpha cleanup nits~~ [claude] chore(benchmarks-migrate): post-alpha cleanup nits Apr 27, 2026

connortsui20 added the changelog/skip Do not list PR in the changelog label Apr 27, 2026

connortsui20 merged commit 8697731 into ct/benchmarks-v3 Apr 27, 2026
57 of 58 checks passed

connortsui20 deleted the claude/benchmarks-v3-nits branch April 27, 2026 17:39

connortsui20 mentioned this pull request Apr 27, 2026

[claude] feat(benchmarks-website): historical comparison UX + mobile #7681

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[claude] chore(benchmarks-migrate): post-alpha cleanup nits#7671

[claude] chore(benchmarks-migrate): post-alpha cleanup nits#7671
connortsui20 merged 2 commits intoct/benchmarks-v3from
claude/benchmarks-v3-nits

connortsui20 commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

connortsui20 commented Apr 27, 2026

Fixes

Tests

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants