fix(moarstats): harden bivariate header read against joined-CSV page-cache race#3873
Merged
Merged
Conversation
…cache race The `moarstats --join-inputs --bivariate` pipeline shells out to `qsv join` and `qsv stats` subprocesses that write temp CSVs, then re-reads them. Under heavy parallel CI load a follow-up read can intermittently see a short/stale view of the joined temp file, dropping a secondary column. `join_datasets_internal` and the joined-stats block already guard their reads with `sync_subprocess_output` + coverage validation + retry, but the bivariate header read (`csv_headers`) had no such guard. When a stats-CSV column was missing from that header, the field-pair loop silently `continue`d past it, producing "primary-only" join-corrupt bivariate output. Harden the joined-input path: fsync the joined CSV, validate its header covers every column the stats subprocess produced, and retry once after a re-sync. If a column is still missing, fail loud with a clear diagnostic instead of silently emitting primary-only output. The non-joined path is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
The first fix hardened the bivariate header read, but CI showed the test still failing with value2 missing — and the new fail-loud check never fired, proving the joined CSV header read was fine. The truncation was upstream. Two real gaps in the joined --bivariate path: 1. The joined-stats coverage check validated `qsv stats` output against a FRESH re-read of the joined CSV header. Under load that re-read can come back short, weakening the check to a trivial pass. 2. After the joined-stats block validated the stats CSV, `stats_csv_content` re-read the same temp file with an independent, unvalidated `fs::read_to_string` — which could observe a short file. Fixes: - `join_datasets_internal` now returns its already-validated joined header. The coverage check validates `qsv stats` output against that trusted header instead of a fragile re-read. - `run_stats_subprocess` reads the stats CSV content once and derives the field-coverage set from that same string, so the validated set and the content fed downstream are the same snapshot. The validated content is carried forward — no second unvalidated read. - Retry bumped from 1 to up to 3 attempts before failing loud. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Hardens moarstats --join-inputs --bivariate against intermittent short/stale reads of joined/stats temp CSVs under heavy parallel load, turning previously-silent join corruption into validated retries and (if still inconsistent) a clear failure.
Changes:
join_datasets_internalnow returns the validated final joined CSV header to avoid downstream re-reads of the joined temp file for coverage checks.- Joined-input stats generation now reads and reuses a single coverage-validated stats CSV snapshot (string) instead of performing later unvalidated re-reads.
- Bivariate header read for joined inputs now fsyncs + validates header coverage against stats fields, retries once, and fails loudly if columns are still missing.
Two Copilot review findings on PR #3873: - The `stats_field_names` comment said "numeric/date/string field names" but the code collects every `field` value from the stats CSV (one row per column, any type). Corrected the comment. - `missing_cols` did an O(n*m) membership scan (`hdrs.iter().any(...)` per stats field). Build a `HashSet` of header names once for O(1) lookups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a flaky CI failure in
test_moarstats::moarstats_join_type_left_runs_and_writes_bivariate(example run):Root cause
The
moarstats --join-inputs --bivariatepipeline shells out toqsv joinandqsv statssubprocesses that write temp CSVs, then re-reads them. Under heavy parallel CI load a follow-up read intermittently sees a short/stale view of the joined temp file, dropping a secondary column.join_datasets_internaland the joined-stats block already guard their reads withsync_subprocess_output+ header-coverage validation + a retry. The bivariate header read (csv_headers) was the one link in that chain with no guard — and when a stats-CSV column was missing from that header, the field-pair loop silentlycontinued past it, producing "primary-only" join-corrupt bivariate output.Change
src/cmd/moarstats.rs— for the joined-input path:The non-joined path is unchanged (no new failure mode for users with a pre-existing stats cache).
Test plan
cargo test -F all_features test_moarstats::moarstats_join— 6/6 pass, including the previously-failing test.cargo clippy -F all_features --tests— no new warnings in the edited region.cargo check --features=lite— compiles clean (the feature set of the failing CI job).cargo +nightly fmt— applied via the repo's auto-format hook.🤖 Generated with Claude Code