Skip to content

fix(moarstats): gate per-pair bivariate skip diagnostics behind log::warn!#3894

Merged
jqnatividad merged 1 commit into
masterfrom
fix/moarstats-bivariate-diagnostics-gate-stderr
May 23, 2026
Merged

fix(moarstats): gate per-pair bivariate skip diagnostics behind log::warn!#3894
jqnatividad merged 1 commit into
masterfrom
fix/moarstats-bivariate-diagnostics-gate-stderr

Conversation

@jqnatividad
Copy link
Copy Markdown
Collaborator

Summary

  • All nine per-pair wwarn! calls in the bivariate field_pairs construction loop switched back to log::warn!. They are now gated behind QSV_LOG_LEVEL (set QSV_LOG_LEVEL=warn to surface them in the qsv log file).
  • The post-loop summary stays as winfo! — one stderr line per run that carries the aggregate per-reason skip counts and the full csv_headers. Enough signal to spot the corruption mode without per-pair noise.
  • The hard-fail guard for joined-input corruption is unaffected.

Why

Codex review job 2383 (MEDIUM) flagged that PR #3892's instrumentation turned every per-pair skip into an unconditional wwarn! stderr write. Zero-variance, both-constant, cardinality==rowcount, and type-filter skips are routine for many datasets, and the field_pairs loop visits O(n²) pairs — so a healthy moarstats --bivariate run on a wide CSV would have flooded stderr.

Test plan

  • cargo test -F all_features test_moarstats:: — 80/80 passing
  • Local left-join repro: stderr is now clean of per-pair skipping lines; output still produces 5 pairs as expected

🤖 Generated with Claude Code

…warn!

Codex review (job 2383, MEDIUM) flagged that PR #3892's
instrumentation turned every per-pair skip into an unconditional
wwarn! stderr write. Zero-variance, both-constant,
cardinality==rowcount, and type-filter skips are routine for many
datasets, and the field_pairs loop visits O(n²) pairs — so a
healthy moarstats --bivariate run on a wide CSV would now flood
stderr.

Switch all nine per-pair skip warnings from wwarn! back to
log::warn!:
- field1_bad_type, field1_missing_in_csv
- field2_bad_type, field2_missing_in_csv
- zero_stddev, zero_variance (the two-branch case)
- both_constant, card_eq_rowcount, type_filter

These are now gated behind QSV_LOG_LEVEL — set QSV_LOG_LEVEL=warn
to surface them in the qsv log file when actually debugging a
flake.

The post-loop summary stays as winfo!: it's a single line that
already carries the aggregate per-reason skip counts and the
full csv_headers — enough signal to spot the corruption mode
without per-pair noise. The hard-fail guard for joined-input
corruption is unaffected.

Healthy left-join repro: stderr is now clean of per-pair
'skipping' lines, output still produces 5 pairs. All 80
moarstats tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@jqnatividad jqnatividad merged commit b985131 into master May 23, 2026
16 of 17 checks passed
@jqnatividad jqnatividad deleted the fix/moarstats-bivariate-diagnostics-gate-stderr branch May 23, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant