v0.5.0 — false-discovery-rate control (--fdr)
A principled significance basis for the point detector — a controlled error rate instead of a magic threshold.
Added
scan/explaingain--fdr Q— per-column Benjamini–Hochberg false-discovery-rate control forpoint.modz. Each cell's modified z-score becomes a two-sided p-value, and the fixedpoint_thresholdis replaced by a multiplicity-aware step-up cutoff bounding the expected proportion of false flags atQ(e.g.--fdr 0.05). Opt-in — omitted, behavior is unchanged. The level is part of theconfig_versionfingerprint (pfdr=).- New
ax_detect::fdrmodule:two_sided_p(normal tail viaerfc) +benjamini_hochberg(deterministic step-up).
Honest scope — correctness, not volume
FDR replaces an arbitrary cutoff with a principled error-rate guarantee and adapts to how many cells were tested (a noise column stops contributing chance flags; the same outlier can be significant in a small column yet not a large one). On genuinely heavy-tailed data it can flag more cells, not fewer — those cells really are significant at Q; the old fixed cutoff was stringent in an uncalibrated way. On the real 127k-row MSHA parquet: 32,893 → 40,079 point findings at q=0.05. Capping output volume is a separate lever (column scoping today; severity / top-N next) — they compose: "top-N by score, among the FDR-significant set."
Calibration
The p-value uses the consistent-σ standardized deviation (x − center)/scale (≈ N(0,1)), not the display-scaled modified z-score.
Gate
proptest + cargo-mutants 0 missed across all four changed files (BH step-up, the (x−center) sign, and multiplicity adaptation are all pinned).
Install: cargo install anomalyx
Full changelog: v0.4.1...v0.5.0