release-25.4: testutils/floatcmp: increase CloseMargin to handle distributed execution errors#162223
release-25.4: testutils/floatcmp: increase CloseMargin to handle distributed execution errors#162223blathers-crl[bot] wants to merge 1 commit intorelease-25.4from
Conversation
…ion errors Previously, the `CloseMargin` constant was set to `CloseFraction * CloseFraction` (1e-28), which was far too small to handle real-world floating-point rounding errors that occur when comparing results from different query execution paths (distributed vs local). This caused the `unoptimized-query-oracle` roachtest to flake when aggregate functions like `stddev_pop` and `var_samp` produced values differing by ~1e-16 due to different aggregation ordering. Additionally, the test only used approximate float matching (`FloatsMatchApprox`) on s390x architecture, while using exact significant-digit matching on other platforms, which didn't leverage the `CloseMargin` tolerance at all. This commit makes two changes: 1. Increases `CloseMargin` from `CloseFraction * CloseFraction` (1e-28) to `CloseFraction / 10` (1e-15), which is appropriate for catching the numerical noise observed in distributed aggregate computations. 2. Changes `unsortedMatricesDiffWithFloatComp` to always use `FloatsMatchApprox` for float comparisons on all architectures, not just s390x. This ensures the `CloseMargin` tolerance is consistently applied. Fixes #150902 Release note: none Epic: None Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
b2489bc to
7f3c2a7
Compare
|
Thanks for opening a backport. Before merging, please confirm that it falls into one of the following categories (select one):
Add a brief release justification to the PR description explaining your selection. Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy. All backports must be reviewed by the TL and EM for the owning area. |
|
Merging to
|
|
✅ PR #162223 is compliant with backport policy Confidence: high 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Backport 1/1 commits from #161907 on behalf of @mw5h.
Previously, the
CloseMarginconstant was set toCloseFraction * CloseFraction(1e-28), which was far too small to handle real-world floating-point rounding errors that occur when comparing results from different query execution paths (distributed vs local). This caused theunoptimized-query-oracleroachtest to flake when aggregate functions likestddev_popandvar_sampproduced values differing by ~1e-16 due to different aggregation ordering.Additionally, the test only used approximate float matching (
FloatsMatchApprox) on s390x architecture, while using exact significant-digit matching on other platforms, which didn't leverage theCloseMargintolerance at all.This commit makes two changes:
Increases
CloseMarginfromCloseFraction * CloseFraction(1e-28) toCloseFraction / 10(1e-15), which is appropriate for catching the numerical noise observed in distributed aggregate computations.Changes
unsortedMatricesDiffWithFloatCompto always useFloatsMatchApproxfor float comparisons on all architectures, not just s390x. This ensures theCloseMargintolerance is consistently applied.Fixes #150902
Release note: none
Epic: None
Release justification: