fix: use float64 accumulators in PearsonCorrelation to prevent catastrophic cancellation#3740
Open
tejasae-afk wants to merge 1 commit intopytorch:masterfrom
Open
Conversation
…rophic cancellation The naive E[X²] - (E[X])² formula loses all precision when values have large magnitude relative to their variance: both terms are ~μ² ≈ 1e16 while their difference (the variance) is ~O(1), which falls below float32's unit in the last place at that scale. Switch all five accumulators and the incoming batches to float64 on non-MPS devices. MPS does not support float64 and keeps the previous float32 behaviour. The final result is still returned as a Python float, so the public API is unchanged. Fixes pytorch#3662 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #3662.
The naive E[X²] - (E[X])² formula for variance is mathematically correct but numerically unstable. When values have large magnitude relative to their variance — for example an offset of 1e8 with small inter-sample differences — both E[X²] and (E[X])² are around 1e16 while their difference (the actual variance) is O(1). In float32, the unit in the last place at that scale is roughly 10^9, so the variance is completely lost and the metric returns 0.0.
The fix is to accumulate in float64 on devices that support it. MPS does not support float64, so it falls back to float32 and retains the previous behaviour. The public return type (Python float) is unchanged.
A new test
test_numerical_stability_large_offsetcovers this case. All existing non-distributed tests pass.you know the codebase far better than I do — happy to adjust if a different approach (e.g., a shared Welford-based utility across PearsonCorrelation, R2Score, and FID as discussed in the issue) is preferred.