Skip to content

Conversation

eicchen
Copy link
Contributor

@eicchen eicchen commented Oct 19, 2025

This improves numerical stability for values that are really large or small such as the example given in the original issue.

Copy link
Member

@Alvaro-Kothe Alvaro-Kothe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run the benchmarks?

@eicchen
Copy link
Contributor Author

eicchen commented Oct 20, 2025

image

I've already run the relevant benchmarks. Unsurprisingly, we are looking at performance decrease in stat_ops.Correlation. I've looked into other ways of solving the issue while keeping the online Welford.

image image

The problem stems from the co-moment calculations at large/small values and the asymmetric nature of Welford's Algorithm. The three values above are mathematically the same, however, when calculating the values provided in the test case, there are always two that are correct and one that is not.

We could pick the value of the pair that match as a redundancy measure. Theoretically, it can only reduce our errors compared to our current version, as the values should be equal. But without a larger test pool, I wouldn't be confident enough to put it in a release version. Two-pass provides the best numerical stability at the cost of performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Pearson correlation outside expected range -1 to 1

2 participants