Skip to content

Conversation

Alvaro-Kothe
Copy link
Contributor


This PR fixes a bug in Series.kurt where a fixed tolerance of 1e-14 caused incorrect results for low-variance arrays.

The tolerance is now computed based on the data magnitude, using an approach based on SciPy's implementation.


Additional notes:

  • I changed the assertion on test_constant_series because it was only showing AssertionError when I broke this test.
  • The expected value in the test is different from the one presented in the Issue because pandas computes the unbiased kurtosis estimator.
    • st.kurtosis(data, bias=False): 18.087646853025614
    • st.kurtosis(data, bias=True): 14.916104870028551

@Alvaro-Kothe Alvaro-Kothe changed the title Fix: zero result in Series.kurt for low variance arrays BUG: zero result in Series.kurt for low variance arrays Sep 21, 2025
m4 = adjusted4.sum(axis, dtype=np.float64)

# Several floating point errors may occur during the summation due to rounding.
# We need to estimate an upper bound to the error to consider the data constant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also mention and maybe link to where Scipy does a similar calculation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the permalink. It's also important to point out that I adapted their code to use the maximum instead of the mean for scaling. I was worried about zero mean, resulting in tolerance=0. Let me know if it needs any further modifications.

@mroeschke mroeschke added the Reduction Operations sum, mean, min, max, etc. label Sep 22, 2025
@mroeschke mroeschke added this to the 3.0 milestone Sep 22, 2025
@mroeschke mroeschke merged commit 4e92c63 into pandas-dev:main Sep 22, 2025
42 checks passed
@mroeschke
Copy link
Member

Thanks @Alvaro-Kothe

@Alvaro-Kothe Alvaro-Kothe deleted the fix/low-variance-kurtosis branch September 22, 2025 17:28
jzwick pushed a commit to jzwick/pandas that referenced this pull request Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Wrong kurtosis outcome due to inadequate fix to previous issues
2 participants