-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: stats.skew: unreliable when all values in axis slice are identical #15554
Comments
Hi @ronzhou, thanks for reporting. I can reproduce and indeed there is an issue here. It's a resolution issue when we compute the moments with from scipy.stats._stats_py import _moment
_moment(a[:, 0], 2, axis, mean=a[:, 0].mean(axis=0, keepdims=True))
# 4.930380657631324e-32
_moment(a, 2, axis, mean=a.mean(axis=0, keepdims=True))[0]
# 1.7749370367472766e-30 And this slight difference makes it fail here: zero = (m2 <= (np.finfo(m2.dtype).resolution * mean.squeeze(axis))**2)
vals = np.where(zero, 0, m3 / m2**1.5) And the root cause is from NumPy actually. import numpy.testing as npt
npt.assert_equal(a.mean(axis=0, keepdims=True)[0][0], a[:, 0].mean(axis=0, keepdims=True)[0])
# Items are not equal:
# ACTUAL: 1.0100000000000013
# DESIRED: 1.0100000000000002 |
From NumPy's doc
So in the end we have to update our check. I am not sure where this ( @peterbell10 I see you touch this area recently. Would you know? Can we bump the tolerance a bit here? I always thought it does not make much sense to go bellow the machine precision. |
Hi @tupui , thanks for your response. In a 3 dim array, this is not an issue.
I use scipy.stats.skew to compute skewness along axis 0 in a 2d array. Currently, I use the method below to avoid this issue,
Is there any better solution to avoid it temporarily? Thanks for your help! |
Better I don't know but you can try playing with the resolution if your application allows that: |
Describe your issue.
computing the skewness of a numpy ndarray will produce two result with following method for a column in ndarray (x):
the issues happens when this column has equal float value
Reproducing Code Example
Error message
SciPy/NumPy/Python version information
1.7.3 1.22.0 sys.version_info(major=3, minor=8, micro=5, releaselevel='final', serial=0)
The text was updated successfully, but these errors were encountered: