Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats.sigma_clipped_stats raises numpy warning when called with MaskedColumn #13281

Closed
hamogu opened this issue May 27, 2022 · 2 comments · Fixed by #15844
Closed

stats.sigma_clipped_stats raises numpy warning when called with MaskedColumn #13281

hamogu opened this issue May 27, 2022 · 2 comments · Fixed by #15844

Comments

@hamogu
Copy link
Member

hamogu commented May 27, 2022

Description

astropy.stats.sigma_clipped_stats raises a warning that the mask is ignored, when called with a astropy.table.MaskedColumn. However, that warning is wrong - that mask is used.

Not sure it's related: #5303

Expected behavior

not issue a warning that warns of something that does not happen

Actual behavior

warning issued for something that does not happen

Steps to Reproduce

from astropy.stats import sigma_clipped_stats
import numpy as np

arr = np.ma.array([1,1,1,1,1,1,1,1,1,3,3,3,3,3,3,3])
arr[:7] = np.ma.masked
print(sigma_clipped_stats(arr))

produces the following output:
(2.5555555555555554, 3.0, 0.8314794192830981)

The following code produces the same output, but raises a warning that the mask is ignored. The mask is most definitely is not ignored, otherwise the output would be different.

from astropy.table import MaskedColumn
col = MaskedColumn(data=arr)
print(sigma_clipped_stats(col))

gives the following warning:

/Users/guenther/mambaforge/envs/kitchensink/lib/python3.10/site-packages/numpy/core/fromnumeric.py:758: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedArray.
  a.partition(kth, axis=axis, kind=kind, order=order)
/Users/guenther/mambaforge/envs/kitchensink/lib/python3.10/site-packages/numpy/core/fromnumeric.py:758: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedArray.
  a.partition(kth, axis=axis, kind=kind, order=order)

Curiously, the following is True: isinstance(col, np.ma.MaskedArray), so I would have expected arr and col to not just give the same numerical result, but also to have or have not the same warning.

System Details

macOS-12.4-arm64-arm-64bit
Python 3.10.1 | packaged by conda-forge | (main, Dec 22 2021, 01:39:07) [Clang 11.1.0 ]
Numpy 1.22.0
pyerfa 2.0.0.1
astropy 5.1
Scipy 1.7.3
Matplotlib 3.5.1

CC: @mhvk

@mhvk
Copy link
Contributor

mhvk commented May 29, 2022

Confirmed this happens on -dev and does not happen on 5.0.4. Though it is all rather weird, as it originates in older code. The error itself definitely comes from numpy:

In [2]: np.nanmedian(arr)
/usr/lib/python3/dist-packages/numpy/core/fromnumeric.py:755: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedArray.
  a.partition(kth, axis=axis, kind=kind, order=order)

And it is clear where the difference between MaskedArray and MaskedColumn arises:

# remove masked values and convert to ndarray
if isinstance(filtered_data, np.ma.MaskedArray):
filtered_data = filtered_data.data[~filtered_data.mask]

Here, for a MaskedColumn, .data will give a MaskedArray, so the code does not actually return an unmasked array.

A really simple change is to change .data to ._data, which will return a Column instead (and that does not give any warnings).

But let me ping @larrybradley since I'm wondering whether this is wanted.

p.s. For completeness, the Masked class works without warning, but keeps its Masked nature:

from astropy.stats import sigma_clipped_stats
import numpy as np
from astropy.utils.masked import Masked

arr = np.ma.array([1,1,1,1,1,1,1,1,1,3,3,3,3,3,3,3])
arr[:7] = np.ma.masked
print(sigma_clipped_stats(Masked(arr)))
# (MaskedNDArray(2.55555556), MaskedNDArray(3.), MaskedNDArray(0.83147942))

hamogu added a commit to hamogu/marxs that referenced this issue Jan 27, 2023
@larrybradley
Copy link
Member

A really simple change is to change .data to ._data, which will return a Column instead (and that does not give any warnings).

I think that make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants