Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Regression in interaction between numpy.ma and pandas with 1.24.0 #22826

Closed
mwaskom opened this issue Dec 19, 2022 · 3 comments · Fixed by #22838
Closed

BUG: Regression in interaction between numpy.ma and pandas with 1.24.0 #22826

mwaskom opened this issue Dec 19, 2022 · 3 comments · Fixed by #22838

Comments

@mwaskom
Copy link

mwaskom commented Dec 19, 2022

Describe the issue:

Hello, this just popped up in my tests with the 1.24.0 release:

Reproduce the code example:

import numpy as np, pandas as pd
np.ma.masked_invalid(pd.Series([1., 2.]))

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [35], line 2
      1 import numpy as np, pandas as pd
----> 2 np.ma.masked_invalid(pd.Series([1., 2.]))

File ~/miniconda/envs/py310/lib/python3.10/site-packages/numpy/ma/core.py:2360, in masked_invalid(a, copy)
   2332 def masked_invalid(a, copy=True):
   2333     """
   2334     Mask an array where invalid values occur (NaNs or infs).
   2335 
   (...)
   2357 
   2358     """
-> 2360     return masked_where(~(np.isfinite(getdata(a))), a, copy=copy)

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

NumPy/Python version information:

1.24.0 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:38:29) [Clang 13.0.1 ]

Pandas is 1.5.2

Context for the issue:

The issue seems to be with getdata, which is pulling out the index block manager rather than the values:

np.ma.getdata(pd.Series([1., 2.]))
SingleBlockManager
Items: RangeIndex(start=0, stop=2, step=1)
NumericBlock: 2 dtype: float64

Reporting to numpy rather than pandas as the error emerged with the numpy 1.24.0 update; perhaps it's an issue on the pandas side though.

@charris charris added this to the 1.24.1 release milestone Dec 19, 2022
@seberg
Copy link
Member

seberg commented Dec 19, 2022

This is also related to gh-22720, although a different angle, since np.ma.getdata() didn't change, I think. Although, I guess there is some chance that pandas introduced series._data which causes getdata to misfire.

It seems strange that getdata relies on _data without type-checking first, but I do wonder if reverting is the easier solution for 1.24.x...

EDIT: xref gh-22046 which was the PR introducing the changes.

@seberg
Copy link
Member

seberg commented Dec 19, 2022

Oh, the PR removed an a = np.array(a, copy=copy, subok=True) call, which I think we can just add back (it is called later anyway, and was there previously before getting the data), I wonder if that also might modify the behavior with respec tto gh-22720.

@mwaskom
Copy link
Author

mwaskom commented Dec 19, 2022

I expect that marking 1.24.0 as non-compatible is the best short-term solution? (I'm not calling np.ma directly).

seberg added a commit to seberg/numpy that referenced this issue Dec 19, 2022
This is the minimal solution to fix numpygh-22826 with as little change
as possible.
We should fix `getdata()` but I don't want to do that in a bug-fix
release really.

IMO the alternative is to revert numpygh-22046 which would also revert
the behavior noticed in numpygh-22720  (which seems less harmful though).

Closes numpygh-22826
charris pushed a commit to charris/numpy that referenced this issue Dec 19, 2022
This is the minimal solution to fix numpygh-22826 with as little change
as possible.
We should fix `getdata()` but I don't want to do that in a bug-fix
release really.

IMO the alternative is to revert numpygh-22046 which would also revert
the behavior noticed in numpygh-22720  (which seems less harmful though).

Closes numpygh-22826
dweindl added a commit to ICB-DCM/pyPESTO that referenced this issue Dec 20, 2022
dweindl added a commit to ICB-DCM/pyPESTO that referenced this issue Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants