Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG?: .fill(np.nan)an int64 array raises ValueError on main #21784

Closed
mroeschke opened this issue Jun 16, 2022 · 4 comments
Closed

BUG?: .fill(np.nan)an int64 array raises ValueError on main #21784

mroeschke opened this issue Jun 16, 2022 · 4 comments

Comments

@mroeschke
Copy link

mroeschke commented Jun 16, 2022

Describe the issue:

We're seeing a change filling a 0-len int64 array on our numpy dev build in pandas: https://github.com/pandas-dev/pandas/runs/6919574372?check_suite_focus=true

Is this an intended change?

Maybe related to #21437

Reproduce the code example:

In [2]: import numpy as np

In [3]: np.__version__
Out[3]: '1.22.4'

In [4]: subarr = np.empty(0, dtype=np.dtype("int64"))

In [5]: subarr.fill(np.nan)

In [6]: subarr
Out[6]: array([], dtype=int64)

vs

In [1]: np.__version__
Out[1]: '1.24.0.dev0+270.g0eb6865d2'

In [2]: subarr = np.empty(0, dtype=np.dtype("int64"))

In [3]: subarr.fill(np.nan)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 subarr.fill(np.nan)

ValueError: cannot convert float NaN to integer

Error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 subarr.fill(np.nan)

ValueError: cannot convert float NaN to integer

NumPy/Python version information:

In [1]: np.__version__
Out[1]: '1.24.0.dev0+270.g0eb6865d2'

(Python 3.9.12 if that matters)

@seberg
Copy link
Member

seberg commented Jun 16, 2022

The change is unrelated to the fact that the array is empty, if it is not, you get:

In [5]: subarr.fill(np.nan)

In [6]: subarr
Out[6]: array([-9223372036854775808])

(where the value is undefined and platform dependent.) This is gh-20924 (not the casting change).

Yes, it was an intended change because it now works the same way as arr1d[0] = np.nan or np.array([np.nan], dtype="int64"). While before arr.fill had awkward custom logic only used by fill itself.

But, of course that doesn't mean that we can just get away with the change, unfortunately. Astropy may also have a problem with this, because apparently .fill() used to drop the unit, while now it does not.

@seberg seberg changed the title BUG?: .fill(np.nan)a 0-len empty int64 array raises ValueError on main BUG?: .fill(np.nan)an int64 array raises ValueError on main Jun 16, 2022
@seberg seberg added this to the 1.24.0 release milestone Jun 17, 2022
@mroeschke
Copy link
Author

From the pandas side, it was sufficient enough to adjust to this change as it aligns with similar value based dtype behavior changes on our end. So feel free to close or address as needed on your side.

@seberg
Copy link
Member

seberg commented Jun 24, 2022

Thakns for the note! Lets wait a bit. Astropy also notices this in 1-2 tests, and I am not sure they looked at it thoroughly. But if pandas has a backported fix, I think we may be able to just go with it. (This might also be a NumPy 2.0 with some chance, but lets not plan that yet...)

If anyone else runs into it, please make a note, because if pandas is OK with it I would slightly lean towards keeping the change.

@seberg
Copy link
Member

seberg commented Oct 20, 2022

Nobody else has complained about this yet in the past months, so for the moment it seems we should be good for the release.

@seberg seberg closed this as completed Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants