Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support NumPy 2.0 #16998

Closed
stinodego opened this issue Jun 16, 2024 · 9 comments · Fixed by #17384
Closed

Support NumPy 2.0 #16998

stinodego opened this issue Jun 16, 2024 · 9 comments · Fixed by #17384
Assignees
Labels
A-interop-numpy Area: interoperability with NumPy accepted Ready for implementation blocked Cannot be worked on due to external dependencies, or significant new internal features needed first enhancement New feature or an improvement of an existing feature

Comments

@stinodego
Copy link
Member

stinodego commented Jun 16, 2024

Polars already partially supports NumPy 2.0.

We do not support temporal types yet (e.g. datetime64) - using these will result in a segfault. Support must first be implemented in the PyO3 numpy crate:

When support is implemented and a new release is available, we can upgrade and that should take care of it.

For now, we will pin to numpy<2 in our pipelines.

@stinodego stinodego added enhancement New feature or an improvement of an existing feature accepted Ready for implementation blocked Cannot be worked on due to external dependencies, or significant new internal features needed first P-high Priority: high A-interop-numpy Area: interoperability with NumPy labels Jun 16, 2024
@ritchie46 ritchie46 removed the P-high Priority: high label Jun 17, 2024
@ritchie46
Copy link
Member

If blocked, it isn't p-high at the moment. :D

@stinodego
Copy link
Member Author

If blocked, it isn't p-high at the moment. :D

Indeed. We should have something to mark "high importance" though - which this certainly is.

@braaannigan
Copy link
Collaborator

I've encountered a seg fault where another library is converting a Polars dataframe to numpy with numpy 2.0.0 and polars 0.20.31.
Downstream issue is here: Nixtla/mlforecast#354

import polars as pl
df = pl.from_repr("""┌───────────┬─────────────────────┬────────────┐
│ unique_id ┆ ds                  ┆ y          │
│ ---       ┆ ---                 ┆ ---        │
│ cat       ┆ datetime[ns]        ┆ f64        │
╞═══════════╪═════════════════════╪════════════╡
│ id_00     ┆ 2000-01-01 00:00:00 ┆ 17.519167  │
│ id_00     ┆ 2000-01-02 00:00:00 ┆ 87.799695  │
│ id_00     ┆ 2000-01-03 00:00:00 ┆ 177.442975 │
│ id_00     ┆ 2000-01-04 00:00:00 ┆ 232.70411  │
│ id_00     ┆ 2000-01-05 00:00:00 ┆ 317.510474 │
│ id_19     ┆ 2000-03-09 00:00:00 ┆ 28.541985  │
│ id_19     ┆ 2000-03-10 00:00:00 ┆ 33.841384  │
│ id_19     ┆ 2000-03-11 00:00:00 ┆ 38.878866  │
│ id_19     ┆ 2000-03-12 00:00:00 ┆ 43.675512  │
│ id_19     ┆ 2000-03-13 00:00:00 ┆ 49.603114  │
└───────────┴─────────────────────┴────────────┘""")
df["ds"].to_numpy()

This gives a seg fault.

@braaannigan
Copy link
Collaborator

@ritchie46
In fact this is enough to produce a seg fault:

import datetime

import polars as pl

pl.Series([datetime.datetime.now()]).to_numpy()

@stinodego
Copy link
Member Author

Yes, we are familiar. Nothing we can do about it at the moment, see the linked issue.

@jmoralez
Copy link

Nothing we can do about it at the moment

You could also pin numpy here

numpy = ["numpy >= 1.16.0"]

in case you're planning a release soon, that'd ensure compatibility when installing polars[numpy]

@ritchie46
Copy link
Member

Yes, we must pin numpy 1.0 until then.

@lesteve
Copy link

lesteve commented Jul 3, 2024

Bumped into this in scikit-learn #17380 (that I opened before realising this was a known issue). I have to admit this I was quite surprised (a segmentation fault is never a nice feeling 😅) but I guess there is not much you can do since numpy is an optional dependency. You can only hope that people do pip install polars[numpy] and not the more naive (as I did) pip install polars numpy

Edit: on the conda-forge side it looks like polars depend on numpy and they pin numpy<2 which seems like the right thing to do: conda-forge/polars-feedstock#240. At least the situation will be a bit better with polars 1.0 and conda-forge.

@pavelzw
Copy link

pavelzw commented Jul 3, 2024

At least the situation will be a bit better with polars 1.0 and conda-forge.

We can also do repodata Patches to retroactively pin to numpy<2 for older polars versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-interop-numpy Area: interoperability with NumPy accepted Ready for implementation blocked Cannot be worked on due to external dependencies, or significant new internal features needed first enhancement New feature or an improvement of an existing feature
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants