Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with upcoming Numpy 2.0 release and existing scikit-learn releases #227

Closed
betatim opened this issue Aug 22, 2023 · 9 comments · Fixed by #259
Closed

How to deal with upcoming Numpy 2.0 release and existing scikit-learn releases #227

betatim opened this issue Aug 22, 2023 · 9 comments · Fixed by #259

Comments

@betatim
Copy link

betatim commented Aug 22, 2023

Once Numpy 2.0 (end of 2023 or early 2024) has been released versions of scikit-learn that were built against Numpy 1.x won't work with the latest version of Numpy (2.0) anymore. The reason for this is that Numpy 2 will contain a change to its ABI that means you need to (at the very least) build your wheels against Numpy 2.0.

I thin kit isn't great that users who install scikit-learn and don't manually specify numpy<2 will end up with a setup that doesn't work. So far scikit-learn did not specify a upper bound on the required Numpy version.

I'm wondering what can be done within conda-forge about this. In particular for releases that are in the past.

I can think of two options:

  1. patch the repodata for existing releases to include a numpy<2 requirement
  2. rebuild (some?) old releases against Numpy 2. As I understand it you can use Numpy 2 to build your wheel and it will work with Numpy 1.x, but the reverse isn't true.

I'm not a conda-forge expert, so maybe there are more options?

What do people think? Is this a topic that is better addressed at the "top level" of conda-forge instead of package by package (I assume there are more packages that will have this problem)?

xref scikit-learn/scikit-learn#27075

@betatim betatim changed the title How to deal with Numpy 2.0 release and existing scikit-learn releases How to deal with upcoming Numpy 2.0 release and existing scikit-learn releases Aug 22, 2023
@ocefpaf
Copy link
Member

ocefpaf commented Aug 22, 2023

We can do both BTW. Patch the old builds and rebuild some, or to save energy just the last release, with np2.

@betatim
Copy link
Author

betatim commented Aug 22, 2023

Is there something I can/should do, given that I have not the faintest idea how y'all get organised (e.g. will rebuilding just happen together with other packages? is the repodata patch something that happens in this repo or somewhere global?)

@ocefpaf
Copy link
Member

ocefpaf commented Aug 22, 2023

Is there something I can/should do, given that I have not the faintest idea how y'all get organised (e.g. will rebuilding just happen together with other packages? is the repodata patch something that happens in this repo or somewhere global?)

The new versions will be correctly pinned when np20 lands. For the older version and the patch, we expected the maintainers to address the problem. Patching can be done already and it should be a PR to https://github.com/conda-forge/conda-forge-repodata-patches-feedstock. However, given the dimension of the problem, I wonder if the core group should consider a wider patch that touches all the packages...

Now, to rebuild the old versions, it will depend on the state of this feedstock and how many previous versions you want to rebuild. The steps would be:

  1. wait for np20 to land
  2. define if only the current or x many previous versions will be rebuilt
  3. create branches for the previous versions, rerender, and bump the build number
  4. if only the current one must be rebuilt, we hope that it will get addressed by a migration and no user action will be necessary. Note that, for the rerender in 3 to work, one must wait for the migration to complete first.

PS0: If you want to try the patching now you can use this example to get you started.

PS1: Some of the info above may change as we discuss the best strategy. Maybe you won't need to do anything..l Stay tuned.

@jakirkham
Copy link
Member

jakirkham commented Aug 22, 2023

Agree with Filipe

Also for discussion around repodata patching, have raised issue ( conda-forge/conda-forge-repodata-patches-feedstock#516 ) to discuss. More generally added issue ( conda-forge/conda-forge.github.io#1997 ) to discuss general NumPy 2 bringup

Adding to conda-forge agenda tomorrow (though it is kind of packed already so this may slip to a later meeting)

@jakirkham
Copy link
Member

Adding to conda-forge agenda tomorrow (though it is kind of packed already so this may slip to a later meeting)

Unfortunately we didn't get to it today. The agenda was pretty packed already

That said, there is some offline discussion and this can come up in the next meeting

@betatim
Copy link
Author

betatim commented Aug 29, 2023

One thing I just realised is that it is unlikely that the latest released version of scikit-learn (1.3.0) will work with Numpy 2.0, even after a rebuild (because changes are needed to the actual code). So maybe there is no need for rebuilding and the repodata patch is the thing to do.

@jjerphan jjerphan mentioned this issue Jan 18, 2024
5 tasks
@ogrisel
Copy link
Contributor

ogrisel commented Mar 29, 2024

The scikit-learn 1.4.1 release might work though.

@jakirkham
Copy link
Member

There is now a package for NumPy 2.0.0rc1, which can be installed like so: conda-forge/numpy-feedstock#311 (comment)

Maybe the next step here is to test building with that package?

@jakirkham
Copy link
Member

There is now a PR to upgrade to NumPy 2! 😄

xref: #259

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants