Skip to content

Conversation

@oscarkey
Copy link
Contributor

@oscarkey oscarkey commented Oct 9, 2025

The CI is currently failing, see
https://github.com/PriorLabs/TabPFN/actions/runs/18370843390/job/52333508672.

The issue is that we get numpy>2 but an older version of scikit-learn, which is incompatible. There are two reasons for this:

  1. Older versions of scikit-learn don't specify a maximum numpy version but are incompatible with numpy>2. When using uv's lowest-direct mode, as in the CI, scikit-learn is resolved to an old version (as it is a direct dependency) but numpy is resolved to the newest specified compatible version, which is >2. However, this is actually incompatible.
  2. We install onnx in a separate command on the CI, which upgrades numpy to >2 even if (1) is fixed.

I fix (1) by specifying numpy as a direct dependency with a minimum version <2. Thus the lowest-direct resolution now chooses a compatible version <2. I selected the minimum version as the current minimum version required by our dependencies, so there should be no effect on compatibility.

I fix (2) by creating a new "ci" optional dependency group, and installing this rather than using a separate uv pip install command.

Note that (1) could still be a problem for users, but this problem has existed forever and we haven't had any complaints. It's also common to any package that depends on scikit-learn.

Our code directly imports numpy but we don't delcare it as an explicit
dependency, so we should do this.

This has the side-effect of fixing the test failures seen in
https://github.com/PriorLabs/TabPFN/actions/runs/18370843390/job/52333508672
Older versions of scikit-learn don't specify a maximum numpy version but
are incompatible with numpy>2. When using uv's lowest-direct mode, as in
the CI, scikit-learn is resolved to an old version (as it is a direct
dependency) but numpy is resolved to the newest specified compatible
version, which is >2. However, this is actually incompatible. Adding the
explicit numpy dependency is a trick which results in it being resolved
to <2. This doesn't help real users, but this issue has existed forever
and we haven't had any problems reported, and is common to all packages
depending on scikit-learn.

I selected the minimum version as the current minimum version required
by our dependencies, so this should have no effect on compatibility.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds numpy as a direct dependency to address an incompatibility issue between older scikit-learn versions and numpy>=2.0. The change is well-motivated. However, the specified version constraint for numpy is <3, which allows numpy 2.x and does not fully resolve the issue described. I've suggested changing the constraint to <2 to ensure compatibility for all users.

This avoids the CI pipeline upgrading numpy again.
@oscarkey oscarkey changed the title Specify numpy as a direct dependency. Manage CI dependencies in pyproject and specify numpy dependency. Oct 9, 2025
@oscarkey oscarkey requested a review from LeoGrin October 9, 2025 10:19
Copy link
Collaborator

@LeoGrin LeoGrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@oscarkey oscarkey merged commit 572156b into main Oct 9, 2025
10 checks passed
@oscarkey oscarkey deleted the ok-numpy-version branch October 9, 2025 10:29
@oscarkey
Copy link
Contributor Author

oscarkey commented Oct 9, 2025

Thank you!

@brendan-priorlabs : I created the "CI" dependency group in pyproject.toml as you suggested in #526. As this was blocking merges I decided to get it in quickly, but if you'd suggest any changes I'm happy to make them now :)

@brendan-priorlabs
Copy link
Contributor

Thanks, @oscarkey! For the record, we discussed how I was surprised that installing onnx separately forced an upgrade to numpy that apparently isn't necessary (given that installing them jointly doesn't pull it in), but maybe that's just a behavior of uv

oscarkey added a commit that referenced this pull request Nov 12, 2025
…mpy dependency. (#174)

* Record copied public PR 541

* Manage CI dependencies in pyproject and specify numpy dependency. (#541)

The CI is currently failing, see
https://github.com/PriorLabs/TabPFN/actions/runs/18370843390/job/52333508672.

The issue is that we get numpy>2 but an older version of scikit-learn, which is incompatible. There are two reasons for this:
1. Older versions of scikit-learn don't specify a maximum numpy version but are incompatible with numpy>2. When using uv's lowest-direct mode, as in the CI, scikit-learn is resolved to an old version (as it is a direct dependency) but numpy is resolved to the newest specified compatible version, which is >2. However, this is actually incompatible.
2. We install onnx in a separate command on the CI, which upgrades numpy to >2 even if (1) is fixed.

I fix (1) by specifying numpy as a direct dependency with a minimum version <2. Thus the lowest-direct resolution now chooses a compatible version <2. I selected the minimum version as the current minimum version required by our dependencies, so there should be no effect on compatibility.

I fix (2) by creating a new "ci" optional dependency group, and installing this rather than using a separate `uv pip install` command.

Note that (1) could still be a problem for users, but this problem has existed forever and we haven't had any complaints. It's also common to any package that depends on scikit-learn.

---------

Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com>
Co-authored-by: Oscar Key <oscar@priorlabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants