Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making the numpy test suite separately installable #26289

Open
rgommers opened this issue Apr 16, 2024 · 11 comments
Open

Making the numpy test suite separately installable #26289

rgommers opened this issue Apr 16, 2024 · 11 comments

Comments

@rgommers
Copy link
Member

No longer shipping the tests with the numpy wheels uploaded to PyPI was discussed in gh-25737. This is now possible after gh-26274. Here are some questions about how we want to go about this. My assumption is that we want to do this in a fairly non-disruptive way, meaning no moving of all tests files (that would yield a ton of merge conflicts on all open PRs).

I think these will be the next steps in a follow-up PR to gh-26274:

  • Drop the tests label from the default set of install tags in pyproject.toml or modify our cibuildwheel config to do that (this needs deciding)
  • Start producing an installable test suite (numpy-tests wheels).
    • Decision 2: do we want that to land in-tree inside a numpy install, or out-of-tree (e.g. in a separate numpy-tests directory)? The latter seems cleaner, but may require some more tweaks to test structure (e.g., import numpy._core._multiarray_tests needs modifying).
  • Tweak our CI for wheel building to make everything work again.
  • Ensure that python -c "import numpy; numpy.test()" gives a clear error message when the tests are not installed
  • Add docs for distro packagers and others who want to run the tests.

Thoughts?

@ngoldbaum
Copy link
Member

The latter seems cleaner, but may require some more tweaks to test structure (e.g., import numpy._core._multiarray_tests needs modifying).

I agree that having the tests come from a separate package but install in-tree is not great. Maybe it would be better to have a little pain from merge conflict fallout and just bite the bullet and move all the tests into a tests subfolder outside the numpy module? It would be strange and possibly lead to some confusion in debugging for the tests to show up in-tree for development installs, but then be installed out-of-tree for pip installs.

@seberg
Copy link
Member

seberg commented Apr 18, 2024

The biggest annoyance about moving the tests is maybe even that it makes backporting hard so yeah, would be nice to avoid. OTOH, it is also a bit odd to have a very different test layout for development and non-dev installs.
(I don't have a gut feeling for exactly how terrible an in-tree install is.)

import numpy._core._multiarray_tests needs modifying

Do you want to bother moving these C test files/helpers? I am not going to complain, but it doesn't really seem worth the build-setup hassle (of course I didn't check how big these .so are, but I would be surprised if they are large).

@charris
Copy link
Member

charris commented Apr 18, 2024

Maybe it would be better to have a little pain from merge conflict fallout

There was a discussion about moving the tests back when we switched to using pytest, as that was the recommended arrangement. If we do decide to do it, let's wait until 2.0.1 is out, I expect we will branch 2.1 soon after

@charris
Copy link
Member

charris commented Apr 18, 2024

I think the most important thing is that running tests remains easy.

@rgommers
Copy link
Member Author

Maybe it would be better to have a little pain from merge conflict fallout

This could be done, but I'm inclined to try to avoid it at least for the moment. Because it'd probably quite a bit of pain, and we don't want to do it now because of backports etc.

Do you want to bother moving these C test files/helpers?

About 300 kb installed, 100 kb wheel size. When the total wheel size is down to 3.5 MB (on macOS), that's still significant enough. So yes, I'd like to drop them unless it turns out to be too annoying to do so.

I think the most important thing is that running tests remains easy.

Agreed, that is a hard constraint.

or modify our cibuildwheel config to do that (this needs deciding)

This is probably the way to go for now. What it would mean:

  1. No change for spin usage, nor for editable installs
  2. No change when users build from source
  3. No change by default for distro packagers, they can opt in to drop test
  4. A minor change to our cibuildwheel setup to drop tests, and a hopefully only slightly larger change to still be able to run the tests in CI against an installed numpy wheel which itself contains no tests.

It's the second half of (4) that needs investigating.

We can discuss in the next community meeting perhaps?

@mhvk
Copy link
Contributor

mhvk commented Apr 18, 2024

Just a small 👍 to having the option not to include the tests, but a 👎 on having them install somewhere else than in the numpy directories. Things should just remain in the same place, with the only difference being that one does not have to install the tests; having two possible locations is just going to cause pain further down the line (if one does it, it should be for all of numpy, but this seems a lot of churn for very little benefit).

p.s. A possible alternative to a new numpy-test package would be that if you want the tests, you have to install the full thing. I.e., one could envision numpy and numpy-lean packages. This seems somewhat worse, though, than having two packages which install in interleaved directories.

@agriyakhetarpal
Copy link
Contributor

agriyakhetarpal commented Apr 19, 2024

Hi, for some more context, this is how it is being done for pandas right now: pandas-dev/pandas#53007. Since the test suite as a separate package would be pure Python, it would make sense to use extensible backends like setuptools or hatch for those and read the version from a file (pyproject.toml) dynamically – because matching the NumPy version for the package and the test suite and keeping them in sync would be important.

@tylerjereddy
Copy link
Contributor

tylerjereddy commented Apr 19, 2024

FWIW, I'm -0.5 on making test sources optionally associated with a project, even isolating to the case of distributed versions. I've never liked the complications that can arise when I've developed on other projects that do it, and prefer the size cost of "always there" to the mental overhead cost of "may not be there," or "may be from a mismatched test package vs. source package situation."

That said, sounds like maintainers are actually "ok" with it, so just making that somewhat obvious observation.

@rgommers
Copy link
Member Author

but a 👎 on having them install somewhere else than in the numpy directories.

That is my preference as well indeed. I need to test the install/uninstall experience to ensure there's no strange effects from targeting the same site-packages/numpy. I expect it works fine, since namespace packages do the same, but need to make sure.

Since the test suite as a separate package would be pure Python

It currently isn't, there are extension modules built that are test-only:

$ ls build/numpy/_core/*test*.so
build/numpy/_core/_multiarray_tests.cpython-311-x86_64-linux-gnu.so
build/numpy/_core/_operand_flag_tests.cpython-311-x86_64-linux-gnu.so
build/numpy/_core/_rational_tests.cpython-311-x86_64-linux-gnu.so
build/numpy/_core/_struct_ufunc_tests.cpython-311-x86_64-linux-gnu.so
build/numpy/_core/_umath_tests.cpython-311-x86_64-linux-gnu.so

That could be changed perhaps, but I'm really looking for the most minimal change possible here at the moment. Either way, we don't want to change build backends, that'd be very invasive for no real reason. We can do this with a one-line patch I believe.

and prefer the size cost of "always there" to the mental overhead cost of "may not be there

This belongs more on gh-25737 (this issue assumes we are doing this, unless it turns out to be too disruptive). I added context in #25737 (comment).

@rgommers
Copy link
Member Author

Okay, some testing results for the in-numpy-tree install of a separate numpy-tests package.

# build a numpy wheel with no tests
$ python -m build -wnx -Cbuild-dir=build -Cinstall-args="--tags=runtime,devel,python-runtime"
$ git stash apply  # apply patch below
$ # build a numpy-tests wheel
$ python -m build -wnx -Cbuild-dir=build -Cinstall-args="--tags=tests"

Patch needed for the numpy-tests wheel building:

diff --git a/pyproject.toml b/pyproject.toml
index b4df3c36d7..2d15cfb0de 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -6,7 +6,7 @@ requires = [
 ]
 
 [project]
-name = "numpy"
+name = "numpy-tests"
 version = "2.1.0.dev0"
 # TODO: add `license-files` once PEP 639 is accepted (see meson-python#88)
 license = {file = "LICENSE.txt"}
@@ -17,6 +17,7 @@ maintainers = [
     {name = "NumPy Developers", email="numpy-discussion@python.org"},
 ]
 requires-python = ">=3.10"
+dependencies = ["numpy"]
 readme = "README.md"
 classifiers = [
     'Development Status :: 5 - Production/Stable',
@@ -40,16 +41,6 @@ classifiers = [
     'Operating System :: MacOS',
 ]
 
-[project.scripts]
-f2py = 'numpy.f2py.f2py2e:main'
-numpy-config = 'numpy._configtool:main'
-
-[project.entry-points.array_api]
-numpy = 'numpy'
-
-[project.entry-points.pyinstaller40]
-hook-dirs = 'numpy:_pyinstaller_hooks_dir'
-
 [project.urls]
 homepage = "https://numpy.org"
 documentation = "https://numpy.org/doc/"

Testing:

$ cd dist
$ ls -lh
$ ls -lh
total 7,6M
-rw-r--r-- 1 rgommers rgommers 6,0M 22 apr 09:11 numpy-2.1.0.dev0-cp311-cp311-linux_x86_64.whl
-rw-r--r-- 1 rgommers rgommers 1,6M 22 apr 10:36 numpy_tests-2.1.0.dev0-cp311-cp311-linux_x86_64.whl

$ pip install numpy-2.1.0.dev0-cp311-cp311-linux_x86_64.whl
$ python -c "import numpy as np; print(np.arange(3))"  # smoke test, works
[0 1 2]
$ python -c "import numpy as np; np.test()"
...
ERROR (can be made informative)
$ pip install numpy_tests-2.1.0.dev0-cp311-cp311-linux_x86_64.whl
$ python -c "import numpy as np; np.test()"
...
47858 passed, 876 skipped, 2783 deselected, 34 xfailed, 5 xpassed in 200.21s (0:03:20)

So far so good. Now the uninstall behavior:

$ pip uninstall numpy-tests
$ python -c "import numpy as np; print(np.arange(3))"  # still works
[0 1 2]

$ pip uninstall numpy
$ python -c "import numpy as np; print(np.arange(3))"  # hitting a corner here 
...
AttributeError: module 'numpy' has no attribute 'arange'

That's not ideal, something gets left behind, turning the expected ImportError into AttributeError.

$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy
_core      f2py  lib     ma         polynomial   _pyinstaller  testing  typing
distutils  fft   linalg  matrixlib  __pycache__  random        tests
$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy/ma
tests
$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy/ma/tests/
__pycache__
$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy/ma/tests/__pycache__/
test_arrayobject.cpython-311-pytest-7.4.3.pyc   test_mrecords.cpython-311-pytest-7.4.3.pyc
test_core.cpython-311-pytest-7.4.3.pyc          test_old_ma.cpython-311-pytest-7.4.3.pyc
test_deprecations.cpython-311-pytest-7.4.3.pyc  test_regression.cpython-311-pytest-7.4.3.pyc
test_extras.cpython-311-pytest-7.4.3.pyc        test_subclassing.cpython-311-pytest-7.4.3.pyc

Conclusion: what gets left behind is .pyc files created by pytest (I verified that if we didn't run the tests, everything does get uninstalled as expected and we get ImportError as expected). There is discussion in pypa/pip#11835, which explains it. It is not a pip issue, the cause is pytest not just creating the .pyc file but modifying the file extension from .cpython-311.pyc to .cpython-311-pytest-7.4.3.pyc.

That is really weird behavior from pytest of course; pip cannot know it has to uninstall such renamed files. Checking this, it turns out we see the same behavior in our regular numpy packages:

$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy  # ensure we start from a clean slate
... No such file or directory
$ pip install numpy==1.26.4
$ python -c "import numpy as np; np.test()"
$ pip uninstall numpy
...
Successfully uninstalled numpy-1.26.4

$ ls ~/mambaforge/envs/numpy-dev/lib/python3.11/site-packages/numpy
array_api  core       f2py  lib     ma         polynomial   _pyinstaller  testing  typing
compat     distutils  fft   linalg  matrixlib  __pycache__  random        tests
$ python -c "import numpy as np; print(np.arange(3))"
...
AttributeError: module 'numpy' has no attribute 'arange'

So splitting off the tests doesn't change this. Given that this behavior is broken and I don't think we knew about this problem, that suggests that all this proves is that very few users ever run the tests.

tl;dr I think this approach works.

@rgommers
Copy link
Member Author

Discussed in the community meeting just now. In general people were 👍🏼 on moving ahead with this. Requests:

  • When running the tests and numpy-tests is installed, ensure that the versions of numpy and numpy-tests match exactly (e.g., check np.version for both version number and git hash) to avoid problems with spurious failures.
  • When trying to run the tests and numpy-tests is not installed, raise an informative error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants