Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include tests in Source Tarball #965

Open
tommy-waltmann opened this issue May 18, 2022 · 6 comments
Open

Include tests in Source Tarball #965

tommy-waltmann opened this issue May 18, 2022 · 6 comments

Comments

@tommy-waltmann
Copy link
Collaborator

tommy-waltmann commented May 18, 2022

Is your feature request related to a problem? Please describe.

We would like to start unit testing the builds on the feedstock before they are deployed to conda-forge. Right now, we cannot do that because we do not include the tests in the source tarball we upload to PyPI.

Describe the solution you'd like

Update MANIFEST.in to include the tests directory in the source tarball.

Additional context

In our 2.10 release process (conda-forge/freud-feedstock#48), we tried to test builds before deploying, but that wasn't possible because of this issue.

@bdice
Copy link
Member

bdice commented May 19, 2022

This will roughly double the size of the tarball. Locally, I see that adding

recursive-include tests *

to MANIFEST.in takes the tarball from 3.2 MB to 5.9 MB. Just wanted to provide that data point for reference / curiosity. If we bundle the tests and corresponding test data files, we should be extremely cautious not to add large data files to the tests folder in the future.

We cannot include the doc folder because it would be prohibitively large (416 MB uncompressed), due to the freud-examples submodule and example data / notebooks.

@joaander
Copy link
Member

Does PyPI limit the size of source tarball? Do we expect users on metered connections to download the source tarball from PyPI repeatedly? Are you concerned about increasing the time needed to build binaries on conda-forge? Should we keep the PyPI source tarball small in order to segregate users into groups that a) use PyPI and are not allowed to run unit tests and b) those that use other distribution mechanisms that can run unit tests?

What alternate solution do you propose?

@bdice
Copy link
Member

bdice commented May 19, 2022

There are PyPI size limits of around 60 MB, which mostly affect packages with complex dependencies like CUDA: pypa/packaging-problems#86

I posted that size data purely for informational purposes, I don’t think the potential harm of increased size is enough to outweigh the potential benefit of including/running tests more widely.

As a note, I think this would also include the tests and test data in the binary wheels. It is a little awkward to bundle test code that cannot be executed from the wheel (at least not as easily as from the tarball). Some packages put their tests into a tests module inside the package, which would make it possible to execute from any import of the package. Alternatively we could use the tag (known by the conda-forge recipe) to pull the tests from GitHub and execute those on conda-forge when building, rather than including them in the tarball. That would let us keep the source distribution/wheels slim, which is appreciated by downstream consumers (less bandwidth, smaller Docker images, etc.).

I leave the decision of how to proceed to others — hope that the context and ideas are helpful.

@tommy-waltmann
Copy link
Collaborator Author

I support the idea of pulling the tests from github instead of including them in the source tarball if its not too much trouble to add that to the conda-forge scripts.

@joaander
Copy link
Member

This is technically feasible. I leave the details up to you.

In the spirit of Bradley's FYI: Note that conda-forge explicitly discourages using git checkouts during the build process: https://conda-forge.org/docs/maintainer/adding_pkgs.html#build-from-tarballs-not-repos, one of the reasons being size. Freud's git repository is 558M, so be sure to use a shallow checkout if you do go that route. You would be downloading then 3.2 MB for the source tarball + 4.78 MB for the source from git to get the tests. Compared to the option of installing the tests, this also prevents users from easily testing their installations: e.g.

conda install freud
python3 -m pytest --pyargs freud

That being said, it would be a non-trivial amount of effort to install the tests as they include test data input in addition to .py files. pip does not make these easy to install with the correct permissions, and you may need to update code that discovers the paths to these files. I have an example solving the 2nd issue in fresnel: https://github.com/glotzerlab/fresnel/tree/master/fresnel/test.

@bdice
Copy link
Member

bdice commented May 19, 2022

conda-forge explicitly discourages using git checkouts during the build process

Right. I’m talking about using commands in the test section of the recipe (or an equivalent bash script) to fetch and run the tests, not using git_url to point to the source files for the build. The commands could fetch the tagged GitHub tarball/zip to avoid the need for git entirely. The GitHub tarball/zip will include the repo contents, including tests, but not including submodules, if I recall correctly (the lack of submodules is why we can’t use it to build).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants