-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] Improve mentions to MANIFEST.in
and instructions for including data files
#3148
Conversation
If using the ``include_package_data`` argument, files specified by | ||
``package_data`` will *not* be automatically added to the manifest unless | ||
they are listed in the |MANIFEST.in|_ file or by a plugin like | ||
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this affirmation still holds (after #2844), I have the impression that sometimes package_data
is enough and does not require MANIFEST.in
...
This does seem to be the case in the experiment performed in: https://github.com/abravalheri/experiment-setuptools-package-data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THANK YOU for that repo with the experiments, it is super helpful!
docs/userguide/datafiles.rst
Outdated
@@ -18,9 +18,10 @@ e.g.:: | |||
) | |||
|
|||
This tells setuptools to install any data files it finds in your packages. | |||
The data files must be specified via the distutils' ``MANIFEST.in`` file. | |||
The data files must be specified via the distutils' |MANIFEST.in|_ file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still want to call this is a distutils
file? I think it is confusing to mention distutils
at all since it is being largely hidden in setuptools
.
configuration and all C sources listed as part of extensions | ||
(it doesn't catch C headers, though). | ||
|
||
However, when building more complex packages (e.g. packages that include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth calling out that wheels (and the final installed package) contain only the data files that are contained within listed package
directories, and only when include_package_data
is True. Specifically this can be used to include a top-level tests
folder in an sdist
using MANIFEST.in
. But by not including tests
in the packages
metadata (by using explicit packages or include
/exclude
with find_packages
, you avoid shipping the tests in the wheel and polluting site-packages
with a tests
folder. The tests in the sdist can be used by downstream packagers like Linux distros or conda-forge to test the package before re-packaging. Maybe this should go in the PyPUG itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth calling out that wheels (and the final installed package) contain only the data files that are contained within listed
package
directories, and only wheninclude_package_data
is True
Ummm... in theory wheels could contain files outside the package
directories via the .data
subdir in the wheel specification. That however is super tricky, because it will depend on the installation schema that varies with OS/platform/virtualenv etc...
So we want to move users away from that.
Maybe this should go in the PyPUG itself?
It is definitely a good idea to add a discussion about tests/docs in the sdist to PyPUB! Maybe at some point in the future I will give it a try and propose a PR there (but please also feel free to go ahead and open one if you have the time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @blink1073, in 8252975 I tried to make these relationships more clear. Would you mind to have a look and see if the text is more clear/has improved?
There is also a link to the rendered version here: https://setuptools--3148.org.readthedocs.build/en/3148/userguide/miscellaneous.html#controlling-files-in-the-distribution
Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the kind review @blink1073!
I will try to think about something for the sdist
x wheel
behaviour.
Please note that, when using ``include_package_data=True``, only files **inside | ||
the package directory** are included in the final ``wheel``, by default. | ||
|
||
So for example, if you create a :term:`Python project <Project>` that uses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding something like: "If you are using MANIFEST.in
, and you include a tests
directory that is not included in the package director(ies), the tests
will also be present in the sdist
but not in the wheel
. If using find_packages
, ensure that your glob pattern does not include the tests
directory, or you use explicit include
or exclude
config along with find_packages
to ensure that the tests
folder is not included. Note: if you accidentally include a top level tests
folder (or another top level folder like docs
) in your packages
config, it will end up in site-packages
and get merged with any other package that accidentally does the same."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I'm starting to think that all of this content should be in PyPUG and we should reference it from here so there is a single source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @blink1073 for the text review.
I understand your concerns here, but I am starting to think that this is a non-trivial amount of information to be casually dropped in the middle of this section.
Maybe it would be best would be to add a separated section/page including the "most common configuration errors"?
That could live in the PyPUG, or even in setuptools docs if the maintainers find that too tool-specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I am already concerned about having 2 footnotes here... too much branching in the information flow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about being tool-specific. It seems like anything that isn't related to a PEP should go in these docs (same with flit-specific options for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But perhaps that needs to wait until setuptools
handles PEP 621 and PEP 660 so there is a uniform set of docs that can go in PyPUG and then refer out to tool-specific docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will stop the changes in this PR at this stage. Do you think it is in a good shape (although incomplete, I think it is an improvement over the current status of the docs...)
Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Motivation
Using
MANIFEST.in
and adding non-Python files seems to be one of the most difficult aspects that beginners face when getting started withsetuptools
, as evidenced by a recent comment in pypa/packaging-problems#84 (comment).The current docs also lack an explanation for the syntax of
MANIFEST.in
that is not bundled together with the deprecated/distutils documentation.Summary of changes
MANIFEST.in
to existing docs,MANIFEST.in
miscellaneous.rst
about how to control files in the distribution that also links to the PyPUG document aboutMANIFEST.in
.MANIFEST.in
.pkg_resources
being the preferred way of accessing data files, in favour ofimportlib.resources
.Pull Request Checklist
changelog.d/
.(See documentation for details)