Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Automatically include files mentioned in pyproject.toml (e.g. license = { file = ...}) into the sdist #3570

Closed
abitrolly opened this issue Aug 26, 2022 · 28 comments · Fixed by #3779

Comments

@abitrolly
Copy link

setuptools version

65.3.0

Python version

3.10

OS

Fedora 36

Additional environment information

No response

Description

Specifying license information as license = {file = "UNLICENSE"} gives warning warnings.warn(f"File {path!r} cannot be found") when building package.

Expected behavior

No warning.

How to Reproduce

  1. Rename license file to UNLICENSE
  2. In pyproject.toml refer to it as license = {file = "UNLICENSE"}
  3. Build with python -m build

or

git clone https://github.com/abitrolly/sqlite2png
cd sqlite2png
python -m build

Output

* Installing packages in isolated environment... (setuptools >= 40.8.0, wheel)
* Getting dependencies for wheel...
/tmp/build-env-h9x6pai5/lib64/python3.10/site-packages/setuptools/config/pyprojecttoml.py:108: _BetaConfiguration: Support for `[tool.setuptools]` in `pyproject.toml` is still *beta*.
  warnings.warn(msg, _BetaConfiguration)
/tmp/build-env-h9x6pai5/lib64/python3.10/site-packages/setuptools/config/expand.py:144: UserWarning: File '/tmp/build-via-sdist-a53010x_/sqlite2png-0.1.4/UNLICENSE' cannot be found
  warnings.warn(f"File {path!r} cannot be found")
@abitrolly abitrolly added bug Needs Triage Issues that need to be evaluated for severity and status. labels Aug 26, 2022
@abitrolly abitrolly changed the title [BUG] UNLICENSE references as file is not copied [BUG] UNLICENSE referenced as file is not copied Aug 26, 2022
@abravalheri
Copy link
Contributor

Hi @abitrolly, thank you very much for reporting this issue.

Quick question before investigating this issue deeper: have you tried to add the UNLICENSE file to a MANIFEST.in or use a plugin like setuptools-scm? The reason why I am asking this is that in order for the file to be included in the wheel, the file has to be included first in the sdist (build will create an sdist first and then create a wheel from that sdist).

By default setuptools will automatically include only files that match the LICENSE* glob pattern. Other then the very basic scenarios1, users are required to specify any other files that are part of the distribution via MANIFEST.in or to use a plugin that understands the version control system (e.g. setuptools-scm for git and mercurial or setuptools-svn for svn).

The MANIFEST.in file is documented by this page and the overall behaviour of setuptools is documented by this other page.

Footnotes

  1. For example, Python only files + a README.{rst,md} + a LICENSE* file

@abitrolly
Copy link
Author

@abravalheri I expected the file to be included automatically, because it is referenced. If it needed to included explicitly, how can I do this from pyproject.toml?

@abravalheri
Copy link
Contributor

 I expected the file to be included automatically, because it is referenced

This sounds like a reasonable expectation.

If I understood correctly, this issue would be better classified as a feature request. Currently setuptools does not offer this capability1.

(I might have seen some similar discussion but in the context of file: in setup.cfg)

 If it needed to included explicitly, how can I do this from pyproject.toml?

This specific functionality is only offered via MANIFEST.in. If you don't want to use MANIFEST.in, your best option is to use a version control system plugin like setuptools-scm:

[build-system]
requires = ["setuptools", "setuptools-scm"]
build-backend = "setuptools.build_meta"

Footnotes

  1. Feature requests have more chances of actually making into setuptools via contributions from the proponent/community.

@abitrolly
Copy link
Author

If I understood correctly, this issue would be better classified as a feature request. Currently setuptools does not offer this capability1.

Fell free to reclassify it. Is there anything that is needed from my side?

@abravalheri abravalheri changed the title [BUG] UNLICENSE referenced as file is not copied [FR] Automatically include files mentioned in pyproject.toml (e.g. license = { file = ...}) into the sdist Aug 31, 2022
@abravalheri abravalheri added help wanted and removed bug Needs Triage Issues that need to be evaluated for severity and status. Waiting User Feedback labels Aug 31, 2022
@abravalheri
Copy link
Contributor

This feature request is closely related to #2821.

Is there anything that is needed from my side?

Please feel free to submit a PR :)

@abitrolly
Copy link
Author

@abravalheri I am not even sure where to start with.

@abravalheri
Copy link
Contributor

@abravalheri I am not even sure where to start with.

We are probably in the same boat :P
Since this is not a feature setuptools was initially designed with I believe the best place to start is to study the sdist class and see how the files are selected to be included (or not in the distribution).

The biggest challenge here is that the configuration is solved before hand, so all the parameters are expanded. By the time the sdist command runs license = {file = ...} is already expanded to be logically equivalent to license = {text = <contents of the file>} and the information about the original file path is lost.

@abitrolly
Copy link
Author

If the license field is already expanded to include the file as text, then why it complains about it missing?

UserWarning: File '/tmp/build-via-sdist-a53010x_/sqlite2png-0.1.4/UNLICENSE' cannot be found
  warnings.warn(f"File {path!r} cannot be found")

@abravalheri
Copy link
Contributor

abravalheri commented Sep 1, 2022

Because the wheel is built from the sdist, not from the CWD.
If the file is not present in the sdist the configuration phase will fail to expand it.

It is a 2 stage process:

  • First the sdist is build from the CWD. In this stage, license = { file = ... } is expanded to the file contents.
  • Then the wheel is build from the sdist. In this stage, the build will warn if file does not exist.

@GergelyKalmar
Copy link

The same would be very useful for dynamic metadata too (e.g. entry-points, dependencies, optional-dependencies and so on, basically everything under https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata).

@abitrolly
Copy link
Author

@abravalheri if wheel building stage uses sdist then the line license = { file = ... } is already expanded and should not exist. Why setuptools complains about it then?

@abravalheri
Copy link
Contributor

abravalheri commented Sep 5, 2022

@abitrolly, there are a lot of things going on here. For example, I don't believe the packaging specs (at least the version that setuptools uses) prevents the backend to re-evaluate the configuration files (of course I might be wrong).

If setuptools decides to re-evaluate the configuration files, than we can see that a file is missing in the source distribution (which ideally contain all the files necessary to build the package from scratch).

So far this is the information in the setuptools documentation about how to control the files in the sdist:

For the most common use cases, setuptools will automatically find out which files are necessary for distributing the package <...>

However, <...> you might find that not all files present in your project folder are included in package distribution archive.

If you are using a Revision Control System, such as git or mercurial, <...> you can use a setuptools plugin, such as setuptools-scm <...> to automatically include all tracked files into the sdist.

Alternatively, <...> you can add a MANIFEST.in file at the root of your project, to specify any files that the default file location algorithm doesn’t catch.

This hopefully conveys the idea that if a use case deviates a bit from what setuptools adds to the sdist by default, the documentated/officially-supported approach is to use a VCS plugin or to add a MANIFEST.in file to the project.

I think we all agree that what you ask is a nice enhancement. That is why I invite anyone in the community that would like to work in this feature to send us a PR.

@abitrolly
Copy link
Author

abitrolly commented Sep 5, 2022

@abravalheri why isn't possible for python -m build to skip the sdist step and construct the package directly from pyproject.toml?

I am curious if there is a conversion graph that shows what field should be converted to what, and what role sdist plays there. Without it I doubt in my abilities to understand how it all works just by reading the code.

@GergelyKalmar
Copy link

This hopefully conveys the idea that if a use case deviates a bit from what setuptools adds to the sdist by default, the documentated/officially-supported approach is to use a VCS plugin or to add a MANIFEST.in file to the project.

I have to disagree with this point because it directly contradicts the previous paragraph that you quoted: "For the most common use cases, setuptools will automatically find out which files are necessary for distributing the package". Arguably, the license file as referred in a pyproject.toml is a standard use case in the strongest sense of that word, given that it is even explicitly defined in PEP 621 (see https://peps.python.org/pep-0621/#license). While the other setuptools-specific dynamic metadata is not "standard" in a PEP 621-sense, I think any reasonable person would expect setuptools to include files referenced in its own dynamic extensions in a pyproject.toml in the distribution. The way I see it setuptools support for PEP 621 is pretty incomplete without this. The fact that setuptools itself raises a warning about these missing files during a standard build with python -m build is just further proof of this.

@abravalheri
Copy link
Contributor

why isn't possible for python -m build to skip the sdist step and construct the package directly from pyproject.toml?

Hi @abitrolly, you can try to skip the sdist step by running python -m build --wheel. I believe that this will construct the package based on the directory tree directly.

I am curious if there is a conversion graph that shows what field should be converted to what, and what role sdist plays there. Without it I doubt in my abilities to understand how it all works just by reading the code.

I am not sure if I understand the question, but the field conversion is described in PEP 621. Unfortunately there is no graphical representation available in the community yet.

@abravalheri
Copy link
Contributor

abravalheri commented Sep 5, 2022

I have to disagree with this point because it directly contradicts the previous paragraph that you quoted: "For the most common use cases, setuptools will automatically find out which files are necessary for distributing the package".

Hi @GergelyKalmar, we can debate about a lot of aspects here...
For example I can present an argument that by far the most common use case is for developers to have a license file named LICENSE.<something> and that setuptools already covers that scenario. However, I don't know how useful this kind of discussion can be: in the end of the day all boils down to interpretation or opinion which are highly personal...

I think that everyone here aggress that it what has been proposed is a nice enhancement to setuptools. So the approach of marking this ticket as a feature request and adding the help-wanted label to it is a nice way of getting things started.

The way I see it setuptools support for PEP 621 is pretty incomplete without this. The fact that setuptools itself raises a warning about these missing files during a standard build with python -m build is just further proof of this.

If I am not wrong, PEP 621 does not dictate that files mentioned in pyproject.toml should be automatically included in the sdist, therefore we cannot say support is incomplete (this behaviour also predates PEP 621, as indicated in #2821).

@avalonv
Copy link

avalonv commented Nov 17, 2022

This hopefully conveys the idea that if a use case deviates a bit from what setuptools adds to the sdist by default, the documentated/officially-supported approach is to use a VCS plugin or to add a MANIFEST.in file to the project.

I have to disagree with this point because it directly contradicts the previous paragraph that you quoted: "For the most common use cases, setuptools will automatically find out which files are necessary for distributing the package". Arguably, the license file as referred in a pyproject.toml is a standard use case in the strongest sense of that word, given that it is even explicitly defined in PEP 621 (see https://peps.python.org/pep-0621/#license). While the other setuptools-specific dynamic metadata is not "standard" in a PEP 621-sense, I think any reasonable person would expect setuptools to include files referenced in its own dynamic extensions in a pyproject.toml in the distribution. The way I see it setuptools support for PEP 621 is pretty incomplete without this. The fact that setuptools itself raises a warning about these missing files during a standard build with python -m build is just further proof of this.

This also effects dynamic dependencies with requirements.txt, which is rather confusing for end-users (although that is at least described in the documentation). I think perhaps the warnings about missing files in sdist could be bolder, as currently they're quite easy to miss among the sea of output.

image

@fofoni
Copy link

fofoni commented Dec 27, 2022

why isn't possible for python -m build to skip the sdist step and construct the package directly from pyproject.toml?

Hi @abitrolly, you can try to skip the sdist step by running python -m build --wheel. I believe that this will construct the package based on the directory tree directly.

For completeness: I just tested, and apparently build --wheel doesn't change the sdist-then-wheel recipe. It just deletes the sdist afterwards.

@abitrolly
Copy link
Author

I don't like the complexity of layering wheel building logic on top of legacy sdist logic, but I am glad that the problem is fixed.

@abravalheri
Copy link
Contributor

I don't like the complexity of layering wheel building logic on top of legacy sdist logic

I don't think sdist is considered a legacy format. It is a perfect valid, non-platform specific, distribution format.

But I am afraid this decision has a higher scope than setuptools (as it's been debated in PEP's). Please feel free to open a new thread in the Python Discourse.

@abitrolly
Copy link
Author

@abravalheri logic != format.

@layday
Copy link
Member

layday commented Jan 21, 2023

Can you stop speaking in riddles - what logic is it you are referring to?

@abitrolly
Copy link
Author

@layday the logic.

  1. Given pyproject.toml
  2. Map fields and files in the pyproject.toml description
  3. To the contents of package.whl archive.

Instead of that, the logic goes through sdist and eggs, which is turn touches setup.py and other stuff that I call legacy, because it is not directly related to .toml + files => .whl map. You don't need to know about setup.py or eggs to do the mapping, but you have to, if you want to understand the code the underlying code. This is what I call "building logic on top of legacy logic".

HTH.

@abravalheri
Copy link
Contributor

@abitrolly setup.py is a valid configuration file (non-deprecated) and the workflow that goes through the sdist is described in PEP 517.

To ensure that wheels from different sources are built the same way, frontends may call build_sdist first, and then call build_wheel in the unpacked sdist.

The .egg-info directory is a directory that stores metadata that is used by setuptools to create wheels and in certain cases is important (think about setuptools-scm and the recording of the list of files that is tracked in the VCS).

I believe that in the long run, we would like to replace the .egg-info directory by another mechanism (please feel free to submit a PR!), but I think the decision of choosing which files to add to the sdist is orthogonal to the specific implementation.

@abitrolly
Copy link
Author

@abravalheri see, instead of discussing technical details of converting pyproject.toml, we are debating over how important and valuable setup.py os, that sdist even has a PEP 517 written behind it, and that .egg-info is important too, because "certain cases" which are not discussed are not covered by .whl and its metadata.

This is the legacy for me. I am not interested in that stuff anymore, and I don't have time to understand .egg-info and its "certain cases" to submit a PR. Just saying it out loud if anybody who reads this thread really thinks that I should or can do this in a good faith.

@GergelyKalmar
Copy link

I'm not sure I see any issues anymore, with this fix you can now use a pyproject.toml file and not worry about anything else when creating a package. I think it is useful to use the sdist as a basis for the wheels, because this way you go through the same process during building as a user would when installing from an sdist, which can reveal potential issues that can be fixed before uploading a broken source distribution package.

I agree that scattering the project directory with "random" folders like build or the .egg-info stuff is annoying, the output should definitely be better structured. I also agree though that it has nothing to do with this specific issue.

@layday
Copy link
Member

layday commented Jan 21, 2023

setup.py and the .egg-info folder are irrelevant. The "logic" needs to go through sdist simply to include the file in the sdist. That's exactly what the PR accomplishes. How do you propose that this should have been approached?

@abravalheri
Copy link
Contributor

@abitrolly I think I understood now what you mean. And it is a perfect valid opinion.

Thank you very much for the issue report. I hope the problems were fixed now.


To anyone in the community (not only you 😁):

If there is anything that you belive can be improved in setuptools (and you have the time/energy), please feel free to engage and contribute with PRs.

Setuptools is a community project and love contributors! The maintainers will try to help as much as we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants