Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using setuptools_scm forces all SCM files to be packaged into sdist #190

Closed
gaborbernat opened this issue Nov 27, 2017 · 34 comments
Closed

using setuptools_scm forces all SCM files to be packaged into sdist #190

gaborbernat opened this issue Nov 27, 2017 · 34 comments
Labels

Comments

@gaborbernat
Copy link

@gaborbernat gaborbernat commented Nov 27, 2017

any way we can disable this?

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Nov 27, 2017

right now not, its suggested to use a MANIFEST.in to add excludes, documentation is needed

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Nov 27, 2017

@RonnyPfannschmidt I've tried but fails, in what form should that be added?

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Nov 27, 2017

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Nov 27, 2017

this also makes that we ignore find_packages, package_data, etc absolute as far as I see 🤔

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Nov 27, 2017

@gaborbernat not sure what you mean by that

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Nov 27, 2017

once setuptools_scm is enabled we package everything, by adding all files to the default (via https://github.com/pypa/setuptools/blob/5ecd7575c9c09d4ec2d8f993c5fb405388c3f3c1/setuptools/command/egg_info.py#L563); so this basically has the effect that we ignore whatever was specified in find_packages, package_data, not?

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Nov 27, 2017

running

/usr/bin/python /home/bernat/git/borg/setup.py sdist --formats=zip --dist-dir /home/bernat/git/borg/.tox/dist | grep pyx 

even though running https://github.com/borgbackup/borg/blob/master/setup.py#L817 states please no pyx, h, c files:

warning: src/borg/crypto/low_level.pyx:742:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:745:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:767:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:773:22: local variable 'olen' referenced before assignment
copying src/borg/chunker.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/compress.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/hashindex.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/item.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/algorithms/checksums.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms
copying src/borg/crypto/low_level.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto
copying src/borg/platform/darwin.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/freebsd.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/linux.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/posix.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/chunker.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/item.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/hashindex.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/compress.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/freebsd.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/darwin.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/posix.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/linux.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms/checksums.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto/low_level.pyx'

see how we just packaged pyx files there.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Nov 27, 2017

gah, does it also add them to a wheel? or does it just pollute the sdist?

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Nov 27, 2017

just the sdist 👍

Loading

@gaborbernat gaborbernat changed the title using setuptools_scm forces all SCM files to be packaged using setuptools_scm forces all SCM files to be packaged into sdist Nov 27, 2017
@axnsan12
Copy link

@axnsan12 axnsan12 commented Dec 12, 2017

Is there any update on this? This functionality is really not stated anywhere and is very undesirable for me. My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

Loading

@axnsan12
Copy link

@axnsan12 axnsan12 commented Dec 12, 2017

For what it's worth, I hacked around this by doing:

try:
    import setuptools_scm.integration
    setuptools_scm.integration.find_files = lambda _: []
except ImportError:
    pass

in my setup.py

Loading

@jaraco
Copy link
Member

@jaraco jaraco commented Dec 12, 2017

If it weren't for the bootstrapping challenges, I'd suggest that setuptools_scm should be split into two packages such that a project could selectively include the behavior they desire (and only that behavior). Unfortunately, the ecosystem (setuptools) assumes that if you've checked files into your source code repo that you intend to copy them into your sdist.

I think the issue here shouldn't be addressed by setuptools_scm, but should be addressed by setuptools (as the issue would apply to any plug-in supplying file finder capability).

Loading

@axnsan12
Copy link

@axnsan12 axnsan12 commented Dec 12, 2017

Would it not be possible to create a dummy package, say setuptools_scm_no_finder, which, if installed, would cause find_files to be a no-op? i.e., in integration.py

try:
    import setuptools_scm_no_finder
    
    def find_files(_):
        return []
except ImportError:
    def find_files(...):
        # current implementation
        ...

Then you could list that module as an extra_require to this module, and one would do, say pip install setuptools_scm[version_only] (or setup_requires=['setuptools_scm[version_only]']) to activate only the versioning component.

Edit: after further thought, scratch that, I just realised doing this would inadvertedly affect setup for all packages; oh well...

Loading

@axnsan12
Copy link

@axnsan12 axnsan12 commented Dec 12, 2017

Another idea: choose a special filename which, if encountered, would cause find_files to return an empty list? something like .no_scm_sdist?

Or try to read configuration settings from some standard files like tox.ini and setup.cfg?

Loading

@gaborbernat
Copy link
Author

@gaborbernat gaborbernat commented Dec 12, 2017

I looked into this in detail, being annoyed with it and I think it's fine the way it works now.

sdist is your source files distribution, and the only non desirable files are CI project files (for that setuptools could have an exclude list evaluated after or against find_files). Remember the Manifest or package_data is still relevant, as that governs what exactly then gets installed into the site package from the sdist. If you build wheels e.g. for PyPi you'll see that the source which are not part of Manifest, package_data or the packages are not packaged.

Tldr: files added by this find_files only will not be installed on the users machine. It's only the build start out material. Wheels distribution packages e.g. do not contain these as proof of this.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Dec 13, 2017

im happy to accept a pr that adds a take_out_file_finder option for setup.py which in turn would apply the monkeypatch

when setup is called, the file finding isnt diretly triggered, but we can at that point apply the evil monkeypatch in a reasonable manner

@jaraco oppinions?

Loading

@jaraco
Copy link
Member

@jaraco jaraco commented Dec 13, 2017

Gaborbernat has already indicated that the existing behavior is acceptable... so the remaining opinion is that of axnsan12:

My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

My recommendation is to eliminate the manifest.in and rely on the file finder. I've yet to see a compelling use-case where SCM-based file finders isn't a suitable approach if not exactly what a project needs. What is the reason that doesn't work for you, @axnsan12?

My opinion is that if it's necessary to support disabling file finders, that should be either implemented in Setuptools or it should be a hack added by the individual projects. Adding it at setuptools_scm feels like the wrong place as it leaves other file finder plugins without that support.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Dec 13, 2017

@jaraco thanks for elaborating, i will close this one as wontfix then

Loading

@webknjaz
Copy link
Member

@webknjaz webknjaz commented Apr 10, 2019

@jaraco I have a use case where we have blobs in the repo (historically, don't ask) and I'd like to rip them off. It seems to be impossible to do via exclude_package_data or find_packages(exclude=).

So I'm forced to resort to MANIFEST.in in this case. It would be nice, though, if this was supported natively.

P.S. Another similar integration claims that they support exclude_package_data... https://pypi.org/project/setuptools-git/#usage

Loading

webknjaz added a commit to webknjaz/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)
webknjaz added a commit to ansible-community/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)
webknjaz added a commit to ansible-community/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)
@ulope
Copy link

@ulope ulope commented May 14, 2019

@jaraco, @gaborbernat Could you elaborate on your thinking why this is desirable behaviour?

In the current project (where by the way it took me close to an hour to figure out that setuptools_scm was the culprit) we have a lot of stuff under source control that is not at all intended to be shipped in a distribution package (e.g. debugging tools, shell scripts, documentation, etc.).

Also that this leads to disregard of the developer intention expressed with find_packages()which I find highly surprising.

/edit: Also the argument that "it doesn't matter since only things listed in MANIFEST.in will get installed" doesn't hold water IMO since size of the sdist package size is also a concern. For the project I'm talking about above the size of the archive increases by almost 5x (~500 kB vs. ~2.5 MB) between including only required package files vs. all vcs files.

Loading

ulope added a commit to ulope/raiden that referenced this issue May 14, 2019
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools_scm#190
@jaraco
Copy link
Member

@jaraco jaraco commented May 14, 2019

why is this desirable behavior?

@ulope - In my experience, the source repo tends to be the source of truth (no pun intended) for what is the source of that project. I think it's reasonable to expect when downloading a source distribution to get the debugging tools, shell scripts, documentation, etc. In my opinion the sdist is meant to be more than just a copy of the Python functionality, but is meant to be a distributable copy of the source code. I would expect someone to be able to download the sdist, extract it, and develop on the project much like they would if cloning the repo. These sdists are used by downstream packagers (Debian, RedHat) to run tests and other validation on them.

Also that this leads to disregard of the developer intention expressed with find_packages() which I find highly surprising.

I would expect find_packages() to be relevant mainly at the point of building/installing a package, after something using MANIFEST.in or setuptools_scm has discovered the sources.

I can see that your expectation was violated here, but I think your expectation varies from the dominant expectation (that all source files should be distributed with the source).

I see in your patch that you simply disabled file finding. It seems there are at least a couple other users of setuptools_scm who desire this behavior. Perhaps you would consider contributing a patch to setuptools to allow disabling of file finding, along the lines of what axnsan12 has suggested above (but to apply to any file finder and not just setuptools_scm).

Loading

ulope added a commit to raiden-network/raiden that referenced this issue May 15, 2019
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools_scm#190
@webknjaz
Copy link
Member

@webknjaz webknjaz commented May 15, 2019

These sdists are used by downstream packagers (Debian, RedHat) to run tests and other validation on them.

Not really, I know a few folks doing RPM packaging for Fedora. They tend to use original Git repos for that because there's no standard convention on what maintainers include into sdists. + sdists are not verifiable.

Loading

@ulope
Copy link

@ulope ulope commented May 21, 2019

@jaraco Thanks for the expanded explanation and sorry for the somewhat abrupt tone in my previous message.

I'll try to find some time to work on the setuptools option you suggested.

Loading

@saimn
Copy link

@saimn saimn commented Dec 3, 2020

I also lost a bit of time because of this, trying to find why find_packages('.', exclude=['*test*']) was not working while it was working before. This on a project that does not use setuptools_scm, but it turns out that just having setuptools_scm installed is enough to trigger this behavior. That's a really unexpected side effect, and a pain to debug.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Dec 3, 2020

That one should be a setuptools bug, file finders are completely disconnected

Loading

@saimn
Copy link

@saimn saimn commented Dec 3, 2020

But it happens only if setuptools_scm is installed, i.e. exactly what is discussed here, it's just that you don't need to actually use setuptools_scm to trigger this behavior.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Dec 3, 2020

Problem is that setuptools has no way to tell if a file finder plugin should be active

They also can't get Metadata of the distribution.

So unless setuptools provides a better solution, it will be what it is

Loading

@embray
Copy link

@embray embray commented Apr 23, 2021

@saimn I ran into the same problem just now. I think that since just having setuptools_scm installed at all causes the file finder entrypoint to be invoked even for packages that don't use it, this should be an opt-in feature.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Apr 23, 2021

@jaraco is there any way to have setuptools do better there? without a better api setuptools_scm has to be a bad part of the ecosystem

one way could be to require setup plugins one wants to use in setup_requires and/or considering the pyproject config

additionally if the extras could be considered, then it could even be a proper opt-in later

the idea being that it would need pyproject.toml:

[build-system]
requires = [ "setuptools_scm[file_finder]" ] # to enable the file finder

Loading

@jaraco
Copy link
Member

@jaraco jaraco commented Apr 23, 2021

Adjusting the requirements doesn’t feel like the right abstraction to me. Instead, Setuptools should provide an opt-out for file finders or maybe make it opt in. Given that this implicit opt in has been present for over a decade (and longer if you consider the SVN behavior was embedded and implicit) with few complaints, I’d lean toward opt-out, at least initially.

Regardless, it’s a Setuptools issue. Would someone file a feature request to allow file finder functionality to be configurable?

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented Apr 23, 2021

Will do as soon as I get back to the computer

Loading

@embray
Copy link

@embray embray commented May 3, 2021

I still think this could be a setuptools_scm feature too. Maybe a bit hacky, but just have an environment variable that disables it, for example.

Loading

@RonnyPfannschmidt
Copy link
Contributor

@RonnyPfannschmidt RonnyPfannschmidt commented May 3, 2021

@embray i opened a new issue for that part, i just can't get to it quickly

Loading

@embray
Copy link

@embray embray commented May 3, 2021

@RonnyPfannschmidt Thanks! No worries. I'm just thinking out loud.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants