Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using setuptools_scm forces all SCM files to be packaged into sdist #190

Closed
gaborbernat opened this issue Nov 27, 2017 · 23 comments

Comments

@gaborbernat
Copy link

commented Nov 27, 2017

any way we can disable this?

@RonnyPfannschmidt

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2017

right now not, its suggested to use a MANIFEST.in to add excludes, documentation is needed

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

@RonnyPfannschmidt I've tried but fails, in what form should that be added?

@RonnyPfannschmidt

This comment has been minimized.

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

this also makes that we ignore find_packages, package_data, etc absolute as far as I see 🤔

@RonnyPfannschmidt

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2017

@gaborbernat not sure what you mean by that

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

once setuptools_scm is enabled we package everything, by adding all files to the default (via https://github.com/pypa/setuptools/blob/5ecd7575c9c09d4ec2d8f993c5fb405388c3f3c1/setuptools/command/egg_info.py#L563); so this basically has the effect that we ignore whatever was specified in find_packages, package_data, not?

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

running

/usr/bin/python /home/bernat/git/borg/setup.py sdist --formats=zip --dist-dir /home/bernat/git/borg/.tox/dist | grep pyx 

even though running https://github.com/borgbackup/borg/blob/master/setup.py#L817 states please no pyx, h, c files:

warning: src/borg/crypto/low_level.pyx:742:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:745:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:767:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:773:22: local variable 'olen' referenced before assignment
copying src/borg/chunker.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/compress.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/hashindex.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/item.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/algorithms/checksums.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms
copying src/borg/crypto/low_level.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto
copying src/borg/platform/darwin.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/freebsd.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/linux.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/posix.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/chunker.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/item.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/hashindex.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/compress.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/freebsd.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/darwin.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/posix.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/linux.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms/checksums.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto/low_level.pyx'

see how we just packaged pyx files there.

@RonnyPfannschmidt

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2017

gah, does it also add them to a wheel? or does it just pollute the sdist?

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Nov 27, 2017

just the sdist 👍

@gaborbernat gaborbernat changed the title using setuptools_scm forces all SCM files to be packaged using setuptools_scm forces all SCM files to be packaged into sdist Nov 27, 2017

@axnsan12

This comment has been minimized.

Copy link

commented Dec 12, 2017

Is there any update on this? This functionality is really not stated anywhere and is very undesirable for me. My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

@axnsan12

This comment has been minimized.

Copy link

commented Dec 12, 2017

For what it's worth, I hacked around this by doing:

try:
    import setuptools_scm.integration
    setuptools_scm.integration.find_files = lambda _: []
except ImportError:
    pass

in my setup.py

@jaraco

This comment has been minimized.

Copy link
Member

commented Dec 12, 2017

If it weren't for the bootstrapping challenges, I'd suggest that setuptools_scm should be split into two packages such that a project could selectively include the behavior they desire (and only that behavior). Unfortunately, the ecosystem (setuptools) assumes that if you've checked files into your source code repo that you intend to copy them into your sdist.

I think the issue here shouldn't be addressed by setuptools_scm, but should be addressed by setuptools (as the issue would apply to any plug-in supplying file finder capability).

@axnsan12

This comment has been minimized.

Copy link

commented Dec 12, 2017

Would it not be possible to create a dummy package, say setuptools_scm_no_finder, which, if installed, would cause find_files to be a no-op? i.e., in integration.py

try:
    import setuptools_scm_no_finder
    
    def find_files(_):
        return []
except ImportError:
    def find_files(...):
        # current implementation
        ...

Then you could list that module as an extra_require to this module, and one would do, say pip install setuptools_scm[version_only] (or setup_requires=['setuptools_scm[version_only]']) to activate only the versioning component.

Edit: after further thought, scratch that, I just realised doing this would inadvertedly affect setup for all packages; oh well...

@axnsan12

This comment has been minimized.

Copy link

commented Dec 12, 2017

Another idea: choose a special filename which, if encountered, would cause find_files to return an empty list? something like .no_scm_sdist?

Or try to read configuration settings from some standard files like tox.ini and setup.cfg?

@gaborbernat

This comment has been minimized.

Copy link
Author

commented Dec 12, 2017

I looked into this in detail, being annoyed with it and I think it's fine the way it works now.

sdist is your source files distribution, and the only non desirable files are CI project files (for that setuptools could have an exclude list evaluated after or against find_files). Remember the Manifest or package_data is still relevant, as that governs what exactly then gets installed into the site package from the sdist. If you build wheels e.g. for PyPi you'll see that the source which are not part of Manifest, package_data or the packages are not packaged.

Tldr: files added by this find_files only will not be installed on the users machine. It's only the build start out material. Wheels distribution packages e.g. do not contain these as proof of this.

@RonnyPfannschmidt

This comment has been minimized.

Copy link
Contributor

commented Dec 13, 2017

im happy to accept a pr that adds a take_out_file_finder option for setup.py which in turn would apply the monkeypatch

when setup is called, the file finding isnt diretly triggered, but we can at that point apply the evil monkeypatch in a reasonable manner

@jaraco oppinions?

@jaraco

This comment has been minimized.

Copy link
Member

commented Dec 13, 2017

Gaborbernat has already indicated that the existing behavior is acceptable... so the remaining opinion is that of axnsan12:

My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

My recommendation is to eliminate the manifest.in and rely on the file finder. I've yet to see a compelling use-case where SCM-based file finders isn't a suitable approach if not exactly what a project needs. What is the reason that doesn't work for you, @axnsan12?

My opinion is that if it's necessary to support disabling file finders, that should be either implemented in Setuptools or it should be a hack added by the individual projects. Adding it at setuptools_scm feels like the wrong place as it leaves other file finder plugins without that support.

@RonnyPfannschmidt

This comment has been minimized.

Copy link
Contributor

commented Dec 13, 2017

@jaraco thanks for elaborating, i will close this one as wontfix then

@webknjaz

This comment has been minimized.

Copy link

commented Apr 10, 2019

@jaraco I have a use case where we have blobs in the repo (historically, don't ask) and I'd like to rip them off. It seems to be impossible to do via exclude_package_data or find_packages(exclude=).

So I'm forced to resort to MANIFEST.in in this case. It would be nice, though, if this was supported natively.

P.S. Another similar integration claims that they support exclude_package_data... https://pypi.org/project/setuptools-git/#usage

webknjaz added a commit to webknjaz/molecule that referenced this issue Apr 10, 2019

📦 Exclude blobs from sdist
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)

webknjaz added a commit to ansible/molecule that referenced this issue Apr 10, 2019

📦 Exclude blobs from sdist
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)

webknjaz added a commit to ansible/molecule that referenced this issue Apr 10, 2019

📦 Exclude blobs from sdist
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools_scm#190 (comment)
@ulope

This comment has been minimized.

Copy link

commented May 14, 2019

@jaraco, @gaborbernat Could you elaborate on your thinking why this is desirable behaviour?

In the current project (where by the way it took me close to an hour to figure out that setuptools_scm was the culprit) we have a lot of stuff under source control that is not at all intended to be shipped in a distribution package (e.g. debugging tools, shell scripts, documentation, etc.).

Also that this leads to disregard of the developer intention expressed with find_packages()which I find highly surprising.

/edit: Also the argument that "it doesn't matter since only things listed in MANIFEST.in will get installed" doesn't hold water IMO since size of the sdist package size is also a concern. For the project I'm talking about above the size of the archive increases by almost 5x (~500 kB vs. ~2.5 MB) between including only required package files vs. all vcs files.

ulope added a commit to ulope/raiden that referenced this issue May 14, 2019

Fix sdist package
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools_scm#190
@jaraco

This comment has been minimized.

Copy link
Member

commented May 14, 2019

why is this desirable behavior?

@ulope - In my experience, the source repo tends to be the source of truth (no pun intended) for what is the source of that project. I think it's reasonable to expect when downloading a source distribution to get the debugging tools, shell scripts, documentation, etc. In my opinion the sdist is meant to be more than just a copy of the Python functionality, but is meant to be a distributable copy of the source code. I would expect someone to be able to download the sdist, extract it, and develop on the project much like they would if cloning the repo. These sdists are used by downstream packagers (Debian, RedHat) to run tests and other validation on them.

Also that this leads to disregard of the developer intention expressed with find_packages() which I find highly surprising.

I would expect find_packages() to be relevant mainly at the point of building/installing a package, after something using MANIFEST.in or setuptools_scm has discovered the sources.

I can see that your expectation was violated here, but I think your expectation varies from the dominant expectation (that all source files should be distributed with the source).

I see in your patch that you simply disabled file finding. It seems there are at least a couple other users of setuptools_scm who desire this behavior. Perhaps you would consider contributing a patch to setuptools to allow disabling of file finding, along the lines of what axnsan12 has suggested above (but to apply to any file finder and not just setuptools_scm).

ulope added a commit to raiden-network/raiden that referenced this issue May 15, 2019

Fix sdist package
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools_scm#190
@webknjaz

This comment has been minimized.

Copy link

commented May 15, 2019

These sdists are used by downstream packagers (Debian, RedHat) to run tests and other validation on them.

Not really, I know a few folks doing RPM packaging for Fedora. They tend to use original Git repos for that because there's no standard convention on what maintainers include into sdists. + sdists are not verifiable.

@ulope

This comment has been minimized.

Copy link

commented May 21, 2019

@jaraco Thanks for the expanded explanation and sorry for the somewhat abrupt tone in my previous message.

I'll try to find some time to work on the setuptools option you suggested.

nlsdfnbch added a commit to nlsdfnbch/raiden that referenced this issue May 21, 2019

Fix sdist package
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools_scm#190
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.