Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using setuptools_scm forces all SCM files to be packaged into sdist #190

Closed
gaborbernat opened this issue Nov 27, 2017 · 38 comments
Closed
Labels

Comments

@gaborbernat
Copy link

any way we can disable this?

@RonnyPfannschmidt
Copy link
Contributor

right now not, its suggested to use a MANIFEST.in to add excludes, documentation is needed

@gaborbernat
Copy link
Author

@RonnyPfannschmidt I've tried but fails, in what form should that be added?

@RonnyPfannschmidt
Copy link
Contributor

@gaborbernat
Copy link
Author

this also makes that we ignore find_packages, package_data, etc absolute as far as I see 🤔

@RonnyPfannschmidt
Copy link
Contributor

@gaborbernat not sure what you mean by that

@gaborbernat
Copy link
Author

once setuptools_scm is enabled we package everything, by adding all files to the default (via https://github.com/pypa/setuptools/blob/5ecd7575c9c09d4ec2d8f993c5fb405388c3f3c1/setuptools/command/egg_info.py#L563); so this basically has the effect that we ignore whatever was specified in find_packages, package_data, not?

@gaborbernat
Copy link
Author

running

/usr/bin/python /home/bernat/git/borg/setup.py sdist --formats=zip --dist-dir /home/bernat/git/borg/.tox/dist | grep pyx 

even though running https://github.com/borgbackup/borg/blob/master/setup.py#L817 states please no pyx, h, c files:

warning: src/borg/crypto/low_level.pyx:742:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:745:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:767:22: local variable 'olen' referenced before assignment
warning: src/borg/crypto/low_level.pyx:773:22: local variable 'olen' referenced before assignment
copying src/borg/chunker.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/compress.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/hashindex.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/item.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg
copying src/borg/algorithms/checksums.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms
copying src/borg/crypto/low_level.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto
copying src/borg/platform/darwin.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/freebsd.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/linux.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
copying src/borg/platform/posix.pyx -> borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/chunker.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/item.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/hashindex.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/compress.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/freebsd.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/darwin.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/posix.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/platform/linux.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/algorithms/checksums.pyx'
adding 'borgbackup-1.2.0.dev296+g1bf8d43e/src/borg/crypto/low_level.pyx'

see how we just packaged pyx files there.

@RonnyPfannschmidt
Copy link
Contributor

gah, does it also add them to a wheel? or does it just pollute the sdist?

@gaborbernat
Copy link
Author

just the sdist 👍

@gaborbernat gaborbernat changed the title using setuptools_scm forces all SCM files to be packaged using setuptools_scm forces all SCM files to be packaged into sdist Nov 27, 2017
@axnsan12
Copy link

Is there any update on this? This functionality is really not stated anywhere and is very undesirable for me. My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

@axnsan12
Copy link

axnsan12 commented Dec 12, 2017

For what it's worth, I hacked around this by doing:

try:
    import setuptools_scm.integration
    setuptools_scm.integration.find_files = lambda _: []
except ImportError:
    pass

in my setup.py

@jaraco
Copy link
Member

jaraco commented Dec 12, 2017

If it weren't for the bootstrapping challenges, I'd suggest that setuptools_scm should be split into two packages such that a project could selectively include the behavior they desire (and only that behavior). Unfortunately, the ecosystem (setuptools) assumes that if you've checked files into your source code repo that you intend to copy them into your sdist.

I think the issue here shouldn't be addressed by setuptools_scm, but should be addressed by setuptools (as the issue would apply to any plug-in supplying file finder capability).

@axnsan12
Copy link

axnsan12 commented Dec 12, 2017

Would it not be possible to create a dummy package, say setuptools_scm_no_finder, which, if installed, would cause find_files to be a no-op? i.e., in integration.py

try:
    import setuptools_scm_no_finder
    
    def find_files(_):
        return []
except ImportError:
    def find_files(...):
        # current implementation
        ...

Then you could list that module as an extra_require to this module, and one would do, say pip install setuptools_scm[version_only] (or setup_requires=['setuptools_scm[version_only]']) to activate only the versioning component.

Edit: after further thought, scratch that, I just realised doing this would inadvertedly affect setup for all packages; oh well...

@axnsan12
Copy link

Another idea: choose a special filename which, if encountered, would cause find_files to return an empty list? something like .no_scm_sdist?

Or try to read configuration settings from some standard files like tox.ini and setup.cfg?

@gaborbernat
Copy link
Author

gaborbernat commented Dec 12, 2017

I looked into this in detail, being annoyed with it and I think it's fine the way it works now.

sdist is your source files distribution, and the only non desirable files are CI project files (for that setuptools could have an exclude list evaluated after or against find_files). Remember the Manifest or package_data is still relevant, as that governs what exactly then gets installed into the site package from the sdist. If you build wheels e.g. for PyPi you'll see that the source which are not part of Manifest, package_data or the packages are not packaged.

Tldr: files added by this find_files only will not be installed on the users machine. It's only the build start out material. Wheels distribution packages e.g. do not contain these as proof of this.

@RonnyPfannschmidt
Copy link
Contributor

im happy to accept a pr that adds a take_out_file_finder option for setup.py which in turn would apply the monkeypatch

when setup is called, the file finding isnt diretly triggered, but we can at that point apply the evil monkeypatch in a reasonable manner

@jaraco oppinions?

@jaraco
Copy link
Member

jaraco commented Dec 13, 2017

Gaborbernat has already indicated that the existing behavior is acceptable... so the remaining opinion is that of axnsan12:

My manifest.in already lists everything that should be in the package and I would really love to use setuptools_scm just for the version detection.

My recommendation is to eliminate the manifest.in and rely on the file finder. I've yet to see a compelling use-case where SCM-based file finders isn't a suitable approach if not exactly what a project needs. What is the reason that doesn't work for you, @axnsan12?

My opinion is that if it's necessary to support disabling file finders, that should be either implemented in Setuptools or it should be a hack added by the individual projects. Adding it at setuptools_scm feels like the wrong place as it leaves other file finder plugins without that support.

@RonnyPfannschmidt
Copy link
Contributor

@jaraco thanks for elaborating, i will close this one as wontfix then

@webknjaz
Copy link
Member

webknjaz commented Apr 10, 2019

@jaraco I have a use case where we have blobs in the repo (historically, don't ask) and I'd like to rip them off. It seems to be impossible to do via exclude_package_data or find_packages(exclude=).

So I'm forced to resort to MANIFEST.in in this case. It would be nice, though, if this was supported natively.

P.S. Another similar integration claims that they support exclude_package_data... https://pypi.org/project/setuptools-git/#usage

webknjaz added a commit to webknjaz/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools-scm#190 (comment)
webknjaz added a commit to ansible/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools-scm#190 (comment)
webknjaz added a commit to ansible/molecule that referenced this issue Apr 10, 2019
Using ``MANIFEST.in`` because of setuptools limitation.
Ref:
pypa/setuptools-scm#190 (comment)
@ulope
Copy link

ulope commented May 14, 2019

@jaraco, @gaborbernat Could you elaborate on your thinking why this is desirable behaviour?

In the current project (where by the way it took me close to an hour to figure out that setuptools_scm was the culprit) we have a lot of stuff under source control that is not at all intended to be shipped in a distribution package (e.g. debugging tools, shell scripts, documentation, etc.).

Also that this leads to disregard of the developer intention expressed with find_packages()which I find highly surprising.

/edit: Also the argument that "it doesn't matter since only things listed in MANIFEST.in will get installed" doesn't hold water IMO since size of the sdist package size is also a concern. For the project I'm talking about above the size of the archive increases by almost 5x (~500 kB vs. ~2.5 MB) between including only required package files vs. all vcs files.

ulope added a commit to ulope/raiden that referenced this issue May 14, 2019
Setuptools_scm forcibly includes all files under version control into the sdist package, ignoring `MANIFEST.in` and `find_packages`.
This fixes this by replacing the `find_files` function with a dummy one (`see setup.py::EggInfo.__init__()`).

See pypa/setuptools-scm#190
@salotz-sitx
Copy link

Regardless of all the buck passing about how its "not our fault" you have to realize how disruptive of a behavior this is for this package to implement. Its essentially a packaging virus that unwitting consumers have to deal with with preemptive hacks in our build scripts!

Its great that you have a workflow that you like (including everything in the source control) but A) not everyone can do this even if they wanted to and B) I never actually opted in to use this project. I don't care that a project I am consuming uses it, but that shouldn't force me to account for it.

Please deal with this mess because until setuptools gets fixed it is this project's problem.

@RonnyPfannschmidt
Copy link
Contributor

Im happy to merge a reasonable contribution, i don't have the bandwidth to do this myself

DimitriPapadopoulos added a commit to DimitriPapadopoulos/codespell that referenced this issue Nov 17, 2022
Source distributions such as sdist must include test directories.
Note that all files matching the pattern `test/test*.py` (or perhaps
`tests/test*.py` with an `s`?) are included implicitly:
	https://packaging.python.org/guides/using-manifest-in/#how-files-are-included-in-an-sdist

Binary distributions such as wheel must exclude test directories.
This seems to be done implicitly for top-level `tests/` directories
only. I do not know how to explicitly exclude files from binary
distributions. This setuptools issue might be of interest:
	pypa/setuptools-scm#190

I change `codespell_lib/tests/` to a top-level `tests/` directory as
a workaround. Not that I like it, but I currently lack an alternative.

Fixes (partially) codespell-project#2592.
pwithnall added a commit to endlessm/kolibri-explore-plugin that referenced this issue Jun 23, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
pwithnall added a commit to endlessm/kolibri-explore-plugin that referenced this issue Jun 23, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
pwithnall added a commit to endlessm/kolibri-explore-plugin that referenced this issue Jun 27, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
pwithnall added a commit to endlessm/kolibri-explore-plugin that referenced this issue Jun 27, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
pwithnall added a commit to endlessm/kolibri-explore-plugin that referenced this issue Jun 29, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
dbnicholson pushed a commit to endlessm/kolibri-explore-plugin that referenced this issue Jul 7, 2023
Using `setuptools_scm` allows us to reliably ensure that all files in
git are in the sdist, which avoids problems caused by occasionally
forgetting to update `MANIFEST.in` when adding a new directory.

`MANIFEST.in` still needs to be kept (in a reduced form) because there
are a few files needed by webpack and by kolibri-installer-android which
are only created at dist time, and exist outside setuptools.

References:
 - https://github.com/pypa/setuptools_scm#readme
 - https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 - pypa/setuptools-scm#190
 - https://packaging.python.org/en/latest/guides/using-manifest-in/

Signed-off-by: Philip Withnall <pwithnall@endlessos.org>

#647
@kevalmorabia97
Copy link

Any updates on how to fix this?

@jaraco
Copy link
Member

jaraco commented Sep 5, 2023

Any updates on how to fix this?

In the Setuptools project, investigate and understand how the file finders functionality works. Develop an understanding of the nuances and differences between files discovered for source distributions and files used for installation. Then come up with a design for Setuptools that allows users that are using a file finder plugin to exclude found files from their sdist.

yuvipanda added a commit to yuvipanda/jupyterhub-fancy-profiles that referenced this issue Nov 14, 2023
- Stop including the built JS file in the package, instead it
  is built during sdist time (and hence included in the
  wheel). Stolen / inspired from JupyterHub itself.
- setuptool_scm includes *all* checked-in files by default in
  the sdist (pypa/setuptools-scm#190).
  This cleans it out to trim down the size of our sdist
- I think our previous wheel files didn't actually have the
  python package correctly. It is set up correctly now.
- Generated js files are put under a dist/, so we can have
  non-generated static files under static/ in the future. dist/
  is added to .gitignore
aryarm added a commit to gymrek-lab/TRTools that referenced this issue Nov 24, 2023
The pyproject.toml file is now the single location for the version to be declared. It will be read by __init__.py automatically.
The old system used setuptools-scm but apparently that has the unintended side-effect of adding all files tracked by version control to the sdist! yikes
Refer to pypa/setuptools-scm#190 for more details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants