Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added data_files specified to filelist during default collection #750

wants to merge 1 commit into from


Copy link

@Purg Purg commented Aug 19, 2016

Added missing data_files content inclusion when building a source distribution. Raised in issue #521.

Copy link

jaraco commented Aug 20, 2016

Thanks for the contrib.

I'm a little reluctant to add this for a few reasons.

  • It's copy/paste of code in distutils.
  • It's inline with other behavior, so it adds complexity and further complicates the distinction between the distutils and setuptools implementations.
  • There are no tests capturing this requirement.
  • This code seems to be explicitly excluded, probably because data files was deprecated in setuptools due to its incompatibility with encapsulated packages.
  • This patch possibly runs in conflict with some pending pull requests to optimize the file list generation.
  • It's encouraging use of, which demands that packagers dual-maintain the manifest of files (in the SCM repository and in another file in that repository).

That said, let's work toward a better solution.

I'd feel a lot better if distutils were to extract this functionality into a (perhaps private) method, and then Setuptools could call this private method (if it exists) or call a copy of it in the codebase for forward compatibility. As this requires committing to the Python codebase, perhaps that's too high a bar.

At the very least, I'd like for the code to clearly indicate why the setuptools sdist overrides the distutils sdist, and what it's intended to do... and to have a test to capture this new expectation.

What do you think?

Copy link

Purg commented Aug 22, 2016

Disclaimer: I'm not entirely synced on setuptools' end goals.

Context: I've been trying to build a file for installation/distribution setup for a project I work on (SMQTK). I've been trying to follow instructions related to the Python Packaging User Guide documentation, with this issue specifically relating to data files part.

Longer description of the "issue" that I've had (as well as @bb-migration at least): When pip installing my package (via the file on my dev branch), all the data_files specified content gets installed correctly, but if I do a python sdist, they are not included. distutils documents that they should be. setuptools is not distutils of course, but I was under the assumption that setuptools should be a drop-in, non-breaking, replacement.

Responses to your @jaraco's bullets:

[1] As a base yes, but updated to use six.stringtypes. As a point of reference, other parts of that function and subclass are also basically copy-paste of setuptools.commands.sdist module.

[2] I'm not entirely sure what you mean by this. I guess this is a sibling of your bullet [4].

[3] True, I need to add some.

[4] I suppose this is all moot then if the use of data_files was deprecated in setuptools. This is however pretty confusing as performing a pip install . on a target module correctly installs content specified in the data_files parameter and pip uses setuptools when installed I think (I'm not actually sure what's going on there)? Anyway, I would expect that if a pip install does it, the python sdist should make a source distro that does the exact same thing. Maybe I'm assuming the wrong thing(s) here, like pip using setuptools when both installed.

[5] Possibly, I don't know what every pull request is...

[6] I think this patch is technically the opposite, really. By programatically filling the data_files list, no needs to exist (except for special cases). NOT supporting the data_files argument requires the creation and maintenance of a file for non-package_data files (e.g any etc, share, etc. content).

As for why setuptools strictly overrides distutils's sdist.add_defaults, I don't know. It looks like it rearranged how pure-module files are added to the filelist (from build_py.get_source_files and build_py.data_files). The code above and below, with the exception of the self.distribution.data_files, is just about exactly the same as distutiles's sdist.add_defaults. I guess seeing that most of the function made me think that maybe the data_files was just mistakenly left out when it was originally created.

I'm going to give the setuptools docs on ReadTheDocs a good read through again.

Copy link

jaraco commented Oct 14, 2016

In this series of commits, I implement what I describe - I've refactored the functionality in distutils to provide hooks into aspects of the functionality, allowing the setuptools subclass to override behavior surgically. Now the codebase is in a healthy place, and when support for Python 3.6 is dropped, that compat module can be removed as well. More importantly, if the project chooses to restore the data files discovery, it should be a simple matter of removing the suppression of the _add_defaults_data_files method.

Would you consider submitting a new PR with a test demonstrating the expectation?

@jaraco jaraco closed this Oct 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

2 participants