Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug that prevents packages.find.exclude to take effect when include_package_data=True #3261

Closed
wants to merge 4 commits into from

Conversation

abravalheri
Copy link
Contributor

@abravalheri abravalheri commented Apr 11, 2022

Summary of changes

When discovering files in build_py, avoid including files and directories that correspond to a sub-package.

This way excluded sub-packages are automatically ignored.

Closes #3260.

Pull Request Checklist

prev = d
d, df = os.path.split(d)
f = os.path.join(df, f)

if d in src_dirs:
if path.endswith('.py') and f == oldf:
Copy link
Contributor Author

@abravalheri abravalheri Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note

This f == oldf condition is a bit odd.
According to the git history (f01c17e), the original intention of this code was to:

Allow .py files found by the include_package_data option to be automatically included.

To me that sounds a little bit as a misuse. My opinion is that modules should not be affected by include_package_data...

Unfortunately, I cannot find the original repository (I guess it was bitbucket) to check if it was associated with a PR or feature request to understand the context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the issues and PRs were migrated from Bitbucket, but that commit comes from when Setuptools was hosted on Sourceforge and Subversion, where issues and merges weren't tracked as easily, so the commit history is the best we have.

part.isidentifier()
for part in potential_module.relative_to(directory).with_suffix("").parts
)
for potential_module in Path(directory).glob("**/*.py")
Copy link
Contributor Author

@abravalheri abravalheri Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I thought about checking against importlib.machinery.all_suffixes, but I decided to go only with .py because I don't know what to do with .so files...

(Maybe the .so is not generated by setuptools, but instead compiled out-of-band and included by the user on purpose?)

@abravalheri abravalheri marked this pull request as ready for review April 11, 2022 19:51
@abravalheri abravalheri requested a review from jaraco April 11, 2022 19:51
@abravalheri
Copy link
Contributor Author

Hi @jaraco, would it be possible for you to have a look in this PR?

There is some very old implementation involved in the faulty behaviour so I am not exactly sure if the provided patch is the best way to go (but is the only way I can imagine 😅)

abravalheri added a commit to abravalheri/setuptools that referenced this pull request Apr 11, 2022
Fix bug that prevents `packages.find.exclude` to take effect when `include_package_data=True`
@jaraco
Copy link
Member

jaraco commented Apr 15, 2022

I'll plan to look today.

@jaraco
Copy link
Member

jaraco commented Apr 15, 2022

Today is not going as planned. Maybe tomorrow.

@abravalheri
Copy link
Contributor Author

No problems Jason, whenever you have the time. It is not something urgent, I was just not confident enough with the change to merge it directly. Thank you very much.

@jaraco
Copy link
Member

jaraco commented Apr 16, 2022

Thanks for this proposal. Your patch made it easy to see the intention here.

In the referenced issue, I wrote a response about the problem generally, where I expressed my reservations.

In spite of those reservations, I'd be open to merging this change and rolling it out to see what impact it has. Even if not optimal, perhaps this proposed behavior is preferable to the prior behavior.

Either way, I'd like us at some point to explore a redesign of the mechanisms for determining the 'build' manifest.

Copy link
Contributor

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny typo :)

setuptools/tests/test_build_py.py Outdated Show resolved Hide resolved
@abravalheri
Copy link
Contributor Author

Thank you very much @jaraco.

There is an alternative implementation: we can consider all nested folders as nested packages, and avoid traversing them in the first place. In terms of implementation it would mean removing this loop:

while d and d != prev and d not in src_dirs:
prev = d
d, df = os.path.split(d)
f = os.path.join(df, f)

I think this behaviour would is more intuitive and reflects better what I expect from include_package_data. But I think it is also "more disruptive".

Either way, I'd like us at some point to explore a redesign of the mechanisms for determining the 'build' manifest.

I am more than open to work on that. Do you have something in mind?

Co-authored-by: Aarni Koskela <akx@iki.fi>
@jaraco
Copy link
Member

jaraco commented Apr 20, 2022

I don’t have anything specific in mind at this time. I’ll defer to your judgment. I’m okay with disruptive as long as we manage the disruption.

@abravalheri
Copy link
Contributor Author

abravalheri commented Apr 25, 2022

Thank you very much @jaraco.

I changed the PEP 660 implementation proposal to not depend on this PR.

I think the right thing going forward is to treat any folder that is importable to Python as a package and therefore subject to the expected effects. packages.find.exclude (at least until we come up with a different mechanism as discussed in #3260). However in order to manage the disruption we might want to issue a deprecation warning for users relying on this behavior.

I plan to implement the following strategy in the following days:

  1. Create another PR that will issue a warning when a project relies on include_package_data to include any folder or Python script that is importable by Python
  2. Change this PR accordingly
  3. Postpone this PR for a while so people can react to the deprecation warning, or until we create a better mechanism (currently being discussed in [BUG] options.packages.find.exclude not taking effect when include_package_data = True #3260).

@abravalheri
Copy link
Contributor Author

abravalheri commented May 6, 2022

  1. Create another PR that will issue a warning when a project relies on include_package_data to include any folder or Python script that is importable by Python

I have created #3308 to start with a deprecation warning. After some time (during which the users have the chance to do the necessary changes), we can revisit the topic and fix the bug.

@jaraco what is the release policy for setuptools in terms of deprecations? Is there a target in terms of recommended time interval (or number of major versions) between deprecating a behaviour/mechanism and its removal from setuptools?

@abravalheri abravalheri closed this May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] options.packages.find.exclude not taking effect when include_package_data = True
3 participants