-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul of venv_metadata_inspector.py
, now using modern python libraries
#588
Conversation
venv_metadata_inspector.py
, now using modern python libraries
"packaging", | ||
"importlib-metadata", | ||
"setuptools", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need packaging and importlib metadata as shared lib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major change to venv_metadata_inspector.py
. Instead of it using setuptools' pkg_resources
to inspect each venv, it now uses packaging
and importlib[.-]metadata
.
From packaging.requirements
it uses Requirement
, and from packaging.utils
it uses canonicalize_name()
.
From importlib[.-]metadata
it uses distribution()
, distributions()
, and type Distribution
.
I couldn't figure out a way to get along without both of those dependencies if we transition away from pkg_resources
. If anyone can think of a way I'd be happy to pursue it! Adding more libraries to the shared libs does increase the chance that we will conflict with requirements in the venvs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid importlib-metadata
in 3.8+ environments by adding a marker here (plus try-except imports in venv_metadata_inspector
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was guarding against the user possibly having a mix of 3.8+ and <3.8 venvs, in which case the shared libs would need importlib-metadata
even if pipx and possibly other venvs were running in 3.8+.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although now that I think about it, if the shared libs are installed 3.8+ but used with a venv that is <3.8, does that imply that the <3.8 venv gets to use importlib.metadata
?
That doesn't seem to work unfortunately. I just tried it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And there is already a try-except sequence in venv_metadata_inspector.py
to first attempt to use importlib.metadata
and then to fall back on using importlib-metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cs01, any thoughts? I realize it's around Christmas and you probably have better things occupying yourself. Just thought I'd try to at least give you an opportunity to chime in before I start releasing things. 🙂 Happy Holidays everyone, btw 🎄 |
except FileNotFoundError: | ||
pass | ||
|
||
# not sure what is found here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still the case? I wonder if we can remove this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered too...but I was also really worried about corner cases because the universe of strange python packages is vast.
I wonder if this duplicates the above for loop for path in dist.files
. Maybe one of the other python packaging experts can give some guidance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uranusjr , @gaborbernat , do either of you know if installed-files.txt
is only used for Python 2.*? It's hard to find documentation on it, but what I find all seems to refer to Python 2.*. If installed-files.txt
is only used for Python 2.* then we can definitely remove the code that searches it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think it has anything to do with Python 2/3. AFAIK installed-files.txt
is a setuptools/distutils artifact from the old days before .dist-info
was specified. Newer setuptools versions should never write this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find references to ancient pip generating this file (there is a mention in changelog for pip 0.3 https://pip.pypa.io/en/stable/news/#id556)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of the library of origin, it seems very old. If we are updating our shared libraries this code that views installed-files.txt
seems like it can be removed.
Let me know if there are any objections, otherwise I will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't read the code closely; can someone explain what it actually does? While new(-ish) packaging tools would never generate the file, it may still be present in systems based on Python packaging but never bothered to update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just goes through every line in file installed-files.txt
and sees if any of them is in the venv bin/
subdirectory. Basically exactly the same thing as the for loop above it does by searching the iterable Distribution.files
from importlib[.-]metadata
.
If this is part of ancient pip
, I'm wondering if it was created each time on install on the client machine, and not made as part of the package distribution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the few mentions I can find on the internet, from a setuptools Issue discussing a possible solution (June 2015):
pypa/setuptools#371 (comment)
Just have setuptools add installed-files.txt to the egg-info directory. pip has been doing this for a long time before .dist-info was on the radar.
Wow this is great, an early Christmas present :). This and will handle all the annoying corner cases, and be a nice foundation for the future of pipx.
The packages installed into the shared libs were originally introduced in https://github.com/pipxproject/pipx/pull/168/files. As you know, the shared libs are essentially "merged" at runtime with pipx's package-specific venvs via the pth file. This is to save time by avoiding installing a bunch of the same things over and over for each venv. It forms a common environment for the app's venvs to be able to run pip and the metadata inspector. pip is definitely required to install packages. pip often needs to build things from source. The source build system is often setuptools, so that is required in case the package is an sdist. To build into a wheel, it needs the This LGTM! |
self.required_packages = [ | ||
"pip", | ||
"wheel", | ||
"packaging", | ||
"importlib-metadata", | ||
"setuptools", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized there are no version constraints on these package specifiers. I have been getting bit on some semver changes on my other projects so it has been on my mind. However, thinking about these a little, it's likely we always want to be on the latest version of all of them since they all work in tandem, and we'll want the latest metadata changes available.
Adding more libraries to the shared libs does increase the chance that we will conflict with requirements in the venvs.
I wonder what would happen here. The shared lib packages are added via the pth file, and aren't part of the requirements of the package being installed. I assume pip (and its new dependency resolver) only applies to the requirements/package being installed (and its dependencies). The shared lib pip invocation is separate from app package installation and manipulation, so maybe there won't be any issues here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what would happen here. The shared lib packages are added via the pth file, and aren't part of the requirements of the package being installed. I assume pip (and its new dependency resolver) only applies to the requirements/package being installed (and its dependencies). The shared lib pip invocation is separate from app package installation and manipulation, so maybe there won't be any issues here?
It depends on how and when the shared libs are injected. pip determines whether a package is installed by inspecting directories on sys.path
, so if the shared libs are injected when pip install
is run, pip will consider those packages as “installed”. If you’re going to add version constraints to those libraries, they could end up conflicting with applications users want to install.
I think maybe the best way to implement venv_metadata_inspecor.py
is to run it as a zipapp instead. pipx can create a zipapp venv_metadata_inspecor.pyz
that contains the script and all its dependencies in a cache directory, and run python venv_metadata_inspecor.pyz
instead. The same technique can actually be used for pip install
to avoid potential edge cases where an application depends on a certain pip and setuptools version not compatible with pipx (not likely to happen though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zipapp idea is very interesting. Wouldn't pipx still need to be running inside of the target venv anyway for importlib[.-]metadata.Distribution
to get information? And wouldn't that end up with the same version conflicts?
I don't see any way to point importlib[.-]metadata
to examine another venv than the current one, in which case I'm worried even a zipapp would encounter the same version conflicts.
Is there a way around these problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh once I actually wrote out "any way to point importlib[.-]metadata to examine another venv than the current one" I realized that is the key, and I found this pointer, which may show the way to do that:
https://docs.python.org/3/library/importlib.metadata.html#extending-the-search-algorithm
I'm wondering if we could hook into that to facilitate running the code in one venv and inspecting another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could do THAT, then we could make venv inspection just another part of pipx and not a separate runnable file. That would really be cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I'm only changing sys.path
in order to change what importlib[.-]metadata
is looking at when we get info about the distributions. Otherwise, the code is still part of the pipx package and not in a separate venv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK, so you do all the inspection outside of the target virtual environment? That should work.
I believe you can even avoid sys.path
manipulation altogether by passing a path context, e.g. importlib_metadata.distributions(path=path_to_venv_site_packages)
(not sure if this would work, @jaraco is the expert on this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can even avoid
sys.path
manipulation altogether by passing a path context, e.g.importlib_metadata.distributions(path=path_to_venv_site_packages)
(not sure if this would work, @jaraco is the expert on this).
That would be even better. Let me see what I can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path
parameter is meant to be a sequence like sys.path
so something like importlib.metadata.distributions(path=[path_to_venv_site_packages])
may be what you're after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @jaraco. I think I've got that working in another branch.
Closing in favor of instead using #593 |
docs/changelog.md
Summary of changes
Closes #587. Closes #534. Closes #528 .
This revamps
venv_metadata_inspector.py
to use modern python libraries likepackaging
andimportlib.metadata
/importlib_metadata
instead ofpkg_resources
for venv metadata inspection.This fixes some things we didn't know we had problems with (see Issues above). One thing I noticed about the old
venv_metadata_inspector.py
that isn't captured in any Issue, is that currently it completely ignores any environment markers, and just assumes every entry in aninstall_requires
is equally valid. The new code evaluates both markers and the presence or absence of package extras (which may be present in dependencies) correctly.I started with #344, and made changes.
The biggest changes are to
venv_metadata_inspector.py
andshared_libs.py
I have two questions:
importlib_metadata
in pipx'sinstall_requires
if pipx doesn't actually use it in pipx code, but rather installs it in the shared venv? RESOLVED--Removed it frominstall_requires
pip
,wheel
,packaging
,importlib-metadata
, andsetuptools
. Are all of these necessary? RESOLVED--Keeping all of them.After this PR merges I hope to release pipx 0.16.0.0!
Test plan
The code was extensively tested with our new "slow" tests and 62 packages:
https://github.com/itsayellow/pipx/actions/runs/429239351
There were no errors about pipx's venv metadata inspection missing apps or missing apps of dependencies (This is specifically checked in the slow tests). The only problems were packages that would not install at all on the CI systems due to missing wheels and/or missing header files on the system.