Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensurepip fails with importlib.metadata backend #11183

Closed
1 task done
stefanor opened this issue Jun 11, 2022 · 14 comments · Fixed by #11217
Closed
1 task done

ensurepip fails with importlib.metadata backend #11183

stefanor opened this issue Jun 11, 2022 · 14 comments · Fixed by #11217
Assignees
Labels
type: bug A confirmed bug or unintended behavior

Comments

@stefanor
Copy link
Contributor

Description

In Debian, we use the pip from our pip package with ensurepip. We've noticed with 3.11, and pip 22.1.1, ensurepip isn't doing anything any more.

The relevant change is that the importlib.metadata is the default backend with Python 3.11. If we force it back to the pkg_resources backend, it works.

Expected behavior

ensurepip is expected to install pip. It doesn't do anything, any more.

pip version

22.1.2+ dc7abca

Python version

3.11+ f9d0240db809fbb4443dc8f96a18e4c49af3fb7f

OS

Linux

How to Reproduce

  1. Install Python 3.11.
  2. Build / download a current (>= 22.1) pip wheel. Install it into ensurepip/_bundled/ in your Python installation.
  3. Update _PIP_VERSION in ensurepip/__init__.py to refer to the new version.
  4. python3.11 -m venv testve
  5. ls testve/lib/python3.11/site-packages/

This should not be empty, but will be.

For extra debugging:

  1. testve/bin/python -Im ensurepip --upgrade --default-pip -v

Output

Using pip 22.2.dev0 from /tmp/tmp3dhtsvpz/pip-22.2.dev0-py3-none-any.whl/pip (python 3.11)
Looking in links: /tmp/tmp3dhtsvpz
Requirement already satisfied: setuptools in ./tmp3dhtsvpz/setuptools-58.1.0-py3-none-any.whl (58.1.0)
Requirement already satisfied: pip in ./tmp3dhtsvpz/pip-22.2.dev0-py3-none-any.whl (22.2.dev0)

Code of Conduct

@stefanor stefanor added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Jun 11, 2022
@uranusjr uranusjr self-assigned this Jun 12, 2022
@uranusjr uranusjr removed the S: needs triage Issues/PRs that need to be triaged label Jun 12, 2022
@uranusjr
Copy link
Member

Huh, pkg_resources explicitly ignores .dist-info in a wheel, but importlib.metadata does not:

if importer.archive.endswith('.whl'):
# wheels are not supported with this finder
# they don't have PKG-INFO metadata, and won't ever contain eggs
return

As you can tell from comment, the logic is pretty outdated and is arguably a bug, but this is relied by pip and “works”. I’ll work on a fix to do this properly.

@hroncok
Copy link
Contributor

hroncok commented Jun 30, 2022

Should ensurepip use --force-reinstall? But that would probably break the current way of fast repeated calls to ensurepip.

@uranusjr
Copy link
Member

ensurepip probably should use --force-reinstall, but pip also should ignore the dist-info found in a wheel.

@hroncok
Copy link
Contributor

hroncok commented Jun 30, 2022

ensurepip probably should use --force-reinstall

I took a look at the code. It assumes that repeated calls don't reinstall. So I guess this would need to check if pip and setuptools is installed first and if at least one of them is not, pass --force-reinstall. Ideally, make this decision separately for pip and setuptools. Not sure if worth it, considering we assume pip should ignore the dist-info found in a wheel.

@pradyunsg
Copy link
Member

pradyunsg commented Jun 30, 2022

I'd lean toward not "fixing" this via ignoring the dist-info, and I think we should instead update ensurepip to work the way get-pip.py does.

Is there any reason to not do it this way?

@hroncok
Copy link
Contributor

hroncok commented Jun 30, 2022

get-pip always uses --upgrade and --force-reinstall. ensurepip does not reinstall on every occasion and only passes --upgrade when the user invoked it as such.

@pradyunsg
Copy link
Member

I didn't mean in terms of the flags passed, but rather in terms of how we're putting a copy of pip on sys.path.

I don't think we should be trying to use pip while it's packaged in a wheel -- we know that's fragile in multiple ways and something that we don't support users doing either.

@uranusjr
Copy link
Member

There’s good justification to ignore .dist-info in a wheel though (not for .whl in general, but an actual wheel)—since a “distribution” in a wheel is not actually installed and might not work, it should not be considered “installed”. I think we should do it regardless.

@pradyunsg
Copy link
Member

Indeed, let's do both then. :)

@sbidoul
Copy link
Member

sbidoul commented Jul 21, 2022

@uranusjr @pradyunsg coming here by reading the 22.2 release notes.

There’s good justification to ignore .dist-info in a wheel though (not for .whl in general, but an actual wheel)—since a “distribution” in a wheel is not actually installed and might not work, it should not be considered “installed”. I think we should do it regardless.

Could it be a problem that pip and importlib.metadata have a different view of what an installed distribution is ?
Or is it an importlib.metadata bug ?

@uranusjr
Copy link
Member

uranusjr commented Jul 21, 2022

I think it's simply a difference in how things are viewed. imporlib.metadata focuses on reading metadata (makes sense given the name), and metadata in a wheel is still valid metadata. It's not metadata of an installed distribution, a distinction importlib.metadata does not make, so it makes sense for pip needing to perform extra logic around it.

@pfmoore
Copy link
Member

pfmoore commented Jul 21, 2022

I think it's something that could do with clarification somewhere, but it's not a problem as such. My view is that pip is working with installed packages, whereas importlib is working with accessible packages.

That's a subtle but important distinction, which we've never formalised so far. Is adding a directory to sys.path considered to be "installing" the packages in that directory? Is adding an import hook? Core python (and in particular importlib.metadata) just works with what's accessible to the import machinery1 - and barely acknowledges the idea that there's a "privileged" set of locations that are considered to be "installed" (sysconfig is the key place, and site adds the idea of "site packages" where .pth files are recognised).

The important aspect of PEP 376 that never really made it into core Python was the idea that there is some sort of database of "installed" packages, and "installed" means something different than "accessible". Maybe that's because core Python genuinely doesn't need to care, and "installed" is entirely a packaging concept?

Footnotes

  1. And in practice, does extra work to ensure that the principle of "all ways of making a package importable are equal" is maintained.

@sbidoul
Copy link
Member

sbidoul commented Jul 21, 2022

Interesting. So I went looking in the python docs and the importlib.metadata page actually starts with a definition of what an installed package is.

By “installed package” we generally mean a third-party package installed into Python’s site-packages directory via tools such as pip. Specifically, it means a package with either a discoverable dist-info or egg-info directory, and metadata defined by PEP 566 or its older specifications. By default, package metadata can live on the file system or in zip archives on sys.path. Through an extension mechanism, the metadata can live almost anywhere.

It's still vague in terms of what discoverable means though. Yet that could be something we'll need to chew on...

@pfmoore
Copy link
Member

pfmoore commented Jul 21, 2022

I hadn't spotted that (TBH, I tend to find the importlib.metadata docs rather difficult to follow). I agree that "discoverable" is vague, but in practice I believe they add (optional) import hook methods to support this sort of "discoverability" so that it equates as near as possible to "anything importable". It's been a long time since I had a proper look at the details of the import machinery, though, I really need to get up to date again.

This starts to tie back into the idea of #4575 and #6052 around cleaning up the idea that we install into an "installation scheme". I think that it would be really useful for pip to formalise exactly what set of directories (the scheme) it's managing on any given run, and tie questions like this back to that formalisation. So we'd frame the current problem as "importlib.metadata finds any discoverable metadata, but pip needs to limit itself to metadata in the currently selected scheme(s)".

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants