Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop relying on __file__ #11648

Open
1 task done
thejcannon opened this issue Dec 8, 2022 · 8 comments
Open
1 task done

Stop relying on __file__ #11648

thejcannon opened this issue Dec 8, 2022 · 8 comments
Labels
S: awaiting response Waiting for a response/more information state: awaiting PR Feature discussed, PR is needed type: feature request Request for a new feature

Comments

@thejcannon
Copy link
Contributor

Description

As indygreg/PyOxidizer#69 explains, __file__ is optional, and in environments running pip where the interpreter doesn't define it: 💥

An easy way to see this in action is pyoxy run-python -- -m pip --version
(releases are here: https://gregoryszorc.com/docs/pyoxy/main/)

Expected behavior

No response

pip version

(main)

Python version

PyOxy 0.2.0

OS

Ubuntu 20

How to Reproduce

  1. Use an interpreter that doesn't define __file__. PyOxy is one of them: https://gregoryszorc.com/docs/pyoxy/main/
  2. Run pip with that interpreter

Output

Traceback (most recent call last):
  File "runpy", line 194, in _run_module_as_main
  File "runpy", line 87, in _run_code
  File "pip.__main__", line 29, in <module>
  File "pip._internal.cli.main", line 9, in <module>
  File "pip._internal.cli.autocompletion", line 10, in <module>
  File "pip._internal.cli.main_parser", line 8, in <module>
  File "pip._internal.cli.cmdoptions", line 22, in <module>
  File "pip._vendor", line 23, in <module>
NameError: name '__file__' is not defined

Code of Conduct

@thejcannon thejcannon added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Dec 8, 2022
@pfmoore
Copy link
Member

pfmoore commented Dec 8, 2022

As a broad principle, this is entirely reasonable, and it would be great if we could do this. However, it's not particularly easy to achieve that goal.

  1. We would need to work out ways of achieving the same results as we currently do without using __file__. This may or may not be easy - someone needs to review all of our uses of __file__ and come up with a proposal for each.
  2. We don't have any direct control over the dependencies we vendor. It would be a further limitation on our ability to use 3rd party modules if we added a restriction that they must not depend on __file__, and conversely it would add an extra maintenance burden if we had to maintain our own patches to work around any such use. There definitely are uses of __file__ in our current vendored code. We'd need a plan to handle this.

Being able to compile pip using something like PyOxidizer would be a nice feature. A standalone exe would be a potentially superior alternative to the recently added zipapp distribution. So if someone such as yourself was interested in working on this issue (and by implication, any other issues with compiling pip with PyOxidizer) that would be great. It's not high on the maintainers' priority list, though, so it's unlikely that this will happen without someone from the community leading the effort.

@notatallshaw
Copy link
Member

notatallshaw commented Dec 8, 2022

I just installed pyoxy and created a test Python script and it had a __file__ attrbute:

$ ./pyoxy run-python
Python 3.9.13 (main, May 28 2022, 19:24:00)
[Clang 14.0.3 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import test
>>> test.__file__
'/home/damian/pyoxytest/pyoxy-0.2.0-x86_64-unknown-linux-gnu-cpython3.9/test.py'

It seems to me that somehow you've installed Pip incorrectly? Does pyoxy support using Pip?

And what other option do you expect Pip to use? I'm not a Pip develop but it seems to me that Pip is deeply coupled to there being a filesystem and it needs to know the file path of code in lots of places?

I did a quick search for __file__ and it seems there's both internal uses and vendored uses: https://gist.github.com/notatallshaw/7363cbc94a979dce1e9bc67085e243d8

CPython itself only expect __file__ to be missing on built-in modules: https://github.com/python/cpython/blob/v3.11.1/Lib/inspect.py#L896

@pradyunsg pradyunsg added state: awaiting PR Feature discussed, PR is needed type: feature request Request for a new feature and removed type: bug A confirmed bug or unintended behavior S: needs triage Issues/PRs that need to be triaged labels Dec 9, 2022
@pradyunsg
Copy link
Member

I'm ambivalent on this and, at least, Paul is in favour of this. I've labelled this to say that we're open to accepting PRs toward this. I do have a couple of concerns / caveats around this:

  1. It's contingent on all our vendored dependencies being on-board. I am not comfortable with the idea of pushing this as a constraint on pip's dependencies as "because pip wants to do it", so whomever talks to them would need to advocate for removing reliance __file__ on its own merits.
  2. We're not going to add additional patches to our vendored dependencies, to adapt for this.

@pfmoore
Copy link
Member

pfmoore commented Dec 9, 2022

To be more explicit, I'm only in favour if someone can solve all the issues in doing this. And I'm not at all convinced that those issues are actually solveable...

In particular, I agree regarding dependencies. And I'd go further still - we need a solution that means that we don't have to go through this whole process again before adding any new dependencies. So sorting out our current dependencies is only part of the issue. Also, I agree that asking dependencies to stop using __file__ "for pip's sake" is unacceptable.

TBH, I'm not convinced by the arguments in the linked PyOxidizer issue. The docs say that __file__ may not be present "for certain types of module". Pure Python files are a "type of module" that does have __file__, and I think it's legitimate to use for that reason. Of course, if the Python code isn't hosted on a filesystem, expecting to navigate the filesystem via __file__ won't work, and the value of __file__ may only be of use for informational purposes. But we have libraries like importlib.resources to handle those needs in an import mechanism independent way. I'd expect PyOxidizer to support such mechanisms if they expect people to stop using __file__. And I would support educating people to use importlib.resources rather than accessing the filesystem directly, just as a general "good practice". So if PyOxidizer works with mechanisms like that, the problem will solve itself in due course.

(Someone still needs to go through existing uses of __file__, in pip and elsewhere, and identify how to use importlib.resources instead - or if it's not possible, identify the missing functionality and get it added to importlib.resources. This isn't a "magic wand" that will fix everyone's issues, it's just a way to work on them within the context of existing approaches to the problem).

@thejcannon
Copy link
Contributor Author

As a comment of sheer curiosity, I wonder how the popularity of zipapps play against the popularity of using __file__. It almost seems from the ubiquity of __file__ in any library that wants to load resources, that the community has implictly voted on ignoring supporting them altogether. Libraries relying on them, like pex, have given up and resorted to unpacking the files onto the filesystem (which also supports C extension modules as well).

(Interestingly the zippapp "caveats" don't mention __file__)

@pfmoore
Copy link
Member

pfmoore commented Dec 9, 2022

🤷 It's the same issue. But it's worth noting that pip is available as a zipapp, and our remaining uses of __file__ don't preclude that. Maybe that means there's something PyOxidizer could do to provide an "equivalent to zipapps" level of support for __file__?

@pradyunsg
Copy link
Member

Based on https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_behavior_and_compliance.html#no-file and https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_resource_files.html#support-for-file which seems to take a more measured description of the problem and aren't a blanket "don't do __file__" but more pragmatically make that a responsibility of whomever packages up the relevant piece.

That said, I don't want to let this issue linger forever, so if we don't end up getting any sort of contributions on this in the next six months, I'll close this out. As far as I can tell, there's not much that we can do unilaterally on this and we know that pip is directly usable as a zipapp.

@pradyunsg pradyunsg added the S: awaiting response Waiting for a response/more information label Dec 16, 2022
@ichard26
Copy link
Member

That said, I don't want to let this issue linger forever, so if we don't end up getting any sort of contributions on this in the next six months, I'll close this out. As far as I can tell, there's not much that we can do unilaterally on this and we know that pip is directly usable as a zipapp.

poke @pradyunsg - you may have forgotten about this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S: awaiting response Waiting for a response/more information state: awaiting PR Feature discussed, PR is needed type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

5 participants