-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Treat import a.b.c as m
as m = sys.modules['a.b.c']
#74210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
https://mail.python.org/pipermail/python-ideas/2017-April/045405.html Hi there. I asked a question <http://stackoverflow.com/questions/41845671/import-as-in-python-3\> on Stackoverflow:
I was pointed out <http://stackoverflow.com/a/24968941/248296\> that this is a somewhat weird behavior of Python:
Why would Using The enhancement is to treat |
The background here is the change in http://bugs.python.org/issue17636 that allows IMPORT_FROM to fall back to sys.modules when written as "from a.b import c as m", while the plain LOAD_ATTR generated for "import a.b.c as m" fails. One potential resolution to the discrepancy that occurred to me in the python-ideas discussion was to add a new IMPORT_ATTR opcode specifically for use in the latter case. The main downside I can see is the usual one with adding any new opcode for a niche use case like this one: it risks slowing the eval loop down in general for a marginal improvement in language consistency. However, I don't think we should reject the possibility out of hand on that basis. |
Hm, the stackoverflow point to another stackoverflow (http://stackoverflow.com/questions/24807434/imports-in-init-py-and-import-as-statement/24968941#24968941) which is due to an import cycle. Is there also an import cycle here? Because I cannot seem repro this with just empty modules (and I'm too lazy to find out if brain is a real package). |
# mytest/mod2.py: import sys
import mytest.mod1
assert 'mytest.mod1' in sys.modules
import mytest.mod1 as mod # this fails, though the prev. lines work perfectly |
So to save others the bother to check what's in the zip file, the contents of mytest/mod1.py is as follows: import mytest.mod2 as mod
def func():
print('mod1.func called')
mod.func() There's no __init__.py (so mytest is a namespace package, PEP-420) but adding an empty __init__.py makes no difference. The problem occurs with both Python 2 and 3. The root cause is the import cycle. I played around with dis and found the following (I suspect others have already found this but the thread was hard to follow for me initially): >>> dis.dis('import a.b')
1 0 LOAD_CONST 0 (0)
2 LOAD_CONST 1 (None)
4 IMPORT_NAME 0 (a.b)
6 STORE_NAME 1 (a)
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
>>> compared to >>> dis.dis('import a.b as c')
1 0 LOAD_CONST 0 (0)
2 LOAD_CONST 1 (None)
4 IMPORT_NAME 0 (a.b)
6 LOAD_ATTR 1 (b) <-- error here
8 STORE_NAME 2 (c)
10 LOAD_CONST 1 (None)
12 RETURN_VALUE
>>> What this shows is that the implementation of "import a.b" and "import a.b as c" are different. The former calls __import__('a.b', ...) which returns the module 'a' and stores that in the variable 'a'. In the OP's case, because of the import cycle, while sys.modules['a.b'] exists, module 'a' does not yet have the attribute 'b'. That's the reason that in the latter example, the LOAD_ATTR opcode fails. I'm inclined not to attempt to fix this. The reason that we don't pull 'a.b' out of sys.modules at this point is that if later in the execution of a/b.py we get an exception, the import will be cancelled, and sys.modules['a.b'] will be *deleted*, and then the variable 'c' would have a reference to a half-loaded module. The semantics of imports in the case of cycles are somewhat complex but clearly defined and there are only a few rules to consider, and from these rules it is possible to reason out whether any particular case is valid or not. I would prefer to keep it this way rather than add more special cases. There's a good reason why, *in general* (regardless of import cycles), "import a.b as c" is implemented as a getattr operation on a, not as an index operation on sys.modules (it is possible for module a to override its attribute b without updating sys.modules) and I'd rather keep those semantics than give them up for this particular edge case. Cyclic imports are hard. If they don't work for you, avoid them. |
I think this is solvable -- track and delete the references too. It's your decision in the end, but I think it's not good to make the language inconsistent because of technical limitations -- the interface should be the primary source for decisions. |
This sounds like a really strange application. Even if someone overrides the attribute, when you do in other place The docs https://docs.python.org/3/reference/simple_stmts.html#import say https://docs.python.org/3/reference/import.html#searching
So if you treat |
Right, this problem only arises with import cycles, and that's why we resisted making eager submodule resolution work *at all* for so long (bpo-992389 was filed way back in 2004). We only conceded the point (with bpo-17636 being implemented for 3.5) specifically to address a barrier to adoption for explicit relative imports, as it turned out that "from . import bar" could fail in cases where "import foo.bar" previously worked. The best explanation I could find for that rationale in the related python-dev thread is PJE's post here: https://mail.python.org/pipermail/python-dev/2013-April/125121.html What Victor's python-ideas thread pointed out is that there are actually *3* flavours of import where this particular circular reference problem can come up: # Has worked as long as Python has had packages,
# as long as you only lazily resolve foo.bar in
# function and method implementations
import foo.bar
# Has worked since 3.5 due to the IMPORT_FROM
# change that falls back to a sys.modules lookup
from foo import bar
# Still gives AttributeError since it
# eagerly resolves the attribute lookup
import foo.bar as bar While I think the architectural case for allowing this kind of circular dependency between different top level namespaces is much weaker than that for allowing it within packages, I do think there's a reasonable consistency argument to be made in favour of ensuring that I don't think it's a big problem in practice (so I wouldn't spend any time on implementing it myself), but the notion of an IMPORT_ATTR opcode for the "import x.y.z as m" case that parallels IMPORT_FROM seems architecturally clean to me in a way that the proposed resolutions to bpo-992389 weren't. |
OK I won't close it but I'm withdrawing from the issue. If there are devs who want to implement this or mentor some contributor into implementing it feel free, but it sounds like Nick doesn't think it's important enough to fix (i.e. not worth his time) and I feel the same. |
It is easy to implement this (just replace LOAD_ATTR with IMPORT_FROM in the compiler). The hardest part is writing tests. |
Huh, interesting - I'd missed that the only part of the "from a.b import c" that IMPORT_FROM implements is the LOAD_ATTR variant that falls back to sys.modules. The prior adjustment to get IMPORT_NAME to return "a.b" instead of "a" happens inside that opcode based on the fact that a non-empty from_list was passed in. So indeed, there's no new opcode needed - as Serhiy points out, the compiler just needs to emit IMPORT_FROM instead of LOAD_ATTR for this case. |
Maybe I am wrong but don't we get another weird behavior? import sys.path # this is error now
ModuleNotFoundError: No module named 'sys.path'; 'sys' is not a package So we probably expect same behavior here: import sys.path as foo But if we just replace LOAD_ATTR with IMPORT_FROM in the compiler then wouldn't it works like: from sys import path as foo ? |
No, because the compiler generates different arguments for the IMPORT_NAME opcode. |
Expanding on Serhiy's answer: the ModuleNotFoundError for "import sys.path" comes from IMPORT_NAME, not from any of the subsequent attribute lookups. That difference is visible in the bytecode: >>> dis("import sys.path as path")
1 0 LOAD_CONST 0 (0)
3 LOAD_CONST 1 (None)
6 IMPORT_NAME 0 (sys.path)
9 LOAD_ATTR 1 (path)
12 STORE_NAME 1 (path)
15 LOAD_CONST 1 (None)
18 RETURN_VALUE For "import sys.path as path", the given module name is "sys.path", and the "from list" entry on the stack is None. This fails before it even reaches the >>> dis("from sys import path")
1 0 LOAD_CONST 0 (0)
3 LOAD_CONST 1 (('path',))
6 IMPORT_NAME 0 (sys)
9 IMPORT_FROM 1 (path)
12 STORE_NAME 1 (path)
15 POP_TOP
16 LOAD_CONST 2 (None)
19 RETURN_VALUE For "from sys import path", the given module name is "sys", and the "from list" entry on the stack is a tuple containing the string "path". This works, since "sys" is importable *and* it has a "path" attribute. Hence why Serhiy's suggestion is such an elegant fix, since the only cases where the LOAD_ATTR vs IMPORT_FROM distinction matters will be those where:
Getting the tests and any related documentation updates right will actually be harder than the code change. |
How can I facilitate/contribute to get the changes suggested by Serhiy into Python? |
Could anyone please make a review? |
I've added your PR to my review queue, Serhiy, so I will get to it, I just can't make any promises as to when (hopefully this week). |
Thank you Brett. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: