Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typing module adds objects to sys.modules that don't look like modules #79987

Closed
ericsnowcurrently opened this issue Jan 22, 2019 · 10 comments
Closed
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

BPO 35806
Nosy @gvanrossum, @warsaw, @brettcannon, @ronaldoussoren, @ncoghlan, @ericsnowcurrently

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-01-22.20:44:57.410>
created_at = <Date 2019-01-22.17:16:42.492>
labels = ['3.8', 'type-bug', 'library']
title = "typing module adds objects to sys.modules that don't look like modules"
updated_at = <Date 2019-01-28.20:00:47.376>
user = 'https://github.com/ericsnowcurrently'

bugs.python.org fields:

activity = <Date 2019-01-28.20:00:47.376>
actor = 'gvanrossum'
assignee = 'none'
closed = True
closed_date = <Date 2019-01-22.20:44:57.410>
closer = 'gvanrossum'
components = ['Library (Lib)']
creation = <Date 2019-01-22.17:16:42.492>
creator = 'eric.snow'
dependencies = []
files = []
hgrepos = []
issue_num = 35806
keywords = []
message_count = 10.0
messages = ['334222', '334224', '334231', '334232', '334238', '334429', '334436', '334437', '334447', '334489']
nosy_count = 6.0
nosy_names = ['gvanrossum', 'barry', 'brett.cannon', 'ronaldoussoren', 'ncoghlan', 'eric.snow']
pr_nums = []
priority = 'low'
resolution = 'wont fix'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue35806'
versions = ['Python 3.8']

@ericsnowcurrently
Copy link
Member Author

tl;dr Should all objects in sys.modules look like module objects?

In bpo-35791 Ronald described a problem with a "module" added to sys.modules that does not have all the attributes a module should have. He also mentioned a similar problem with typing.io [1]:

BTW. Typing.io is a namespace added to sys.modules by
the typing module that also does not have __spec__, and
causes similar problems. I have an simple workaround for
that on my side.

I've verified the missing module attributes (using 3.8):

    >>> old = sorted(sys.modules)
    >>> import typing
    >>> new = sorted(sys.modules)
    >>> assert sorted(set(old) - set(new)) == []
    >>> sorted(set(new) - set(old))
    ['_collections', '_functools', '_heapq', '_locale',
     '_operator', '_sre', 'collections', 'collections.abc',
     'contextlib', 'copyreg', 'enum', 'functools', 'heapq',
     'itertools', 'keyword', 'operator', 're', 'reprlib',
     'sre_compile', 'sre_constants', 'sre_parse', 'types',
     'typing', 'typing.io', 'typing.re']
    >>> [name for name in vars(sys.modules['typing.io']) if name.startswith('__')]
    ['__module__', '__doc__', '__all__', '__dict__', '__weakref__']
    >>> [name for name in vars(sys.modules['typing.re']) if name.startswith('__')]
    ['__module__', '__doc__', '__all__', '__dict__', '__weakref__']

Per the language reference [2], modules should have the following attributes:

__name__
__loader__
__package__
__spec__

Modules imported from files also should have __file__ and __cached__. (For the sake of completeness, packages also should have a __path__ attribute.)

As seen above, typing.io and typing.re don't have any of the import-related attributes.

So, should those two "modules" have all those attributes added? I'm in favor of saying that every sys.modules entry must have all the appropriate import-related attributes (but doesn't have to be an actual module object). Otherwise tools (e.g. importlib.reload(), Ronald's) making that (arguably valid) assumption break. The right place for the change in the language reference is probably the "module cache" section. [3] The actual entry for sys.modules [4] is probably fine as-is.

[1] https://bugs.python.org/issue35791#msg334212
[2] https://docs.python.org/3/reference/import.html#module-spec
[3] https://docs.python.org/3/reference/import.html#the-module-cache
[4] https://docs.python.org/3/library/sys.html#sys.modules

@ericsnowcurrently ericsnowcurrently added 3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 22, 2019
@gvanrossum
Copy link
Member

It's been a somewhat well-known idiom for modules to replace themselves in sys.modules with an object that implements some special behaviors (e.g. dynamic loading). This "just works" and AFAIK people have been doing this for ages. Typically such classes just implement enough machinery so that "import foo; print(foo.bar)" works -- everything else (file, __doc__ etc.) is optional.

So I think tools should be robust when they inspect sys.modules and not crash or whine loudly when they find something that's missing a few advanced attributes.

That said, if there's something *useful* we could put in the special attributes for e.g. typing.io, I'm not against it. But I don't think this is a bug.

@ronaldoussoren
Copy link
Contributor

The reason I filed bpo-35791 is that I'm rewriting modulegraph[1] using importlib and run into some problems due to importlib.util.find_spec() failing for names that are already imported.

Working around that in general probably will require reimplementing bits of importlib in my own code and that's something I'd prefer to avoid. The alternative appears to be messing with sys.modules when I run into this problem, which might cause other problem.

BTW. The lack of __spec__ on typing.io and typing.re is not a problem for me, I can use the machinery I already have to insert edges for C extensions to do the right thing for these modules as well.

[1] https://modulegraph.readthedocs.io/en/latest/

@ronaldoussoren
Copy link
Contributor

In response to my previous message: "Messing" with sys.modules appears to be good enough work around for me, at least in the limited testing I've done so far. The new version of modulegraph hasn't seen any testing beyond a largish set of unit tests though.

@gvanrossum
Copy link
Member

OK, I don't see a bug here. The docs for sys.modules are pretty vague -- I don't believe they imply that all values in sys.modules must have every attribute of a Module object.

@ncoghlan
Copy link
Contributor

Closing this without any changes contradicts the answer we gave Ronald on bpo-35791 that it's expected behaviour for importlib.find_spec() to throw an exception for already loaded modules without a __spec__ attribute.

So if this stays closed, then we should reopen bpo-35791, and treat it as a feature request to either:

  1. add a "ignore_module_cache" option to bypass sys.modules; or
  2. revert to searching for the original spec in cases where the sys.modules entry has no __spec__ attribute (which has the virtue of just working for cases of the "replace yourself in sys.modules" idiom)

That said, the typing pseudo submodules can populate their spec with useful information by copying most of their attributes from typing.__spec__, but setting their spec.loader attribute to one that throws an ImportError with a message saying to import typing instead of attempting to reload the submodule directly.

@gvanrossum
Copy link
Member

Option 2 sounds best. I am also not against adding __spec__ but I think we
should support the idiom regardless, and I don’t consider this a bug in the
typing module — at best there’s a slight improvement.

On Sun, Jan 27, 2019 at 6:50 AM Nick Coghlan <report@bugs.python.org> wrote:

Nick Coghlan ncoghlan@gmail.com added the comment:

Closing this without any changes contradicts the answer we gave Ronald on
bpo-35791 that it's expected behaviour for importlib.find_spec() to throw an
exception for already loaded modules without a spec attribute.

So if this stays closed, then we should reopen bpo-35791, and treat it as a
feature request to either:

  1. add a "ignore_module_cache" option to bypass sys.modules; or
  2. revert to searching for the original spec in cases where the
    sys.modules entry has no spec attribute (which has the virtue of just
    working for cases of the "replace yourself in sys.modules" idiom)

That said, the typing pseudo submodules can populate their spec with
useful information by copying most of their attributes from
typing.__spec__, but setting their spec.loader attribute to one that
throws an ImportError with a message saying to import typing instead of
attempting to reload the submodule directly.

----------


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue35806\>


--
--Guido (mobile)

@ronaldoussoren
Copy link
Contributor

Note that I will have to special case namespaces like typing.re anyway in my code, being a namespace that does not correspond to a "real" module whose code I can analyse.

bpo-35791 was mostly about 3th-party code like apipkg that are "real" modules, but for some reason effectively erase there __spec__ attribute. In those cases find_spec could return valid value.

As I mentioned in the other issue I have a simple workaround for this, and that's something I'll have to keep around for a while anyway.

I'm not convinced at this point that anything needs to be changed, as long as the documentation is clear on the requirement that modules should have a __spec__ attribute (which makes it easier to convince 3th-party authors to update their code).

Special cases like typing.re will always be special, and adding __spec__ to them won't change a lot there (the only vaguely useful attribute in the ModuleSpec for typing.re would be "name" and "parent") and IMHO would be needless code churn.

@ncoghlan
Copy link
Contributor

OK, I've filed bpo-35839 to propose falling back to a regular search when the __spec__ attribute is missing, while treating "obj.__spec__ is None" as a true negative cache entry.

@gvanrossum
Copy link
Member

Thanks!

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants