-
-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve is_namespace() check for modules where __spec__
is None
#1802
Conversation
46be0f7
to
61ee4f0
Compare
Thanks for that through investigation! I don't have the time for a full review right now, but this would definitely need a test in order not to regress in the future. It also needs a small entry in the changelog of the next patch release. |
Pull Request Test Coverage Report for Build 3099164269
💛 - Coveralls |
See https://stackoverflow.com/a/42962529. Let's take the following contents as an example: ```python import celery.result ``` From pylint-dev#1777, astroid started to use `processed_components` for namespace check. In the above case, the `modname` is `celery.result`, it first checks for `celery` and then `celery.result`. Before that PR, it'd always check for `celery.result`. `celery` is recreating module to make it lazily load. See https://github.com/celery/celery/blob/34533ab44d2a6492004bc3df44dc04ad5c6611e7/celery/__init__.py#L150. This module does not have `__spec__` set. Reading through Python's docs, it seems that `__spec__` can be set to None, so it seems like it's not a thing that we can depend upon for namespace checks. See https://docs.python.org/3/reference/import.html#spec__. --- The `celery.result` gets imported for me when pylint-pytest plugin tries to load fixtures, but this could happen anytime if any plugin imports packages. In that case, `importlib.util._find_spec_from_path("celery")` will raise ValueError since it's already in `sys.modules` and does not have a spec. Fixes pylint-dev/pylint#7488.
61ee4f0
to
0d9e05c
Compare
Thanks, I have added a changelog and a test. I was wrong before, the |
We probably should support this but we should probably open an issue against |
I was/am a bit hesitant to create a PR due to unclear semantics (especially since it's doing a lot of questionable stuff with modules). My understanding is |
The PEP is really unclear:
Similarly the docs state, https://docs.python.org/3/reference/import.html#spec__:
The I did a deep dive into The first commit is linked to python/typeshed#321, which is part of python/typeshed#189. There is a crucial reply in that thread by Brett:
My (perhaps somewhat biased) conclusion is that |
Great points. Thank you for the great research, and all the help. :) I have opened celery/celery#7773. |
__spec__
is None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and investigation @skshetry, I appreciate having another pair of eyes on a very difficult bit of astroid
's code.
I agree we need to make astroid
more robust against unexpected inputs. Your change looks sensible (and for this branch you edited, all we are trying to do is make sure that we still pass the test cases for old-style namespace packages such as test_identify_old_namespace_package_protocol
).
return ( | ||
sys.modules[processed_components[0]].__spec__ is None | ||
mod.__spec__ is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we even need to check mod.__spec__ is None
, but I don't fully understand all different styles of namespace packages. The test does pass even if we remove this.
Steps
Description
See https://stackoverflow.com/a/42962529.
Let's take the following contents as an example:
From #1777, astroid started to use
processed_components
for namespace check. In the above case, themodname
iscelery.result
, it first checks forcelery
and thencelery.result
. Before that PR, it'd always check forcelery.result
.But if you have imported it as
celery.result
,sys.modules["celery"].__spec__
is going to beNone
, and hence the function will return True, but below where we try to load getsubmodule_path
/__path__
forcelery.result
will fail as it is not a package. Also celery is not a namespace package.celery
is recreating module to make it lazily load, which does not have__spec__
set. Reading through Python's docs, it seems that__spec__
can be set to None, so it seems like it's not a thing that we can depend upon for namespace checks.The
celery.result
gets imported for me whenpylint-pytest
plugin tries to load fixtures, but this could happen anytime if any plugin imports packages. In that case,importlib.util._find_spec_from_path("celery")
will raise ValueError since it's already insys.modules
and does not have a spec.Type of Changes
Related Issue
Fixes pylint-dev/pylint#7488.