Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stubtest: Improve heuristics for determining whether global-namespace names are imported #14270

Merged
merged 3 commits into from Dec 22, 2022

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Dec 9, 2022

The problem

Stubtest currently has both false-positives and false-negatives when it comes to verifying constants in the global namespace of a module.

False positive

The import of string.ascii_letters is an implementation detail, but stubtest will complain about it missing from the stub:

# STUB: empty file

# RUNTIME
from string import ascii_letters

False negative

IMPORTANT_REGEX is defined in the module, but stubtest will erroneously conclude (on Python 3.10+) that the constant has been imported from another module, due to the fact that IMPORTANT_REGEX.__module__ is "re":

# STUB: empty file

# RUNTIME
import re
IMPORTANT_REGEX = re.compile("foo")

Solution

This PR fixes the false positive by using inspect.getsourcelines() to dynamically retrieve the module source code. It then uses symtable to analyse that source code to gather a list of names which are known to be imported.

The PR fixes the false negative by only using the __module__ heuristic on objects which are callable. The vast majority of callable objects will be types or functions. For these objects, the __module__ attribute will give a good indication of whether the object originates from another module or not; for other objects, it's less useful.

Impact on typeshed

This PR has the following impact on typeshed (all diffs are relative to stubtest as it exists on mypy master, when run on Python 3.10.7; the stdlib diff is based on running stubtest on a Windows machine):

Stdlib:

+error: lib2to3.pygram.python_grammar_no_print_and_exec_statement is not present in stub
+Stub: in file ..\typeshed\stdlib\lib2to3\pygram.pyi
+MISSING
+Runtime:
+<lib2to3.pgen2.grammar.Grammar object at 0x000001279D810FA0>
+ 
+error: venv.logger is not present in stub
+Stub: in file ..\typeshed\stdlib\venv\__init__.pyi
+MISSING
+Runtime:
+<Logger venv (WARNING)>
+ 
+note: unused allowlist entry asyncore\.E[A-Z]+
+note: unused allowlist entry asyncore.errorcode
+note: unused allowlist entry email._header_value_parser.hexdigits
+note: unused allowlist entry logging.handlers.ST_[A-Z]+
+note: unused allowlist entry xml.dom.expatbuilder.EMPTY_NAMESPACE
+note: unused allowlist entry xml.dom.expatbuilder.EMPTY_PREFIX
+note: unused allowlist entry xml.dom.expatbuilder.XMLNS_NAMESPACE
+note: unused allowlist entry xml.dom.minidom.EMPTY_NAMESPACE
+note: unused allowlist entry xml.dom.minidom.EMPTY_PREFIX
+note: unused allowlist entry xml.dom.minidom.XMLNS_NAMESPACE
+note: unused allowlist entry distutils.command.bdist_rpm.DEBUG
+note: unused allowlist entry distutils.command.build_ext.USER_BASE
+note: unused allowlist entry distutils.command.build_scripts.ST_MODE
+note: unused allowlist entry distutils.command.install_scripts.ST_MODE
+note: unused allowlist entry distutils.dist.DEBUG
+note: unused allowlist entry distutils.spawn.DEBUG
+note: unused allowlist entry distutils.core.DEBUG
+note: unused allowlist entry distutils.archive_util.getgrnam
+note: unused allowlist entry distutils.archive_util.getpwnam
+note: unused allowlist entry xmlrpc.server.fcntl
-Found 6 errors (checked 482 modules)
+Found 28 errors (checked 482 modules)

cffi:

+note: unused allowlist entry cffi.cparser.COMMON_TYPES
+note: unused allowlist entry cffi.verifier.__version_verifier_modules__
-Found 1 error (checked 17 modules)
+Found 3 errors (checked 17 modules)

chevron:

+note: unused allowlist entry chevron.main.version
+note: unused allowlist entry chevron.renderer.linesep
+Found 2 errors (checked 5 modules)

colorama:

+note: unused allowlist entry colorama.ansitowin32.BEL
+note: unused allowlist entry colorama.ansitowin32.windll
+Found 2 errors (checked 13 modules)

croniter

+error: croniter.croniter.hash_expression_re is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/croniter/croniter/croniter.pyi
+MISSING
+Runtime:
+re.compile('^(?P<hash_type>h|r)(\\((?P<range_begin>\\d+)-(?P<range_end>\\d+)\\))?(\\/(?P<divisor>...
+ 
+error: croniter.croniter.only_int_re is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/croniter/croniter/croniter.pyi
+MISSING
+Runtime:
+re.compile('^\\d+$')
+ 
+error: croniter.croniter.special_weekday_re is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/croniter/croniter/croniter.pyi
+MISSING
+Runtime:
+re.compile('^(\\w+)#(\\d+)|l(\\d+)$')
+ 
+error: croniter.croniter.star_or_int_re is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/croniter/croniter/croniter.pyi
+MISSING
+Runtime:
+re.compile('^(\\d+|\\*)$')
+ 
+error: croniter.croniter.step_search_re is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/croniter/croniter/croniter.pyi
+MISSING
+Runtime:
+re.compile('^([^-]+)-([^-/]+)(/(\\d+))?$')
+ 
+Found 5 errors (checked 2 modules)

crontab

+note: unused allowlist entry crontabs.X_OK
+Found 1 error (checked 3 modules)

dateparser:

+note: unused allowlist entry dateparser.conf.date_order_chart
+note: unused allowlist entry dateparser.conf.language_order
+note: unused allowlist entry dateparser.languages.loader.language_locale_dict
+note: unused allowlist entry dateparser.languages.loader.language_order
+note: unused allowlist entry dateparser.languages.locale.ALWAYS_KEEP_TOKENS
+note: unused allowlist entry dateparser.custom_language_detection.language_mapping.language_map
+note: unused allowlist entry dateparser.custom_language_detection.fasttext.dateparser_model_home
+note: unused allowlist entry dateparser.timezone_parser.timezone_info_list
+Found 8 errors (checked 238 modules)

decorator:

+error: decorator.POS is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/decorator/decorator.pyi
+MISSING
+Runtime:
+<_ParameterKind.POSITIONAL_OR_KEYWORD: 1>
+ 
+Found 1 error (checked 1 module)

dockerfile-parse:

+error: dockerfile_parse.parser.logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/dockerfile-parse/dockerfile_parse/parser.pyi
+MISSING
+Runtime:
+<Logger dockerfile_parse.parser (WARNING)>
+ 
+note: unused allowlist entry dockerfile_parse.constants.version_info
+note: unused allowlist entry dockerfile_parse.parser.string_types
+note: unused allowlist entry dockerfile_parse.parser.DOCKERFILE_FILENAME
+note: unused allowlist entry dockerfile_parse.parser.COMMENT_INSTRUCTION
+note: unused allowlist entry dockerfile_parse.util.PY2
+Found 6 errors (checked 4 modules)

fpdf:

+error: fpdf.image_parsing.RESAMPLE is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/fpdf2/fpdf/image_parsing.pyi
+MISSING
+Runtime:
+<Resampling.LANCZOS: 1>
+ 
+note: unused allowlist entry fpdf.syntax.BOM_UTF16_BE
+note: unused allowlist entry fpdf.linearization.signer
+note: unused allowlist entry fpdf.output.signer
+Found 4 errors (checked 25 modules)

parsimonious

+note: unused allowlist entry parsimonious.nodes.version_info
+Found 1 error (checked 13 modules)

pika

+note: unused allowlist entry pika.data.PY2
+note: unused allowlist entry pika.data.basestring
+note: unused allowlist entry pika.spec.str_or_bytes
+note: unused allowlist entry pika.validators.basestring
-Found 2 errors (checked 30 modules)
+Found 6 errors (checked 30 modules)

playsound:

+error: playsound.logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/playsound/playsound.pyi
+MISSING
+Runtime:
+<Logger playsound (WARNING)>
+ 
+Found 1 error (checked 1 module)

pyinstaller:

+error: PyInstaller.__main__.logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/pyinstaller/PyInstaller/__main__.pyi
+MISSING
+Runtime:
+<Logger PyInstaller.__main__ (INFO)>
+ 
+error: PyInstaller.utils.hooks.logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/pyinstaller/PyInstaller/utils/hooks/__init__.pyi
+MISSING
+Runtime:
+<Logger PyInstaller.utils.hooks (INFO)>
+ 
+Found 2 errors (checked 151 modules)

pyscreeze:

+error: pyscreeze.whichProc is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/PyScreeze/pyscreeze/__init__.pyi
+MISSING
+Runtime:
+<Popen: returncode: 1 args: ['which', 'scrot']>
+ 
+Found 1 error (checked 1 module)

requests:

+note: unused allowlist entry requests.adapters.basestring
+note: unused allowlist entry requests.auth.basestring
+note: unused allowlist entry requests.utils.basestring
+note: unused allowlist entry requests.utils.integer_types
+note: unused allowlist entry requests.models.basestring
+note: unused allowlist entry requests.help.requests_version
+note: unused allowlist entry requests.sessions.DEFAULT_PORTS
+note: unused allowlist entry requests.help.chardet
+note: unused allowlist entry requests.help.cryptography
+note: unused allowlist entry requests.help.pyopenssl
+note: unused allowlist entry requests.help.OpenSSL
+note: unused allowlist entry requests.charset_normalizer_version
+note: unused allowlist entry requests.chardet_version
+note: unused allowlist entry requests.utils.HEADER_VALIDATORS
-Found 2 errors (checked 18 modules)
+Found 16 errors (checked 18 modules)

retry:

+error: retry.api.logging_logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/retry/retry/api.pyi
+MISSING
+Runtime:
+<Logger retry.api (WARNING)>
+ 
+Found 1 error (checked 5 modules)

toml:

+note: unused allowlist entry toml.decoder.linesep
+Found 1 error (checked 5 modules)

vobject:

+error: vobject.base.formatter is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/vobject/vobject/base.pyi
+MISSING
+Runtime:
+<logging.Formatter object at 0x7f8bb3d737f0>
+ 
+error: vobject.base.handler is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/vobject/vobject/base.pyi
+MISSING
+Runtime:
+<StreamHandler /dev/null (NOTSET)>
+ 
+error: vobject.base.logger is not present in stub
+Stub: in file /home/runner/work/typeshed/typeshed/stubs/vobject/vobject/base.pyi
+MISSING
+Runtime:
+<Logger vobject.base (ERROR)>
+ 
+note: unused allowlist entry vobject.change_tz.PyICU
+note: unused allowlist entry vobject.hcalendar.CRLF
+Found 5 errors (checked 9 modules)

Xlib:

+note: unused allowlist entry Xlib.ext.randr.W
+note: unused allowlist entry Xlib.ext.xinput.integer_types
+note: unused allowlist entry Xlib.protocol.display.PY3
+note: unused allowlist entry Xlib.protocol.rq.PY3
+Found 4 errors (checked 66 modules)

zxcvbn:

+note: unused allowlist entry zxcvbn.scoring.ADJACENCY_GRAPHS
+note: unused allowlist entry zxcvbn.matching.FREQUENCY_LISTS 
+Found 2 errors (checked 8 modules)

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Dec 12, 2022

I went for code that felt "safe" and well-commented in this PR, since the approach felt a little out-there. But the code inside the _get_imported_symbol_names function could actually be made more concise and inlined inside the verify_mypyfile function, if that's preferable, and it probably wouldn't be significantly less safe:

-imported_symbols = _get_imported_symbol_names(runtime)
+try:
+    source = inspect.getsource(runtime)
+    module_symtable = symtable.symtable(source, runtime.__name__, "exec")
+    imported_symbols: set[str] | None = {
+        sym.get_name()
+        for sym in module_symtable.get_symbols()
+        if sym.is_imported()
+    }
+except (OSError, TypeError, SyntaxError):
+    imported_symbols = None

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, this is interesting, cool idea! A new way for stubtest to blur the lines between dynamic and static.

mypy/stubtest.py Outdated Show resolved Hide resolved
except SyntaxError:
return None

return frozenset(sym.get_name() for sym in module_symtable.get_symbols() if sym.is_imported())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if you have an import and an assignment, is_imported is still True. I think this is probably desirable, given that we usually type objects as coming from their C accelerator modules even if there is a pure Python equivalent. So no action item here, just typing out my thoughts :-)

@hauntsaninja hauntsaninja merged commit 31b0413 into python:master Dec 22, 2022
@AlexWaygood AlexWaygood deleted the stubtest-reexports branch December 22, 2022 23:27
@AlexWaygood
Copy link
Member Author

Thanks! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants