Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use *.pyi stub files as source for the inspect.signature() #121945

Closed
skirpichev opened this issue Jul 18, 2024 · 7 comments
Closed

Use *.pyi stub files as source for the inspect.signature() #121945

skirpichev opened this issue Jul 18, 2024 · 7 comments
Labels
topic-typing type-feature A feature request or enhancement

Comments

@skirpichev
Copy link
Contributor

skirpichev commented Jul 18, 2024

Feature or enhancement

Proposal:

In the referenced d.p.o thread it was proposed using stub files to get inspect.Signature's for extension modules.

Currently, there is no public interface to provide such data for introspection. There is only private, undocumented __text_signature__ attribute, which parsed by inspect.signature(). This interface also incomplete, e.g. it lacks support for return type annotations.

Lets use stub files as a fallback for extension modules: if callable has no __text_signature__ attribute - the inspect module should take look to stub files for the given module, import relevant stub object (with same import resolution ordering as for type checkers) and take it's signature. Later, the AC can be changed to generate stub files (or take them as input).

PS: Maybe same approach could work to support functions with multiple signatures, if the typing.get_overloads() will fallback to stub files as well.

If this does make sense, I'll provide a draft implementation.

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://discuss.python.org/t/type-signatures-for-extension-modules-pep-draft/43914/22

@skirpichev skirpichev added the type-feature A feature request or enhancement label Jul 18, 2024
@skirpichev
Copy link
Contributor Author

CC @zooba

@sobolevn
Copy link
Member

This feature requires a lot of little details to get right, since .pyi files right now are not ever executed by Python. And they are a bit special in that regard (for example from __future__ import annotations). Adding topic-typing to be sure that typing maintainers also see this.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jul 18, 2024

Although pyi files are always guaranteed to parse as valid Python syntax, they often can't be imported at runtime since as @sobolevn says they support forward references in many contexts that runtime Python source files don't. The only way you'd be able to reliably retrieve information from them is by statically analysing their ASTs (which is what type checkers do).

It's possible to figure out which AST node in a stub file corresponds to a given runtime symbol. https://github.com/JelleZijlstra/stubdefaulter and https://github.com/python/mypy/blob/master/mypy/stubtest.py are both examples of projects that compare AST nodes in parsed stub files to corresponding runtime objects. It's a pretty nontrivial task to do it correctly, however, and I'm not sure if it's something the standard library's inspect.signature function should be doing.

@skirpichev
Copy link
Contributor Author

This feature requires a lot of little details to get right

Probably, so.

And they are a bit special in that regard (for example from future import annotations).

they often can't be imported at runtime since as @sobolevn says they support forward references in many contexts that runtime Python source files don't.

Yeah, I would expect that importing in runtime will be less trivial, but it's possible, isn't? Or did I miss something and there are cases, where above __future__ import is not enough?

If that's true - this is a no-go for the idea (for me).

The only way you'd be able to reliably retrieve information from them is by statically analysing their ASTs

This is something I would like to avoid.

I was thinking that possible objection against is using stub files for different goals, that are overlapping but... I.e. currently signatures of stdlib/builtins don't include typing info, hardly this will be changed. So, stubs shipped with CPython will be more simple. But it's NOT OK if pip install types-foo will break inspect's behaviour for extension foo...

Working with AST could be much more simple, if a decent part of data will be ignored. E.g. current inspect's helper, parsing __text_signature__ - ignores annotations and "complex" default values:

cpython/Lib/inspect.py

Lines 2193 to 2337 in c8d2630

def _signature_fromstr(cls, obj, s, skip_bound_arg=True):
"""Private helper to parse content of '__text_signature__'
and return a Signature based on it.
"""
Parameter = cls._parameter_cls
clean_signature, self_parameter = _signature_strip_non_python_syntax(s)
program = "def foo" + clean_signature + ": pass"
try:
module = ast.parse(program)
except SyntaxError:
module = None
if not isinstance(module, ast.Module):
raise ValueError("{!r} builtin has invalid signature".format(obj))
f = module.body[0]
parameters = []
empty = Parameter.empty
module = None
module_dict = {}
module_name = getattr(obj, '__module__', None)
if not module_name:
objclass = getattr(obj, '__objclass__', None)
module_name = getattr(objclass, '__module__', None)
if module_name:
module = sys.modules.get(module_name, None)
if module:
module_dict = module.__dict__
sys_module_dict = sys.modules.copy()
def parse_name(node):
assert isinstance(node, ast.arg)
if node.annotation is not None:
raise ValueError("Annotations are not currently supported")
return node.arg
def wrap_value(s):
try:
value = eval(s, module_dict)
except NameError:
try:
value = eval(s, sys_module_dict)
except NameError:
raise ValueError
if isinstance(value, (str, int, float, bytes, bool, type(None))):
return ast.Constant(value)
raise ValueError
class RewriteSymbolics(ast.NodeTransformer):
def visit_Attribute(self, node):
a = []
n = node
while isinstance(n, ast.Attribute):
a.append(n.attr)
n = n.value
if not isinstance(n, ast.Name):
raise ValueError
a.append(n.id)
value = ".".join(reversed(a))
return wrap_value(value)
def visit_Name(self, node):
if not isinstance(node.ctx, ast.Load):
raise ValueError()
return wrap_value(node.id)
def visit_BinOp(self, node):
# Support constant folding of a couple simple binary operations
# commonly used to define default values in text signatures
left = self.visit(node.left)
right = self.visit(node.right)
if not isinstance(left, ast.Constant) or not isinstance(right, ast.Constant):
raise ValueError
if isinstance(node.op, ast.Add):
return ast.Constant(left.value + right.value)
elif isinstance(node.op, ast.Sub):
return ast.Constant(left.value - right.value)
elif isinstance(node.op, ast.BitOr):
return ast.Constant(left.value | right.value)
raise ValueError
def p(name_node, default_node, default=empty):
name = parse_name(name_node)
if default_node and default_node is not _empty:
try:
default_node = RewriteSymbolics().visit(default_node)
default = ast.literal_eval(default_node)
except ValueError:
raise ValueError("{!r} builtin has invalid signature".format(obj)) from None
parameters.append(Parameter(name, kind, default=default, annotation=empty))
# non-keyword-only parameters
total_non_kw_args = len(f.args.posonlyargs) + len(f.args.args)
required_non_kw_args = total_non_kw_args - len(f.args.defaults)
defaults = itertools.chain(itertools.repeat(None, required_non_kw_args), f.args.defaults)
kind = Parameter.POSITIONAL_ONLY
for (name, default) in zip(f.args.posonlyargs, defaults):
p(name, default)
kind = Parameter.POSITIONAL_OR_KEYWORD
for (name, default) in zip(f.args.args, defaults):
p(name, default)
# *args
if f.args.vararg:
kind = Parameter.VAR_POSITIONAL
p(f.args.vararg, empty)
# keyword-only arguments
kind = Parameter.KEYWORD_ONLY
for name, default in zip(f.args.kwonlyargs, f.args.kw_defaults):
p(name, default)
# **kwargs
if f.args.kwarg:
kind = Parameter.VAR_KEYWORD
p(f.args.kwarg, empty)
if self_parameter is not None:
# Possibly strip the bound argument:
# - We *always* strip first bound argument if
# it is a module.
# - We don't strip first bound argument if
# skip_bound_arg is False.
assert parameters
_self = getattr(obj, '__self__', None)
self_isbound = _self is not None
self_ismodule = ismodule(_self)
if self_isbound and (self_ismodule or skip_bound_arg):
parameters.pop(0)
else:
# for builtins, self parameter is always positional-only!
p = parameters[0].replace(kind=Parameter.POSITIONAL_ONLY)
parameters[0] = p
return cls(parameters, return_annotation=cls.empty)

Most extensions, probably, aren't too complicated in this regard. For example, for most functions/classes in the gmpy2 introspection capabilities could be added with #101872.

@sobolevn
Copy link
Member

sobolevn commented Jul 19, 2024

Yeah, I would expect that importing in runtime will be less trivial, but it's possible, isn't? Or did I miss something and there are cases, where above future import is not enough?

There might be, where generic types are used which are not really generic in runtime. It might be a rare case, but it might happen.

# some.pyi

Alias = SomeFakeGeneric[int]

There might be cases where stub-only modules or types are used:

  • Marked as @typing.type_check_only
  • Protected aliases import like in from django.utils.functional import _StrOrPromise in django-stubs
  • _typeshed
  • etc

@JelleZijlstra
Copy link
Member

Yes, it is not possible to run stubs with a standard Python interpreter. I just tried with a few files in typeshed, and got the following problems:

  • Importing from _typeshed
  • A recursive type alias TraceFunction: TypeAlias = Callable[[FrameType, str, Any], TraceFunction | None] (throws NameError)
  • A class returning an instance of itself as a forward reference (this one would be fixed by lazy evaluation of annotations)

This proposal doesn't say much about where the stub files should come from: should CPython ship them? If so, how do they interact with type checkers and with the stubs from typeshed?

I maintain a project https://github.com/JelleZijlstra/typeshed_client that contains functionality for finding stub files, parsing them, and locating names. This could be used to create a "superpowered inspect.signature" that returns a signature as defined in the stub files. But it would be a lot of complexity to add to the standard library.

@skirpichev
Copy link
Contributor Author

@JelleZijlstra, thanks! For me, this kills the proposal:( I'll keep issue open for a while, maybe Steve could defend this idea.

@skirpichev skirpichev closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-typing type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants