Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: signature copying for kwargs. #270

Closed
dmoisset opened this issue Aug 26, 2016 · 70 comments
Closed

Proposal: signature copying for kwargs. #270

dmoisset opened this issue Aug 26, 2016 · 70 comments

Comments

@dmoisset
Copy link
Contributor

There's a quite common pattern in python code which is:

def function(foo, *args, **kwargs):
    # do something with foo
    other_function(*args, **kwargs)
    # possibly do something else

def other_function(color: str=..., temperature: float=..., style: Stylesheet=..., timeout: Optional[int]=..., database_adaptor: Adaptor=..., strict: bool=..., output: IO[str], allow_frogs: bool=..., mode: SomeEnum=...):
    # do something with a lot of options

(a usual subcase of this one is when other_function is actually super().function ). This presents two problems for a static analyzer:

  • the call from function to other_function can not be type-checked properly because of the *args, **kwargs in the call arguments.
  • there is no sensible way to annotate function, so calls to it are unchecked.

The problem for me also affects readability of the code (which is for me one of the main problems that annotations tries to address). James Powell from numfocus even gave a pydata talk about the difficulties it brings at https://www.youtube.com/watch?v=MQMbnhSthZQ

Even if theoretically the args/kwargs packing feature of python can be used with more or less arbitrary data, IMO this use-case is common enough to warrant some special treatment. I was thinking on a way to flag this usage, for example:

@delegate_args(other_function)
def function(foo, *args, **kwargs):
    other_function(*args, **kwargs)

This could hint an analyzer so:

  • On calls to function, the "extra" arguments are checked to match the signature of other_function
  • The call to other_function is considered valid given that it uses the same arguments (I know that the code above could have modified the content of kwargs, but it's still more checking than what we have now).

For me, even without static analyzer, the readability benefits of seeing

@delegate_args(matplotlib.pyplot.plot)
def plot_valuation(ticker_symbol: str, start: date, end: date, *args, **kwargs): ...

and knowing that plot_valuation accepts any valid arguments from matplotlib's plot function, is worth it.

@gvanrossum
Copy link
Member

Gotta make this quick: Yes, we experience this in our own (Dropbox) code bases a lot too. And I agree this is useful for readability even without static checking. I think there are some details to be worked out -- what I often see is that the wrapping function adds some fixed args to the function in delegates too, so those should not be accepted:

def raw(name='me', age=42): ...

def cooked(first_name, last_name, **extra):
    raw(name=first_name + ' ' + last_name, **extra)

cooked(age=100)  # OK
cooked(name='Guido')  # Error

Also of course the wrapper may also have some of its own arguments that are unrelated to the wrapped function.

@JukkaL
Copy link
Contributor

JukkaL commented Aug 26, 2016

I've seen this pattern quite frequently as well, and it would be nice to support it somehow. Refactoring existing code to use explicitly spelled out individual arguments instead of *args and **kwargs can be lot of work and make the code harder to maintain, especially if there are many args to propagate.

Expanding on the decorator idea, maybe it should be possible to declare that a function accepts all positional and keyword args another function accepts, except for a set of args. Example:

def raw(name='me', age=42): ...

@delegate_args(raw, except=['name'])
def cooked(first_name, last_name, **extra):
    raw(name=first_name + ' ' + last_name, **extra)

cooked(age=100)  # OK
cooked(name='Guido')  # Error

Implementing this in a type checker looks a little tricky, especially when using both *args and **kwargs. Special-casing the whole thing looks possible, but sometimes programmers mutate the **kwargs dictionary and type checking that would likely require some sort of support for "dict-as-struct".

@dmoisset
Copy link
Contributor Author

OK, to support a bit of this with data I did a small experiment: take a random sample of stdlib files until I got about 100 function definitions with open keyword args; then another random sample in django with 100 more defs, and check manually what they do with the arguments. My finding say that:

  1. 17% of the defs with **k are for functions that never use k
  2. 18% of the defs with **k actually use k as a dictionary (calling get(), pop(), iterating on it, etc., but also storing it into a variable or attribute, or passing k as a parameter but without **)
  3. 40% of the defs with **k pass it as is to another function, without adding more positional or keyword arguments on the call (This percentage also includes the functions that pass *varargs as is too).
  4. 24% of the defs with **k pass it (with ** again) to another function, but the call includes some extra arguments (positional or keyword).
  5. 1% of the defs with **k use it for self.__dict__.update(k).

Case 1 is trivial and works well already with Any or object. Case 2 is the hard one (when the dictionary is heterogeneous, which appears to be the most frequent scenario, although I didn't collect that data). Case 5 could be a verification target for a type checker but seems quite unusual. My proposal would cover case 3, and with @JukkaL addition, a part of case 4.

So "hard" cases would be reduced to about 20% of the def statements with **kwargs according to this semi-scientific experiment :) .

From what I saw of case 4, actually I think that a lot of cases could be covered automatically. The example that @JukkaL posted is not very common. The common case for those (at least in my sample) are the following two:

def raw(name=..., age=...): ...

@delegate_args(raw)
def cooked(name, **extra):
    raw(name=name.upper(), **extra)
    # here the typechecker could deduce that `**extra` never contains a `name`, because it's a named argument of "cooked"

def raw2(a, b, c, name=..., age=...): ...

@delegate_args(raw)
def cooked2(a, b, c, **extra):
    raw(a+1, b, c, **extra)
    # here the typechecker could deduce that `**extra` never contains `a`, `b`, or `c`, because they are named arguments of `cooked2`

So the except=... proposal is helpful, but in just a handful of cases (which I prefer, and also avoids the "except" keyword). There's also this case (which I included in my group 2) that could benefit from a similar proposal which is:

@delegate_args(raw, include=['foo'])
def cooked(**extra):
    foo = extra.pop('foo', SOME_DEFAULT)
    do_something_with(foo)
    raw( **extra)

I would have declared foo as an explicit argument and perhaps that's what we should recommend, but I'm a bit surprised on how relatively common this pattern is.

@JukkaL
Copy link
Contributor

JukkaL commented Aug 31, 2016

@dmoisset Thanks for the careful analysis! Having data like this makes decisions much easier. So the conclusion seems to be that this would be useful even without support for dict-as-struct. It might be useful for us to run a similar analysis against Dropbox code to get another data point.

@dfee
Copy link

dfee commented Jun 13, 2018

I came across this issue, and wanted to present my solution to the problem:

import forge

def raw(name='he', age=42):
    return f'{name} is {age}'

@forge.compose(
    forge.copy(raw, exclude=('name')),
    forge.insert((forge.arg('first_name'), forge.arg('last_name')), index=0),
)
def cooked(first_name, last_name, **extras):
    extras['name'] = first_name + ' ' + last_name
    return forge.callwith(raw, extras)

(and some tests that validate that it works as expected)

import inspect
assert repr(inspect.signature(cooked)) == '<Signature (first_name, last_name, age=42)>'
assert raw(age=100)  == 'he is 100'
assert raw('Guido', age=42) == 'Guido is 42'
assert cooked('Guido', 'VR', 42) == 'Guido VR is 42'

try:
    cooked(name='Guido')  # Error
except TypeError as exc:
    assert exc.args[0] == "cooked() missing a required argument: 'first_name'"

forge is a package I recently released that allows for signature revision (add, remove, modify parameters as well as convert or validate argument values, etc.).

I'm curious as to whether a mypy plugin could be written that would allow for static analysis.

@alexmojaki
Copy link

This is quite the mouthful:

@forge.compose(
    forge.copy(raw, exclude=('name')),
    forge.insert((forge.arg('first_name'), forge.arg('last_name')), index=0),
)
def cooked(first_name, last_name, **extras):

How about instead forge.copy just doesn't interfere with arguments in the function it's decorating that don't have * or **? In other words the forge.insert could be automatic. If somehow that turns out to be a problem in some cases it could be turned off with a keyword argument.

I don't understand the need for this:

    extras['name'] = first_name + ' ' + last_name
    return forge.callwith(raw, extras)

Why not raw(name=first_name + ' ' + last_name, **extra)?

@dfee
Copy link

dfee commented Jun 13, 2018

So forge can do more than solve the problem discussed here. You bring up an interesting point though, is that a common-enough problem to which a canned decorator should be provided? I'll share a snippet in a little bit.

The reason for callwith is that reconstituting arguments (especially the ordering of positional or keyword parameters is a non-trivial problem:

import forge

def func(a, b, c, d=4, e=5, f=6, *args):
    return (a, b, c, d, e, f, args)

@forge.sign(
    forge.arg('a', default=1),
    forge.arg('b', default=2),
    forge.arg('c', default=3),
    *forge.args,
)
def func2(*args, **kwargs):
    return forge.callwith(func, kwargs, args)

assert forge.repr_callable(func2) == 'func2(a=1, b=2, c=3, *args)'
assert func2(10, 20, 30, 'a', 'b', 'c') == (10, 20, 30, 4, 5, 6, ('a', 'b', 'c'))

The alternative to that is manual interpolation of argument values which is just as big of a problem:

import forge

def func(a, b, c, d=4, e=5, f=6, *args):
    return (a, b, c, d, e, f, args)

@forge.sign(
    forge.arg('a', default=1),
    forge.arg('b', default=2),
    forge.arg('c', default=3),
    *forge.args,
)
def func2(*args, **kwargs):
    return func(
        kwargs['a'],
        kwargs['b'],
        kwargs['c'],
        4,
        5,
        6,
        *args,
    )

assert forge.repr_callable(func2) == 'func2(a=1, b=2, c=3, *args)'
assert func2(10, 20, 30, 'a', 'b', 'c') == (10, 20, 30, 4, 5, 6, ('a', 'b', 'c'))

I explain that further, in the docs.

@alexmojaki
Copy link

OK, that's cool, but:

  1. Anyone who writes
def func(a, b, c, d=4, e=5, f=6, *args):

instead of

def func(a, b, c, *args, d=4, e=5, f=6):

deserves to be slapped, and

  1. If I understand correctly, the cooked/raw example can work fine without callwith?

@dfee
Copy link

dfee commented Jun 13, 2018

Both are correct :)

@dfee
Copy link

dfee commented Jun 14, 2018

Alright, I got a minute to draft out an extend revision:

import functools
import inspect

import forge


class extend(forge.Revision):
    """
    Extends a function's signature...
    """
    def __init__(self, callable, *, include=None, exclude=None):
        # pylint: disable=W0622, redefined-builtin
        self.callable = callable
        self.include = include
        self.exclude = exclude

    def revise(self, previous):
        extensions = forge.fsignature(self.callable)
        if self.include:
            extensions = list(forge.findparam(extensions, self.include))
        elif self.exclude:
            extensions = [
                param for param in extensions
                if param not in forge.findparam(extensions, self.exclude)
            ]

        params = [
            param for param in previous
            if param.kind is not forge.FParameter.VAR_KEYWORD
        ] + list(extensions)

        return forge.FSignature(params)

Usage is straightforward:

def raw(name='he', age=42):
    return f'{name} is {age}'

@extend(raw, exclude=('name'))
def cooked(first_name, last_name, **extras):
    return raw(name=f'{first_name} {last_name}', **extras)

And the tests still pass:

assert repr(inspect.signature(cooked)) == '<Signature (first_name, last_name, age=42)>'
assert raw(age=100)  == 'he is 100'
assert raw('Guido', age=42) == 'Guido is 42'
assert cooked('Guido', 'VR', 42) == 'Guido VR is 42'

try:
    cooked(name='Guido')  # Error
except TypeError as exc:
    assert exc.args[0] == "cooked() missing a required argument: 'first_name'"

The caveats to this approach are that a user must remain wary of the ordering of "parameter kind" and parameters with default values (or use within a @forge.compose(forge.extend(...), forge.sort()) construct.

@JukkaL
Copy link
Contributor

JukkaL commented Sep 6, 2018

python/mypy#5559 has an implementation of the basic proposal with exclude= by @alexmojaki as a mypy extension. Before moving on, I'd like to discuss the design a bit more.

In particular, there's the question of what syntax to use. All examples above use a decorator, but @ilevkivskyi also suggested (in python/mypy#5559) using a TypedDict (another mypy extension) or a protocol-based callable as an annotation. Here's an example using a callable annotation from the above PR:

class DoStuff(Protocol):
    def __call__(a: int, b: int = ..., c: int = ...) -> None: ...

f: DoStuff
def f(*args, **kwargs):
    ...

g: DoStuff
def g(*args, **kwargs):
    f(*args, **kwargs)

This is not optimal if the original function (such as f above) doesn't use *args and/or **kwargs for all arguments, since then we'd need to duplicate the argument names and types, even though that's what the proposal is trying to avoid. So there would arguably only be a benefit if there are two or more functions that delegate to f.

Here's another idea (which will likely be harder to implement):

def f(x: int, y: str = '') -> None:
    ...

g: TypeOf[f]
def g(*args, **kwargs):
    ...

We'd introduce a TypeOf[x] type operator that is equivalent to the type of the expression within square brackets. It could potentially also be used for other things, such as callbacks, though these use cases may be pretty marginal:

def default(x: int, y: str = '') -> None:
    ...

def do_stuff(cb: TypeOf[default] = default) -> int:
    ...

This would only cover uses cases where the signatures of the two functions are identical.

Even if we continue with the decorator proposal, we don't have an agreement on what we'd call it. The ideas above aren't quite self-explanatory enough, in my opinion. The feature is not very widely useful, so I feel like we should try to make name very clear and explicit. Here are a bunch of random ideas (using a real example from subprocess):

@copy_signature(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

@inherit_signature(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

@with_signature(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

@use_signature(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

@use_signature_from(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

@apply_signature(Popen)
def call(*popenargs, timeout: Optional[float] = None, **kwargs)): ...

Here my rationale is that the decorator actually does no delegation -- the assumption is that the function delegates to another function, but that's actually not enforced by the feature, I assume. So the effect is to take the signature of another function and apply it to the decorated function.

@ilevkivskyi
Copy link
Member

Here are few more random ideas/comments:

  • Instead of TypeOf[f], that may give an impression that this applies to any other object, I would propose to allow single argument Callable like this:

    def f(x: int, y: str = '') -> None:
        ...
    g: Callable[f]
    def g(*args, **kwargs):
        ...
    def do_stuff(cb: Callable[f] = f) -> int:
        ...
  • ...protocol-based callable as an annotation...
    So there would arguably only be a benefit if there are two or more functions that delegate to f.

    Yes, the same also applies to using TypedDict to annotate **kwargs. Allowing TypedDict in annotations has its own use case, Allow using TypedDict for more precise typing of **kwds mypy#4441, so probably it is a bit orthogonal here.

  • If we go with the decorator approach, maybe again we can use (now stalled) @declared_type? (I am not sure I like this however.)

My preference would be probably to go with g: Callable[f].

@dfee
Copy link

dfee commented Sep 7, 2018

That's great for copying a signature, but it does nothing for the just-as, if-not-more common case where you want to mutate the parameters: #270 (comment)

@alexmojaki
Copy link

A decorator allows performing actions at runtime such as setting the __signature__ attribute. This cannot be done with an annotation. To me this is a complete dealbreaker for using an annotation. I could elaborate some reasons for this, but do I need to? Does anyone feel that runtime signatures aren't such an important feature for this proposal?

@gvanrossum
Copy link
Member

Yes, please explain your use cases and spare us further rhetoric (words like "complete dealbreaker" and "but do I need to?").

@alexmojaki
Copy link

Sorry, it was a genuine question, I didn't want to waste my time or anyone else's preaching to the choir.

Setting __signature__ on the decorated function means that inspect.signature and many of the nice features that follow (.parameters, .bind, .bind_partial, etc.) become automatically available. I don't know exactly what arguments led to the proposal and acceptance of these features in PEP 362 (Function Signature Object), but it seems to me that all or at least most of those same arguments should apply here. If a user has any interest in inspecting a 'normal' function such as raw, they would be equally interested in inspecting a delegating function such as cooked, and would be disappointed if they didn't get the same quality of information back.

This goes well beyond specific obscure use cases where a programmer wants to do some clever introspection for their own application. Here are some ways in which __signature__ can affect the experience of an average Python programmer trying to understand or use a function, depending on which tools they use.

In PyCharm (apparently the most popular editor for Python), a keyboard shortcut shows the parameters of a function for which the user is currently writing a call. Personally, I use this feature often, and it always annoys me when I get back a meaningless *args, **kwargs:

screen shot 2018-09-17 at 22 43 13

In the console, setting __signature__ fixes this problem:

screen shot 2018-09-17 at 22 43 57

In Jupyter notebooks, the most common editor for scientific developers after PyCharm (same survey above), the signature is used for parameter autocompletion:

screen shot 2018-09-17 at 22 57 28

And of course, there's the builtin help:

Help on function cooked:

cooked(name='me', age=42)

@chadrik
Copy link
Contributor

chadrik commented Sep 17, 2018

In PyCharm (apparently the most popular editor for Python), a keyboard shortcut shows the parameters of a function for which the user is currently writing a call. Personally, I use this feature often, and it always annoys me when I get back a meaningless *args, **kwargs:

Note that PyCharm's editor uses static analysis (like mypy), so setting __signature__ at runtime will have no affect on its completion, though it very well might in its interpreter/console (I can't say as I don't use this feature). For the editor at least, PyCharm will need to add explicit support for whatever convention is established as a result of this discussion.

@alexmojaki
Copy link

Yes, I did specifically mention the console. And seeing how much static analysis PyCharm has already implemented, I think it's most likely that they will implement this feature too. Personally I want this feature in the standard library precisely so that I can make use of it in PyCharm, so if they don't implement it, I might even do it myself. In any case none of this affects the decorator vs annotation question.

@alexmojaki
Copy link

There's already a clear precedent to a @delegate decorator: @wraps from functools. wraps is typically used as follows:

@wraps(f)
def wrapper(*args, **kwargs):
    # some other stuff
    return f(*args, **kwargs)

where f has a specific, 'concrete' signature. This is exactly how the typical use case of a delegate decorator would look, the only difference being the word wraps. Of course, the context is different, namely that the code above for wraps would be inside another function which is a decorator, but the similarity is still there. Making that connection may make it easier for people familiar with wraps to understand delegate and to remember how to use it.

Based on this similarity, I think it makes sense for the two decorators to behave similarly in other ways. @wraps sets __wrapped__, @delegate can set __delegate__ or some similar attribute pointing to the base function. inspect.signature follows chains of __wrapped__ by default, but this can be switched off. The same can be done for @delegate, and it wouldn't actually set __signature__ as I've previously suggested.

typing.get_type_hints can also use information provided by delegate to produce meaningful type hints based on the base function, just as it does with wraps. Whether delegate actually sets __annotations__ or get_type_hints just uses __delegate__ can be discussed.

__delegate__ may have other uses too. One idea that comes to mind now is that automatically generated HTML documentation could include a hyperlink to the base function.

Beyond that, attaching information at runtime may have uses that none of us think of, maybe even uses that none of us could think of because the relevant Python features don't exist yet. Using an annotation is a decision that would be messy to reverse and effectively rules out these possibilities.

In terms of readability, another problem I have with using an annotation is that annotations are usually (if not always) an absolute declaration of the type of the annotated object, without taking the annotated object into account. If I see:

cooked: TypeOf[raw]
def cooked(foo, *args, **kwargs):
    raw(*args, **kwargs)

the impression given by cooked: TypeOf[raw] is that cooked has the exact type of raw, i.e. the same signature, when in fact cooked has a slightly different signature because of the foo parameter. Certainly the name TypeOf is partly responsible for this, but it's also just the nature of annotations.

@gvanrossum
Copy link
Member

I just want to note that you seem to be focused primarily on runtime behavior, while others on this thread are focused primarily on static checkers (which operate without running or importing the code they are checking).

@alexmojaki
Copy link

I'm fine with static analysis being the priority. I even wrote the PR for that. We can defer actually making decisions about or implementing the runtime stuff until much later. But I think it should be possible to implement those features eventually, and it essentially won't be if we use an annotation. All of this is just an argument in favour of using a decorator instead of an annotation, in contrast to @ilevkivskyi's stated preference for an annotation.

@gvanrossum
Copy link
Member

I think there's still a misunderstanding though. Annotations are also introspectable at runtime (typically through some __annotations__ attribute somewhere).

@alexmojaki
Copy link

The suggestion is to use a variable annotation, which is the only possibility I see. Given a function, finding a corresponding variable annotation is messy at best, and AFAICT impossible if the function is locally defined. And since Python versions before 3.6 have to mimic variable annotations using comments, it's definitely impossible in all cases to attach runtime information for those versions. Of course this feature won't be directly available in those versions, but it could easily be available in a backport.

@JukkaL
Copy link
Contributor

JukkaL commented Sep 24, 2018

I prefer using a decorator over a variable annotation, mainly because it allows some differences in the signatures. Based on analysis by @dmoisset above, it's pretty common that some arguments are different, or *args are not passed, etc. It's much easier to cover these cases with the decorator-based syntax. Maybe further analysis would be helpful here, but I'm pretty much convinced already that this is important. The annotation-based syntax only makes sense to me if the signatures are identical.

Here are some potential signature differences that the decorator-based approach can support:

  • The caller takes an extra argument that the callee shouldn't accept.
  • The callee does not accept some argument accepted by the callee (it may be provided explicitly in the call).
  • The caller only accepts **kwargs -- no *args -- but the callee also accepts some positional arguments.

I think that both approaches can support runtime introspection (in 3.6 and later), so it's probably not an important factor.

The similarity to @wraps noticed by @alexmojaki suggests some additional possible names for the decorator:

  • @inherits_signature(f)
  • @delegates_signature(f)
  • @reuses_signature(f)
  • @extends_signature(f)

More random ideas:

  • @fallback_signature(f)
  • @signature_fallback(f)

@rmorshea
Copy link

rmorshea commented Dec 4, 2019

@alexmojaki with respect to checking function bodies it would, in principle, be possible to determine what *args and **kwargs are with more specificity:

def parent(x: int, y: int) -> int: ...

def child(z: int, **kwargs: SameAs[parent]) -> int:
    reveal_type(kwargs)  # Revealed type is TypedDict(x=int, y=int)

But again, my use cases for this are primarily concerned with having the correct signature for the wrapping function so I'd have to agree that this isn't a priority. Having more detailed type checking information in the wrapping function's body would be more of a nicety and could be added later since most of the time, kwargs is simply passed from child to parent without modification.

@rmorshea
Copy link

rmorshea commented Dec 9, 2019

@JukkaL and @alexmojaki even though ilevkivskyi seems to have excluded himself from the conversation I think the idea of indicating signature copying via a type hint could be aesthetically acceptable. Such a type hint might be named something like:

  • SameAs
  • Same
  • CopyFrom
  • Copy
  • Inherit
  • InheritFrom

And would probably look a bit like this in practice:

class Parent:
    def method(self, y: int, y: int) -> int: ...

class Child(Parent):
    def method(self, z: int, *args: SameAs[Parent]) -> SameAs: ...
    # or
    def method(self, z: int, **kwargs: SameAs[Parent]) -> int: ...
    # or
    def method(self, z: int, *args: SameAs[Parent], **kwargs: SameAs[Parent]) -> int: ...

Implementation Concern

The ability to copy *args, **kwargs, and return separately might be more complicated to implement:

def parent(x: int, *, y: int = 0) -> int: ...

def child(z: int, *args: SameAs[parent]) -> int: ...  # SameAs should only copy `x` arg

Edge Case Benefits?

This may have some advantages when wrapping functions in slightly more complex ways. For example, this could allow you to delegate arguments to two different functions, or delegate arguments from one or more sources:

def f(*args: int) -> int: ...
def g(**kwargs: int) -> int: ...

def wrapper(*args: SameAs[f], **kwargs: SameAs[g]) -> int:
    return f(*args) + g(**kwargs)
T =  # not sure what to put here

def f(*args: int, **kwargs: int) -> int: ...

def chain_outer(*args: SameAs[f]) -> T:
    def chain_inner(**kwargs: SameAs[f]) -> SameAs[f]:
        return f(*args, **kwargs)
    return chain_inner

@rmorshea
Copy link

rmorshea commented Jan 9, 2020

@msullivan PEP-612 appears related to this issue.

@lucmos
Copy link

lucmos commented Jan 22, 2020

Sorry if I intrude on the conversation.
I found this thread while I was researching how to solve one minor issue. I think it is common, thus I'd like to share my use case.

Use case

I want to type the function returned by functools.partial.

More precisely, I'd like be able to have some form of parameter hinting from PyCharm. (from what I understood, it is very probable to be implemented if this proposal is accepted)

Sample code

I have a register decorator that I use to register function or class names, in order to be able to retrieve them by name.
I'd like to organize such register in categories.

One possible implementation of a decorator to register objects of the "model category":

register_model = functools.partial( _register_decorator_factory, category="model_class")

However in this way it is not easy to use such decorator, because the signature is lost.

i.e. This is the signature of _register_decorator_factory:

def _register_decorator_factory(
    wrapped=None,
    *,
    names: Optional[Union[List[str], str]] = None,
    category: str,
) -> Any:

I would expect to automatically infer the register_model signature to:

def register_model(
    wrapped=None,
    *,
    names: Optional[Union[List[str], str]] = None,
) -> Any:

tl;dr

Consider the use case of automatically infer the signature of functools.partial's output

@rmorshea
Copy link

rmorshea commented Jan 23, 2020

@lucmos I don't think your use case has been brought up in this thread before, so thanks for sharing! While it seems similar, to my eyes though it looks like you need a general solution for currying so PEP-612 (which will probably close this issue) won't help you out as it stands right now.

Doing a bit of digging though, it doesn't seem like anyone's brought up currying before so perhaps this warrants a new issue. TypeScript appears to have some solution to currying using recursive types, so that seems like a promising angle from which to approach the problem.

@gvanrossum
Copy link
Member

Now that PEP 612 is accepted, can we close this issue?

@untitaker
Copy link

untitaker commented Aug 15, 2020

👍 even if it doesn't cover all usecases (no idea if that's the case, just started reading the spec) this thread has become sufficiently long so remaining usecases should probably get new threads.

Is there a ticket that tracks the implementation in mypy I can subscribe to?

@gvanrossum
Copy link
Member

python/mypy#9250

@matthewgdv
Copy link

matthewgdv commented Feb 16, 2021

Does PEP 612 offer a solution to the common usecase of copying the signature of the superclass method when overriding?

I've read through it and while it seems like the use-case of signature-preserving decorators is now well-supported, I can't see any way to indicate 'same as parent' or even 'same as x' where x is another callable. Could anyone tell me if I've just missed something glaringly obvious? I found some parts of the PEP a little hard to digest without first seeing more examples of its use out in the wild.

Honestly, I really like the use of attribute-access syntax to separate a ParamSpec out into P.args and P.kwargs.

If there is not yet a way to meet this need with PEP 612 I would like to start a new thread suggesting the following (in the spirit of PEP 612):

def raw(some: str, /, signature: int, goes: bool, *, here: dict) -> str:
    ...


def cooked(*args: raw.args, **kwargs: raw.kwargs) -> raw.ret:
    ret = raw(*args, **kwargs)
    ...  # do something with ret before returning it

Allowing func.args, func.kwargs, and (possibly, and subject to a name-change) func.ret as special type hints that are valid for any function.

And for the specific use-case of method overriding:

from typing import Super


class Parent:
    def some_method(self, some: str, /, signature: int, goes: bool, *, here: dict) -> str:
        ...


class Child(Parent):
    def some_method(self, *args: Super.args, **kwargs: Super.kwargs) -> Super.ret:
        ret = super().some_method(*args, **kwargs)
        ...  # do something with ret before returning it

If typing.Super were a special object that signified to type-checkers and IDEs that the parent's signature for that method should be used.

So yeah. Are these use-cases covered? If not, has there been a successor thread to this where I could post this as a suggestion? And if not, would it be okay if I started one?

Cheers

@rmorshea
Copy link

@matthewgdv if all you want to do is copy the signature, this comment describes a way to do that right now: #270 (comment)

That doesn't really help you though if you want to actually add arguments (e.g.(new_arg, *args, **kwargs)). In that case, I don't really see an obvious way to use ParamSpec to accomplish that.

Perhaps in a future revision the following could be made possible?

P1 = ParamSpec("P1")
P2 = ParamSpec("P2")

def concatenate_functions(f1: P1, f2: P2) -> Concatenate[P1, P2]:
    ...

def f(x: int) -> int: ...
def g(y: int) -> int: ...

h: Callable[[int, int], int] = concatenate_functions(f, g)

There might be a separate issue to track this though - a link here would be great if anyone knows of such an issue.

@USSX-Hares
Copy link

The Problem

As mentioned above, the most-used cases of **kwargs were passing it to the other function, with or without any additional arguments. However, the new PEP-612's ParamSpec does not cover that case as it does not provide syntax for merging keyword parameters from different functions defined at different places.

Example:

class A:
    def __init__(self, *, param_1: int, param_2: str, param_3: bool = False):
        pass

class B:
    a: A
    
    def create_a(self, *, log: bool = False, **kwargs):
        self.a = A(**kwargs)
        if (log): print("Log!")

Here, B.create_a()'s real signature is create_a(self, *, log: bool = False, param_1: int, param_2: str, param_3: bool = False).
I managed to create a simple decorator that merges signatures and overrides the attribute __signature__, but as this happens at runtime exclusively, neither PyCharm nor mypy are able to handle it correctly.
Since it is one of the most-common usages, as well as MyPy eating default values and still (Jun 2022) not supporting dataclasses, I consider this request as not resolved.

If I am somewhere wrong, please, correct me.

P.S.

If anybody is interested, I am attaching my implementation of merge_signature.
Usage:

from .merge_sig import merge_signature
class B:
    a: A
    
    @merge_signature([ A.__init__ ])
    def create_a(self, *, log: bool = False, **kwargs):
        ...

file:/home/peter/projects/test/merge-sig.tar.gz

I tried different tools for generating stub files, and neither of them succeeded in generating the correct signature
(or even just respecting dataclass).

@ahuang11
Copy link

ahuang11 commented Oct 28, 2022

To your last point, I think there's more than just signature copying. As mentioned in a previous comment it would also be useful if one could add fixed arguments to test that are not a part of f:

def f(x: bool, *extra: int) -> str: ...

@copy_signature(f)
def test(fixed_argument: str, *args, **kwargs): ....

# desired
reveal_type(test)  # Revealed type is 'def (fixed_argument: str, x: bool, *extra: int) -> str'
# actual
reveal_type(test)  # Revealed type is 'def (x: bool, *extra: int) -> str'

I apologize in advance if this has already been solved. I wanted to share what I found with Concatenate and ParamSpec since it took me a long time to come up with a solution to copy arguments from internal_f and then add a_bool argument:

import functools
from typing import TypeVar, Callable, Any
from typing_extensions import Concatenate, ParamSpec

I = ParamSpec("I")  # internal function
P = ParamSpec("P")  # public function
R = TypeVar("R")  # The return type of the internal function

def copy_inputs(internal_f: Callable[I, R]) -> Callable[[Callable[I, R]], Callable[Concatenate[bool, I], R]]:
    def wrap(public_f: Callable[P, R]) -> Callable[P, R]:
        @functools.wraps(public_f)
        def run(*args: P.args, **kwargs: P.kwargs) -> R:
            print(f"a_bool is {args[0]}")
            args = args[1:]
            internal_f(*args, **kwargs)
        return run
    return wrap

def internal_f(a_str: str, a_float: float) -> str:
    print(f"a_str is {a_str}, a_float is {a_float}")

@copy_inputs(internal_f)
def public_f(a_bool: bool, *args, **kwargs) -> str:
    ...

public_f(True, "a_string", 42.8)

I am not entirely sure I have everything type annotated correctly, but it does result in this on VSCode:
image

Also, I couldn't figure out how to get the argument named (i.e. rather than bool, it should show as a_bool: bool.

@ringohoffman
Copy link

ringohoffman commented Nov 30, 2022

@ahuang11 It looks like keyword parameter support by typing.Concatenate was postponed: https://peps.python.org/pep-0612/#concatenating-keyword-parameters... you can still accomplish this through a callback Protocol though: https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols

Does anyone know if this work is underway or not? I am definitely interested in making use of such a feature.

@Igetin
Copy link

Igetin commented Dec 9, 2022

copying signature for external callers already works.

F = TypeVar('F', bound=Callable[..., Any])

class copy_signature(Generic[F]):
    def __init__(self, target: F) -> None: ...
    def __call__(self, wrapped: Callable[..., Any]) -> F: ...

def f(x: bool, *extra: int) -> str: ...

@copy_signature(f)
def test(*args, **kwargs):
    return f(*args, **kwargs)

reveal_type(test)  # Revealed type is 'def (x: bool, *extra: int) -> str'

The revealed type works as expected, but Mypy still reports the outer **kwargs as missing type annotation.

@matthewgdv
Copy link

matthewgdv commented Dec 9, 2022

@lgetin that's pretty cool, but I think there's 2 problems with your example.

It's easy to forget when commenting on typing-related threads but most users probably aren't familiar enough with advanced typing constructs like TypeVars, Generics, etc. or even just with how to make their own decorators to come up with a solution like that. So unless they get lucky while searching StackOverflow this very common use-case will go unmet, resulting in badly-typed or untyped code even when the user would have wanted to type it.

The second problem is that even if everyone were to copy your copy_signature recipe, it's a bit 'too cute'. What I mean by that it uses metaprogramming that some of the most common type inference engines out in the wild aren't able to understand. For example, if I copy your code as-is into PyCharm and start invoking test the argument function signature assistance popup pane comes up as *args, **kwargs rather than x: bool, *extra: int the way it does for f.

I think a pretty good compromise solution would be to include either this exact recipe or an equivalent one in the typing module. This turns it into an official feature that nearly all code insight engines will then make sure to support, even if they have to special-case it.

EDIT: I also just realized that using a decorator like that there's no way to indicate that a method should keep the same signature as its closest parent in the MRO without explicitly referencing the parent. I think this is possibly the most common use-case for signature copying.

@ringohoffman
Copy link

@lgetin that's pretty cool, but I think there's 2 problems with your example.

@matthewgdv that example isn't even functional, so there are definitely at least 3 problems with it 😉

I would like to propose the following implementations for a copy_signature-type decorator:

import functools
from typing import Any, Callable, Concatenate, ParamSpec, TypeVar

P = ParamSpec("P")
T = TypeVar("T")


def copy_function_signature(source: Callable[P, T]) -> Callable[[Callable], Callable[P, T]]:
    def wrapper(target: Callable) -> Callable[P, T]:
        @functools.wraps(source)
        def wrapped(*args: P.args, **kwargs: P.kwargs) -> T:
            return target(*args, **kwargs)

        return wrapped

    return wrapper


def copy_method_signature(source: Callable[Concatenate[Any, P], T]) -> Callable[[Callable], Callable[Concatenate[Any, P], T]]:
    def wrapper(target: Callable) -> Callable[Concatenate[Any, P], T]:
        @functools.wraps(source)
        def wrapped(self, *args: P.args, **kwargs: P.kwargs) -> T:
            return target(self, *args, **kwargs)

        return wrapped

    return wrapper


def f(x: bool, *extra: int) -> str:
    return str(...)


@copy_function_signature(f)
def test(*args, **kwargs):
    return f(*args, **kwargs)


class A:
    def foo(self, x: int, y: int, z: int) -> float:
        return float()


class B:
    @copy_method_signature(A.foo)
    def bar(self, *args, **kwargs):
        print(*args)


print(test(True, 1, 2, 3))  # Elipsis
B().bar(1, 2, 3)  # 1, 2, 3

reveal_type(test)  # Type of "test" is "(x: bool, *extra: int) -> str"
reveal_type(B.bar)  # Type of "B.bar" is "(Any, x: int, y: int, z: int) -> float"

Obviously we could make use of typing-extensions for Python versions <3.10. Curious to hear everyone's thoughts on this implementation and if there is truly a need for such functionality in typing.

@Igetin
Copy link

Igetin commented Dec 9, 2022

@matthewgdv It wasn’t my example in the first place, it’s quoted from ilevkivskyi earlier in the thread (actually from over three years ago 😄). I just replied about it since it was recommended multiple times in the thread and also on Stack Overflow, but it didn’t seem to work as expected. I’m not very well-versed in the more advanced typing aspects of Python myself.

@Igetin
Copy link

Igetin commented Dec 12, 2022

I would like to propose the following implementations for a copy_signature-type decorator:

@ringohoffman This caused a bunch of warnings when running Mypy in strict mode (via the --strict flag). I have attempted to make an updated version that fixes those warnings:

import functools
from collections.abc import Callable
from typing import Any, Concatenate, ParamSpec, TypeVar, reveal_type

P = ParamSpec("P")
T = TypeVar("T")


def copy_callable_signature(
    source: Callable[P, T]
) -> Callable[[Callable[..., T]], Callable[P, T]]:
    def wrapper(target: Callable[..., T]) -> Callable[P, T]:
        @functools.wraps(source)
        def wrapped(*args: P.args, **kwargs: P.kwargs) -> T:
            return target(*args, **kwargs)

        return wrapped

    return wrapper


def copy_method_signature(
    source: Callable[Concatenate[Any, P], T]
) -> Callable[[Callable[..., T]], Callable[Concatenate[Any, P], T]]:
    def wrapper(target: Callable[..., T]) -> Callable[Concatenate[Any, P], T]:
        @functools.wraps(source)
        def wrapped(self: Any, /, *args: P.args, **kwargs: P.kwargs) -> T:
            return target(self, *args, **kwargs)

        return wrapped

    return wrapper


def f(x: bool, *extra: int) -> str:
    return str(...)


@copy_callable_signature(f)
def test(*args, **kwargs):  # type: ignore[no-untyped-def] # copied signature
    return f(*args, **kwargs)


class A:
    def foo(self, x: int, y: int, z: int) -> float:
        return float()


class B:
    @copy_method_signature(A.foo)
    def bar(self, *args, **kwargs):  # type: ignore[no-untyped-def] # copied signature
        print(*args)


class Person:
    def __init__(self, given_name: str, surname: str):
        self.full_name = given_name + " " + surname

    def __repr__(self) -> str:
        return f"<{self.__class__.__name__}: {self.full_name}>"


@copy_callable_signature(Person)
def wrapper(*args, **kwargs):  # type: ignore[no-untyped-def] # copied signature
    return Person(*args, **kwargs)


print(test(True, 1, 2, 3))  # Ellipsis
B().bar(1, 2, 3)  # 1, 2, 3
print(wrapper("John", "Doe"))  # <Person: John Doe>

reveal_type(test)  # Type of "test" is "(x: bool, *extra: int) -> str"
reveal_type(B.bar)  # Type of "B.bar" is "(Any, x: int, y: int, z: int) -> float"
reveal_type(wrapper)  # Type of "wrapper" is "(given_name: str, surname: str) -> Person"

It runs without errors, and mypy --strict passes successfully. I also added another example copying the call signature of a class.

Interestingly, no-untyped-def errors will be raised for the decorated functions, so I have ignored them in the code. I wonder if this would actually count as a bug in Mypy. Since the revealed type shows that the function arguments and return type are fully typed, there’s no reason to raise an error about the function being untyped, right? 🤔

@Moortiii
Copy link

I've been experimenting with the copy_callable_signature from @Igetin's reply above in combination with Pydantic and the new typing.Self from Python 3.11. In the example below I am trying to create a classmethod that allows us to create a model with synthetic data without instantiating the class. However, the signature copying only works when all arguments are provided to .synthesize. Is there an obvious way to work around this limitation?

import functools
from collections.abc import Callable
from typing import ParamSpec, Self, TypeVar, reveal_type
from pydantic import BaseModel

P = ParamSpec("P")
T = TypeVar("T")


def copy_callable_signature(
    source: Callable[P, T]
) -> Callable[[Callable[..., T]], Callable[P, T]]:
    def wrapper(target: Callable[..., T]) -> Callable[P, T]:
        @functools.wraps(source)
        def wrapped(*args: P.args, **kwargs: P.kwargs) -> T:
            return target(*args, **kwargs)

        return wrapped

    return wrapper


class Message(BaseModel):
    author: str
    content: str

    @classmethod
    @copy_callable_signature(Self)
    def synthesize(cls, **kwargs):
        kwargs["author"] = kwargs.get("author", "Default author")
        kwargs["content"] = kwargs.get("content", "Default content")
        return cls(**kwargs)


message_1 = Message.synthesize(author="Moortiii") # Note that 'content' is missing
reveal_type(message_1) # Type of "message_1" is "Any"
reveal_type(message_1.synthesize) # Type of "message_1.synthesize" is "Any"

message_2 = Message.synthesize(author="Moortiii", content="Example message")
reveal_type(message_2) # Type of "message_2" is "Message"
reveal_type(message_2.synthesize) # Type of "message_2.synthesize" is "(*, author: str, content: str) -> Message"

print(message_1) # author='Moortiii' content='Default content'
print(message_2) # author='Moortiii' content='Example message'

@rmorshea
Copy link

@Moortiii, it looks like you're not just trying to copy the signature. Instead what you need is to be able to modify the signature since some parameters in your new synthesize callable are optional even though they'd be required in the original. Unfortunately, with the typing tools we have today, this is not possible.

Also, on an unrelated note, kwargs["author"] = kwargs.get("author", "Default author") can be shortened to kwargs.setdefault("author", "Default author").

@Moortiii
Copy link

Moortiii commented May 10, 2023

Ah yes, you're right. In the example above the type hint for synthesize would need to be (*, author: Optional[str] = None, content: Optional[str] = None) -> Message in order accurately represent the function. It's a shame to hear that this isn't currently possible. Are you aware of any ongoing efforts that may introduce this in the future? I've been unable to find any PEPs that are directly related to modifying the type of function arguments, not just the number of arguments itself.

Thinking out loud, I guess that even if we had a way to modify the arguments such that they all became Optional, this on its own would be confusing without a default value being specified in the type hint (which in our case we modify on kwargs directly, making it even more confusing). I imagine it would be difficult to provide an interface where users could specify an individual default value for each argument for usecases with different requirements than my own. Regardless, I feel like we're quickly stepping into the territory of the Sentinel PEP with regards to the default value, since None isn't always viable as a sentinel value either.

Indeed, kwargs.setdefault is a much nicer approach than what I had. Thanks for the heads up!

@rmorshea
Copy link

Hypothetically, depending on how keyword argument concatenation worked, that could be an avenue towards allowing for this. But I am not aware of any work being done in this regard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests