Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify what @overload means #253

Closed
gvanrossum opened this issue Jul 26, 2016 · 30 comments
Closed

Clarify what @overload means #253

gvanrossum opened this issue Jul 26, 2016 · 30 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@gvanrossum
Copy link
Member

In mypy (at least) overlapping @overload variants are not an error as long as the return types match. Examples where this has been reported or discussed include:

There are probably more, but the point should be clear. We should either decide that overloads are tried in order of occurrence until a match is found, or that mypy's current philosophy is correct. Once we have decided we should update PEP 484 to clarify the desired behavior and explain the motivation. (And fix mypy, if needed.)

I'm not sure that it's always possible to have one unique match; e.g. in python/mypy#1322 the problem is that we're calling f(*some_list) and this is deemed a match for both variants. Mypy then cannot deduce a precise return type and falls back to Any.

I could also imagine that unions or type variables could make it hard to unambiguously pick a variant.

I'd also like to point out that the algorithm chosen by a type checker doesn't have to be the same as that chosen at run time -- at run time the overload variants are irrelevant and a single implementation does all the work using e.g. *args and isinstance() (or maybe using functools.singledispatch). But mypy's approach to overlapping is clearly causing a lot of confusion and it would be good to discuss this and write up in the PEP what a type checker should do.

@JukkaL, anything else we should consider here? Do you have a preference?

@matthiaskramm, is this relevant to pytype?

@vlasovskikh, any feedback from the PyCharm team?

@vlasovskikh
Copy link
Member

We use the first match for these cases in PyCharm and we don't warn about overlapping signatures. If no matches are found, we infer a "weak" union of all the return types (we don't check for isinstance assertions on this union later).

@matthiaskramm
Copy link
Contributor

pytype tries @overload methods in order of occurrence and picks the first one that matches.

We make heavy use of that in (our version of) __builtin__.pyi, haven't seen it much elsewhere.

@matthiaskramm
Copy link
Contributor

Btw. I'm not sure I understand the current mypy approach.

In the mypy's / typeshed's 2.7/__builtin__.pyi, we have:

@overload
def divmod(a: int, b: int) -> Tuple[int, int]: ...
@overload
def divmod(a: float, b: float) -> Tuple[float, float]: ...

Given that the second function is effectively divmod(a: Union[int, float], b: Union[int, float]) (see python/typeshed#270), aren't those signatures overlapping, with different return types, and mypy should report an error?

@gvanrossum
Copy link
Member Author

gvanrossum commented Jul 26, 2016 via email

@JukkaL
Copy link
Contributor

JukkaL commented Jul 26, 2016

I wrote about how mypy deals with overloads (python/mypy#1941 (comment)). Copied here for convenience:

Mypy lets you vary the return type even if multiple variants match if your signatures are ordered from narrower to more general. For example, this should be fine:

@overload
def f(x: int) -> str: ...
@overload
def f(x: object) -> object: ...

f(1)  # this is str
f('x')  # this is object
x = 1  # type: object
f(x)  # this is object, which is okay since str is a subclass of object

This would be a problem, though:

@overload
def f(x: int) -> object: ...
@overload
def f(x: object) -> int: ...  # bad

def g(x: object) -> int:
    return f(x) + 3   # the runtime type of x may be int and return value could be anything

g(1)
f(1) + 3   # an error

@JukkaL
Copy link
Contributor

JukkaL commented Jul 26, 2016

Also, if multiple, non-overlapping overload variants can match due to Any types in arguments, mypy will infer Any as the return type. Mypy doesn't have weak unions.

@matthiaskramm
Copy link
Contributor

I like the narrow-to-general (topological) ordering approach. That seems to be the primary use case for having overlapping signatures.

(If you have multiple functions matching and an Any is to blame, pytype falls back to return type Any, same as mypy.)

@JukkaL
Copy link
Contributor

JukkaL commented Jul 26, 2016

Another thing to consider is the exact rules for deciding whether signatures are overlapping. Let's start with single-argument functions like this:

@overload
def f(x: A) -> int: ...
@overload
def f(x: B) -> str: ...

Here are some rules that mypy follows:

  • If A is a subclass of B or vice versa they are overlapping. Otherwise if they are classes they aren't overlapping.
  • Type parameters of generic types don't affect the overlapping check. List[int] is overlapping with List[str] (rationale: empty list).
  • Type variables overlap like their upper bounds.
  • Union types A and B overlap if any item from A is overlapping with some item from B.
  • Any overlaps with everything.

These things are less clear:

  • Does Tuple[int] overlap with Tuple[int, str]?
  • What does Callable[...] overlap with? I think that mypy currently thinks that it can overlap with anything, since a subclass could define __call__, but this is unclear.

@matthiaskramm
Copy link
Contributor

  • Type parameters of generic types don't affect the overlapping check. List[int] is overlapping with List[str] (rationale: empty list).

pytype models empty lists explicitly as List[nothing], so for us, List[int] and List[str] can be treated as non-empty and not overlapping.

@JukkaL
Copy link
Contributor

JukkaL commented Jul 27, 2016

How does pytype treat code like this:

def f1(n: int) -> List[int]:  # should this return Union[List[int], List[nothing]]?
    return list(range(n))

def f2(n: int) -> List[str]:
    return list(str(x) for x in range(n))

@overload
def g(a: List[str]) -> str: ...
@overload
def g(a: List[int]) -> int: ...

n = int(open('n.txt').read())  # assume that this is 0
g(f1(n))  # int or str?
g(f2(n))  # int or str?

It seems to me that it's generally impossible to infer when a list could be empty if using PEP 484 annotations.

@matthiaskramm
Copy link
Contributor

We treat nothing as ⊥. So List[int] ∪ List[⊥]List[int ∪ ⊥] = List[int].

Whenever the type parameter of a container "touches" (is joined with) another type, the type parameter will have that type. Only very few special operations (like dict.clear) can set the type parameter back to nothing.

So in the above example, g(f1(n)) will be int, and g(f2(n)) will be str. (And without return type annotations, f1() will be inferred to have return type List[int], and f2() will be inferred to have return type List[str])

I think the real point of contention is

g([])

Should this match the first signature, or generate an error? pytype currently opts for the latter, even if the type parameter for List were covariant. (So that ⊥⊂int, hence List[⊥]List[int], causing the first signature to "match")

@dmoisset
Copy link
Contributor

A possible solution that would be more flexible than the current one but still "symmetrical" (i.e. independent on the order of the overloads and detecting conflicts) would be allow any kind of type definitions, but check the overlap when checking the call. That can also detect the problem of a possible subclass of A and B passed to an overloaded f(x: A) -> int ; f(x: B) -> str which is the current problem with callable.

With this approach, when checking a call, you find all the return types for every matching overload definition, and the return type is the narrower one if that exists, and an error otherwise.

@gvanrossum
Copy link
Member Author

The latest suggestion here is that we should improve mypy's implementation of @overload until we like it and then update the PEP. There's some discussion in python/mypy#4159 (comment).

@czhang03
Copy link

So are we settled on going through definition until find a match?

@chadrik
Copy link
Contributor

chadrik commented Nov 28, 2017

So are we settled on going through definition until find a match?

As an end-user, I wholeheartedly endorse this approach. It is intuitive to think of overloads as a declarative form of the isinstance checks used within a function body, so it's natural that they should reflect the order found there.

For example, the following seems intuitive to me, but it results in an overlap error:

from typing import overload, Any

class Foo:
    pass

class Bar:
    pass

@overload
def func(a: Foo, b: Bar) -> Bar: ...
@overload
def func(a: Foo, b: Any) -> Foo: ...
@overload
def func(a: Any, b: Any) -> None: ...

def func(a, b):
    if isinstance(a, Foo) and isinstance(b, Bar):
        return b
    elif isinstance(a, Foo):
        return a
    else:
        return None

func(Foo(), 1)

Certainly there are complex considerations which are over my head, but I think that above all the new design should aim to be intuitive -- allowing a developer to approach overloads as a transcription of isinstance/issubclass checks would achieve that goal. If you want users to use overloads without giving up in frustration, the rules that govern them should be digestible within a few short bullet points.

@sproshev
Copy link

@Michael0x2a Thank you for the clarification! I agree.

Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue May 19, 2018
This pull request:

1.  Modifies how mypy handles overloads to match the [proposal][0]
    I made in the typing repo.

2.  Starts removing type erasure from overload checks.

3.  Adds support for basic union math.

4.  Makes overloads respect keyword-only args

This pull request does NOT implement the following features:

1.  Special-casing descriptors

2.  Various improvements and refactorings for operator methods
    (e.g. `__add__`, `__radd__`, etc...)

3.  Detecting partially overlapping arguments. Note: I think initially
    we were thinking only unions can be partially overlapping but on
    reflection, I think tuples and typevars could also be partially
    overlapping. For example:

        @overload
        def f(x: Tuple[int, ...]) -> str: ...
        @overload
        def f(x: Tuple[int, int]) -> int: ...

        T = TypeVar('T', A, B)
        S = TypeVar('S', B, C)

        @overload
        def g(x: T) -> int: ...
        @overload
        def g(x: S) -> str: ...

4.  Detecting "overlapping argument counts". For example, we should
    flag the following as an error since the first alternative could
    potentially overlap with the second.

        @overload
        def f(*args: int) -> int: ...
        @overload
        def f(x: int, y: int, z: int) -> str: ...

5.  The "is-more-precise" relation. It's working in most normal cases,
    but it does contain a few bugs, mostly relating to typevars.
    For example, this is currently *not* flagged as an error, even
    though it should be:

        class Wrapper(Generic[T]):
            @overload
            def f(self, x: int) -> int: ...
            @overload
            def f(self, x: T) -> str: ...

    (This PR does the right thing if 'T' isn't bound to a containing
    class though:)

        class Wrapper:
            @overload
            def f(self, x: int, y: int) -> int: ...
            @overload
            def f(self, x: T, y: T) -> str: ...

    Currently, what I'm doing is using the existing `is_more_precise`
    method, which calls `is_proper_subtype`. To fix this, I think I'll
    either need to (a) just rewrite that method to do what I want with
    TypeVars or (b) find a way of "unbinding" methods from their class
    and force the two methods to unify their typevars before running the
    `is_proper_subtype` check.

The plan is to address these 5 TODOs in future pull requests. Items 1
and 2 are basically orthogonal to the overloads overhaul; items 3, 4,
and 5 basically boil down to finding ways to teach mypy to detect if one
thing is *potentially* compatible with another.

For example, mypy contains code to tell if one type is *definitely* a
subtype of another; fixing items 3 and 5 involve writing code to check
if a type is *potentially* a subtype of another.

  [0]: python/typing#253
Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue May 19, 2018
This pull request:

1.  Modifies how mypy handles overloads to match the [proposal][0]
    I made in the typing repo.

2.  Starts removing type erasure from overload checks.

3.  Adds support for basic union math.

4.  Makes overloads respect keyword-only args

This pull request does NOT implement the following features:

1.  Special-casing descriptors

2.  Various improvements and refactorings for operator methods
    (e.g. `__add__`, `__radd__`, etc...)

3.  Detecting partially overlapping arguments. Note: I think initially
    we were thinking only unions can be partially overlapping but on
    reflection, I think tuples and typevars could also be partially
    overlapping. For example:

        @overload
        def f(x: Tuple[int, ...]) -> str: ...
        @overload
        def f(x: Tuple[int, int]) -> int: ...

        T = TypeVar('T', A, B)
        S = TypeVar('S', B, C)

        @overload
        def g(x: T) -> int: ...
        @overload
        def g(x: S) -> str: ...

4.  Detecting "overlapping argument counts". For example, we should
    flag the following as an error since the first alternative could
    potentially overlap with the second.

        @overload
        def f(*args: int) -> int: ...
        @overload
        def f(x: int, y: int, z: int) -> str: ...

5.  The "is-more-precise" relation. It's working in most normal cases,
    but it does contain a few bugs, mostly relating to typevars.
    For example, this is currently *not* flagged as an error, even
    though it should be:

        class Wrapper(Generic[T]):
            @overload
            def f(self, x: int) -> int: ...
            @overload
            def f(self, x: T) -> str: ...

    (This PR does the right thing if 'T' isn't bound to a containing
    class though:)

        class Wrapper:
            @overload
            def f(self, x: int, y: int) -> int: ...
            @overload
            def f(self, x: T, y: T) -> str: ...

    Currently, what I'm doing is using the existing `is_more_precise`
    method, which calls `is_proper_subtype`. To fix this, I think I'll
    either need to (a) just rewrite that method to do what I want with
    TypeVars or (b) find a way of "unbinding" methods from their class
    and force the two methods to unify their typevars before running the
    `is_proper_subtype` check.

The plan is to address these 5 TODOs in future pull requests. Items 1
and 2 are basically orthogonal to the overloads overhaul; items 3, 4,
and 5 basically boil down to finding ways to teach mypy to detect if one
thing is *potentially* compatible with another.

For example, mypy contains code to tell if one type is *definitely* a
subtype of another; fixing items 3 and 5 involve writing code to check
if a type is *potentially* a subtype of another.

  [0]: python/typing#253
gwk pushed a commit to gwk/typeshed that referenced this issue May 29, 2018
This commit reorders any overloads where the first overload was
"shadowing" the second, preventing it from ever being matched by type
checkers that work by selecting the first matching overload alternative.

For example, the first overload alternative below is strictly broader
then the second, preventing it from ever being selected:

    class Parent: pass
    class Child(Parent): pass

    @overload
    def foo(x: *int) -> Parent: ...
    @overload
    def foo(x: int, y: int) -> Child: ...

The correct thing to do is to either delete the second overload or
rearrange them to look like this:

    @overload
    def foo(x: int, y: int) -> Child: ...
    @overload
    def foo(x: *int) -> Parent: ...

Rationale: I'm currently [working on a proposal][0] that would amend
PEP 484 to (a) mandate type checkers check overloads in order and
(b) prohibit overloads where an earlier alternative completely shadows
a later one.

  [0]: python/typing#253 (comment)

This would prohibit overloads that look like the example below, where
the first alternative completely shadows the second.

I figured it would be a good idea to make these changes ahead of time:
if my proposal is accepted, it'd make the transition smoother. If not,
this is hopefully a relatively harmless change.

Note: I think some of these overloads could be simplified (e.g.
`reversed(...)`), but I mostly stuck with rearranging them in case I was
wrong. The only overload I actually changed was `hmac.compare_digest` --
I believe the Python 2 version actually accepts unicode.
facebook-github-bot pushed a commit to facebook/pyre-check that referenced this issue Oct 25, 2018
Summary:
The typechecker needs to ignore the existence of a signature if that signature is overloaded with typing.overload decorated defines. In other words, when selecting signatures, the overloaded define (containing the actual implementation of the function) should never be selected.

typing.overload designates a definition that is purely for the typechecker's benefit, and is ignored at runtime because it is immediately overwritten with a definition that contains the actual implementation. Calling a function decorated with typing.overload throws at runtime:
```
>>> import typing
>>> typing.overload
... def foo(x):
...     return 1
...
>>> foo(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/fbcode/gcc-5-glibc-2.23/lib/python3.6/typing.py", line 1591, in _overload_dummy
    "You should not call an overloaded function. "
NotImplementedError: You should not call an overloaded function. A series of The controller you requested could not be found. functions outside a stub module should always be followed by an implementation that is not The controller you requested could not be found..
```

Therefore, the purpose of typing.overload defines is simply to clarify the type signature when the return annotation should vary depending on the parameter annotations (but maybe not in a way specifiable with type vars):

```
typing.overload
def foo(x: int) -> str: ...

typing.overload
def foo(x:str) -> int:...

def foo(x: Union[int, str]) -> Union[int, str]
```

Currently, when pyre sees the above set of signatures and then tries to resolve a call to `foo(x)` where `x` is an `int`, we look at all overloads equally and can select the `Union[int, str] -> Union[int, str]` signature and end up assuming the return type of `foo(x)` is `Union[int, str]` rather than the correct and more specific `str`.

If `typing.overload` definitions exist, we should be selecting only from those, and not from the implementation signature.

**NodeAPI Bug (T35334236)**
```
    overload
    staticmethod
    def get(roots):
        # type: (TRoot) -> NodeGetCity[NodeCity]
        pass

    overload  # noqa
    staticmethod
    def get(roots):
        # type: (TRoots) -> NodeGetCity[List[NodeCity]]
        pass

    staticmethod  # noqa
    def get(roots):
        # type: (Union[TRoot, TRoots]) -> NodeGetCity
        """
        Get the node object/s corresponding to the roots.
        param roots: Node|fbid, or iterable<Node|fbid>.
        return A Query object that will return a Node if a Node or NodeId
                was passed as roots otherwise will return a list of Nodes.
        """
        return NodeGetCity(roots, nodeapi_ext.NodeErrorControllerType.ENFORCE)
```

Pyre selects the third method, which has a return type of NodeGetCity with an unknown type parameter (which inherits from awaitable), so awaiting on this returns an unknown type. Pyre *should* realize that the overloaded methods are there to refine the type signature of get depending on the input type. Merging with a pre-existing task to handle overloads this way.

**More context:**
Pep:
https://docs.python.org/3/library/typing.html#typing.overload
Discussion:
python/typing#253
python/mypy#3160
python/typing#14

**Next steps - not urgent (T35517446):**
1. When typechecking a define, if it is decorated with an typing.overload and there does not exist a matching definition *not* decorated with typing.overload. (overload functions throw when called at runtime)
2. When typechecking a define, if there exist other same name defines decorated with typing.overload, throw if overloading functions do not comprehensively cover the type signature of this overloaded function

Reviewed By: dkgi

Differential Revision: D10494018

fbshipit-source-id: 48973008d94cc539198ec13552b8108412413f2a
yedpodtrzitko pushed a commit to yedpodtrzitko/typeshed that referenced this issue Jan 23, 2019
This commit reorders any overloads where the first overload was
"shadowing" the second, preventing it from ever being matched by type
checkers that work by selecting the first matching overload alternative.

For example, the first overload alternative below is strictly broader
then the second, preventing it from ever being selected:

    class Parent: pass
    class Child(Parent): pass

    @overload
    def foo(x: *int) -> Parent: ...
    @overload
    def foo(x: int, y: int) -> Child: ...

The correct thing to do is to either delete the second overload or
rearrange them to look like this:

    @overload
    def foo(x: int, y: int) -> Child: ...
    @overload
    def foo(x: *int) -> Parent: ...

Rationale: I'm currently [working on a proposal][0] that would amend
PEP 484 to (a) mandate type checkers check overloads in order and
(b) prohibit overloads where an earlier alternative completely shadows
a later one.

  [0]: python/typing#253 (comment)

This would prohibit overloads that look like the example below, where
the first alternative completely shadows the second.

I figured it would be a good idea to make these changes ahead of time:
if my proposal is accepted, it'd make the transition smoother. If not,
this is hopefully a relatively harmless change.

Note: I think some of these overloads could be simplified (e.g.
`reversed(...)`), but I mostly stuck with rearranging them in case I was
wrong. The only overload I actually changed was `hmac.compare_digest` --
I believe the Python 2 version actually accepts unicode.
@saulshanabrook
Copy link

I also have a WIP branch of mypy that implements these changes that I'll link to some time shortly once I get it working.

@Michael0x2a Sorry if I missed this, but what are the next steps on your work? Do you have a branch with changes in it? I am hitting some overload issues and would like to see if your work could solve them.

@ilevkivskyi
Copy link
Member

All the proposed changes landed in mypy several months ago, the next step is to write a mini-PEP for this.

facebook-github-bot pushed a commit to facebook/pyre-check that referenced this issue Apr 3, 2019
Summary:
This isn't actually specified yet, but it looks like the behavior of picking the first overload that matches is going to be specified soon (python/typing#253).
For some reason we thought we needed to reverse the list to make this work, but evidently it wasn't reversed, so we were reversing it.

Reviewed By: sinancepel

Differential Revision: D14717465

fbshipit-source-id: 646dd01ff50b709a3ed55a9905fe39611d935286
@srittau
Copy link
Collaborator

srittau commented Nov 4, 2021

I am closing this for now. From skimming the comments, I believe there needs to be some clarification in documentation on how overloads are handled, but the problem is mostly resolved. Because this issue became overloaded and hard to follow, if there any actionable items still left to do, please open new issues in the appropriate issue trackers.

@srittau srittau closed this as completed Nov 4, 2021
@srittau srittau added topic: feature Discussions about new features for Python's type annotations and removed question labels Nov 4, 2021
@jace
Copy link

jace commented Sep 7, 2022

From @Michael0x2a's explanation:

  1. If a type checker detects it's impossible for an overload alternative to ever be called (e.g. if a call always matches an earlier alternative), it should report an error.
    Basically, overloads need to be ordered from narrow to broad.
  2. Suppose that a call could potentially match two overload alternatives. In that case, the return type of the alternative that appears earlier must be a subtype of the return type of the alternative that appears later.

I was hoping this meant the following would work:

import typing as t
import typing_extensions as te
from collections import abc

@overload
def is_collection(item: str) -> te.Literal[False]:
    ...

@overload
def is_collection(item: bytes) -> te.Literal[False]:
    ...

@overload
def is_collection(item: t.Collection) -> te.Literal[True]:
    ...

def is_collection(item: t.Any) -> bool:
    """Return True if the item is a collection class but not a string."""
    return not isinstance(item, (str, bytes, dict)) and isinstance(item, abc.Collection)

However, mypy thinks definitions 1 and 2 overlap with 3. If I remove the overloads, downstream type narrowing fails:

def is_present(collection: t.Set, item: t.Union[str, t.Collection]) -> bool:
    if is_collection(item):
        # item is still a union here and mypy will flag str as incompatible with `&`
        return bool(collection & item)
    return item in collection

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Sep 7, 2022

The overlapping overload check is basically a lint, so type ignore it if you know what you're doing. mypy will still do the right thing most of the time, but obviously it'll get tricked by stuff like x: Collection = "asdf"; is_collection(x)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests