Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving an overloaded generic function into a class can sometimes introduce "incompatible return type" errors #5510

Closed
Michael0x2a opened this issue Aug 20, 2018 · 0 comments · Fixed by #17392

Comments

@Michael0x2a
Copy link
Collaborator

Consider the following function:

from typing import *

T = TypeVar('T')

@overload
def f1(x: int) -> int: ...
@overload
def f1(x: T) -> T: ...
def f1(*args, **kwargs): pass

This program, after #5476 is merged, will type-check with no errors.

But if we move that function into a class like so, it'll cause an "Overloaded function signatures 1 and 2 overlap with incompatible return types" errors:

class Wrapper(Generic[T]):
    @overload
    def f2(self, x: int) -> int: ...
    @overload
    def f2(self, x: T) -> T: ...
    def f2(self, *args, **kwargs): pass

It seems this is because mypy is inferring f2 to basically be structured like below:

class Dummy(Generic[T]): pass

@overload
def f3(d: Dummy[T], x: int) -> int: ...
@overload
def f3(d: Dummy[T], x: T) -> T: ...
def f3(*args, **kwargs): pass

Basically, it's surprising that converting the function/tweaking the generics in what seems like a harmless way introduces a new error message. This is probably a bug?

In any case, it's unclear what the right thing to do here is -- I'm creating this issue so we can keep track of this issue + link to it in in the overload tests.

Also see #5280 (comment) for a little more context.

ilevkivskyi added a commit that referenced this issue Jun 19, 2024
Fixes #5510

OK, so I noticed during last couple years, that every other time I
change something about type variables, a few unsafe overload overlap
errors either appears or disappears. At some point I almost stopped
looking at them. The problem is that unsafe overload overlap detection
for generic callables is currently ad-hoc. However, as I started working
on it, I discovered a bunch of foundational problems (and few smaller
issues), so I decided to re-work the unsafe overload overlap detection.
Here is a detailed summary:

* Currently return type compatibility is decided using regular subtype
check. Although it is technically
correct, in most cases there is nothing wrong if first overload returns
`list[Subtype]` and second returns `list[Supertype]`. All the unsafe
overload story is about runtime values, not static types, so we should
use `is_subset()` instead of `is_subtype()`, which is IIUC easy to
implement: we simply need to consider all invariant types covariant.
* Current implementation only checks for overlap between parameters,
i.e. it checks if there are some calls that are valid for both
overloads. But we also need to check that those common calls will not be
always caught by the first overload. I assume it was not checked
because, naively, we already check elsewhere that first overload doesn't
completely shadow the second one. But this is not the same: first
overload may be not more general overall, but when narrowed to common
calls, it may be more general. Example of such false-positive (this is
an oversimplified version of what is often used in situations with many
optional positional arguments):
  ```python
  @overload
  def foo(x: object) -> object: ...
  @overload
  def foo(x: int = ...) -> int: ...
  ```
* Currently overlap for generic callables is decided using some weird
two-way unification procedure, where we actually keep going on (with
non-unified variables, and/or `<never>`) if the right to left
unification fails. TBH I never understood this. What we need is to find
some set of type variable values that makes two overloads unsafely
overlapping. Constraint inference may be used as a (good) source of such
guesses, but is not decisive in any way. So instead I simply try all
combinations of upper bounds and values. The main benefit of such
approach is that it is guaranteed false-positive free. If such algorithm
finds an overlap it is definitely an overlap. There are however false
negatives, but we can incrementally tighten them in the future.
* I am making `Any` overlap nothing when considering overloads.
Currently it overlaps everything (i.e. it is not different from
`object`), but this violates the rule that replacing a precise type with
`Any` should not generate an error. IOW I essentially treat `Any` as
"too dynamic or not imported".
* I extend `None` special-casing to be more uniform. Now essentially it
only overlaps with explicitly optional types. This is important for
descriptor-like signatures.
* Finally, I did a cleanup in `is_overlapping_types()`, most notably
flags were not passed down to various (recursive) helpers, and
`ParamSpec`/`Parameters` were treated a bit arbitrary.

Pros/cons of the outcome:
* Pro: simple (even if not 100% accurate) mental model
* Pro: all major classes of false positives eliminated
* Pro: couple minor false negatives fixed
* Con: two new false negatives added, more details below

So here a two new false negatives and motivation on why I think they are
OK. First example is
```python
T = TypeVar("T")

@overload
def foo(x: str) -> int: ...
@overload
def foo(x: T) -> T: ...
def foo(x):
    if isinstance(x, str):
        return 0
    return x
```
This is obviously unsafe (consider `T = float`), but not flagged after
this PR. I think this is ~fine for two reasons:
* There is no good alternative for a user, the error is not very
actionable. Using types like `(str | T) -> int | T` is a bad idea
because unions with type variables are not only imprecise, but also
highly problematic for inference.
* The false negative is mostly affecting unbounded type variables, if a
"suspicious" bound is used (like `bound=float` in this example), the
error will be still reported.

Second example is signatures like
```python
@overload
def foo(x: str, y: str) -> str: ...
@overload
def foo(*args: str) -> int: ...

@overload
def bar(*, x: str, y: str) -> str: ...
@overload
def bar(**kwds: str) -> int: ...
```
These are also unsafe because one can fool mypy with `x: tuple[str, ...]
= ("x", "y"); foo(*x)` and `x: dict[str, str] = {"x": "x", "y": "y"};
bar(**x)`. I think this is OK because while such unsafe calls are quite
rare, this kind of catch-all fallback as last overload is relatively
common.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants