Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis time binding of restricted Generics #11565

Open
harahu opened this issue Nov 16, 2021 · 2 comments
Open

Analysis time binding of restricted Generics #11565

harahu opened this issue Nov 16, 2021 · 2 comments
Labels

Comments

@harahu
Copy link

harahu commented Nov 16, 2021

Feature

A generic type defined with reference to a restricted type variable only makes sense when containing a value of type within the restrictions of the type variable.

It would then make sense to assume that when the user does not bind the generic type (that is, they do not provide a type argument), the implied type is the Union of the possible bound types.

To Reproduce

from typing import Any
from typing import AnyStr
from typing import Generic


class Message(Generic[AnyStr]):
    header: AnyStr
    content: AnyStr

    def __init__(self, header: AnyStr, content: AnyStr) -> None:
        self.header = header
        self.content = content

# This feels wrong to me, and I'd rather see this as Union[Message[str], Message[bytes]]
m1: Message = Message(header="foo", content="bar")
reveal_type(m1)  # Message[Any]...

# This is perhaps ok, since the user explicitly passed the `Any` parameter. But if this behavior is kept, 
# one shouldn't be forced to provide a type arg to restricted generics in strict mode, since that 
# would imply having to enumerate the restriction to get the same behavior as above.
m2: Message[Any] = Message(header="foo", content="bar")
reveal_type(m2)  # Message[Any]...

Pitch

Say I have a function that accepts any valid Message as an arg:

def read(msg: Message) -> None:
    msg.content.method_in_neither_str_nor_bytes()

If I annotate it like I just did, I get no hand-holding from the type checker. msg.content is understood as Any, so everything goes. We know, however, that this has to be one of AnyStr. If I want type checker assistance, I now have to manually list the possible Message types, like so:

def read(msg: Union[Message[str], Message[bytes]]) -> None:
    msg.content.method_in_neither_str_nor_bytes()

This is fine when the restriction consists of two types, but it should be easy to imagine examples with a more verbose restriction.

@harahu harahu added the feature label Nov 16, 2021
@pranavrajpal
Copy link
Contributor

The problem is that Message with no explicit type parameters means Message[Any], so you're telling mypy that the types of those variables is Message[Any]. Removing the type annotation for m1 or m2 causes mypy to infer Message[str] correctly.

For your read example, I think you can get the behavior you're looking for by explicitly using the type variable again as a parameter:

def read(msg: Message[AnyStr]) -> None:
    msg.content.method_in_neither_str_nor_bytes()

which correctly shows errors saying that neither str nor bytes have that method.

--disallow-any-generics should help find missing type parameters that are silently becoming Any.

@harahu
Copy link
Author

harahu commented Nov 17, 2021

The problem is that Message with no explicit type parameters means Message[Any], so you're telling mypy that the types of those variables is Message[Any]. Removing the type annotation for m1 or m2 causes mypy to infer Message[str] correctly.

I am aware of this, so I realise I should give a better example. I should have typed it out properly, but my thinking behind the typing I used was to imply that the correct type is hard to infer statically. Say we are rather doing this, or something similar:

m3: Message = Message(header="foo", content="bar") if random.random() < 0.5 else Message(header=b"foo", content=b"bar")
reveal_type(m1)  # Message[Any]...

If the stochasticity is wrapped in a function call, I realise I can annotate the function with return type Message[AnyStr], but I am sure this won't always be the case.

Here is yet another common case - and perhaps a better example - that shows up in the module scope, where I don't have the luxury of binding the type by help of the TypeVar. Say I want to make a type alias that references Message:

# Verbatim
MessageDict: TypeAlias = dict[Hashable, Message]

What I really am trying to express, with less verbosity, is this:

# Desired interpretation
MessageDict: TypeAlias = dict[Hashable, Union[Message[str], Message[bytes]]]

but what I end up with is, as you point out:

# Actual interpretation
MessageDict: TypeAlias = dict[Hashable, Message[Any]]

Given what you propose for my function example, it might be reasonable to look for a solution involving AnyStr, but that is hard in this case:

MessageDict: TypeAlias = dict[Hashable, Message[AnyStr]]

This just results in a new Generic, that I have to bind down the line. And what's worse, it is impossible to bind it to the type I originally wanted:

def dict_consumer1(md: MessageDict[AnyStr]) -> None:
    reveal_type(md)  # Union[Dict[Hashable, Message[str], Dict[Hashable, Message[bytes]]

Hence, this is a case where I really have no choice but to spell out the Union of possible Message types in my TypeAlias. This is needlessly verbose, and violates the DRY principle.

For your read example, I think you can get the behavior you're looking for by explicitly using the type variable again as a parameter:

def read(msg: Message[AnyStr]) -> None:
    msg.content.method_in_neither_str_nor_bytes()

which correctly shows errors saying that neither str nor bytes have that method.

You are right that this isn't too bad of a solution in this case, but it still is more verbose than it needs to be. I'd prefer not having to import AnyStr, and extend my signatures every place I use Message. Having to explicitly remember that Message is upper bound by AnyStr, as opposed to the type checker doing it for me, is also in violation of DRY.

--disallow-any-generics should help find missing type parameters that are silently becoming Any.

I'm already using mypy with the strictest possible options enabled, including this. I propose this features because I, in certain situations, like the TypeAlias above, am almost tempted to resign to the Any type, even though I could technically spell out a highly verbose type. Given how poisonous Any is to type checking, that should be a warning sign.

It is precisely because of the poisonousness of Any I propose this. I think type checkers should be proactive in avoiding inferring Any where this is possible. And in the case of a restricted Generic, it seems it is possible to avoid. I would benefit from this by being allowed to write more compact code without loosing type information (I have Generics restricted to significantly more than two types). And those people that haven't enabled --disallow-any-generics would benefit from safer code, since the type checker would be more willing to detect misuse of unbound Generics where possible.

@harahu harahu changed the title Handling of restricted Generics Analysis time binding of restricted Generics Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants