Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a Not type #801

Open
antonagestam opened this issue Apr 22, 2021 · 20 comments
Open

Introduce a Not type #801

antonagestam opened this issue Apr 22, 2021 · 20 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@antonagestam
Copy link

antonagestam commented Apr 22, 2021

This is a continuation of this discussion on the Intersection issue.

I have a different use case for the Not[...] type, but it is related to the original discussion. My usecase is to be able to tell the type checker that two types can never have an intersection. The specific case is for the phantom-types library where there are two types NonEmpty and Empty.

Currently there is no way to tell mypy that a variable can't both be NonEmpty and Empty, so this code passes without error:

from __future__ import annotations
from phantom.sized import NonEmpty, Empty

i: tuple[int, ...] = ()

assert isinstance(i, Empty)
assert isinstance(i, NonEmpty)

reveal_type(i)

The output shows that mypy interprets i as an intersection between tuple, Empty and NonEmpty:

intx.py:9: note: Revealed type is 'intx.<subclass of "tuple", "Empty", and "NonEmpty">'

Of course this code would error at runtime, but it would be preferable if this could be expressible so that type checkers can give an error here, similarly to how mypy gives errors for unreachable code.

A suggestion for how this could be expressed is to allow subclassing a Not type, so simplified the definitions of the two mentioned types would like this:

class NonEmpty:
    ...

class Empty(Not[NonEmpty]):
    ...

There are also other types in the library that would benefit from this like TZAware and TZNaive.

Edit: To clarify, I think that this type would be valuable even without an Intersection type which is why I opened a separate issue. This is because (as showed in the example) there already implicit intersections in mypy. I don't know how the other type checkers treats this.

This was referenced Oct 3, 2021
@KotlinIsland
Copy link

In Kotlin the root type Any is split from null, and the union of those is denoted Any?

In TypeScript the root type unknown includes null and undefined, but those types are distinct from the other basal types object, number etc. {} can be used to refer to any type except null(and undefined)

Adding a Not type seems like a massive deal so I think a NotNone type should be added, which is just a shortcut for Not[None].

@mccolljr
Copy link

mccolljr commented Jan 3, 2022

I think I also have a use case for a Not type:

I have a class, something like this:

class Field(Generic[T]):
    kind: FieldKind[T] # has many subclasses, like StringKind, ObjectKind[T], etc
    def __init__(kind_spec: Union[FieldKind[T], Type[FieldKind[T]], Callable[..., T]]):
        ...  

The intent is that T can be some primitive type, or some type that has schema annotations.
The goal is to do something like this:

def field_kind_from(
    spec: Union[FieldKind[T], Type[FieldKind[T]], Callable[..., T]]
) -> FieldKind[T]:
    if isinstance(spec, FieldKind):
        return spec
    if isinstance(spec, FieldKindMeta):
        return spec()
    if isinstance(spec, SchemaMeta):
        return ObjectKind[T](spec)
    raise TypeError(f"{type(spec)} cannot be used as a FieldKind")

...so that I can write all 3 of these and have the type inference work:

field_1 = Field(StringKind()) # Field[str], kind is StringKind()
field_2 = Field(StringKind)   # Field[str], kind is StringKind() 
field_3 = Field(SomeClass)   # Field[SomeClass], kind is ObjectKind[SomeClass]()

However, because Type[FieldKind[T]] can also be construed as Callable[..., T] where T=FieldKind[?], type annotations are required when using the version for field_2 (the most common case)

Right now, T is just TypeVar('T'), but I think this would work how I want if I could specify TypeVar('T', bound=Not[FieldKind]). In other words, tell the type checker that T can be any type that isn't FieldKind or one of its subclasses.

@harahu
Copy link

harahu commented Jun 16, 2022

I have yet another use case for Not. Making the typing for Mapping.get more accurate.
If we had Not, we could express it like so:

class Mapping(Collection[_KT], Generic[_KT, _VT_co]):
    @overload
    def get(self, __key: Not[_KT]) -> None: ...
    @overload
    def get(self, __key: object) -> _VT_co | None: ...
    @overload
    def get(self, __key: Not[_KT], default: _T) -> _T: ...
    @overload
    def get(self, __key: object, default: _T) -> _VT_co | _T: ...

Meaning: any key type that for certain cannot contain _KT will result in the default return type. Everything else can result in a union.

This means you don't have to narrow your key type before using Mapping.get with a too broad type, like you must do today.

@ekalosak
Copy link

ekalosak commented Aug 4, 2022

My usecase is much less theory-driven - I just want to be able to safely chain functions using an @maybe decorator:

def maybe(f: Callable) -> Callable:
  def g(...):
    return None if ... is None else f(...)
  return g

^ this has some nasty callback protocol for its type (https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols) but a NotNone type would make this much more readable.

@eyaler
Copy link

eyaler commented Nov 16, 2022

I think I may have another real use case for Not (combined with Intersection)

As described in python/mypy#13893, I want to use boolean literals to forbid a specific combination of function parameter types and boolean flag values. However, the required bool-type fallback for the boolean Literals, would catch and allow the forbidden combination, rendering the whole overload specialization useless.

This is what doesn't work (full mypy example here)

@overload
def f(data: bytes, data_must_be_bytes: Literal[True]) -> int: ...

# actually redundant due to the later fallback:
#@overload
#def f(data: AnyStr, data_must_be_bytes: Literal[False]) -> int: ...

# necessary fallback for non-literal bool - but allows the forbidden combination: data: str, data_must_be_bytes: Literal[True]
@overload
def f(data: AnyStr, data_must_be_bytes: bool) -> int: ...

If I had Not and Intersection, I could replace the latter fallback overload with:

@overload
def f(data: AnyStr, data_must_be_bytes: Intersection[bool, Not[Literal[True]]]) -> int: ...

@JelleZijlstra
Copy link
Member

@eyaler TypingError (#1043) would also cover your use case.

@randolf-scholz
Copy link

randolf-scholz commented Jan 9, 2023

Another application, from my comment in #256: Combining a NOT-type with an AND-type (#213) allows us to correctly specify a type that is an Iterable of strings, which is not supposed to be a string itself: Iterable[str] & ~str.

@spacether
Copy link

spacether commented Jun 9, 2023

Having a not type would allow using literals to define keys for a dictionary where those keys are required. Then having Not[the key literals] would allow one to define an additionalProperty key and corresonding value type which applies to all other keys. That would be useful in json schema contexts.

@CarliJoy
Copy link

On 22-23 July 2023 at the PyCon Europe in Prague I would like to discuss this Issue (together with others) and maybe formulate a PEP.
Maybe someone here is interested and would like to join the sprint.

@jessestricker
Copy link

jessestricker commented Jul 13, 2023

We already have Union and we may get Intersection (#213).
Going with the set theory naming scheme, should this type perhaps be named Complement instead of Not?

I argue that while Complement is more consistent, Not will be more intuitive thus personally prefer Not.

@erictraut
Copy link
Collaborator

Not will not compose with other features of the Python type system (overload matching, TypeVar constraint solving, etc.) so I don't see this effort going anywhere. There's a good reason why no other popular programming languages include such a feature in their type systems.

I brought up the topic with the TypeScript team a while ago, and they said that they've never seriously considered adding such a concept to TypeScript because of all the problems it would create.

I recommend focusing your attention on intersections and abandoning the notion of adding a Not to Python. With intersections, there are still many issues to work out to make them compose with other Python typing features, but I think it's a tractable problem.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jul 13, 2023

I'm not as negative as Eric about this idea overall, but FWIW I agree that there's little use in pursuing it, really, until we have an intersection type. Its use cases are much more limited without an intersection type, and intersection types have a much broader array of use cases than a Not type. Intersections should definitely come first, in my opinion.

As Eric says, there will be enough challenges with intersection types that it makes sense to take the two separately.

@superbobry
Copy link

@erictraut can you elaborate on why Not will not compose with other features? Are you implying that there is no sensible way to incorporate Not into subtyping?

Not type arise naturally in control flow, for example in

def f(x: Any):
  if not isinstance(x, str):
    x  # type: Any&Not[str]

so defining what Not means in terms of subtyping will be useful for aligning the different type checkers.

@erictraut
Copy link
Collaborator

I'm saying that negated types will not compose with existing typing features. The feature would require many caveats and special cases limiting where it can be used.

Also, depending on how generalized it is, it would be an enormous amount of work to support in type checkers. I'm aware that several people have supplied use cases to justify the introduction of type negation, but none of them (IMO) come close to being compelling enough to justify adding such a complex and unusual addition to the type system.

Off the top of my head, here are some examples of where it won't compose, will introduce type holes, or will require special handling.

# Type variables and constraint solving
def func0(x: Not[T]) -> T: ...
func1(0) # What is the type of this expression?

def func1(x: Not[T], y: T) -> T: ...
func2(0, 0) # What is the type of this expression? Should it be an error?

# Constrained type variables
X = TypeVar("X", str, Not[str]) # Is this allowed? What does it mean?

# Instantiable types
type[Not[int]] # What does this mean?
Not[type[int]] # Is it the same as this?

def func3(val: Not[Foo]):
    type(val) # Is this valid? What is the type of this expression?

# Multiple inheritance
def func3(val: Foo):
    x: Not[Bar] = val # Should this be allowed? What if val is an instance of a class that derives from both Foo and Bar?

# Structural subtyping
def func4(val: SupportsAbs):
    x: Not[SupportsIndex] = val # Should this be allowed? What if val happens to conform to both protocols?

# Various other special cases
Not[object] # Is this equivalent to Never?
Not[Any] # What does this mean? Is it equivalent to Never?
Not[Never] # Is this the equivalent of object? Any? Some new top type that is not expressible today?
Not[Self] # Is this permitted?

# Recursive type aliases
X: TypeAlias = list[X] | Not[X] # This would need to be rejected with special-case logic because it introduces a circularity

I'm sure I could come up with dozens more issues and questions if I spent a bit more time. This list is just the tip of the iceberg. And I'm focusing here only on the static type checking issues. The runtime issues would also need to be considered (e.g. including Not[X] in isinstance or checking for Not[X] as a base class in a class statement).

Many issues will go undiscovered unless/until someone takes the time to implement a reference implementation in one of the major type checkers.

As with any new type feature, I recommend that prior to writing any draft PEP you familiarize yourself with other typed languages and if/how they implement the feature. For example, I did this with PEP 695 (see appendix A). I'm not aware of any other typed languages that implement this capability, and I suspect there's a good reason for that. You should ask yourself why Python is special in that it needs such a feature when other typed languages — even those that have much more advanced typing features — do not implement this.

As I mentioned above, I think theres a reasonably strong justification for intersection types, and there is precedent for this feature in other typed languages. I recommend that you focus your attention on intersections and abandon the notion of negated types, at least for now. Intersections have their own long list of challenges that need to be worked out, and they will also require an enormous amount of work to add to type checkers if they are fully generalized. If history is any indication, it will take multiple years after an intersection PEP is drafted before the major type checkers support such a feature. And it will take years more to shake out all of the bugs for such a feature. Don't underestimate the work involved.

@gandhis1
Copy link

You should ask yourself why Python is special in that it needs such a feature when other typed languages — even those that have much more advanced typing features — do not implement this.

Do these other languages have as many unusual object relationships on core types? The previously mentioned str issue being a big one, datetime inheriting from date being another, and int/float?

Regarding the examples, sure I get that many of them will require special casing. But how many of those examples have legitimate ambiguity? I could point out at least a few of those - the constrained TypeVar for one - where I think there's an obvious interpretation (usually an error).

For the other examples of mixing-and-matching of constructs in arbitrary ways, I'd also ask whether existing typing constructs required the same special casing upon their introduction. I don't have a good example here, but I'd pose the question whether currently you can in fact stick anything into a type[] or whether some of those possibilities are already regarded as errors by type checkers.

@superbobry
Copy link

Thanks, Eric! These are all great points.

I can suggest natural answers to most of your questions, but I agree Not is tricky and will take a lot of work to get right.

I also think we need to spec how existing features work before attempting Not.

@randolf-scholz
Copy link

randolf-scholz commented Jul 21, 2023

Another application for NOT-type I just encountered: It could be useful to "abuse" type-checkers as typo-checkers:

str & ~LiteralString would allow verifying that a literal string argument belongs to a set of valid string arguments, while still allowing dynamical string arguments

Over at pandas-stubs it is debated whether Series.astype(dtype: str) should be allowed, as the most common use-case does not involve dynamically strings. The advantage is that this guards against typos: s.astype("flaot32") would be flagged by the type-checker.

However, sometimes we do want dynamical string, for example when applying schema save in config-files. With Not, this would be solved by overloading:

@overload
def astype(dtype: Literal["float32", "float64", "int32", "int64", ...]) -> Series: ...
@overload
def astype(dtype: str & ~LiteralString) -> Series: ...

Currently, one has to give up either typo-checking or dynamical strings.

@NeilGirdhar
Copy link

NeilGirdhar commented Aug 14, 2023

Linking:

These may be useful if someone wants to write a PEP one day.

@mikeshardmind
Copy link

mikeshardmind commented Aug 16, 2023

I'm saying that negated types will not compose with existing typing features. The feature would require many caveats and special cases limiting where it can be used.

Also, depending on how generalized it is, it would be an enormous amount of work to support in type checkers. I'm aware that several people have supplied use cases to justify the introduction of type negation, but none of them (IMO) come close to being compelling enough to justify adding such a complex and unusual addition to the type system.

Off the top of my head, here are some examples of where it won't compose, will introduce type holes, or will require special handling.

# Type variables and constraint solving
def func0(x: Not[T]) -> T: ...
func1(0) # What is the type of this expression?

def func1(x: Not[T], y: T) -> T: ...
func2(0, 0) # What is the type of this expression? Should it be an error?

# Constrained type variables
X = TypeVar("X", str, Not[str]) # Is this allowed? What does it mean?

# Instantiable types
type[Not[int]] # What does this mean?
Not[type[int]] # Is it the same as this?

def func3(val: Not[Foo]):
    type(val) # Is this valid? What is the type of this expression?

# Multiple inheritance
def func3(val: Foo):
    x: Not[Bar] = val # Should this be allowed? What if val is an instance of a class that derives from both Foo and Bar?

# Structural subtyping
def func4(val: SupportsAbs):
    x: Not[SupportsIndex] = val # Should this be allowed? What if val happens to conform to both protocols?

# Various other special cases
Not[object] # Is this equivalent to Never?
Not[Any] # What does this mean? Is it equivalent to Never?
Not[Never] # Is this the equivalent of object? Any? Some new top type that is not expressible today?
Not[Self] # Is this permitted?

# Recursive type aliases
X: TypeAlias = list[X] | Not[X] # This would need to be rejected with special-case logic because it introduces a circularity

I'm sure I could come up with dozens more issues and questions if I spent a bit more time. This list is just the tip of the iceberg. And I'm focusing here only on the static type checking issues. The runtime issues would also need to be considered (e.g. including Not[X] in isinstance or checking for Not[X] as a base class in a class statement).

Many issues will go undiscovered unless/until someone takes the time to implement a reference implementation in one of the major type checkers.

As with any new type feature, I recommend that prior to writing any draft PEP you familiarize yourself with other typed languages and if/how they implement the feature. For example, I did this with PEP 695 (see appendix A). I'm not aware of any other typed languages that implement this capability, and I suspect there's a good reason for that. You should ask yourself why Python is special in that it needs such a feature when other typed languages — even those that have much more advanced typing features — do not implement this.

As I mentioned above, I think theres a reasonably strong justification for intersection types, and there is precedent for this feature in other typed languages. I recommend that you focus your attention on intersections and abandon the notion of negated types, at least for now. Intersections have their own long list of challenges that need to be worked out, and they will also require an enormous amount of work to add to type checkers if they are fully generalized. If history is any indication, it will take multiple years after an intersection PEP is drafted before the major type checkers support such a feature. And it will take years more to shake out all of the bugs for such a feature. Don't underestimate the work involved.

Most of this would be abject nonsense to apply Not to.

The below shows what's valid on static types and on gradual types, and it's a well-understood and studied part of type systems, even with gradual typing The below is from section4.1; Note that negation is not valid on gradual types.

image

Most of these are just examples that show that Not needs to be restricted in its use, not that it can't be added or wouldn't be useful.

There are very clear reasons why python would benefit from this currently, the most frequent, currently unsolvable correctly "gotcha" being Sequence[str] without intending to allow str

the only way to solve this one currently is an overload with str -> Never and a runtime check for str to raise on, which incurs runtime costs instead of just Sequence[str] & ~str (given both intersections and type negation), or to not check it at runtime and lie a bit. Either of the current methods also create issues of LSP violations if someone wanted to subclass a method to also handle single-string input.

A clear set of things that Not would be correct to apply to:

  1. On a static type.
  2. On a structural type (protocol)*
  3. In combination with unions and intersections.
  4. In combination with overloads.
  5. As part of a typevar bound or constraint
  6. On typing.Self, which has a singular associated static type in all cases where it is meaningful.

A clear set of things where Not would be nonsense if applied to

  1. A TypeVar where the type is not constrained to static types via constraints or a bound.
  2. Any gradual type

The open question would be

  1. On a TypeVar in the input type to a function (see below)**

* (Edited a prior caveat to be significantly more clear) There are problems with this that can be shown with partial type erasure.

class Sequence(Protocol[T]):
    ...

class SequenceMutationMethods(Protocol):
    ...

type MutableSequence[T] = Sequence[T] & SequenceMutationMethods
type NeverMutableSequence[T] = Sequence[T] & ~SequenceMutationMethods
x: list = []
y: Sequence = x
z: NeverMutableSequence = y
x.append(1)

These problems go away if total consistent use is checked by type checkers.


** (Not[T]) -> T is nonsensical in defining the input in terms of the result, rather than the other way around, but equivalent to something else: (T) -> Not[T] has a meaning, but maybe should still be disallowed for practical reasons of this not providing meaningful information, with this restriction being considered flexible to future change if someone could show a use case for it.

Edit: Stricter TypeGuards PEP essentially provides a use case for this, but in just the case of type guards.


Not is useful (arguably required if we add intersections) in python because python has a combination of structural and nominal subtyping, as well as just value-based handling in the type system (Literals...), and currently requires all 3 to express some types properly, while having undesirable overlap between them in some cases.

There's existing research and proofs that show where it is and isn't sound to use in a gradually typed type system, so it's not so much a matter of not being able to determine what is and isn't valid or meaningful, but whether or not the effort of adding it provides enough value to justify the addition to the type system.

@Asaurus1
Copy link

There a large number of complicated examples in this thread, so here's a usecase that might be overly simplified.

A function that returns the same type as the object put put in, unless you explicitly pass in None in which case it can either return a str value or a None value (based on the result of some side-effect that you as the user have no control over)

@overload
def foo(a: None) -> str | None:
   ...

@overload
def foo(a: T) -> T:
   ...

def foo(a):
    ...

Currently gets flagged as an "Overloaded function signature overlap with incompatible return types" because T cannot be specified to NOT be NoneType.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests