Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative notations for covariance and contravariance #211

Closed
ilevkivskyi opened this issue May 4, 2016 · 19 comments
Closed

Alternative notations for covariance and contravariance #211

ilevkivskyi opened this issue May 4, 2016 · 19 comments

Comments

@ilevkivskyi
Copy link
Member

ilevkivskyi commented May 4, 2016

NOTE: The Idea described here was discussed in the context of #2 , but somehow was forgotten.
Here I would like to revive the discussion of this particular idea.

I was writing a text explaining variance of generic types and I have found something that is quite inconvenient and confusing form me in the current notations for variance. The key point is that covariance, contravariance, or invariance are the properties of generics, not of type variables. As currently agreed, only the declaration-site variance is supported. Yet, something like this is currently not prohibited by the PEP:

def fun(lst: List[T_contra]) -> T_contra:
    ...

It is quite unclear what that would mean. There is another problem, if one sees a type variable, then unless it has a self-explanatory name, it is not clear whether it is covariant, invariant, or contravariant, one needs to scroll up to its definition.

I would like to propose to change slightly the notation. Namely, remove covariant and contravariant keyword arguments from TypeVar constructor. Instead, implement __pos__ and __neg__ methods for unary plus and minus. Pluses and minuses should be allowed only in the definition/declaration of a generic type. So that one could write

class MyClass(Generic[+T, U, -S], Mapping[T, int], Set[S]):

    def my_method(self, lst: List[T]) -> S:
        ...

and this would mean that:

  1. MyClass[t, u, s] is a subtype of Mapping[t, int] and a subtype of Set[s], i.e. could be used in place of function parameters with such declared types.
  2. MyClass[t1, u, s] is a subtype of MyClass[t2, u, s] if t1 is a subtype of t2 (covariance).
  3. MyClass[t, u, s2] is a subtype of MyClass[t, u, s1] if s1 is a subtype of s2 (contravariance).
  4. MyClass is invariant in second type argument.

Also, as discussed in #115 a shorter notation (ommiting Generic) could be used in simple cases:

class MyClass(Mapping[+T, int]):

    def my_method(self, lst: List[T]) -> int:
        ...

At the same time, pluses and minuses in type annotations of functions and variables should be prohibited by the PEP:

def fun(lst: List[-T]) -> -T: # Error
    ...

I think that the new notation would be especially clearer in situation with complex indexed generic types. One could be easily confused by

it = [] # type: Iterable[Tuple[T_contra, T_co]]

how could it be contravariant if Iterator is defined as covariant? Such confusions will be eliminated with the new notations. Also, it would be simpler to understand for people familiar with other languages like C#, Scala or OCaml.

EDITED: there was a typo in point 1, and a mistake in the short example

@gvanrossum
Copy link
Member

I'm sorry, this instantly gave me a headache. I don't even understand what's wrong with def fun(lst: List[T_contra]) -> T_contra:. :-(

MyClass[t, u, s] is a subtype of Mapping[u, int]

Surely you meant "a subtype of Mapping[t, int]" ?

class MyClass(Mapping[+T, int], Generic[U], Set[-S]):

IIRC in #115 we said that if Generic[...] is listed as a base class, only the variables mentioned there are valid, and their order is determined by that form (not by the default rule of picking them out of all bases in textual order).

it would be simpler to understand for people familiar with other languages like C#, Scala or OCaml.

I'm not sure that's one of our goals. I think it would be more important if it could be made understandable for people like myself.

@ilevkivskyi
Copy link
Member Author

I'm sorry, this instantly gave me a headache. I don't even understand what's wrong with def fun(lst: List[T_contra]) -> T_contra:. :-(

This is exactly my point. There is nothing wrong formally with such definition, but such examples constantly confuse me, since the type variable here contains the variance information that is redundant in this context. Such redundant information never appears with the +/- notations.

Surely you meant "a subtype of Mapping[t, int]" ?

Yes, that is a typo.

IIRC in #115 we said that if Generic[...] is listed as a base class, only the variables mentioned there are valid, and their order is determined by that form (not by the default rule of picking them out of all bases in textual order).

You are right. I am sorry, this is a bad example. I just wanted to illustrate that both longer (with Generic), and shorter forms of generic type declaration are OK with such notations.

I'm not sure that's one of our goals. I think it would be more important if it could be made understandable for people like myself.

I agree. But I think I am more like you than like that other people :-) and +/- in generic type declaration is much more understandable for me than covariant=True in TypeVar definition.

What do you you think? Is +/- notation is more understandable for you than covariant=True or not?

@gvanrossum
Copy link
Member

TBH any mention of variance gives me a headache, which doesn't sound like it's the case for you. :-)

So I'm not sure what to think of the +T/-T notation. I'm not even sure I could remember which of these means co and which one is contra.

Regarding the function with unwanted variance information, I don't see why a type checker couldn't reject that just as easily as it would reject use of +T/-T. Regardless, it's not something one would write unless one is still learning stuff, and typing random programs and responding to the error messages is a really poor way to learn about something as abstract as variance. (I personally figured out the reason just now by starting to type a program, and before I had even finished it I understood the reason why variance is silly here.)

@JukkaL
Copy link
Contributor

JukkaL commented May 4, 2016

Note that the implementation of variance in mypy is not complete, which mostly means that mypy won't reject many things that it should complain about.

@ilevkivskyi
Copy link
Member Author

@JukkaL I understand your point. I just wanted to raise the question at an early stage, when it is easy to change things. BTW, what do you think about indicating variance only in generic type declaration instead of indicating it in the type variable definition?

Regarding the function with unwanted variance information, I don't see why a type checker couldn't reject that just as easily as it would reject use of +T/-T.

@gvanrossum , Agreed.

Still, there is the question what to do with the situations where type variable is covariant or contravariant but has a non-self-descriptive name. I think +/- notation forces you to indicate the variance information exactly where it is necessary and that could improve readability significantly.

Of course one could make some convention on naming of type variables, but it is a bit ugly
to have T_co and T_contra all over the code.

@ilevkivskyi
Copy link
Member Author

@gvanrossum By the way, there is a simple mnemonic rule to remember which of +/- means co and which one is contra. Consider a function f(x) = -x, it has a simple property:
if x1 < x2 then f(x2) < f(x1).
The same happens for contravariant generics:
if t1 is subtype of t2, then Contra[t2] is subtype of Contra[t1].
Therefore minus is contravariant.

@ilevkivskyi
Copy link
Member Author

@rwbarton Could you please tell your opinion on the idea proposed in this issue?

@rwbarton
Copy link

I agree that it would make more sense to specify the variance of generic class type variables at the class definition site. As for the concrete syntax +T, -U, V, it's cute that it matches the syntax of some other languages; but I'm guessing that most Python programmers will never have seen it before, and it seems essentially ungoogleable to me.

I thought of the more explicit syntax

class MyClass(Generic[T, U, S], Covariant[T], Contravariant[S], Mapping[T, int], Set[S]):
    ...

But a subclass of MyClass need not be covariant in T (even when the type variables line up), so perhaps this is a little bit weird?

@gvanrossum
Copy link
Member

Maybe this instead?

class MyClass(Generic[Covariant[T], U, Contravariant[S]], T, S, Mapping[T, int], Set[S]):
    ...

I still think that a naming convention like T_co and T_contra is good enough...

@ilevkivskyi
Copy link
Member Author

The good point for + and - notation is that it is very short. I could imagine that many classes could be covariant, so that repeating Covariant everywhere could be quite verbose.

As a yet another alternative, I would propose:

class MyClass(Generic[Co[T], U, Contra[S]], Mapping[T, int], Set[S]):
    ...

This is only slightly more verbose than naming convention, but at the same time it is more clear to specify the variance at the class definition site.

@rwbarton @gvanrossum what do you think?

@gvanrossum
Copy link
Member

I could see Generic[Co[T], U, Contra[S]], but I have to deal with a few
unpleasant practicalities:

  • Getting a new version of typing.py into the stdlib, this has to wait
    until Python 3.5.3 is released (not expected before December)
  • Implementing the new notation in mypy while keeping backwards
    compatibility with the old one
  • Getting other users of PEP 484 (at least pytype and PyCharm) to also
    support this

And I have many other more pressing things to do.

@ilevkivskyi
Copy link
Member Author

If we have an agreement on this, then I could implement Co and Contra in typing.py and make additions to PEP 484.

Or do you think we need to first ask other guys (from pytype and PyCharm) whether they have any objections?

@gvanrossum
Copy link
Member

You have it backwards. If we are going to change this, Co/Contra are
acceptable to me. But I still don't want to change. Doing the typing.py
changes is easy; it's the mypy changes that I don't want to commit to.

@ilevkivskyi
Copy link
Member Author

@gvanrossum OK, I see. Actually, I feel similar about this issue. I could live with the current situation, but I am more worried about people who are not familiar with variance.

After some thinking I have a radically different alternative proposal: keep only the current syntax, but

  • Explicitly add a naming convention recommending _co and _contra to PEP 484 and documentation on docs.python.org (probably we should add a PEP 8 naming conventions for type variables in general, like CapWords + preferring short names?)
  • Add a note to PEP 484 and documentation emphasizing that variance is not a property of a type variable, it is a property of a generic class.
  • Prohibit using type variables defined with covariant=True or contravariant=True in generic functions (of course such variables could be used in methods of generic classes as usual).

@gvanrossum
Copy link
Member

That sounds like a plan I can get behind. Do you have the oomph to write up
a PR for the PEP?

On Tue, Aug 2, 2016 at 1:07 AM, Ivan Levkivskyi notifications@github.com
wrote:

@gvanrossum https://github.com/gvanrossum OK, I see. Actually, I feel
similar about this issue. I could live with the current situation, but I am
more worried about people who are not familiar with variance.

After some thinking I have a radically different alternative proposal:
keep only the current syntax, but

  • Explicitly add a naming convention recommending _co and _contra to
    PEP 484 and documentation on docs.python.org (probably we should add a
    PEP 8 naming conventions for type variables in general, like CapWords +
    preferring short names?)
  • Add a note to PEP 484 and documentation emphasizing that variance is
    not a property of a type variable, it is a property of a generic class.
  • Prohibit using type variables defined with covariant=True or
    contravariant=True in generic functions (of course such variables
    could be used in methods of generic classes as usual).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#211 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACwrMkmWugLhAriBfZlsDbBlX-_4p073ks5qbvqsgaJpZM4IXEPp
.

--Guido van Rossum (python.org/~guido)

@ilevkivskyi
Copy link
Member Author

OK, if @rwbarton also has no objections I will submit a PR to python/peps tomorrow and later also a patch for documentation at docs.python.org

@rwbarton
Copy link

rwbarton commented Aug 2, 2016

Yes, this sounds like a good plan to me too.

(I actually think it would still be a minor improvement to mypy to internally make the change suggested in the original post, of moving variance from an attribute of a type variable to an attribute of a class definition. This would just be a refactoring though, and is low priority.)

@ilevkivskyi
Copy link
Member Author

I have submitted two PRs: here #257 and to python/peps.

@elazarg
Copy link

elazarg commented Sep 15, 2016

I was referred here from python-ideas. Just wanted to comment on Guido's statement

So I'm not sure what to think of the +T/-T notation. I'm not even sure I could remember which of these means co and which one is contra.

I remember very well that as a 3rd-year student, it took me (and many others) too long to memorize what does "covariance" and "contravariance" mean. Maybe that's more obvious to people from Europe/America, but it is a non obvious terminology. Yes, it is standard, but +T/-T is now standard in Scala, and much less verbose; generic type names are already too verbose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants