New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow subclassing without supertyping #241

Open
dmoisset opened this Issue Jul 8, 2016 · 50 comments

Comments

Projects
None yet
7 participants
@dmoisset

dmoisset commented Jul 8, 2016

This issue comes from python/mypy#1237. I'll try to make a summary of the discussion here

Some one reported an issue with an artificial example (python/mypy#1237 (comment)):

class Foo:
    def factory(self) -> str:
        return 'Hello'

class Bar(Foo):
    def factory(self) -> int:
        return 10

and then with the following stub files (python/mypy#1237 (comment)):

class QPixmap(QPaintDevice):
    def swap(self, other: 'QPixmap') -> None: ...

class QBitmap(QPixmap):
    def swap(self, other: 'QBitmap') -> None: ...

Which mypy currently reports as erroneous: Argument 1 of "swap" incompatible with supertype "QPixmap"

These were initially was argued to be a correct error because they violate Liskov Substitution Principle (the same case by a change of return type, the second is a covariant argument change). The problem in this scenario is that the actual classes are not meant to be substituted (the first example is actually a wrapper for a non-virtual C++ function so they're not substitutable, and the second aren't supposed to be mixed together, just reuse part of the code). (see python/mypy#1237 (comment))

There was also a suggestion that allow covariant args explicitly (in the same way that Eiffel allows it and adds a runtime check) with a decorator

class QBitmap(QPixmap):
    @covariant_args
    def swap(self, other: 'QBitmap') -> None: ...

My proposal instead was add what some dialects of Eiffel call "non-conforming inheritance" ("conforms" is the Eiffel word for what PEP-483 calls "is-consistent-with"). It is essentially a mechanism to use subclassing just as a way to get the implementation from the superclass without creating a subtyping relation between them. See python/mypy#1237 (comment) for a detailed explanation.

The proposal is to haven Implementation in typing so you can write:

class QBitmap(Implementation[QPixmap]):
    def swap(self, other: 'QBitmap') -> None: ...

which just defines a QBitmap class with all the mothods copied from QPixmap, but without making one a subtype of the other. In runtime we can just make Implementation[QPixmap] == QPixmap

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 8, 2016

As a follow-up to the discussion, @gvanrossum suggested using a class decorator. The problem with this approach is that class-level is not the right granularity to decide this, but each individual inheritance relation is. Ina multiple-inheritance context, you could want to inherit the interface of a SuperClassA, and the implementation of SuperClassB. I think that might be quite common with mixins (actually I think most mixins could be used with this form of inheritance)

Looking deeper into what Eiffel does (but this is a bit of a side-comment, not part of the main proposal), there it was also possible to define a superclass in a way that only implementation inheritance from it was allowed, and that could work well with mixins (and implemented as a class decorator). However you still need to be able to say "I just want the implementation of this superclass" when inheriting, because the writer of the superclass may not have foreseen that you wanted that.

dmoisset commented Jul 8, 2016

As a follow-up to the discussion, @gvanrossum suggested using a class decorator. The problem with this approach is that class-level is not the right granularity to decide this, but each individual inheritance relation is. Ina multiple-inheritance context, you could want to inherit the interface of a SuperClassA, and the implementation of SuperClassB. I think that might be quite common with mixins (actually I think most mixins could be used with this form of inheritance)

Looking deeper into what Eiffel does (but this is a bit of a side-comment, not part of the main proposal), there it was also possible to define a superclass in a way that only implementation inheritance from it was allowed, and that could work well with mixins (and implemented as a class decorator). However you still need to be able to say "I just want the implementation of this superclass" when inheriting, because the writer of the superclass may not have foreseen that you wanted that.

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 8, 2016

Also note that this idea makes self types more viable. Things like TotalOrderingMixin where you have

    def __lt__(self: T, other:T): -> bool ...

can be defined more accurately (currently this is in the stdlib), and with no errors (note that the < operator/ __lt__ method is actually covariant in its argument in Python 3). TotalOrderingMixin is something that would create an error in every case with the current mypy implementation if self types are added as is.

dmoisset commented Jul 8, 2016

Also note that this idea makes self types more viable. Things like TotalOrderingMixin where you have

    def __lt__(self: T, other:T): -> bool ...

can be defined more accurately (currently this is in the stdlib), and with no errors (note that the < operator/ __lt__ method is actually covariant in its argument in Python 3). TotalOrderingMixin is something that would create an error in every case with the current mypy implementation if self types are added as is.

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 10, 2016

Collaborator

I think this is a very nice feature. In general, it is good that we separate classes (code reuse) and types (debugging/documentation). If it is added, then I would also mention this feature in PEP483 (theory) as an example of differentiating classes and types.

Between a decorator and a special class I chose the second one, since as @dmoisset mentioned, it is more flexible and suitable fox mixin use cases. Also, it is very in the spirit of gradual typing, one can mark particular bases as Implementation while still typecheck with respect to other bases.

Collaborator

ilevkivskyi commented Jul 10, 2016

I think this is a very nice feature. In general, it is good that we separate classes (code reuse) and types (debugging/documentation). If it is added, then I would also mention this feature in PEP483 (theory) as an example of differentiating classes and types.

Between a decorator and a special class I chose the second one, since as @dmoisset mentioned, it is more flexible and suitable fox mixin use cases. Also, it is very in the spirit of gradual typing, one can mark particular bases as Implementation while still typecheck with respect to other bases.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 10, 2016

Member
Member

gvanrossum commented Jul 10, 2016

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 14, 2016

Collaborator

I think it is better to have a minimalistic solution, rather than try to make a complex proposal that will cover all use cases. Later, more features for most popular use cases could be added.

Here is a detailed proposal with examples:

  1. At runtime Implementation[MyClass] is MyClass. So that the separation of control between subclass/superclass is as usual. Also, all normal mechanisms work at runtime:
class Base:
    def __init__(self, number: int) -> None:
        self.number = number

class AnotherBase:
    ...

class Derived(Implementation[Base], AnotherBase):
    def __init__(self) -> None:
        super().__init__(42) # This works since Implementation[Base] is just Base
  1. Use cases for Implementation are situations where a user wants to reuse some code in another class without creating excessive "type contracts". Of course one can use Any or #type: ignore, but Implementation will allow more precise type checking.

  2. The derived class is not considered a subtype of implementation base classes. An explicit cast might be used under user's responsibility. Examples:

from typing import cast

def func(x: Base) -> None:
    ...
def another_func(x: AnotherBase) -> None:
    ...

func(Base(42))              # this is OK
another_func(Derived())     # this is also OK
func(Derived())             # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
  1. Methods of an implementation base class could be redefined in an incompatible way:
class Base:
    def method(self, x: int) -> None:
        ...

class AnotherBase:
    def another_method(self, y: int) -> None:
        ...

class Derived(Implementation[Base], AnotherBase):
    def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
    def method(self, x: str) -> None: ...         # Although, this is OK
  1. As @JukkaL proposed, a typechecker checks that the redefined methods do not "spoil" the already checked code by re-type-checking the bodies of all base class methods as if they were written in the subclass (after translating type variables suitably if the base class is generic).

I think the re-check should be recursive, i.e. include bases of bases, etc. Of course, this could slow down the type-check, but not the runtime.

  1. A typechecker uses the normal MRO to find the signature of a method. For example, in this code:
class Base:
    def method(self, x: int) -> None: ...

class AnotherBase:
    def another_method(self, y: int) -> None: ...

class Derived(Implementation[Base], AnotherBase):
    def method(self, x: str) -> None: ...

a typechecker infers that Derived().method has signature (str) -> None, and Derived().another_method has signature (int) -> None.

  1. Clearly, Implementation could be used only with a proper class. Implementation[Union[t1, t2]], Implementation[Callable[[t1, t2, ...], tr]], etc. rise TypeError.

If we agree on this, then I will submit PRs for typing.py, PEP484, and PEP483.

Collaborator

ilevkivskyi commented Jul 14, 2016

I think it is better to have a minimalistic solution, rather than try to make a complex proposal that will cover all use cases. Later, more features for most popular use cases could be added.

Here is a detailed proposal with examples:

  1. At runtime Implementation[MyClass] is MyClass. So that the separation of control between subclass/superclass is as usual. Also, all normal mechanisms work at runtime:
class Base:
    def __init__(self, number: int) -> None:
        self.number = number

class AnotherBase:
    ...

class Derived(Implementation[Base], AnotherBase):
    def __init__(self) -> None:
        super().__init__(42) # This works since Implementation[Base] is just Base
  1. Use cases for Implementation are situations where a user wants to reuse some code in another class without creating excessive "type contracts". Of course one can use Any or #type: ignore, but Implementation will allow more precise type checking.

  2. The derived class is not considered a subtype of implementation base classes. An explicit cast might be used under user's responsibility. Examples:

from typing import cast

def func(x: Base) -> None:
    ...
def another_func(x: AnotherBase) -> None:
    ...

func(Base(42))              # this is OK
another_func(Derived())     # this is also OK
func(Derived())             # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
  1. Methods of an implementation base class could be redefined in an incompatible way:
class Base:
    def method(self, x: int) -> None:
        ...

class AnotherBase:
    def another_method(self, y: int) -> None:
        ...

class Derived(Implementation[Base], AnotherBase):
    def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
    def method(self, x: str) -> None: ...         # Although, this is OK
  1. As @JukkaL proposed, a typechecker checks that the redefined methods do not "spoil" the already checked code by re-type-checking the bodies of all base class methods as if they were written in the subclass (after translating type variables suitably if the base class is generic).

I think the re-check should be recursive, i.e. include bases of bases, etc. Of course, this could slow down the type-check, but not the runtime.

  1. A typechecker uses the normal MRO to find the signature of a method. For example, in this code:
class Base:
    def method(self, x: int) -> None: ...

class AnotherBase:
    def another_method(self, y: int) -> None: ...

class Derived(Implementation[Base], AnotherBase):
    def method(self, x: str) -> None: ...

a typechecker infers that Derived().method has signature (str) -> None, and Derived().another_method has signature (int) -> None.

  1. Clearly, Implementation could be used only with a proper class. Implementation[Union[t1, t2]], Implementation[Callable[[t1, t2, ...], tr]], etc. rise TypeError.

If we agree on this, then I will submit PRs for typing.py, PEP484, and PEP483.

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

This has the problem that a type checker doesn't see inside the implementations of stubbed classes, and you could break some of the base class methods without a type checker having any way of detecting it.

Maybe implementation inheritance would only be allowed from within your own code somehow. For example, behavior would be undefined if you inherit the implementation of a stdlib or 3rd party library class, unless you include the implementation of the library module in the set of checked files. object would be fine as a base class, of course, but you probably shouldn't override any methods defined in object in an incompatible manner, since this can break a lot of things. This would only be enforced by static checkers, not at runtime.

Not sure if the latter would be too restrictive. To generalize a bit, it would okay if the implementation base class has library classes in the MRO, but methods defined in those classes can't be overridden arbitrarily.

Contributor

JukkaL commented Jul 14, 2016

This has the problem that a type checker doesn't see inside the implementations of stubbed classes, and you could break some of the base class methods without a type checker having any way of detecting it.

Maybe implementation inheritance would only be allowed from within your own code somehow. For example, behavior would be undefined if you inherit the implementation of a stdlib or 3rd party library class, unless you include the implementation of the library module in the set of checked files. object would be fine as a base class, of course, but you probably shouldn't override any methods defined in object in an incompatible manner, since this can break a lot of things. This would only be enforced by static checkers, not at runtime.

Not sure if the latter would be too restrictive. To generalize a bit, it would okay if the implementation base class has library classes in the MRO, but methods defined in those classes can't be overridden arbitrarily.

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 14, 2016

Collaborator

@JukkaL This is a good point. We need to prohibit incompatibly overriding methods everywhere except for accessible in the set of checked files. I don't think it is too restrictive. Alternatively, we could allow this with a disclaimer that complete typecheking is not guaranteed in this case.

Collaborator

ilevkivskyi commented Jul 14, 2016

@JukkaL This is a good point. We need to prohibit incompatibly overriding methods everywhere except for accessible in the set of checked files. I don't think it is too restrictive. Alternatively, we could allow this with a disclaimer that complete typecheking is not guaranteed in this case.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 14, 2016

Member
Member

gvanrossum commented Jul 14, 2016

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 14, 2016

Collaborator

@gvanrossum Indeed, there is not much could be done. But I don't think this is bad. As with Any or # type: ignore, people who are going to use this feature know what they do. More important is that Implementation provides more precise typechecking, more "granularity", and probably a bit more safety than Any.

Collaborator

ilevkivskyi commented Jul 14, 2016

@gvanrossum Indeed, there is not much could be done. But I don't think this is bad. As with Any or # type: ignore, people who are going to use this feature know what they do. More important is that Implementation provides more precise typechecking, more "granularity", and probably a bit more safety than Any.

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

The QPixmap example shouldn't be a problem, since it seems to be for a library stub file. We don't verify that stubs are correct anyway, so a stub would be able to use implementation inheritance without restrictions but there are no guarantees that things are safe -- we expect the writer of the stub to reason about whether implementation inheritance is a reasonable thing to use.

For user code, I think that we can provide a reasonable level of safety if we enforce the restrictions that I mentioned.

Contributor

JukkaL commented Jul 14, 2016

The QPixmap example shouldn't be a problem, since it seems to be for a library stub file. We don't verify that stubs are correct anyway, so a stub would be able to use implementation inheritance without restrictions but there are no guarantees that things are safe -- we expect the writer of the stub to reason about whether implementation inheritance is a reasonable thing to use.

For user code, I think that we can provide a reasonable level of safety if we enforce the restrictions that I mentioned.

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

I have a related proposal. What about having a magic class decorator (with no runtime or minimal effect) or a base class that declares a class as a "pure mixin"? A pure mixin could only be used for implementation inheritance, and it wouldn't be type checked by itself at all -- it would only be type checked as part of implementation inheritance. I think that this would allow type checkers to check some mixin use cases that are tricky right now with minimal code changes. The alternative would be to use ABCs to declare the things that the mixin expects to be defined elsewhere, but they are harder to use and arguably add boilerplate.

Example:

@puremixin
class MyMixin:
    def do_stuff(self) -> int:
        self.one_thing()  # ok, even though not defined in this class
        return self.second_thing(1)  # also ok

class Thing(Implementation[MyMixin]):
    def one_thing(self) -> None: pass  # ok
    def second_thing(self, x: int) -> str:    # error: return type not compatible with do_stuff
        return 'x'
Contributor

JukkaL commented Jul 14, 2016

I have a related proposal. What about having a magic class decorator (with no runtime or minimal effect) or a base class that declares a class as a "pure mixin"? A pure mixin could only be used for implementation inheritance, and it wouldn't be type checked by itself at all -- it would only be type checked as part of implementation inheritance. I think that this would allow type checkers to check some mixin use cases that are tricky right now with minimal code changes. The alternative would be to use ABCs to declare the things that the mixin expects to be defined elsewhere, but they are harder to use and arguably add boilerplate.

Example:

@puremixin
class MyMixin:
    def do_stuff(self) -> int:
        self.one_thing()  # ok, even though not defined in this class
        return self.second_thing(1)  # also ok

class Thing(Implementation[MyMixin]):
    def one_thing(self) -> None: pass  # ok
    def second_thing(self, x: int) -> str:    # error: return type not compatible with do_stuff
        return 'x'
@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 14, 2016

Member

But is that the way mixins are typically used? It would be good to look
carefully at some examples of mixins (e.g. one of our internal code bases
uses them frequently).

Member

gvanrossum commented Jul 14, 2016

But is that the way mixins are typically used? It would be good to look
carefully at some examples of mixins (e.g. one of our internal code bases
uses them frequently).

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 14, 2016

Collaborator

@JukkaL How about the following disclaimer to be added?

"""
Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

In user code, implementation inheritance is only allowed with classes defined in the set of checked files. Built-in classes could be used as implementation base classes, but their methods could not be redefined in an incompatible way.
"""

I like you proposal of @puremixin decorator (maybe we could choose other name). It looks like it is independent of the initial proposal and could be easily added. I propose @gvanrossum to decide whether we need this.

Collaborator

ilevkivskyi commented Jul 14, 2016

@JukkaL How about the following disclaimer to be added?

"""
Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

In user code, implementation inheritance is only allowed with classes defined in the set of checked files. Built-in classes could be used as implementation base classes, but their methods could not be redefined in an incompatible way.
"""

I like you proposal of @puremixin decorator (maybe we could choose other name). It looks like it is independent of the initial proposal and could be easily added. I propose @gvanrossum to decide whether we need this.

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

@ilevkivskyi Looks good. I'd do some small changes (feel free to edit at will):

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

->

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). As a special case, if a stub uses implementation inheritance within the stub, the writer of the stub is expected to reason about whether implementation inheritance is appropriate. Stubs don't contain implementations, so type checker can't verify the correctness of implementation inheritance in stubs.

Built-in classes could be used ...

->

Built-in and library classes could be used ...

Contributor

JukkaL commented Jul 14, 2016

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

@ilevkivskyi Looks good. I'd do some small changes (feel free to edit at will):

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). The writer of the stub is expected to reason about whether implementation inheritance is appropriate.

->

Note that implementation inheritance could be used with stubbed extension modules, but there are no guarantees of safety in this case (type checkers could not verify correctness of extension modules). As a special case, if a stub uses implementation inheritance within the stub, the writer of the stub is expected to reason about whether implementation inheritance is appropriate. Stubs don't contain implementations, so type checker can't verify the correctness of implementation inheritance in stubs.

Built-in classes could be used ...

->

Built-in and library classes could be used ...

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

Also, I'm not at all attached to the puremixin name.

Contributor

JukkaL commented Jul 14, 2016

Also, I'm not at all attached to the puremixin name.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 14, 2016

Member
Member

gvanrossum commented Jul 14, 2016

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

Created #246 for pure mixins.

Contributor

JukkaL commented Jul 14, 2016

Created #246 for pure mixins.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 14, 2016

Member

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

We're probably talking about the same mixin classes. :-) However, I'm not convinced that those would deserve being marked as Implementation[] -- the class that inherits from the mixin typically also inherits from some other class and the mixin should not conflict with any methods defined there (though it may override them). (OTOH maybe I didn't read the examples all the way through.)

Member

gvanrossum commented Jul 14, 2016

@gvanrossum I've seen mixins like that (that use things that aren't defined anywhere in the mixin class). I'm not sure how common they are.

We're probably talking about the same mixin classes. :-) However, I'm not convinced that those would deserve being marked as Implementation[] -- the class that inherits from the mixin typically also inherits from some other class and the mixin should not conflict with any methods defined there (though it may override them). (OTOH maybe I didn't read the examples all the way through.)

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 14, 2016

Contributor

Let's move the discussion of mixins to #246?

Contributor

JukkaL commented Jul 14, 2016

Let's move the discussion of mixins to #246?

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 14, 2016

I was writing a proposal and it may be a bit messy to post it now because I duplicated some stuff already said, but wrote some other things differently. Let me summarize different or subtle points that I think should be discussed:

  • Using a generic-like syntax like Implementation[Foo] looks to similar to generics while what we have is somewhat different. I've been thinking on a functional syntax, i.e.:
class Base: ...
class Derived(Foo, implementation_only(Base), Bar): ...

so it doesn't make sense to pass it a Typevar or anything that has no supporting class (like unions or protocols). I think the different syntax helps.

  • other names I considered for implementation_only() are reuse() or include() or insert() (the latter ones makes sense if you think of the mechanism it implements). "implementation" sounds a bit vague (even if it was my first proposal)
  • probably the type checker should verify that implementation_only is used only in the context of a class definition in the base class list.
  • the typechecker should ignore the Base methods if overriden in Derived, or in other parent (Foo, but not Bar in the example above)
  • inherited bodes should be checked if available; if stubs only no checking its done (it's consistent with the rest of mypy that ignore bodies if stubs are present)
  • super() needs some special care. The simplest way to approach it is that a super() object does not have methods that have a signature coming from an implementation_only parent, otherwise you could get mistyped calls from the mro chain. If you're doing implementation only inheritance, following the mro makes no sense anyway. We should allow some way to get the parent methods, maybe through an explicit class reference (allowing some leeway on the self parameter when some method on Derived says Base.method(self, ...))
  • A class defined having all its parents as implementation_only, is interpreted by the typechecker
    as a direct descendent of object.

As a side note, using functional syntax would make it trivial to use @implementation_only as a class decorator too, which can be used for the mixin mechanism proposed by @JukkaL . However for that example I'd prefer the mixin to actually have the signatures of its "abstract methods"; if I want to implement a mixin one of the things I want from a static type system id to give me some hints of what these methods should take. For me, the decorator would only make subclasses of the mixin to behave like the rest of this proposal

dmoisset commented Jul 14, 2016

I was writing a proposal and it may be a bit messy to post it now because I duplicated some stuff already said, but wrote some other things differently. Let me summarize different or subtle points that I think should be discussed:

  • Using a generic-like syntax like Implementation[Foo] looks to similar to generics while what we have is somewhat different. I've been thinking on a functional syntax, i.e.:
class Base: ...
class Derived(Foo, implementation_only(Base), Bar): ...

so it doesn't make sense to pass it a Typevar or anything that has no supporting class (like unions or protocols). I think the different syntax helps.

  • other names I considered for implementation_only() are reuse() or include() or insert() (the latter ones makes sense if you think of the mechanism it implements). "implementation" sounds a bit vague (even if it was my first proposal)
  • probably the type checker should verify that implementation_only is used only in the context of a class definition in the base class list.
  • the typechecker should ignore the Base methods if overriden in Derived, or in other parent (Foo, but not Bar in the example above)
  • inherited bodes should be checked if available; if stubs only no checking its done (it's consistent with the rest of mypy that ignore bodies if stubs are present)
  • super() needs some special care. The simplest way to approach it is that a super() object does not have methods that have a signature coming from an implementation_only parent, otherwise you could get mistyped calls from the mro chain. If you're doing implementation only inheritance, following the mro makes no sense anyway. We should allow some way to get the parent methods, maybe through an explicit class reference (allowing some leeway on the self parameter when some method on Derived says Base.method(self, ...))
  • A class defined having all its parents as implementation_only, is interpreted by the typechecker
    as a direct descendent of object.

As a side note, using functional syntax would make it trivial to use @implementation_only as a class decorator too, which can be used for the mixin mechanism proposed by @JukkaL . However for that example I'd prefer the mixin to actually have the signatures of its "abstract methods"; if I want to implement a mixin one of the things I want from a static type system id to give me some hints of what these methods should take. For me, the decorator would only make subclasses of the mixin to behave like the rest of this proposal

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 14, 2016

Btw, for further reference there's a paper on the idea at http://smarteiffel.loria.fr/papers/insert2005.pdf

dmoisset commented Jul 14, 2016

Btw, for further reference there's a paper on the idea at http://smarteiffel.loria.fr/papers/insert2005.pdf

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 14, 2016

There are some things that can be added to this mechanism (I left these apart because they aren't
part of the core proposal to keep the above KISS).

  • Allow classes to be always implementation_only (like @JukkaL said):
@implementation_only
class SomeMixin:
    def method(self, x: int) -> str: ...
    def other_method(self, x: int) -> str: ...

class C(SomeMixin):
    def method(self) -> None: ... # OK to change the signature
  • Allow to selectively include or exclude parts of the parent class:
class C(
        implementation_only(
            SomeMixin, include=('method', 'attribute')
        )
       ): ...

class D(
        implementation_only(
            SomeMixin, exclude=('other_method')
        )
       ): ...
  • Allow renames of methods
class E(
        implementation_only(
            SomeMixin, renames={'method': '_d_method'}
        )
       ): ...

The exclusion/renaming mechanism are present in Eiffel, and the renaming allows to unambiguously resolve issues related to super() by given explicit different names to the overriden methods, although
the mechanism has some complexities that might be problematic in python.

I don't think the first proposal should include any of these, but I wanted to mention these to also clarify that the functional syntax gives more flexibility for future growth of the idea.

dmoisset commented Jul 14, 2016

There are some things that can be added to this mechanism (I left these apart because they aren't
part of the core proposal to keep the above KISS).

  • Allow classes to be always implementation_only (like @JukkaL said):
@implementation_only
class SomeMixin:
    def method(self, x: int) -> str: ...
    def other_method(self, x: int) -> str: ...

class C(SomeMixin):
    def method(self) -> None: ... # OK to change the signature
  • Allow to selectively include or exclude parts of the parent class:
class C(
        implementation_only(
            SomeMixin, include=('method', 'attribute')
        )
       ): ...

class D(
        implementation_only(
            SomeMixin, exclude=('other_method')
        )
       ): ...
  • Allow renames of methods
class E(
        implementation_only(
            SomeMixin, renames={'method': '_d_method'}
        )
       ): ...

The exclusion/renaming mechanism are present in Eiffel, and the renaming allows to unambiguously resolve issues related to super() by given explicit different names to the overriden methods, although
the mechanism has some complexities that might be problematic in python.

I don't think the first proposal should include any of these, but I wanted to mention these to also clarify that the functional syntax gives more flexibility for future growth of the idea.

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 14, 2016

Collaborator

@dmoisset Indeed, some things you propose have already been said. This is rather a good sign.
Some comments:

  • I don't think that we need functional syntax, square brackets are used not only for generics they are used almost everywhere in the type checking context.
  • I like the name Implementation. It reminds me that I am inheriting only the "implementation" of the class, not the "interface" (i.e. type contracts).
  • I am not sure that the restriction about super() is really necessary.
Collaborator

ilevkivskyi commented Jul 14, 2016

@dmoisset Indeed, some things you propose have already been said. This is rather a good sign.
Some comments:

  • I don't think that we need functional syntax, square brackets are used not only for generics they are used almost everywhere in the type checking context.
  • I like the name Implementation. It reminds me that I am inheriting only the "implementation" of the class, not the "interface" (i.e. type contracts).
  • I am not sure that the restriction about super() is really necessary.
@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 14, 2016

The possible problem with super happens in the following snippet:

def implementation_only(C): return C

class Base:
    def f(self, a: int) -> None:
        print("Base")
        print(a)

class Left(Base):
    def f(self, a: int) -> None:
        print("Left.f")
        super().f(a)

class Right(implementation_only(Base)):
    def f(self, a: str) -> None:
        print("Right.f")
        super().f(len(a))

class C(Left, implementation_only(Right)):
    def f(self, a: int) -> None:
        print("C.f")
        super().f(a)

C().f(42)

This code fails with a TypeError if run with Python. but the rules defined above don't detect a problem:

  1. Base and Left are correctly typed
  2. Right redefines the signature of f (which is ok, given that it's using implementation inheritance).
  3. Right.f calls super.f using an int argument (which looks ok, given that the inherited f accepts an int)
  4. C redefines the signature of Right.f (which is ok, given that it is using implementation inheritance). It keeps the signature of Left.f so no problem there
  5. C.f calls super.f with an int. The type checker is only looking at the Left.f signature, so this is considered valid.

But even if everything looks ok by itself, the program has a type error, so the rules are unsound. So some of the rules above should actually be rejected in this scenario. 1 seems to be an ok rule (valid with the current semantics), and 2 is our proposal so let's assume it as valid (if we get to a contradiction the idea itself is unsound). So 3, 4, or 5 should be wrong, and we should choose (at least) one, under some conditions (which should include this scenario). My proposal was "3 is wrong. You can't call super from an implementation_only redefinition of a method". This actually will make the call that produces the TypeError in runtime to be invalid.

Other options are making this scenario impossible forbidding the redefinition in some multiple inheritance cases (4 would be invalid), but it's probably harder to explain to users

dmoisset commented Jul 14, 2016

The possible problem with super happens in the following snippet:

def implementation_only(C): return C

class Base:
    def f(self, a: int) -> None:
        print("Base")
        print(a)

class Left(Base):
    def f(self, a: int) -> None:
        print("Left.f")
        super().f(a)

class Right(implementation_only(Base)):
    def f(self, a: str) -> None:
        print("Right.f")
        super().f(len(a))

class C(Left, implementation_only(Right)):
    def f(self, a: int) -> None:
        print("C.f")
        super().f(a)

C().f(42)

This code fails with a TypeError if run with Python. but the rules defined above don't detect a problem:

  1. Base and Left are correctly typed
  2. Right redefines the signature of f (which is ok, given that it's using implementation inheritance).
  3. Right.f calls super.f using an int argument (which looks ok, given that the inherited f accepts an int)
  4. C redefines the signature of Right.f (which is ok, given that it is using implementation inheritance). It keeps the signature of Left.f so no problem there
  5. C.f calls super.f with an int. The type checker is only looking at the Left.f signature, so this is considered valid.

But even if everything looks ok by itself, the program has a type error, so the rules are unsound. So some of the rules above should actually be rejected in this scenario. 1 seems to be an ok rule (valid with the current semantics), and 2 is our proposal so let's assume it as valid (if we get to a contradiction the idea itself is unsound). So 3, 4, or 5 should be wrong, and we should choose (at least) one, under some conditions (which should include this scenario). My proposal was "3 is wrong. You can't call super from an implementation_only redefinition of a method". This actually will make the call that produces the TypeError in runtime to be invalid.

Other options are making this scenario impossible forbidding the redefinition in some multiple inheritance cases (4 would be invalid), but it's probably harder to explain to users

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 15, 2016

Collaborator

Thank you for elaborating, I understand your point. However, it looks like a more sophisticated typechecker could catch the TypeError. By re-type-checking bodies of methods in C parent classes, a typechecker could detect that the super().f(a) call in Left has incompatible signature, since super() for Left is now Right, and f in Right requires a str.

Prohibiting 3 is an absolutely valid option (although it feels a bit false positive to me), also prohibiting 4 is a valid option. What I propose is to leave this choice (sophisticated recursive re-checking vs prohibitng 3 vs prohibiting 4) up to the implementer of a particular typechecker.

Restrictions mentioned by Jukka are necessary, because it is extremely difficult (probably impossible in principle) to check types inside extension modules, so that we:
a) "forgive" typecheckers all uncaught TypeErrors related to this;
b) propose some restrictions for user code that nevertheless allow to maintain safety.
At the same time, typecheckers should be able to catch the error like you show in either way they choose (i.e. if a typechecker is not able, then this could be considered a bug in that checker).

Collaborator

ilevkivskyi commented Jul 15, 2016

Thank you for elaborating, I understand your point. However, it looks like a more sophisticated typechecker could catch the TypeError. By re-type-checking bodies of methods in C parent classes, a typechecker could detect that the super().f(a) call in Left has incompatible signature, since super() for Left is now Right, and f in Right requires a str.

Prohibiting 3 is an absolutely valid option (although it feels a bit false positive to me), also prohibiting 4 is a valid option. What I propose is to leave this choice (sophisticated recursive re-checking vs prohibitng 3 vs prohibiting 4) up to the implementer of a particular typechecker.

Restrictions mentioned by Jukka are necessary, because it is extremely difficult (probably impossible in principle) to check types inside extension modules, so that we:
a) "forgive" typecheckers all uncaught TypeErrors related to this;
b) propose some restrictions for user code that nevertheless allow to maintain safety.
At the same time, typecheckers should be able to catch the error like you show in either way they choose (i.e. if a typechecker is not able, then this could be considered a bug in that checker).

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 15, 2016

Member
Member

gvanrossum commented Jul 15, 2016

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Jul 15, 2016

Contributor

The point about incremental checking is a good one. General performance could also be an important concern if people would start using this more widely. My expectation would be that this would mostly be used in fairly simple cases, but of course any feature given to users tends to be used in unforeseen ways, and once the genie is out of the bottle it's difficult to put it back there. There could be also other interactions with particular language features that we haven't though of yet.

Due to all the concerns and tricky interactions that we've uncovered, I'd like to see a prototype implementation of this before deciding whether to add this to the PEP -- I have a feeling that we don't yet understand this well enough.

Contributor

JukkaL commented Jul 15, 2016

The point about incremental checking is a good one. General performance could also be an important concern if people would start using this more widely. My expectation would be that this would mostly be used in fairly simple cases, but of course any feature given to users tends to be used in unforeseen ways, and once the genie is out of the bottle it's difficult to put it back there. There could be also other interactions with particular language features that we haven't though of yet.

Due to all the concerns and tricky interactions that we've uncovered, I'd like to see a prototype implementation of this before deciding whether to add this to the PEP -- I have a feeling that we don't yet understand this well enough.

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 27, 2016

Collaborator

After thinking more it looks like there are three important "use cases" for this feature:

  • The original request originates from typing extension modules;
  • Another use case is inheriting from library classes, where a user has some already working code, that redefines methods in a "theoretically" incompatible way;
  • A use case proposed by @JukkaL: pure mixin as a class that is not typechecked at all when defined, but only when inheriting from it.

What I see as important now is that first two use cases actually do not require recheck of base classes (and it is even impossible for built-in types, extension modules and library stubs), only the third one requires it. Let us focus on first two cases here and discuss re-typechecking and related issues in #246.

A reworked proposal that covers two first cases is: Implementation is just a way to say "I know what I do" that is "softer" than Any. More details:

  1. At runtime Implementation[MyClass] is MyClass. All normal mechanisms work at runtime:
class Base:
    def __init__(self, number: int) -> None:
        self.number = number

class AnotherBase:
    ...

class Derived(Implementation[Base], AnotherBase):
    def __init__(self) -> None:
        super().__init__(42) # This works since Implementation[Base] is just Base
  1. The derived class is not considered a subtype of implementation base classes. An explicit cast might be used under user's responsibility. Examples:
from typing import cast

def func(x: Base) -> None:
    ...
def another_func(x: AnotherBase) -> None:
    ...

func(Base(42))              # this is OK
another_func(Derived())     # this is also OK
func(Derived())             # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
  1. Methods of an implementation base class could be redefined in an incompatible way:
class Base:
    def method(self, x: int) -> None:
        ...

class AnotherBase:
    def another_method(self, y: int) -> None:
        ...

class Derived(Implementation[Base], AnotherBase):
    def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
    def method(self, x: str) -> None: ...         # Although, this is OK

No re-typecheck of base classes is performed, it is the responsibility of a user to ensure that the new signature is safe.

  1. A typechecker uses the normal MRO to find the signatures of methods and uses them to typecheck their use in derived class and elsewhere as usual. For example:
class Base:
    def method(self, x: int) -> None: ...

class AnotherBase:
    def another_method(self, y: int) -> None: ...

class Derived(Implementation[Base], AnotherBase):
    def method(self, x: str) -> None: ...

Derived().method('abc')         # OK
Derived().another_method('abc') # Fails, incorrect type of argument
Derived().third_method('abc')   # Fails, no such method
  1. Implementation could be used only with a proper class. Implementation[Union[t1, t2]], Implementation[Callable[[t1, t2, ...], tr]], etc. raise TypeError.

  2. People who will use this feature probably want some speed, therefore implementation of Implementation should be fast. Probably, it should not be a class itself, just an object whose class defines __getitem__ that returns its argument.

@gvanrossum @dmoisset what do you think about this?

Collaborator

ilevkivskyi commented Jul 27, 2016

After thinking more it looks like there are three important "use cases" for this feature:

  • The original request originates from typing extension modules;
  • Another use case is inheriting from library classes, where a user has some already working code, that redefines methods in a "theoretically" incompatible way;
  • A use case proposed by @JukkaL: pure mixin as a class that is not typechecked at all when defined, but only when inheriting from it.

What I see as important now is that first two use cases actually do not require recheck of base classes (and it is even impossible for built-in types, extension modules and library stubs), only the third one requires it. Let us focus on first two cases here and discuss re-typechecking and related issues in #246.

A reworked proposal that covers two first cases is: Implementation is just a way to say "I know what I do" that is "softer" than Any. More details:

  1. At runtime Implementation[MyClass] is MyClass. All normal mechanisms work at runtime:
class Base:
    def __init__(self, number: int) -> None:
        self.number = number

class AnotherBase:
    ...

class Derived(Implementation[Base], AnotherBase):
    def __init__(self) -> None:
        super().__init__(42) # This works since Implementation[Base] is just Base
  1. The derived class is not considered a subtype of implementation base classes. An explicit cast might be used under user's responsibility. Examples:
from typing import cast

def func(x: Base) -> None:
    ...
def another_func(x: AnotherBase) -> None:
    ...

func(Base(42))              # this is OK
another_func(Derived())     # this is also OK
func(Derived())             # Marked as error by a typechecker
func(cast(Base, Derived())) # Passes typecheck
  1. Methods of an implementation base class could be redefined in an incompatible way:
class Base:
    def method(self, x: int) -> None:
        ...

class AnotherBase:
    def another_method(self, y: int) -> None:
        ...

class Derived(Implementation[Base], AnotherBase):
    def another_method(self, y: str) -> None: ... # This is prohibited by a typechecker
    def method(self, x: str) -> None: ...         # Although, this is OK

No re-typecheck of base classes is performed, it is the responsibility of a user to ensure that the new signature is safe.

  1. A typechecker uses the normal MRO to find the signatures of methods and uses them to typecheck their use in derived class and elsewhere as usual. For example:
class Base:
    def method(self, x: int) -> None: ...

class AnotherBase:
    def another_method(self, y: int) -> None: ...

class Derived(Implementation[Base], AnotherBase):
    def method(self, x: str) -> None: ...

Derived().method('abc')         # OK
Derived().another_method('abc') # Fails, incorrect type of argument
Derived().third_method('abc')   # Fails, no such method
  1. Implementation could be used only with a proper class. Implementation[Union[t1, t2]], Implementation[Callable[[t1, t2, ...], tr]], etc. raise TypeError.

  2. People who will use this feature probably want some speed, therefore implementation of Implementation should be fast. Probably, it should not be a class itself, just an object whose class defines __getitem__ that returns its argument.

@gvanrossum @dmoisset what do you think about this?

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Jul 28, 2016

Member

It still sounds like too big a hammer. In python/mypy#1237 (comment) @philthompson10 wrote:

Is there a case for treating stub files differently, maybe by providing something like @overload that tells mypy that Bar.factory() is not an override of Foo.factory()?

From this it seems pretty clear that he doesn't want to completely sever the ties between Foo and Bar, only between the two unrelated factory() methods. (Though perhaps this is a job for # type: overload?)

Member

gvanrossum commented Jul 28, 2016

It still sounds like too big a hammer. In python/mypy#1237 (comment) @philthompson10 wrote:

Is there a case for treating stub files differently, maybe by providing something like @overload that tells mypy that Bar.factory() is not an override of Foo.factory()?

From this it seems pretty clear that he doesn't want to completely sever the ties between Foo and Bar, only between the two unrelated factory() methods. (Though perhaps this is a job for # type: overload?)

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Jul 29, 2016

@gvanrossum I think it was more or less established that the initial example in #1237 (the Foo.factory return type definition) was unsound, and that case can already be solved with a #type: ignore at the redefinition point, without affecting the relationship between classes.

The other scenario presented later (The one about QBitmap) may be more interesting (I'm assuming those classes can be used differently.

To move away from examples that are not my own, I also found a similar one at https://github.com/django/django/blob/stable/1.10.x/django/utils/datastructures.py#L48 ; there you have a MultiValueDict class that inherits dict because the implementation of dict is good for it, but it has a slightly different API (actually, it has the API of a Dict[K, V] extended, but the implementation of a Dict[K, List[V]]). It's ok because it's not intended to replace a dict, it's just a new data structure to be used separately.

In this situation, I think a proposal like the last one from @ilevkivskyi 's would work. There is some potential for unsoundness (essentially every reference to "self" in the inherited class can be subject to a broken call), but the alternatives here are using Any or # type: ignore which can also lead to runtime errors. The point here is to be able to explicitly say "this inherits from dict, but it's not something you can use in place of a dict", and have that checked (and allow me to break the dict contract in a class that is not really a dict).

dmoisset commented Jul 29, 2016

@gvanrossum I think it was more or less established that the initial example in #1237 (the Foo.factory return type definition) was unsound, and that case can already be solved with a #type: ignore at the redefinition point, without affecting the relationship between classes.

The other scenario presented later (The one about QBitmap) may be more interesting (I'm assuming those classes can be used differently.

To move away from examples that are not my own, I also found a similar one at https://github.com/django/django/blob/stable/1.10.x/django/utils/datastructures.py#L48 ; there you have a MultiValueDict class that inherits dict because the implementation of dict is good for it, but it has a slightly different API (actually, it has the API of a Dict[K, V] extended, but the implementation of a Dict[K, List[V]]). It's ok because it's not intended to replace a dict, it's just a new data structure to be used separately.

In this situation, I think a proposal like the last one from @ilevkivskyi 's would work. There is some potential for unsoundness (essentially every reference to "self" in the inherited class can be subject to a broken call), but the alternatives here are using Any or # type: ignore which can also lead to runtime errors. The point here is to be able to explicitly say "this inherits from dict, but it's not something you can use in place of a dict", and have that checked (and allow me to break the dict contract in a class that is not really a dict).

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Jul 30, 2016

Collaborator

@gvanrossum There are two questionable things in my proposal:

  • per-class vs per-method specification of implementation inheritance
  • whether to assume derived class a subtype or not

While I am not 100% sure that it should be per class, still I think that it is less verbose, since in such cases people would want to override more than one method, like in django example by @dmoisset .

Concerning the second question, I think it is better to not assume the derived class a subtype of base. First, subtype assumption could lead to many uncaught errors. Second, if someone needs to allow also subclass, then a Union could be easily use to type such situations:

class Base: ...
class Derived(Implementation[Base]): ...

def fun_1(x: Base): # this function uses methods that are overridden in Derived
    ...

def fun_2(x: Derived): # this one too
    ...

def fun_3(x: Union[Base, Derived]): # incompatibly overridden methods are not used here
    ...

If we want to keep the subtype relationship, then we need to do the re-typecheck of bases, which is quite complex and time consuming as we already agreed. I think this should be left to #246 where this is important. @JukkaL what do you think?

On the contrast, here I propose a simple scheme that covers extension modules, built-ins, library stubs, etc. Here I agree with @dmoisset

The point here is to be able to explicitly say "this inherits from dict, but it's not something you can use in place of a dict", and have that checked

Collaborator

ilevkivskyi commented Jul 30, 2016

@gvanrossum There are two questionable things in my proposal:

  • per-class vs per-method specification of implementation inheritance
  • whether to assume derived class a subtype or not

While I am not 100% sure that it should be per class, still I think that it is less verbose, since in such cases people would want to override more than one method, like in django example by @dmoisset .

Concerning the second question, I think it is better to not assume the derived class a subtype of base. First, subtype assumption could lead to many uncaught errors. Second, if someone needs to allow also subclass, then a Union could be easily use to type such situations:

class Base: ...
class Derived(Implementation[Base]): ...

def fun_1(x: Base): # this function uses methods that are overridden in Derived
    ...

def fun_2(x: Derived): # this one too
    ...

def fun_3(x: Union[Base, Derived]): # incompatibly overridden methods are not used here
    ...

If we want to keep the subtype relationship, then we need to do the re-typecheck of bases, which is quite complex and time consuming as we already agreed. I think this should be left to #246 where this is important. @JukkaL what do you think?

On the contrast, here I propose a simple scheme that covers extension modules, built-ins, library stubs, etc. Here I agree with @dmoisset

The point here is to be able to explicitly say "this inherits from dict, but it's not something you can use in place of a dict", and have that checked

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Aug 1, 2016

Member

Hm... The QBitmap example only has one method, so it's still not really very enlightening. The MultiValueDict example seems more helpful, except it seems that it overrides all the methods -- so why not the underlying dict an instance attribute? That might even be quicker than all those super() calls.

I also still have some questions regarding the specification of Implementation[Base]. Suppose:

class Base:
    def foo(self, a: int) -> int: ...
class Derived(Implementation[Base]):
    def bar(self) -> int:
        return self.foo(12)

Is the call to self.foo(12) inside bar() valid? What about Derived().foo(12)? Python itself certainly has no problem with those calls. But type-checking them might be tricky if we say that Derived is not a subtype of Base -- by what other mechanism would a checker consider those calls valid?

A similar case would be if Derived did override foo(), and its foo() called super().foo(). Should that work?

I suppose the answer would be that those calls are valid, and that the checker should still consider Base a member of the MRO of Derived.

But that answer still leaves me unhappy for two reasons. First, the checking of Derived itself feels to lax: anything that happens to override a method of Base gets a free pass, whether intended or not. I would rather see a proposal that requires that conflicting method overrides are explicitly flagged (with something more specific than # type: ignore).

Second, despite Daniel's assurance that it's just a convenient implementation device, I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict (or perhaps a Mapping or MutableMapping), since it does in fact implement those interfaces (AFAICT). Also the name would suggest so. Maybe that could be solved by making it inherit from MutableMapping also, e.g.

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation[Dict[K, List[V]]], MutableMapping[K, V]):
    ...

Does all if this work as expected?

Member

gvanrossum commented Aug 1, 2016

Hm... The QBitmap example only has one method, so it's still not really very enlightening. The MultiValueDict example seems more helpful, except it seems that it overrides all the methods -- so why not the underlying dict an instance attribute? That might even be quicker than all those super() calls.

I also still have some questions regarding the specification of Implementation[Base]. Suppose:

class Base:
    def foo(self, a: int) -> int: ...
class Derived(Implementation[Base]):
    def bar(self) -> int:
        return self.foo(12)

Is the call to self.foo(12) inside bar() valid? What about Derived().foo(12)? Python itself certainly has no problem with those calls. But type-checking them might be tricky if we say that Derived is not a subtype of Base -- by what other mechanism would a checker consider those calls valid?

A similar case would be if Derived did override foo(), and its foo() called super().foo(). Should that work?

I suppose the answer would be that those calls are valid, and that the checker should still consider Base a member of the MRO of Derived.

But that answer still leaves me unhappy for two reasons. First, the checking of Derived itself feels to lax: anything that happens to override a method of Base gets a free pass, whether intended or not. I would rather see a proposal that requires that conflicting method overrides are explicitly flagged (with something more specific than # type: ignore).

Second, despite Daniel's assurance that it's just a convenient implementation device, I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict (or perhaps a Mapping or MutableMapping), since it does in fact implement those interfaces (AFAICT). Also the name would suggest so. Maybe that could be solved by making it inherit from MutableMapping also, e.g.

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation[Dict[K, List[V]]], MutableMapping[K, V]):
    ...

Does all if this work as expected?

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Aug 1, 2016

Collaborator

@gvanrossum

anything that happens to override a method of Base gets a free pass, whether intended or not. I would rather see a proposal that requires that conflicting method overrides are explicitly flagged

OK, I would propose a dummy decorator here, maybe named @override that allows to override incompatibly a method of an Implementation base:

class BaseObj:
    def foo(self) -> str: ...
    def bar(self) -> object: ...

class BaseInt:
    def bar(self) -> int: ...

class Derived(Implementation[BaseInt], BaseObj):
    @override
    def foo(self) -> int: ... # error, @override only allowed with Implementation
    @override
    def bar(self, x: int) -> object:
    # also error, new signature must be compatible with non-Implementation bases
    ...
    @override
    def bar(self) -> str: ... # this one is OK

I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict (or perhaps a Mapping or MutableMapping), since it does in fact implement those interfaces

I think it is better not to modify anything at runtime (your proposal might change MRO in some cases). I propose to list supertypes that one wants to keep as a keyword argument (that will accept a single class or a tuple of classes):

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation(Dict[K, List[V]], keep=MutableMapping)):
    ...

I don't think that we need to repeat the type variables in supertype, they can be inferred automatically.

If we want to use the keyword, then we are forced to use (...) instead of [...] for Implementation. I don't know whether it is good or bad.

Alternatively, we can use syntax like this:

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation[dict, MutableMapping[K, List[V]]]):
    ...

and say that Implementation takes runtime info from the first argument, and static type info from the optional second argument (that could be a class, or a list of classes).
@gvanrossum what do you think?

Collaborator

ilevkivskyi commented Aug 1, 2016

@gvanrossum

anything that happens to override a method of Base gets a free pass, whether intended or not. I would rather see a proposal that requires that conflicting method overrides are explicitly flagged

OK, I would propose a dummy decorator here, maybe named @override that allows to override incompatibly a method of an Implementation base:

class BaseObj:
    def foo(self) -> str: ...
    def bar(self) -> object: ...

class BaseInt:
    def bar(self) -> int: ...

class Derived(Implementation[BaseInt], BaseObj):
    @override
    def foo(self) -> int: ... # error, @override only allowed with Implementation
    @override
    def bar(self, x: int) -> object:
    # also error, new signature must be compatible with non-Implementation bases
    ...
    @override
    def bar(self) -> str: ... # this one is OK

I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict (or perhaps a Mapping or MutableMapping), since it does in fact implement those interfaces

I think it is better not to modify anything at runtime (your proposal might change MRO in some cases). I propose to list supertypes that one wants to keep as a keyword argument (that will accept a single class or a tuple of classes):

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation(Dict[K, List[V]], keep=MutableMapping)):
    ...

I don't think that we need to repeat the type variables in supertype, they can be inferred automatically.

If we want to use the keyword, then we are forced to use (...) instead of [...] for Implementation. I don't know whether it is good or bad.

Alternatively, we can use syntax like this:

K = TypeVar('K')
V = TypeVar('V')
class MultiValueDict(Implementation[dict, MutableMapping[K, List[V]]]):
    ...

and say that Implementation takes runtime info from the first argument, and static type info from the optional second argument (that could be a class, or a list of classes).
@gvanrossum what do you think?

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Aug 1, 2016

Member
Member

gvanrossum commented Aug 1, 2016

@dmoisset

This comment has been minimized.

Show comment
Hide comment
@dmoisset

dmoisset Aug 1, 2016

@gvanrossum Regarding "Is the call to self.foo(12) inside bar() valid?" for me its obviously yes, given that the purpose is to actually get the methods in the subclass (if you don't get the methods and don't get the subtypes, the classes might as well be unrelated).

Also your statement of «I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict» looks perfectly valid, but to make it 100% accurate it should be expressed with the generic params: «I'm pretty sure that the users of MultiValueDict[str, Foo] would be unhappy if a MultiValueDict[str, Foo] couldn't be passed into code expecting a Dict[str, List[foo]]» and there's where the problem makes itself apparent (because that is something that this implementation doesn't allow and it would be nice if the checker detected it).

This is a somewhat contrived case because there is both interface and implementation inherited but from two different instantiations. However I don't think this is not a rare case; on a quick inspection I found many cases of this situation around dict on the python stdlib: (http.cookies.BaseCookie, http.cookies.Morsel, logging.config.ConvertingDict are dicts of one type by implementation, but its interface shows them as dicts of another); another situation is collections.Counter that redefines update() with an incompatible signature. also some cases around bulitin types: (xml.dom.minicompat which redefines tuple.add incompatibly).

On custom classes I didn't find many examples in public code that I know well; an additional one from the stdlib is http.client.HTTPMessage which inherits email.message.EmailMessage to reuse header parsing functionality (this is perhaps the most obvious and simple case of "inheritance of implementation but not subtyping").

(just read newest message on the issue while I was writing this comment) I understand that this might not be a priority, I might update this issue with other examples I find in the future in case it helps a future discussion.

For me the original (simpler) version was good enough. It is not a free pass in calls in Derived (they are perfectly checked), it has some weakness on inherited-but-not-overriden references to self (including uses of super), but the added weakness is better than a type:ignore, it prevents errors respecting developer intention (using a value as subtype when not intended/valid as such). I wouldn't add the complexities of override decorator, options to keep stuff, etc, in the sense that they do not prevent any problems that don't already exist with the current approach (i.e "type: ignore").

dmoisset commented Aug 1, 2016

@gvanrossum Regarding "Is the call to self.foo(12) inside bar() valid?" for me its obviously yes, given that the purpose is to actually get the methods in the subclass (if you don't get the methods and don't get the subtypes, the classes might as well be unrelated).

Also your statement of «I'm pretty sure that the users of MultiValueDict would be unhappy if a MultiValueDict couldn't be passed into code expecting a dict» looks perfectly valid, but to make it 100% accurate it should be expressed with the generic params: «I'm pretty sure that the users of MultiValueDict[str, Foo] would be unhappy if a MultiValueDict[str, Foo] couldn't be passed into code expecting a Dict[str, List[foo]]» and there's where the problem makes itself apparent (because that is something that this implementation doesn't allow and it would be nice if the checker detected it).

This is a somewhat contrived case because there is both interface and implementation inherited but from two different instantiations. However I don't think this is not a rare case; on a quick inspection I found many cases of this situation around dict on the python stdlib: (http.cookies.BaseCookie, http.cookies.Morsel, logging.config.ConvertingDict are dicts of one type by implementation, but its interface shows them as dicts of another); another situation is collections.Counter that redefines update() with an incompatible signature. also some cases around bulitin types: (xml.dom.minicompat which redefines tuple.add incompatibly).

On custom classes I didn't find many examples in public code that I know well; an additional one from the stdlib is http.client.HTTPMessage which inherits email.message.EmailMessage to reuse header parsing functionality (this is perhaps the most obvious and simple case of "inheritance of implementation but not subtyping").

(just read newest message on the issue while I was writing this comment) I understand that this might not be a priority, I might update this issue with other examples I find in the future in case it helps a future discussion.

For me the original (simpler) version was good enough. It is not a free pass in calls in Derived (they are perfectly checked), it has some weakness on inherited-but-not-overriden references to self (including uses of super), but the added weakness is better than a type:ignore, it prevents errors respecting developer intention (using a value as subtype when not intended/valid as such). I wouldn't add the complexities of override decorator, options to keep stuff, etc, in the sense that they do not prevent any problems that don't already exist with the current approach (i.e "type: ignore").

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Aug 1, 2016

Member

OK, I think we actually agree on a kernel of behavior for
Implementation[Base], and maybe we can agree that further complexities are
just a distraction. But honestly I would rather hear from people who find
they need this for their own code that they are annotating -- I know that
if I had a class like MultiValueDict I would just rewrite it so that it
doesn't inherit from dict at all.

Member

gvanrossum commented Aug 1, 2016

OK, I think we actually agree on a kernel of behavior for
Implementation[Base], and maybe we can agree that further complexities are
just a distraction. But honestly I would rather hear from people who find
they need this for their own code that they are annotating -- I know that
if I had a class like MultiValueDict I would just rewrite it so that it
doesn't inherit from dict at all.

@PetterS

This comment has been minimized.

Show comment
Hide comment
@PetterS

PetterS Aug 11, 2016

I need it for the following scenario. Consider an optimization solver written in Python:

optimization_problem = Problem()
x = optimization_problem.add_variable()
y = optimization_problem.add_variable()
optimization_problem.add_constraint(x + y == 2)

This requires that the type of x defines a custom eq returning something other that bool. This causes an error in mypy:

error: Return type of "__eq__" incompatible with supertype "object"

I think this use case is legitimate. There is a linear programming library doing this.

PetterS commented Aug 11, 2016

I need it for the following scenario. Consider an optimization solver written in Python:

optimization_problem = Problem()
x = optimization_problem.add_variable()
y = optimization_problem.add_variable()
optimization_problem.add_constraint(x + y == 2)

This requires that the type of x defines a custom eq returning something other that bool. This causes an error in mypy:

error: Return type of "__eq__" incompatible with supertype "object"

I think this use case is legitimate. There is a linear programming library doing this.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Aug 11, 2016

Member

I see the point, but what benefit do you get from type-checking such code?
Wouldn't it be rowing upstream because everything is overloaded in a funny
way?

In the meantime, if you put a #type:ignore on your definitions of eq,
does the rest work?

Member

gvanrossum commented Aug 11, 2016

I see the point, but what benefit do you get from type-checking such code?
Wouldn't it be rowing upstream because everything is overloaded in a funny
way?

In the meantime, if you put a #type:ignore on your definitions of eq,
does the rest work?

@PetterS

This comment has been minimized.

Show comment
Hide comment
@PetterS

PetterS Aug 11, 2016

I guess the benefit would be that code like

optimization_problem.add_constraint("abc")

etc. would be marked as an error. I haven’t actually implemented this completely, so I don’t know whether it would be rowing upstream. I am still very new to this (installed 3.5 + mypy today).

Yeah, # type: ignore and @no_type_check both silences the error. I assume the type annotations are still used elsewhere.

PetterS commented Aug 11, 2016

I guess the benefit would be that code like

optimization_problem.add_constraint("abc")

etc. would be marked as an error. I haven’t actually implemented this completely, so I don’t know whether it would be rowing upstream. I am still very new to this (installed 3.5 + mypy today).

Yeah, # type: ignore and @no_type_check both silences the error. I assume the type annotations are still used elsewhere.

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Aug 12, 2016

Member
Member

gvanrossum commented Aug 12, 2016

@ilevkivskyi

This comment has been minimized.

Show comment
Hide comment
@ilevkivskyi

ilevkivskyi Aug 19, 2016

Collaborator

@gvanrossum

But honestly I would rather hear from people who find they need this for their own code that they are annotating

The situation described by @PetterS is quite widespread in the scientific community. I did such thing several times. If one adds #type: ignore to

class Expr:
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

then this method will be typed as Callable[[Any], Any]. That will lead to two kinds of errors:

  1. If Expr is used where object is expected
def eq_one(x: object) -> bool:
    return x == 1
eq_one(Expr()) # error here
  1. If something other than Expr is passed to its __eq__ method:
Expr() == 'boom!' # error here

Both kinds of errors are not caught by mypy now, but will be caught with Implementation:

class Expr(Implementation[object]):
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

Of course I could not judge how widespread are such situations in general. Maybe we could ask others?
@vlasovskikh Have you heard about such situations from users of PyCharm?

Collaborator

ilevkivskyi commented Aug 19, 2016

@gvanrossum

But honestly I would rather hear from people who find they need this for their own code that they are annotating

The situation described by @PetterS is quite widespread in the scientific community. I did such thing several times. If one adds #type: ignore to

class Expr:
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

then this method will be typed as Callable[[Any], Any]. That will lead to two kinds of errors:

  1. If Expr is used where object is expected
def eq_one(x: object) -> bool:
    return x == 1
eq_one(Expr()) # error here
  1. If something other than Expr is passed to its __eq__ method:
Expr() == 'boom!' # error here

Both kinds of errors are not caught by mypy now, but will be caught with Implementation:

class Expr(Implementation[object]):
    def __eq__(self, other: 'Expr') -> 'Expr':
        ...

Of course I could not judge how widespread are such situations in general. Maybe we could ask others?
@vlasovskikh Have you heard about such situations from users of PyCharm?

@pkch

This comment has been minimized.

Show comment
Hide comment
@pkch

pkch Apr 3, 2017

I wanted to add class constructors to the list of issues to clear up in this discussion. At the moment, they receive a confusing treatment:

# example 1
class A:
    def f(self, x: Sequence) -> None: ...

class B(A):
    # error: Argument 1 of "f" incompatible with supertype "A"
    def f(self, x: List) ->  None: ...

but

# example 2
class A:
    def __init__(self, x: Sequence) -> None: ...

class B(A):
    # passes type check
    def __init__(self, x: List) ->  None: ...

OTOH, class creation passes type check:

# example 3
class A: 
    def __init__(self, a: int) -> None:
        self.a = a
def f(a: Type[A]) -> A:
    return a(0)

The only difference between constructors and other methods is that they affect Liskov substitutability of different objects: constructors affect the classes, while other methods affect the instances of those classes.

My preference would be to do what this thread is proposing - let the programmer easily choose whether Liskov constraint (=subtyping) should apply, regardless of subclassing; so __init__ could be either constrained or not. If it's constrained, then example (2) should fail type check but (3) should pass; if it's not constrained, it (2) should pass but (3) should fail.

BTW, another argument in favor of separation of subytping from subclassing is that substitutability only makes sense if the semantics is compatible -- it's not enough that the methods have matching types; for example:

  • it's rarely ok pass iterators in place of iterables (despite the subclass relationship) because they might be exhausted (leading to incorrect behavior)
  • it's dangerous to pass a x: defaultdict instead of a x: dict because there might be try dostuff(x[k]) except KeyError ... in the code

pkch commented Apr 3, 2017

I wanted to add class constructors to the list of issues to clear up in this discussion. At the moment, they receive a confusing treatment:

# example 1
class A:
    def f(self, x: Sequence) -> None: ...

class B(A):
    # error: Argument 1 of "f" incompatible with supertype "A"
    def f(self, x: List) ->  None: ...

but

# example 2
class A:
    def __init__(self, x: Sequence) -> None: ...

class B(A):
    # passes type check
    def __init__(self, x: List) ->  None: ...

OTOH, class creation passes type check:

# example 3
class A: 
    def __init__(self, a: int) -> None:
        self.a = a
def f(a: Type[A]) -> A:
    return a(0)

The only difference between constructors and other methods is that they affect Liskov substitutability of different objects: constructors affect the classes, while other methods affect the instances of those classes.

My preference would be to do what this thread is proposing - let the programmer easily choose whether Liskov constraint (=subtyping) should apply, regardless of subclassing; so __init__ could be either constrained or not. If it's constrained, then example (2) should fail type check but (3) should pass; if it's not constrained, it (2) should pass but (3) should fail.

BTW, another argument in favor of separation of subytping from subclassing is that substitutability only makes sense if the semantics is compatible -- it's not enough that the methods have matching types; for example:

  • it's rarely ok pass iterators in place of iterables (despite the subclass relationship) because they might be exhausted (leading to incorrect behavior)
  • it's dangerous to pass a x: defaultdict instead of a x: dict because there might be try dostuff(x[k]) except KeyError ... in the code
@pkch

This comment has been minimized.

Show comment
Hide comment
@pkch

pkch Apr 7, 2017

Let's say Y inherits from X (through regular class Y(X) inheritance, not through ABCMeta.register).

Currently:

  • Automatically Y is subtype of X and type safety is ensured (i.e., subclass methods compatibility is checked).
  • Automatically Type[Y] is subtype of Type[X] but type safety is not ensured (i.e., subclass __init__ compatibility is not checked).

The proposal so far is:

  • Programmer can choose whether or not Y should be a subtype of X. If he chooses to create this subtyping relationship, then type safety is ensured (same as at present).

I think for consistency, the second part of the proposal should be:

  • Programmer can choose whether or not Type[Y] should be a subtype of Type[X]. If he chooses to create this subtyping relationship, then type safety is ensured (i.e., subclass __init__ compatibility is checked, unlike at present).

pkch commented Apr 7, 2017

Let's say Y inherits from X (through regular class Y(X) inheritance, not through ABCMeta.register).

Currently:

  • Automatically Y is subtype of X and type safety is ensured (i.e., subclass methods compatibility is checked).
  • Automatically Type[Y] is subtype of Type[X] but type safety is not ensured (i.e., subclass __init__ compatibility is not checked).

The proposal so far is:

  • Programmer can choose whether or not Y should be a subtype of X. If he chooses to create this subtyping relationship, then type safety is ensured (same as at present).

I think for consistency, the second part of the proposal should be:

  • Programmer can choose whether or not Type[Y] should be a subtype of Type[X]. If he chooses to create this subtyping relationship, then type safety is ensured (i.e., subclass __init__ compatibility is checked, unlike at present).
@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Apr 7, 2017

Member

Canyou please use words instead of symbols? I can never remember which way <: or :> goes. (It's fine to edit the comment in place if GitHub let you.)

Member

gvanrossum commented Apr 7, 2017

Canyou please use words instead of symbols? I can never remember which way <: or :> goes. (It's fine to edit the comment in place if GitHub let you.)

@pkch

This comment has been minimized.

Show comment
Hide comment
@pkch

pkch Apr 7, 2017

Done. To summarize, I think X.__init__ is relevant for type checking of class objects (type Type[X]) and irrelevant for type checking of instance objects (type X). My rationale is that:

  • X.__init__ signature defines the details of the Callable type that is used to represent class X, and that Callable type is nearly equivalent to Type[X]
  • while technically a programmer can call __init__ on an instance: X().__init__(), there are no good use cases for that

pkch commented Apr 7, 2017

Done. To summarize, I think X.__init__ is relevant for type checking of class objects (type Type[X]) and irrelevant for type checking of instance objects (type X). My rationale is that:

  • X.__init__ signature defines the details of the Callable type that is used to represent class X, and that Callable type is nearly equivalent to Type[X]
  • while technically a programmer can call __init__ on an instance: X().__init__(), there are no good use cases for that
@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Apr 8, 2017

Member

Ah, that summary is very helpful. I agree that one shouldn't call __init__ manually (and in fact Python always reserves the right to have "undefined" behavior when you define or use dunder names other than documented).

But of course Type[X] has a bunch of behaviors that Callable[..., X] does not (e.g. it has class methods).

Member

gvanrossum commented Apr 8, 2017

Ah, that summary is very helpful. I agree that one shouldn't call __init__ manually (and in fact Python always reserves the right to have "undefined" behavior when you define or use dunder names other than documented).

But of course Type[X] has a bunch of behaviors that Callable[..., X] does not (e.g. it has class methods).

@mitar

This comment has been minimized.

Show comment
Hide comment
@mitar

mitar Sep 26, 2017

Contributor

I have tried to read through this whole thread and I am maybe missing something, but I think I have a slight variation/use case to share to add to this discussion.

If you are familiar with sklearn you know that it provides many various machine learning classes. It is great because many classes share the same interface, so it is easy to programmers to use them. The issue is that while method names are the same, arguments they receive vary greatly. If I would like to define types for them in a meaningful way, there is a problem. The base class might be something like:

class Estimator(Generic[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, outputs: Outputs) -> None:
        pass

    @abstractmethod
    def predict(self, *, inputs: Inputs) -> Outputs:
        pass

class SupervisedEstimator(Estimator[Inputs, Outputs]):
    pass

class UnsupervisedEstimator(Estimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs) -> None:
        pass

class GeneratorEstimator(Estimator[List[None], Outputs]):
    @abstractmethod
    def fit(self, *, outputs: Outputs) -> None:
        pass

class LupiSupervisedEstimator(SupervisedEstimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, privileged_inputs: Inputs, outputs: Outputs) -> None:
        pass

So, somehow the idea is that Estimator defines which methods are available, and general arguments, but that then subclasses can define their own combination of arguments for their methods. And that then subclasses of those have to uphold the Liskov Substitution Principle. But one between Estimator and direct subclasses should not matter, because it should be clear that if you are expecting base Estimator, you might know which methods you can use, and what is semantics of those, but you have to inspect required arguments at runtime for an instance you have.

It seems this is not possible to express with current Python typing so that mypy would not complain?

Contributor

mitar commented Sep 26, 2017

I have tried to read through this whole thread and I am maybe missing something, but I think I have a slight variation/use case to share to add to this discussion.

If you are familiar with sklearn you know that it provides many various machine learning classes. It is great because many classes share the same interface, so it is easy to programmers to use them. The issue is that while method names are the same, arguments they receive vary greatly. If I would like to define types for them in a meaningful way, there is a problem. The base class might be something like:

class Estimator(Generic[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, outputs: Outputs) -> None:
        pass

    @abstractmethod
    def predict(self, *, inputs: Inputs) -> Outputs:
        pass

class SupervisedEstimator(Estimator[Inputs, Outputs]):
    pass

class UnsupervisedEstimator(Estimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs) -> None:
        pass

class GeneratorEstimator(Estimator[List[None], Outputs]):
    @abstractmethod
    def fit(self, *, outputs: Outputs) -> None:
        pass

class LupiSupervisedEstimator(SupervisedEstimator[Inputs, Outputs]):
    @abstractmethod
    def fit(self, *, inputs: Inputs, privileged_inputs: Inputs, outputs: Outputs) -> None:
        pass

So, somehow the idea is that Estimator defines which methods are available, and general arguments, but that then subclasses can define their own combination of arguments for their methods. And that then subclasses of those have to uphold the Liskov Substitution Principle. But one between Estimator and direct subclasses should not matter, because it should be clear that if you are expecting base Estimator, you might know which methods you can use, and what is semantics of those, but you have to inspect required arguments at runtime for an instance you have.

It seems this is not possible to express with current Python typing so that mypy would not complain?

@JukkaL

This comment has been minimized.

Show comment
Hide comment
@JukkaL

JukkaL Sep 26, 2017

Contributor

@mitar Can you give a more complete example which illustrates the errors generated by mypy?

Contributor

JukkaL commented Sep 26, 2017

@mitar Can you give a more complete example which illustrates the errors generated by mypy?

@gvanrossum

This comment has been minimized.

Show comment
Hide comment
@gvanrossum

gvanrossum Sep 26, 2017

Member

It seems this is not possible to express with current Python typing so that mypy would not complain?

I can up with an example that seems to work:

from typing import *
T = TypeVar('T')

class Base:
    fit: Callable

class Foo(Base):
    def fit(self, arg1: int) -> Optional[str]:
        pass

class Bar(Foo):
    def fit(self, arg1: float) -> str:
        pass

Maybe this pattern helps? This seems to be about the best you can do given that you don't want the base class to specify the method signatures.

Member

gvanrossum commented Sep 26, 2017

It seems this is not possible to express with current Python typing so that mypy would not complain?

I can up with an example that seems to work:

from typing import *
T = TypeVar('T')

class Base:
    fit: Callable

class Foo(Base):
    def fit(self, arg1: int) -> Optional[str]:
        pass

class Bar(Foo):
    def fit(self, arg1: float) -> str:
        pass

Maybe this pattern helps? This seems to be about the best you can do given that you don't want the base class to specify the method signatures.

@mitar

This comment has been minimized.

Show comment
Hide comment
@mitar

mitar Oct 12, 2017

Contributor

Interesting. I never thought of just specifying Callable. This is very useful.

My ideal typing semantics would be: this is a method signature for fit with arguments inputs and outputs. If your subclass fit uses these arguments, they have to match base fit types, but they can also be specified as omitted (by not listing them in subclass fit arguments).

Sadly, Optional is not useful here because it is not that a subclass is saying you can pass an argument or not, but that you have to pass it always, or you cannot pass it ever. (So different subclasses would choose one or the other.)

Then consumers of those objects could check and say: does this method accepts inputs, if yes, then I know what is the type and semantics of this argument. And same for outputs.

Contributor

mitar commented Oct 12, 2017

Interesting. I never thought of just specifying Callable. This is very useful.

My ideal typing semantics would be: this is a method signature for fit with arguments inputs and outputs. If your subclass fit uses these arguments, they have to match base fit types, but they can also be specified as omitted (by not listing them in subclass fit arguments).

Sadly, Optional is not useful here because it is not that a subclass is saying you can pass an argument or not, but that you have to pass it always, or you cannot pass it ever. (So different subclasses would choose one or the other.)

Then consumers of those objects could check and say: does this method accepts inputs, if yes, then I know what is the type and semantics of this argument. And same for outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment