Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__eq__'s type hint for primitives accept object while in many cases it returns NotImplemented #8217

Closed
mmohaveri opened this issue Jun 30, 2022 · 13 comments

Comments

@mmohaveri
Copy link

As can be seen in builtins.pyi, all primitives are defining their __eq__ and __ne__ functions in the following manner:

    def __eq__(self, __x: object) -> bool: ...
    def __ne__(self, __x: object) -> bool: ...

This will cause incorrect type errors in cases where the second operand has overloaded __eq__ or __ne__ in a way that it does not return bool. Consider the following example:

T = TypeVar("T", float, bool)

class Matrix(Generic[T]):
  def __init__(self, data: List[List[T]]):
    self.data = data

  def __eq__(self, other: "Matrix[T] | float | bool") -> Matrix[bool]:
    if not isinstance(other, Matrix):
      return Matrix([[elem == other for elem in row] for row in self.data])
    return Matrix((elem == other_elem for elem, other_elem in zip(row1, row2)] for row1, row2 in zip(self.data, other.data)]

  def __ne__(self, other: "Matrix[T] | float | bool") -> Matrix[bool]: ...

Here we're trying to create a class that mimics numpy's NDArray in a very limited way.

Now if I try to check the equality of a Matrix with a float, the return type will be inferred as Matrix, since LHS of the operation has a matching __eq__ method which returns matrix (and the returned type in runtime is Matrix as well)

But if I try to check the equality of a float with a Matrix, the returned type will be inferred as bool because LHS of the operation has a compatible __eq__ (def __eq__(self, __x: object) -> bool, object and Matrix ar compatible) while in the runtime because float's eq will return a NotImplemented, interpreter will try to run the __ne__ function of the RHS which returns a Matrix.

This will create situations where the inferred type of a simple equality check operation does not match the runtime's type, all because
primitives stubs has been defined in a way that is not representative of their runtime behavior.

@srittau
Copy link
Collaborator

srittau commented Jun 30, 2022

This is indeed problematic and the natural solution would be an overloads returning NotImplemented. In the case of float (where FloatLike would have to replaced with something appropriate):

@overload
def __eq__(self, other: FloatLike) -> bool: ...
@overload
def __eq__(self, other: object) -> NotImplemented: ...

That said, we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this. It would be worth a try in an exploratory PR. Or maybe we should discuss it on typing-sig first.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jun 30, 2022

Or maybe we should discuss it on typing-sig first.

I would be opposed to making changes like this in typeshed without having a discussion on typing-sig first.

This would be a large change to make. My instinct is to say that we shouldn't be making allowances for the ways in which numpy/pandas (ab)use __eq__ and __ne__ methods, since the way these methods are written in numpy/pandas usually violates LSP. It's probably (well, almost certainly) too late to change that at this point, but I still don't feel like we should really be bending over backwards to accommodate it.

@srittau
Copy link
Collaborator

srittau commented Jun 30, 2022

This is not just a numpy problem:

class Floaty:
    def __init__(self, f: float) -> None:
        self._f = f

    def __eq__(self, other: object):
        if not isinstance(other, float):
            return NotImplemented
        return other == self._f
    

print(Floaty(3.0) == 3.0)  # True
print(3.0 == Floaty(3.0))  # True

This is due to how __eq__ etc. work: Docs. If the left side comparison returns NotImplemented, the right side comparison is used instead.

@AlexWaygood
Copy link
Member

This is due to how __eq__ etc. work: Docs. If the left side comparison returns NotImplemented, the right side comparison is used instead.

I know, but in general, this doesn't really matter that much, since in general, the only relevant thing to consider is whether the result of __eq__ is truthy or falsey, and in general, bool() can be called on literally any object in Python. As the docs that you linked to state:

By convention, False and True are returned for a successful comparison.

I think that it is useful to continue to enforce this convention in the stubs; in general, doing otherwise is surprising behaviour that I would consider to be a code smell. It is only with numpy/pandas where the precise return type of __eq__ or __ne__ matters, since it is only numpy/pandas that insist on violating this convention and (often) return objects from these methods that raise exceptions if you try to call bool() on them.

@mmohaveri
Copy link
Author

@srittau

we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this. It would be worth a try in an exploratory PR

We dont' even need the def __eq__(self, other: object) -> NotImplemented: ... one, I know that mypy and pyright allow returning NotImplemented when the return type is something else. That's because that's an expected behavior for a function that has been passed a wrong input type.

@AlexWaygood

My instinct is to say that we shouldn't be making allowances for the ways in which numpy/pandas (ab)use eq and ne methods, since the way these methods are written in numpy/pandas usually violates LSP

I understand your point of view regarding LSP, but the problem you're describing does make writing scientific codes with python much much easier and human understandable, and as a programming language with a large scientist users, I think in these cases it's really ok, and even IMO encouraged to break the LSP.

since in general, the only relevant thing to consider is whether the result of eq is truthy or falsey

Again, that's because you're ignoring the scientific use of the language. Imagine you're creating a library that simplifies SIMD parallelization by defining an IF function that takes a list of bool and two data list and interlace the two list based on the boolean list. In this case it makes sense for the __eq__ operator to return something other than a boolean value.

Finally, My main problem is that the aforementioned type definitions are WRONG. A type definition SHOULD describe the correct ways a function can be used, and a scenario in which the function returns NotImplemented is by definition the wrong usage, and should not be included in the type hint.

@AlexWaygood
Copy link
Member

Again, that's because you're ignoring the scientific use of the language.

I understand that scientific users of the language represent a major portion of the Python community, and I don't want to disregard that :)

I remain of the opinion that this would be a very significant change to make, which would affect the Python-typing community at large, and that we should therefore not make any changes in this area without first discussing it on the typing-sig mailing list, so that we can get a broader range of feedback.

@mmohaveri
Copy link
Author

I remain of the opinion that this would be a very significant change to make, which would affect the Python-typing community at large, and that we should therefore not make any changes in this area without first discussing it on the typing-sig mailing list, so that we can get a broader range of feedback.

Although I don't think it affects python-typing community at large (as it does not break any code that's working correctly and only allows some other correct uses of the operation to be included in the types), but I understand your point of view. I just don't know how to discuss this issue on the typing-sig mailing list. If that's something that I can do I'll be happy to write my opinion about it there and ask for feedback on it, just let me know where and how I can join and do that.

@AlexWaygood
Copy link
Member

https://mail.python.org/archives/list/typing-sig@python.org/

@headtr1ck
Copy link

I have created https://discuss.python.org/t/make-type-hints-for-eq-of-primitives-less-strict/34240
Feel free to continue discussing over there :)

@Avasam
Copy link
Sponsor Collaborator

Avasam commented Oct 1, 2023

That said, we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this.

FWIW, I've been using -> NoReturn | <expected_subclass_return_type>: ...

@AlexWaygood
Copy link
Member

If you think about Python's type hints in terms of set theory, NoReturn is equivalent to the empty set -- the bottom type, a type with no members -- so a union with NoReturn is essentially always redundant. NoReturn | int will be internally simplified by the type checker to just int, for example.

@Avasam
Copy link
Sponsor Collaborator

Avasam commented Oct 1, 2023

If you think about Python's type hints in terms of set theory, NoReturn is equivalent to the empty set -- the bottom type, a type with no members -- so a union with NoReturn is essentially always redundant. NoReturn | int will be internally simplified by the type checker to just int, for example.

I can see that after testing it quickly. Pylance language server follows that principle and it makes sense. Not sure why I was under the impression it worked differently. nvm my previous comment then.

I only found one place where I added a NoReturn union. I'll go fix that up. And I'll follow that typing-sig discussion as it seems -> NotImplemented | type: ... is what I expected

@srittau
Copy link
Collaborator

srittau commented Nov 21, 2023

I think all relevant points have been made. __eq__'s standard semantics are to return bool. If a library overloads the standard semantics, using # type: ignore seems reasonable. It's not necessary to add NotImplemented to the return types as this is special-cased by type checkers.

(Please reopen, if another maintainer disagrees.)

@srittau srittau closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants