STYLE: fix pylint unneeded-not warnings #49382

Moisan · 2022-10-29T01:58:42Z

Related to #48855

Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

I added # pylint: disable=unneeded-not for tests where the intention was clearly to test the not operator.

mzeitlin11

Thanks @Moisan, nice catch disabling those purposeful not usages instead of removing!

MarcoGorelli · 2022-10-29T10:28:39Z

pandas/tests/arrays/interval/test_interval.py

-    assert not p1 == p3
+    assert not p1 == p3  # pylint: disable=unneeded-not
    assert hash(p1) == hash(p2)
-    assert not hash(p1) == hash(p3)
+    assert hash(p1) != hash(p3)


just for my understanding, why does the first one need the check turning off, but the second one can be changed to LSH != RHS?

I may have been a bit too quick on that one. The first one seems to be a bug either on Pandas or Pyarrow side:

>>> p1 = ArrowIntervalType(pa.int64(), "left") >>> p3 = ArrowIntervalType(pa.int64(), "right") >>> p1 == p3 False >>> p1 != p3 False >>> p1.equals(p3) False

I looked trough the ArrowIntervalType implementation and the pyarrow.ExtensionType source, and I cannot pinpoint why p1 != p3 would return False.

🤔 @jorisvandenbossche is this a bug? Does ArrowIntervalType need something like

def __ne__(self, other): return not self.__eq__(other)

?

cc @WillAyd in case you have comments on this

Yes, that seems a bug in ArrowIntervalType, and probably needs such a __ne__ (for example the base ExtensionDtype also has such a method)

Yea very interesting. My guess is that this stems from pyarrow somewhere

>>> getattr(pandas.core.arrays.arrow.extension_types.ArrowIntervalType, "__ne__") <slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects>

If you traverse the mro its interesting to see this change through the inheritance hierarchy:

for cls in pandas.core.arrays.arrow.extension_types.ArrowIntervalType.__mro__: print(cls, getattr(cls, '__ne__')) <class 'pandas.core.arrays.arrow.extension_types.ArrowIntervalType'> <slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects> <class 'pyarrow.lib.ExtensionType'> <slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects> <class 'pyarrow.lib.BaseExtensionType'> <slot wrapper '__ne__' of 'pyarrow.lib.DataType' objects> <class 'pyarrow.lib.DataType'> <slot wrapper '__ne__' of 'pyarrow.lib.DataType' objects> <class 'pyarrow.lib._Weakrefable'> <slot wrapper '__ne__' of 'object' objects> <class 'object'> <slot wrapper '__ne__' of 'object' objects>

Interestingly enough I didn't see anything in Arrow's build system that would set the language_level for the Cython extension to 3. We do this in setup.py, I think they would ideally do it in their CMake file.

So if its a matter of 2/3 compat with the generated source file could be a nice upstream patch:

https://github.com/apache/arrow/blob/8e3a1e1b7c17213eb8b79fba49c496de6199fd50/cpp/cmake_modules/UseCython.cmake#L138

@lithomas1

That said for the scope of this PR I think should just ignore

Interestingly enough I didn't see anything in Arrow's build system that would set the language_level for the Cython extension to 3.

I don't think that matters; if you run Python 3, it has the effect that the default __ne__ will use __eq__ automatically.

But what is happening here is that pyarrow.lib.ExtensionType is a cython class that implements __eq__, and cython defaults to implement a richcompare slot that also adds __ne__ based on the __eq__ of that class. So just like when subclassing a builtin, if you override __eq__ you also need to override __ne__ when subclassing a cython class with a __eq__.
This also happens in pandas, for example we have PeriodDtypeBase in cython with __eq__, and that initially caused the same issue (#37265), and that was fixed by adding __ne__ in the python subclass

Now, we could maybe fix this on the PyArrow side as well by adding an explicit __ne__ that is defined to be return not self == other, which might ensure that subclasses would only need to override __eq__, because the parent __ne__ might then work correctly (something to test).
(but regardless of that we should fix this on the pandas side)

Thanks for your investigations

If this can be resolved as simply as adding

def __ne__(self, other): return not self.__eq__(other)

(with type annotations), then I'd suggest we do it directly in this PR

The primary motivation for using linters and all it to help catch issues, and I'm glad pylint has helped find this one - I'd suggest it just be fixed in this PR

MarcoGorelli · 2022-11-03T16:55:33Z

pandas/tests/arrays/interval/test_interval.py

-    assert not p1 == p3
+    assert not p1 == p3  # pylint: disable=unneeded-not
    assert hash(p1) == hash(p2)
-    assert not hash(p1) == hash(p3)
+    assert hash(p1) != hash(p3)


cc @WillAyd in case you have comments on this

MarcoGorelli · 2022-11-06T12:09:43Z

pandas/tests/arrays/interval/test_interval.py

-    assert not p1 == p3
+    assert not p1 == p3  # pylint: disable=unneeded-not
    assert hash(p1) == hash(p2)
-    assert not hash(p1) == hash(p3)
+    assert hash(p1) != hash(p3)


Thanks for your investigations

If this can be resolved as simply as adding

def __ne__(self, other): return not self.__eq__(other)

(with type annotations), then I'd suggest we do it directly in this PR

The primary motivation for using linters and all it to help catch issues, and I'm glad pylint has helped find this one - I'd suggest it just be fixed in this PR

jorisvandenbossche · 2022-11-07T07:24:58Z

pandas/core/arrays/arrow/extension_types.py

@@ -35,6 +35,9 @@ def __eq__(self, other):
        else:
            return NotImplemented

+    def __ne__(self, other) -> bool:
+        return not self.__eq__(other)


Suggested change

return not self.__eq__(other)

return not self == other

Based on the links I read about this, they recommended using this instead of __eq__ (https://stackoverflow.com/questions/4352244/should-ne-be-implemented-as-the-negation-of-eq)

jorisvandenbossche · 2022-11-07T07:25:18Z

pandas/core/arrays/arrow/extension_types.py

@@ -91,6 +94,9 @@ def __eq__(self, other):
        else:
            return NotImplemented

+    def __ne__(self, other) -> bool:
+        return not self.__eq__(other)


Suggested change

return not self.__eq__(other)

return not self == other

MarcoGorelli

Looks good to me pending approval from Will and Joris, thanks @Moisan !

mroeschke · 2022-11-17T00:52:05Z

Awesome, thanks again @Moisan

* STYLE: fix pylint unneeded-not warnings * fixup! STYLE: fix pylint unneeded-not warnings * Add __ne__ method to ArrowPeriodType and ArrowIntervalType * fixup! Add __ne__ method to ArrowPeriodType and ArrowIntervalType

mzeitlin11 added the Code Style Code style, linting, code_checks label Oct 29, 2022

mzeitlin11 approved these changes Oct 29, 2022

View reviewed changes

mzeitlin11 added this to the 2.0 milestone Oct 29, 2022

MarcoGorelli reviewed Oct 29, 2022

View reviewed changes

MarcoGorelli requested changes Nov 3, 2022

View reviewed changes

MarcoGorelli requested changes Nov 6, 2022

View reviewed changes

Moisan added 3 commits November 6, 2022 16:11

STYLE: fix pylint unneeded-not warnings

6f49510

fixup! STYLE: fix pylint unneeded-not warnings

48a19e8

Add __ne__ method to ArrowPeriodType and ArrowIntervalType

89185f5

Moisan force-pushed the pylint-unneeded-not branch from 549c23d to 89185f5 Compare November 6, 2022 21:11

jorisvandenbossche reviewed Nov 7, 2022

View reviewed changes

MarcoGorelli mentioned this pull request Nov 7, 2022

DEPR: enforce a bunch #49552

Merged

5 tasks

fixup! Add __ne__ method to ArrowPeriodType and ArrowIntervalType

b1d188d

MarcoGorelli approved these changes Nov 7, 2022

View reviewed changes

mroeschke approved these changes Nov 17, 2022

View reviewed changes

mroeschke merged commit 694c0cd into pandas-dev:main Nov 17, 2022

Moisan deleted the pylint-unneeded-not branch November 17, 2022 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STYLE: fix pylint unneeded-not warnings #49382

STYLE: fix pylint unneeded-not warnings #49382

Moisan commented Oct 29, 2022

mzeitlin11 left a comment

MarcoGorelli Oct 29, 2022

Moisan Oct 29, 2022 •

edited

MarcoGorelli Oct 30, 2022

MarcoGorelli Nov 3, 2022

jorisvandenbossche Nov 3, 2022

WillAyd Nov 3, 2022

WillAyd Nov 3, 2022

WillAyd Nov 4, 2022

jorisvandenbossche Nov 4, 2022

MarcoGorelli Nov 6, 2022

MarcoGorelli Nov 3, 2022

MarcoGorelli Nov 6, 2022

jorisvandenbossche Nov 7, 2022

jorisvandenbossche Nov 7, 2022

MarcoGorelli left a comment

mroeschke commented Nov 17, 2022

STYLE: fix pylint unneeded-not warnings #49382

STYLE: fix pylint unneeded-not warnings #49382

Conversation

Moisan commented Oct 29, 2022

mzeitlin11 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Moisan Oct 29, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarcoGorelli left a comment

Choose a reason for hiding this comment

mroeschke commented Nov 17, 2022

Moisan Oct 29, 2022 •

edited