Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 0.29.x: cdef class misses definition of __radd__, even if __radd__ is *explicitly* invoked #4750

Closed
navytux opened this issue Apr 19, 2022 · 11 comments · Fixed by #4753 or #4754
Closed

Comments

@navytux
Copy link
Contributor

navytux commented Apr 19, 2022

Hello up there. I'm writing a custom string class. I already know that behaviour of __add__ is different in between Cython 3 and Cython < 3. In particular I'm aware that Cython < 3 does not install user-provided __radd__ into class's arithmetic methods.

https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#arithmetic-special-methods

So for my class to support both Cython 3 and Cython < 3, in my __add__ I wanted to add explicit check for "reverted" __add__ call and manually invoke my __radd__ in that case. But, as the test program shows, that manual __radd__ invocation completely misses provided __radd__ code definition, and, seemingly just wrongly calls __radd__ from underlying base class.

Please find details of the bug below:

To Reproduce
Code to reproduce the behaviour:

# cython: language_level=3

cdef class MyStr(unicode):

    def __repr__(self):
        return "u(" + unicode.__repr__(self) + ")"

    def __add__(a, b):
        print('Add', type(a), type(b))

        # Support for Cython 2 as Cython < 3 does not support __radd__:
        # https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#arithmetic-special-methods
        if type(a) is not MyStr:
            assert type(b) is MyStr, type(b)
            return b.__radd__(a)

        return MyStr(unicode.__add__(a, b))


    def __radd__(b, a):
        # a.__add__(b) returned NotImplementedError, e.g. for unicode.__add__(MyStr)
        print('RAdd', type(b), type(a))
        return MyStr(unicode(a).__add__(b))




a = MyStr(u'a')
b = u'b'

print('a+b:', repr(a + b))
print('b+a:', repr(b + a))

Expected behavior

With Cython 3 it does work ok as expected:

$ cythonize -i radd.pyx && python -c 'import radd'
Compiling /home/kirr/tmp/trashme/radd.pyx because it changed.
[1/1] Cythonizing /home/kirr/tmp/trashme/radd.pyx
...

Add <class 'radd.MyStr'> <class 'str'>
a+b: u('ab')
RAdd <class 'radd.MyStr'> <class 'str'>
b+a: u('ba')
$ cython --version
Cython version 3.0.0a10

However with Cython 0.29.28-21-g60ca7eb74 (today's 0.29.x) it fails with RecursionError because provided __radd__ is never called:

$ cython --version
Cython version 0.29.28

$ cythonize -i radd.pyx && python -c 'import radd'
Compiling /home/kirr/tmp/trashme/radd.pyx because it changed.
[1/1] Cythonizing /home/kirr/tmp/trashme/radd.pyx
...

Add <class 'radd.MyStr'> <class 'str'>
a+b: u('ab')
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
...
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Add <class 'str'> <class 'radd.MyStr'>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "radd.pyx", line 32, in init radd
    print('b+a:', repr(b + a))
  File "radd.pyx", line 15, in radd.MyStr.__add__
    return b.__radd__(a)
  File "radd.pyx", line 15, in radd.MyStr.__add__
    return b.__radd__(a)
  File "radd.pyx", line 15, in radd.MyStr.__add__
    return b.__radd__(a)
  [Previous line repeated 986 more times]
  File "radd.pyx", line 9, in radd.MyStr.__add__
    print('Add', type(a), type(b))
RecursionError: maximum recursion depth exceeded while calling a Python object

Environment (please complete the following information):

  • OS: Linux
  • Python version: 3.9.2
  • Cython version: 0.29.28-21-g60ca7eb74

Thanks beforehand,
Kirill

@scoder
Copy link
Contributor

scoder commented Apr 19, 2022

I think your __radd__ implementation does not match the way you call it in __add__. It converts a to unicode where it should convert b, according to the way you call it. I think it would be better to do the conversion already in __add__ to avoid this pitfall.

@da-woods
Copy link
Contributor

If you add cython: binding=True this works in 0.29.x. It's a bit odd - I can't immediately see what's going on.

The flow through the default Python object.__add__ isn't particularly simple though - I recall it does some odd things selecting operators of derived classes in preference. So it may not be Cython's fault. But I'm pretty puzzled at this stage

@da-woods
Copy link
Contributor

I think (but I'm not 100% sure) that the Python type initialization creates an __radd__ slot as soon as you have an __add__ function and this __radd__ slot overrides the function that Cython has created. See https://github.com/python/cpython/blob/e2d78baed385c349d756e96d8f0def0268fa9c4f/Objects/typeobject.c#L8539 and related code. With binding=True Cython then overwrites that (because it adds the bound methods later).

This doesn't seem like something that Cython can easily do much about except possible create a warning that your defined __radd__ function will likely be overridden without binding.

I'm a little worried that Cython 3.0's new binop code may combine badly with binding=False. This needs investigating at some point.

@navytux
Copy link
Contributor Author

navytux commented Apr 20, 2022

@scoder, @da-woods, thanks for feedback.

Maybe I'm missing something, but with the following workaround, that I'm using:

--- a/radd.pyx.kirr
+++ b/radd.pyx
@@ -12,12 +12,15 @@ cdef class MyStr(unicode):
         # https://cython.readthedocs.io/en/latest/src/userguide/migrating_to_cy30.html#arithmetic-special-methods
         if type(a) is not MyStr:
             assert type(b) is MyStr, type(b)
-            return b.__radd__(a)
+            return b.____radd__(a)
 
         return MyStr(unicode.__add__(a, b))
 
 
     def __radd__(b, a):
+        return b.____radd(a)
+
+    def ____radd__(b, a):
         # a.__add__(b) returned NotImplementedError, e.g. for unicode.__add__(MyStr)
         print('RAdd', type(b), type(a))
         return MyStr(unicode(a).__add__(b))

it seemingly works ok and from debugging print it can be seen that __radd__ is called with b=MyStr and a=something else and it all produces correct result:

$ cythonize -i radd.pyx && python -c 'import radd'
Compiling /home/kirr/src/tools/py/cython/t/radd.pyx because it changed.
...

Add <class 'radd.MyStr'> <class 'str'>
a+b: u('ab')
Add <class 'str'> <class 'radd.MyStr'>         <-- NOTE
RAdd <class 'radd.MyStr'> <class 'str'>        <-- NOTE
b+a: u('ba')                                   <-- NOTE

In this particular example that something else is already unicode, but if that would be a non-unicode object, but with __str__, that unicode(a) would convert that to good unicode string that will handle further .__add__(b) well.

Once-again, I might be missing something, but isn't it ok?


If you add cython: binding=True this works in 0.29.x.
...
I'm a little worried that Cython 3.0's new binop code may combine badly with binding=False

@da-woods, I'm just interested in something that would work simultaneously for both Cython 3 and Cython < 3. And also a clear warning if __radd__ is not being taken into effect. It took me some time to dig down and debug that Cython was mishandling even explicit call to MyStr.__radd__ in my original code.

Kirill

@scoder
Copy link
Contributor

scoder commented Apr 20, 2022

is called with b=MyStr and a=something else

That's what I meant. "something else" is actually already a str here, so you have to convert b from MyStr to str (or unicode), not a. That's where the infinite recursion came from.

@navytux
Copy link
Contributor Author

navytux commented Apr 20, 2022

The design of my __radd__ is, like I tried to explain above, to support addition of MyStr to non-unicode objects, for example:

--- a/radd.pyx.kirr2
+++ b/radd.pyx
@@ -33,3 +33,10 @@ b = u'b'
 
 print('a+b:', repr(a + b))
 print('b+a:', repr(b + a))
+
+print()
+class MyWorld:
+    def __str__(self):
+        return u"мир"
+
+print('MyWorld()+a:', repr(MyWorld()+a))

gives

$ cythonize -i radd.pyx && python -c 'import radd'
Compiling /home/kirr/src/tools/py/cython/t/radd.pyx because it changed.
...
# output as before

Add <class 'radd.MyWorld'> <class 'radd.MyStr'>    <-- NOTE
RAdd <class 'radd.MyStr'> <class 'radd.MyWorld'>  <-- NOTE
MyWorld()+a: u('мирa')                   <-- NOTE

If I remove that unicode cast on a as follows:

--- a/radd.pyx.kirr3
+++ b/radd.pyx
@@ -23,7 +23,7 @@ cdef class MyStr(unicode):
     def ____radd__(b, a):
         # a.__add__(b) returned NotImplementedError, e.g. for unicode.__add__(MyStr)
         print('RAdd', type(b), type(a))
-        return MyStr(unicode(a).__add__(b))
+        return MyStr(a.__add__(b))

MyWorld + MyStr starts crashing:

$ cythonize -i radd.pyx && python -c 'import radd'
Compiling /home/kirr/src/tools/py/cython/t/radd.pyx because it changed.
...

Add <class 'radd.MyWorld'> <class 'radd.MyStr'>
RAdd <class 'radd.MyStr'> <class 'radd.MyWorld'>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "radd.pyx", line 42, in init radd
    print('MyWorld()+a:', repr(MyWorld()+a))
  File "radd.pyx", line 15, in radd.MyStr.__add__
    return b.____radd__(a)
  File "radd.pyx", line 26, in radd.MyStr.____radd__
    return MyStr(a.__add__(b))
AttributeError: 'MyWorld' object has no attribute '__add__'

That's where the infinite recursion came from.

I thought I showed in #4750 (comment) the output of run with __radd__ problem worked-around via ____radd__ - there for X+MyStr both MyStr.__add__ and MyStr.____radd__ are called without RecursionError.

I, once again, might be missing something, but for me the only problem here is that Cython mishandles explicit invocation of MyStr.__radd__ in the original example.

Kirill

@navytux
Copy link
Contributor Author

navytux commented Apr 20, 2022

( both #4750 (comment) and #4750 (comment) were trials with Cython 0.29.28-21-g60ca7eb74 )

@da-woods
Copy link
Contributor

I, once again, might be missing something, but for me the only problem here is that Cython mishandles explicit invocation of MyStr.__radd__ in the original example.

It's more that CPython installs its own __radd__ function that hides the Cython function...

I'm just interested in something that would work simultaneously for both Cython 3 and Cython < 3

It should work with binding=True (in both cases).

Alternatively a function with a different name would work (e.g. _radd_implementation) and just have __radd__ call _radd_implementation.


I think there's a genuine issue here, although it's because of CPython behaviour that is probably hard to change. I'll try to add a warning to the reverse functions in 0.29.x so no-one else will get caught by it.

@navytux
Copy link
Contributor Author

navytux commented Apr 20, 2022

Thanks, @da-woods.

@da-woods
Copy link
Contributor

I fixed this differently to how I said - it should be possible to look up the methods so a warning isn't needed.

@navytux
Copy link
Contributor Author

navytux commented Apr 22, 2022

Thanks, @da-woods. That's good to see.

@scoder scoder added this to the 0.29.29 milestone May 3, 2022
@scoder scoder closed this as completed in fe98838 May 11, 2022
scoder pushed a commit that referenced this issue May 13, 2022
This means that reverse operators (e.g. `__radd__`) won't be
hidden by the automatic wrapper that PyType_Ready() produces if
the forward method exists. Although they won't work as
in Python, they will be possible to look up and call explicitly.
This should make it easier to write code that's compatible with
Cython 0.29.x and Cython 3 (where reverse operators will be
full supported).

Adjusted to work on Cython 3.0.

Based on #4753
Closes #4750
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants