Skip to content

[Python] Scalar arithmetic dunders raise TypeError instead of returning NotImplemented #49826

@kazeno

Description

@kazeno

Describe the bug, including details regarding any error messages, version, and platform.

Version: pyarrow 24.0.0 (regression vs. 23.0.1)
Platform: Linux (python 3.12), but not platform-specific

In pyarrow 24.0.0, pyarrow.lib.Scalar gained arithmetic dunder methods (__add__, __sub__, __mul__, __truediv__, __pow__, __neg__, and bitwise ops) in python/pyarrow/scalar.pxi:

def __add__(self, object other):
    return _pc().call_function('add_checked', [self, other])

These implementations unconditionally dispatch to pyarrow.compute.call_function, which raises TypeError via _pack_compute_args when other is not a recognized pyarrow / list / tuple / ndarray type:

TypeError: Got unexpected argument type <class 'MyCustomColumn'> for compute function

Because a raised TypeError does NOT trigger Python's reflected-operator fallback (only a returned NotImplemented does, as you can see in Python data model, §3.3.8), any custom class that previously relied on its own __radd__ / __rmul__ / __rsub__ / __rtruediv__ to handle pyarrow.Scalar + my_obj is now broken. The user has no workaround from their side, as pyarrow.lib.Scalar is an immutable extension type and cannot be monkey-patched, and virtual subclass registration is not honored by CPython's binary-op dispatch (which uses PyType_IsSubtype at the C level).

Reproducer

import pyarrow

class MyCol:
    def __radd__(self, other):
        return "MyCol.__radd__ called"

s = pyarrow.scalar(5)
c = MyCol()

# Works on pyarrow <= 23 (Scalar had no __add__, so Python dispatches to MyCol.__radd__)
# Fails on pyarrow >= 24 with:
#   TypeError: Got unexpected argument type <class '__main__.MyCol'> for compute function
print(s + c)

Expected: "MyCol.__radd__ called" (or at least a NotImplemented return from Scalar.__add__ so Python can fall back).
Actual: TypeError from _pack_compute_args.

Why this matters

Libraries that wrap pyarrow arrays with a richer Python class (like our Data Curator library with its DataColumn class , but also other downstream projects) have historically been able to make pyarrow.Scalar + custom_column work by implementing __radd__ on their class (and the same for the other reflected-operators). This pattern is now silently broken by an upgrade to 24.0.0, with no opt-out and no Python-level workaround.

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions