Skip to content

[C++] Sliced sparse union element access can return incorrect values #50105

@fenfeng9

Description

@fenfeng9

Describe the bug, including details regarding any error messages, version, and platform.

Slicing a sparse union array produces incorrect logical values.
to_pylist() and scalar access on the slice return wrong results —a non-null element is reported as null, or a value from the wrong position is returned.

Reproduce

import pyarrow as pa


def main():
    arr = pa.UnionArray.from_sparse(
        pa.array([0, 0, 0], type=pa.int8()),
        [pa.array([None, 20, 30])],
    )
    sliced = arr.slice(2)

    print(f"pyarrow: {pa.__version__}")
    print()

    # Direct access on the original array is still correct.
    expected_arr_2 = 30
    actual_arr_2 = arr[2].as_py()
    print(f"arr[2].as_py() expected: {expected_arr_2}")
    print(f"arr[2].as_py() actual:   {actual_arr_2}")
    print()

    # The sliced array should preserve the same logical value in to_pylist().
    expected_sliced_pylist = [30]
    actual_sliced_pylist = sliced.to_pylist()
    print(f"sliced.to_pylist() expected: {expected_sliced_pylist}")
    print(f"sliced.to_pylist() actual:   {actual_sliced_pylist}")
    print()

    # Scalar access on the sliced array should also return the same value.
    expected_sliced_0 = 30
    actual_sliced_0 = sliced[0].as_py()
    print(f"sliced[0].as_py() expected: {expected_sliced_0}")
    print(f"sliced[0].as_py() actual:   {actual_sliced_0}")


if __name__ == "__main__":
    main()

Result

pyarrow: 24.0.0

arr[2].as_py() expected: 30
arr[2].as_py() actual:   30

sliced.to_pylist() expected: [30]
sliced.to_pylist() actual:   [None]

sliced[0].as_py() expected: 30
sliced[0].as_py() actual:   None

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions