-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Description
Describe the bug, including details regarding any error messages, version, and platform.
According to the doc for Array.offset:
A relative position into another array’s data.
The purpose is to enable zero-copy slicing. This value defaults to zero but must be applied on all operations with the physical storage buffers.
So in particular "must be applied on all operations with the physical storage buffers."
I'm wondering if it should be applied to ListArray.values.
Here's an example:
import pyarrow as pa
values = [[1], [1, 2], [1, 2, 3]]
array = pa.array(values)
assert array.to_pylist() == values
assert array.values.to_pylist() == [1, 1, 2, 1, 2, 3]
slice = array[1:]
assert slice.to_pylist() == [[1, 2], [1, 2, 3]]
assert slice.values == array.values # Wrong Should skip the first value
The work around is to calculate the values offset my self, by looking at ListArray.offsets at position ListArray.offset, but it's not straightforward.
Alternatively if ListArray.values isn't going to respect ListArray.offset it should be documented here
Tested on pyarrow==13.0.0
Component(s)
Python