New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] list_parent_indices only computes for first chunk #29317
Comments
Joris Van den Bossche / @jorisvandenbossche: I can't directly reproduce it with this: In [1]: import pyarrow as pa
In [2]: import pyarrow.compute as pc
In [3]: arr = pa.array([[1, 2], [3, 4, 5]])
In [4]: pc.list_parent_indices(arr)
Out[4]:
<pyarrow.lib.Int32Array object at 0x7f9eebd67be0>
[
0,
0,
1,
1,
1
]
In [5]: chunked_arr = pa.chunked_array([arr, arr])
In [6]: pc.list_parent_indices(chunked_arr)
Out[6]:
<pyarrow.lib.ChunkedArray object at 0x7f9f37c926d0>
[
[
0,
0,
1,
1,
1
],
[
0,
0,
1,
1,
1
]
]
In [7]: pa.__version__
Out[7]: '5.0.0' where calling the compute function on the chunked array also gives a chunked array as result. |
Antoine Pitrou / @pitrou: |
Antoine Pitrou / @pitrou: |
Antoine Pitrou / @pitrou: >>> arr = pa.array([[1, 2], [3, 4, 5]])
>>> pc.list_parent_indices(pa.chunked_array([arr, arr]))
<pyarrow.lib.ChunkedArray object at 0x7faa18d1ba10>
[
[
0,
0,
1,
1,
1
],
[
2,
2,
3,
3,
3
]
] |
Joris Van den Bossche / @jorisvandenbossche: |
Tor Eivind McKenzie-Syvertsen: |
Pyarrow version: 5.0.0.
Python version: 3.7.9
I came across this issue due to very unexpected behaviour from the "explode" function obtained here:
https://issues.apache.org/jira/browse/ARROW-12099
indices = pc.list_parent_indices(table[col_name])
if table[column] in this example contains several chunks, the indices will look perfectly fine for that chunk, but erratic and unexpected results for second chunk.
No warning or info was given either
A workaround that solved the problem for me is:
The behaviour then changes dramatically.
I'm assuming this isnt expected and should be fixed?
Reporter: Tor Eivind McKenzie-Syvertsen
Assignee: Antoine Pitrou / @pitrou
PRs and other links:
Note: This issue was originally created as ARROW-13681. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: