Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] ListArray.flatten() should take care of slicing offsets #16993

Closed
asfimport opened this issue Dec 10, 2019 · 5 comments
Closed

[Python] ListArray.flatten() should take care of slicing offsets #16993

asfimport opened this issue Dec 10, 2019 · 5 comments

Comments

@asfimport
Copy link
Collaborator

Currently ListArray.flatten() simply returns the child array. If a ListArray is a slice of another ListArray, they will share the same child array, however the expected behavior (I think) of flatten() should be returning an Array that's a concatenation of all the sub-lists in the ListArray, so the slicing offset should be taken into account.

 

For example:

a = pa.array([[1], [2], [3]])

assert a.flatten().equals(pa.array([1,2,3]))

  1. expected:

    a.slice(1).flatten().equals(pa.array([2, 3]))

Reporter: Zhuo Peng / @brills
Assignee: Zhuo Peng / @brills

PRs and other links:

Note: This issue was originally created as ARROW-7362. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
There was some discussion about this in ARROW-7031: #5759, where it was said to not slice the values.

Personally, I think it would be nice to have easy python access to the sliced values as well, but I also find it somewhat confusing to have .flatten() and .values differ.

@asfimport
Copy link
Collaborator Author

Wes McKinney / @wesm:
I can't remember what was the argument before (maybe I was making it, sorry), but I think it would be OK for flatten() to return the sliced values, while .values does need to return the unsliced values I think. As long as an appropriate caveat is added to the docstring to say that the offsets should not be used (for random access purposes) with the result of flatten()

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Yes, the main thing is that offsets and one of values/flatten() need to match. Currently I implemented offsets such that they are sliced themselves but point into the unsliced values.

@asfimport
Copy link
Collaborator Author

Joris Van den Bossche / @jorisvandenbossche:
Another option could be to adjust the offsets so they point into the sliced values. But, this would then not be a zero-copy access of the offsets, which probably makes it a bad idea.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 6006
#6006

@asfimport asfimport added this to the 0.16.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant