New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-15837: [C++][Python] Clarify documentation for ListArray::offsets() #12557
Conversation
If there is a way to reconstruct a ListArray using the offsets and values, maybe it could be worth mentioning it as well ? EDIT: from the discussion on JIRA it doesn't seem possible yet - the doc looks good to me then, thanks ! |
There seem to be conda-related issues, I'm going to force-push. |
db64ce5
to
97f6f91
Compare
97f6f91
to
7ce7280
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that this wording really hurts anything and the example is always helpful but I don't think it made the issue any clearer for me.
A list array has three pieces, values, offsets, and validity. It isn't clear why the offsets would be expected to contain the validity. I would think it just as likely someone assumes the values contains the validity.
Once we fix ARROW-15839 by adding a mask I think the original issue would be clearer.
While "a list array has three pieces, values, offsets, and validity" is of course correct, I think many people will think about (or explain) a list array as consisting of two pieces: values and offsets (those are also the two "child" arrays for which we have properties on ListArray to access them, and are the two arrays from which you can recreate a new ListArray in from_arrays). So I don't think the confusion from ARROW-15837 is that uncommon, and the clarification here seems helpful IMO. Since the values array doesn't have a 1:1 relationship with the list values (and can have nulls itself, independent from nulls at the list level), I would find it less expected to think that those would contain the list validity. ARROW-15839 will indeed help, but then it's maybe also the question if we want to make it easier to get the "mask" / validity bitmap of an existing ListArray (although that's not specific to a ListArray). |
Benchmark runs are scheduled for baseline = a6e51a0 and contender = c70426f. c70426f is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
No description provided.