-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: unflatten should accept non-packed counts
#2097
Conversation
numpy.array_equal(left.starts, right.starts) | ||
and numpy.array_equal(left.stops, right.stops) | ||
and visitor(left.content, right.content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd thought that offsets
was the shared interface (and it would not work well for ListArray
), but of course that's nonsense; it's starts
/stops
. This change to the test helper fixes it.
layout = ak.operations.to_layout( | ||
array, allow_record=False, allow_other=False | ||
).to_packed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately we need to know that at any depth
, the counts
computed from our externally-visible structure aligns with the current content. Therefore, we need most of the guarantees of to_packed()
. This could be done in the recursion if we wanted better performance for the non axis=-1
cases, because we don't need the leaf to be contiguous, or indeed care about any nodes after the unflatten depth.
counts = ak.operations.to_layout( | ||
counts, allow_record=False, allow_other=False | ||
).to_packed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
counts
should be a trivial layout, so this is reasonable to pack (we don't need contiguousness, but we do need a non-indexed layout).
counts = ak.operations.to_layout( | ||
counts, allow_record=False, allow_other=False | ||
).to_packed() | ||
if counts.is_option and (counts.content.is_numpy or counts.content.is_unknown): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpivarski's fix
counts = backend.nplike.to_rectilinear( | ||
ak.operations.fill_none(counts, 0, axis=-1, highlevel=False) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replaces the use of to_numpy
, meaning we won't incur a needless device transfer under CUDA (IIRC).
current_offsets = backend.index_nplike.empty(len(counts) + 1, np.int64) | ||
current_offsets[0] = 0 | ||
backend.index_nplike.cumsum(counts, out=current_offsets[1:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use nonlocal
for readability (nonlocal appears later)
return layout | ||
|
||
out = transform(layout, depth=1, axis=axis) | ||
out = ak._do.recursively_apply(apply, layout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix recursion. In v1, we had a transform_child_layouts
that basically does what continuation
does now (apart from the fact that it accepted a new transform function). Here, we restore the recursive behaviour
Codecov Report
Additional details and impacted files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting: using to_rectilinear
instead of project
. Since the counts
needs to be rectilinear (and not necessarily NumPy), that makes sense.
Nice work!
ak.unflatten
error message doesn't indicate axis out of boudns #2059np.ma
with our own option handling