-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: support axis != -1
for record reduction
#2514
Conversation
Codecov Report
Additional details and impacted files
|
def _reduce_max_masked(array, mask): | ||
assert mask | ||
j = ak.from_regular( | ||
ak.argmax(array["1"], axis=1, keepdims=True, mask_identity=True) | ||
) | ||
return ak.flatten(array[j], axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm being "lazy" by avoiding implementing an identity. The identity can easily be obtained using Form.length_zero_array()
and fill_none
, but it's not necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nicely documented kernel.
I see that it only affects the new RecordArray
overloaded reduction, and you're the expert in that—I admit that I haven't followed the precise indexing details, but there are a lot of comments to be future maintainer-friendly.
It's tested with an example of maximizing pairs, which is something that would require overloads. The test is at axis=0
for what looks like a 1 * var * var * (int64, float64)
. It could be tested at axis=1
as well, with might catch more surprises.
axis=0
/axis=-3
all non-localaxis=1
/axis=-2
partly non-local, partly localaxis=2
/axis=-1
all local
Since the motivation was
The logic in
RecordArray._reduce_next
made an implicit assumption thatparents
is contiguous and monotonic increasing. This is not always the case foraxis != -1
(see #2510 (comment)).
the axis=-1
case may be unnecessary, but axis=1
could turn up something.
awkward-cpp/src/cpu-kernels/awkward_RecordArray_reduce_nonlocal_outoffsets_64.cpp
Outdated
Show resolved
Hide resolved
…l_outoffsets_64.cpp
The logic in
RecordArray._reduce_next
made an implicit assumption thatparents
is contiguous and monotonic increasing. This is not always the case foraxis != -1
(see #2510 (comment)).This PR fixes #2512 with the introduction of a new kernel, similar to
awkward_RecordArray_reduce_nonlocal_outoffsets_64
, to compute the offsets (given by potentially non-monotonic, locally contiguous parents) that correctly partitions the current array into sublists. This kernel must also compute thecarry
operation that orders the sublists given byparents
.As such, it requires several loops:
parents
parents
that groupsparents
into sublists, and compute the associatedcarry
that maps this reduction result to the position given byparents
offsets
, and insert the appropriatecarry
index.We could "save performance" by only appending a single empty sublist in the case of "missing parents", but I don't see a strong incentive to do this (it's probably not worth any noticeable performance).
A verbose example of this kernel is as follows:
With the current
_reduce_next
implementation, we can't tell whether the reduction occurs at this axis. As such,awkward_RecordArray_reduce_nonlocal_outoffsets_64
needs to anticipate non-monotonicparents
. If we add this information, it would be possible to add a second kernel that makesaxis=-1
reductions slightly faster (nocarry
, less array jumping). I think we hold off, and wait for a performance complaint :)