Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: support axis != -1 for record reduction #2514

Merged
merged 8 commits into from
Jun 10, 2023

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Jun 8, 2023

The logic in RecordArray._reduce_next made an implicit assumption that parents is contiguous and monotonic increasing. This is not always the case for axis != -1 (see #2510 (comment)).

This PR fixes #2512 with the introduction of a new kernel, similar to awkward_RecordArray_reduce_nonlocal_outoffsets_64, to compute the offsets (given by potentially non-monotonic, locally contiguous parents) that correctly partitions the current array into sublists. This kernel must also compute the carry operation that orders the sublists given by parents.

As such, it requires several loops:

  1. Initialise the carry array with a unique value to indicate the empty sublists that aren't indexed by parents
  2. Compute the offsets into parents that groups parents into sublists, and compute the associated carry that maps this reduction result to the position given by parents
  3. Append the necessary number of empty lists to the end of offsets, and insert the appropriate carry index.

We could "save performance" by only appending a single empty sublist in the case of "missing parents", but I don't see a strong incentive to do this (it's probably not worth any noticeable performance).

A verbose example of this kernel is as follows:

import numpy as np


parents = np.array([0, 0, 0, 4, 4, 1, 1, 1, 3, 3, 5], dtype=np.int64)


outcarry = np.empty(6, dtype=np.int64)
outoffsets = np.empty(outcarry.size + 1, dtype=np.int64)


def awkward_RecordArray_reduce_nonlocal_outoffsets_64(
    outoffsets, outcarry, parents, lenparents, outlength
):
    # Zero initialise offsets
    outoffsets[0] = 0

    # Initialise carry to unique value, indicating "missing"
    for i in range(outlength):
        outcarry[i] = -1

    #   index     0  1  2  3  4  5  6
    # outoffsets  0  -  -  -  -  -  -
    #  outcarry  -1 -1 -1 -1 -1 -1

    # Fill offsets with lengths of sublists (in order of appearance, *NOT* parents)
    i = 0
    k_sublist = 0
    for j in range(1, lenparents):
        if parents[i] != parents[j]:
            outoffsets[k_sublist + 1] = j
            outcarry[parents[i]] = k_sublist
            i = j
            k_sublist += 1

    #   index     0  1  2  3  4  5  6
    # outoffsets  0  3  5  8  10 -  -
    #  outcarry   0  2 -1  3  1 -1

    if lenparents > 0:
        outoffsets[k_sublist + 1] = j
        outcarry[parents[i]] = k_sublist
        k_sublist += 1

    #   index     0  1  2  3  4  5  6
    # outoffsets  0  3  5  8  10 10 -
    #  outcarry   0  2 -1  3  1  4

    # Append empty lists for missing parents
    for i in range(k_sublist, outlength):
        outoffsets[i + 1] = lenparents

    #   index     0  1  2  3  4  5  6
    # outoffsets  0  3  5  8  10 10 11
    #  outcarry   0  2 -1  3  1  4

    # Replace unique value with index of appended empty list
    for i in range(outlength):
        if outcarry[i] == -1:
            outcarry[i] = k_sublist
            k_sublist += 1

    #   index     0  1  2  3  4  5  6
    # outoffsets  0  3  5  8  10 10 11
    #  outcarry   0  2  5  3  1  4


awkward_RecordArray_reduce_nonlocal_outoffsets_64(
    outoffsets, outcarry, parents, parents.size, outcarry.size
)

With the current _reduce_next implementation, we can't tell whether the reduction occurs at this axis. As such, awkward_RecordArray_reduce_nonlocal_outoffsets_64 needs to anticipate non-monotonic parents. If we add this information, it would be possible to add a second kernel that makes axis=-1 reductions slightly faster (no carry, less array jumping). I think we hold off, and wait for a performance complaint :)

@agoose77 agoose77 marked this pull request as draft June 8, 2023 13:39
@agoose77 agoose77 temporarily deployed to docs-preview June 8, 2023 13:52 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Jun 9, 2023

Codecov Report

Merging #2514 (c7d0b8d) into main (645c817) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/contents/listoffsetarray.py 80.90% <ø> (ø)
src/awkward/contents/recordarray.py 85.15% <100.00%> (+0.05%) ⬆️

@agoose77 agoose77 marked this pull request as ready for review June 9, 2023 09:08
@agoose77 agoose77 temporarily deployed to docs-preview June 9, 2023 09:08 — with GitHub Actions Inactive
@agoose77 agoose77 requested review from jpivarski and ianna June 9, 2023 09:08
Comment on lines 9 to 14
def _reduce_max_masked(array, mask):
assert mask
j = ak.from_regular(
ak.argmax(array["1"], axis=1, keepdims=True, mask_identity=True)
)
return ak.flatten(array[j], axis=1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm being "lazy" by avoiding implementing an identity. The identity can easily be obtained using Form.length_zero_array() and fill_none, but it's not necessary here.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very nicely documented kernel.

I see that it only affects the new RecordArray overloaded reduction, and you're the expert in that—I admit that I haven't followed the precise indexing details, but there are a lot of comments to be future maintainer-friendly.

It's tested with an example of maximizing pairs, which is something that would require overloads. The test is at axis=0 for what looks like a 1 * var * var * (int64, float64). It could be tested at axis=1 as well, with might catch more surprises.

  • axis=0/axis=-3 all non-local
  • axis=1/axis=-2 partly non-local, partly local
  • axis=2/axis=-1 all local

Since the motivation was

The logic in RecordArray._reduce_next made an implicit assumption that parents is contiguous and monotonic increasing. This is not always the case for axis != -1 (see #2510 (comment)).

the axis=-1 case may be unnecessary, but axis=1 could turn up something.

@agoose77 agoose77 temporarily deployed to docs-preview June 10, 2023 00:38 — with GitHub Actions Inactive
@agoose77 agoose77 enabled auto-merge (squash) June 10, 2023 16:31
@agoose77 agoose77 temporarily deployed to docs-preview June 10, 2023 16:39 — with GitHub Actions Inactive
@agoose77 agoose77 merged commit cad6c72 into main Jun 10, 2023
35 checks passed
@agoose77 agoose77 deleted the agoose77/fix-kernel-parents-ordering branch June 10, 2023 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Record reducers need re-ordering
2 participants