Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: replace axis_wrap_if_negative with maybe_posaxis, simpler and more correct #1986

Merged
merged 26 commits into from
Dec 9, 2022

Conversation

jpivarski
Copy link
Member

@jpivarski jpivarski commented Dec 9, 2022

I can't believe it: axis_wrap_if_negative has always been wrong. This function dates back to the early days of Awkward 1.x. We've been using it for so long, in so many things, that I would have thought that it's been through every possible test.

To convert a negative axis into its non-negative equivalent, you have to know the absolute depth of the array. With depth == 1 for a 1-dimensional array and increasing for each nested list, the non-negative axis is axis + depth where axis is negative. For instance, if axis == -1 (deepest) and depth == 2 (array of lists of primitives), axis + depth == 1: the equivalent non-negative axis == 1.

If there's branching (a union or a record), then there isn't a single depth; we can't evaluate axis + depth. However, as we recurse down, we'll eventually recurse through the branching structure. Inside of a record of {x: var * float64, y: var * var * int32}, for instance, we can evaluate one depth inside the x branch and a different depth inside the y branch. We want to do that so that axis=-1 means "deepest for all branches," i.e. the float64 for x and the int32 for y.

That's why axis_wrap_if_negative passes axis through unchanged if (a) it's already non-negative or (b) the layout it's looking at is branched. The idea is that we'll repeatedly call axis_wrap_if_negative while the value we get back is negative. As soon as it's non-negative, we keep that value.

However, if we're recursing, the layout that we have at some level is not the whole layout. If we only have the x branch of a record, for instance, we'll see its var * float64 as "layout", and if x's record were embedded within other lists, we wouldn't see them. The depth we compute form that "layout" is therefore relative to the depth of last branching (the record with x and y, in this example).

Since axis_wrap_if_negative takes this partial "layout" and the negative axis as arguments, it can never determine the absolute depth, and therefore cannot correctly convert the axis into a non-negative axis... in general.

There's only one case that works anyway: if the branching happens to be at the top level, the relative depth is equal to the absolute depth and we get the right answer. I've looked around, and I've only ever seen tests of negative axis with non-branching arrays (no records or unions) or the branching is a RecordArray at top level (this special case). There are tests in which different fields have different depths, like

# * {x: var * float64, y: var * var * int32}

but never within another list, like

# * var * {x: var * float64, y: var * var * int32}

The examples in #1914 were apparently the first general test of negative to non-negative axis conversion.

So I started this PR by introducing a new function (which will eventually live in ak._util, but right now it's in ak._do to be next to axis_wrap_if_negative for juxtaposition), which does the conversion correctly. It needs the same arguments as axis_wrap_if_negative and also the current depth, to correct for the fact that the layout it's seeing is not the whole array.

I've implemented ak.is_none with it. Now I'm going to look for other functions that were using axis_wrap_if_negative incorrectly. I don't think we'll fix all of these bugs before the 2.0.0 release tomorrow, but we'll get the pattern established.

They're bugs and we can change them without a deprecation cycle.


📚 The documentation for this PR will be available at https://awkward-array.readthedocs.io/en/jpivarski-improved-axis-to-posaxis/ once Read the Docs has finished building 🔨

@codecov
Copy link

codecov bot commented Dec 9, 2022

Codecov Report

Merging #1986 (d73d78a) into main (4e4bb54) will decrease coverage by 0.09%.
The diff coverage is 89.32%.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/contents/bitmaskedarray.py 69.28% <ø> (+0.13%) ⬆️
src/awkward/contents/unmaskedarray.py 72.76% <36.36%> (+2.07%) ⬆️
src/awkward/contents/emptyarray.py 72.00% <62.50%> (-0.38%) ⬇️
src/awkward/contents/indexedarray.py 77.58% <83.33%> (-0.20%) ⬇️
src/awkward/contents/recordarray.py 84.32% <83.33%> (-0.44%) ⬇️
src/awkward/_do.py 83.33% <85.71%> (-2.15%) ⬇️
src/awkward/operations/ak_fill_none.py 92.68% <85.71%> (-2.44%) ⬇️
src/awkward/operations/ak_singletons.py 92.30% <85.71%> (-0.29%) ⬇️
src/awkward/operations/ak_num.py 90.90% <86.66%> (-9.10%) ⬇️
src/awkward/_util.py 81.30% <90.90%> (+0.21%) ⬆️
... and 23 more

@jpivarski jpivarski added the pr-next-release Required for the next release label Dec 9, 2022
tests/test_1137-num.py Show resolved Hide resolved
@agoose77 agoose77 enabled auto-merge (squash) December 9, 2022 15:29
@agoose77 agoose77 merged commit d4fa292 into main Dec 9, 2022
@agoose77 agoose77 deleted the jpivarski/improved-axis-to-posaxis branch December 9, 2022 15:45
@jpivarski jpivarski removed the pr-next-release Required for the next release label Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ak.is_none is off-by-one with negative axis and doesn't handle an edge case of positive axis
2 participants