Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement axis_wrap_if_negative #314

Merged
merged 15 commits into from
Jul 20, 2020
Merged

Conversation

ianna
Copy link
Collaborator

@ianna ianna commented Jun 30, 2020

Start working on issue #163

@ianna ianna marked this pull request as draft June 30, 2020 16:06
@jpivarski jpivarski linked an issue Jun 30, 2020 that may be closed by this pull request
@jpivarski
Copy link
Member

Thanks for taking this up! I just finished updating your #261 and will be merging it now.

@ianna ianna changed the title [WIP] implement axis_wrap_if_negative implement axis_wrap_if_negative Jul 9, 2020
@ianna ianna changed the title implement axis_wrap_if_negative Implement axis_wrap_if_negative Jul 9, 2020
@ianna ianna requested a review from jpivarski July 9, 2020 11:31
@ianna ianna marked this pull request as ready for review July 9, 2020 11:31
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the output of axis_wrap_if_negative is off by one. The tests don't test it because argmax is a reducer; reducers and sorting don't use axis_wrap_if_negative. Unfortunately, all of the methods that use axis_wrap_if_negative don't have NumPy equivalents (to motivate how it ought to work), but we still have a pretty good guide on how it ought to work. (Try negative axis in np.size, for instance.)

Below, this three-dimensional array should allow axis to be 0, 1, or 2, and these should be equivalent to -3, -2, -1 (in that order). It's always the case that -1 is the deepest allowed axis.

>>> a = awkward1.Array(numpy.arange(3*5*2).reshape(3, 5, 2))
>>> awkward1.type(a)
3 * 5 * 2 * int64
>>> awkward1.num(a, axis=0)
3
>>> awkward1.num(a, axis=1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=2)
<Array [[2, 2, 2, 2, 2], ... [2, 2, 2, 2, 2]] type='3 * 5 * int64'>

>>> awkward1.num(a, axis=3)     # this should fail because axis=3 is out of range
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
    out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'

>>> awkward1.num(a, axis=-1)    # therefore, -1 should be the deepest, equivalent to axis=2
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=-2)   # axis=-2 should be equivalent to axis=1
3
>>> awkward1.num(a, axis=-3)   # axis=-3 should be equivalent to axis=0
3

The same should be true of ListType nodes (the above are RegularType). This makes the same off-by-one mistake, which leads to a different behavior in the axis=-3 case, but that's probably unimportant.

>>> a = awkward1.Array(numpy.arange(3*5*2).reshape(3, 5, 2).tolist())
>>> awkward1.type(a)
3 * var * var * int64
>>> awkward1.num(a, axis=0)
3
>>> awkward1.num(a, axis=1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=2)
<Array [[2, 2, 2, 2, 2], ... [2, 2, 2, 2, 2]] type='3 * var * int64'>
>>> awkward1.num(a, axis=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
    out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'
>>> awkward1.num(a, axis=-1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=-2)
3
>>> awkward1.num(a, axis=-3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pivarski/irishep/awkward-1.0/awkward1/highlevel.py", line 1145, in __repr__
    limit_value, layout, self._behavior
  File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1257, in minimally_touching_string
    lft = next(leftgen)
  File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1244, in forever
    for token in iterable:
  File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1150, in forward
    for token in forward(x[i], sp):
  File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1150, in forward
    for token in forward(x[i], sp):
ValueError: in ListOffsetArray64 attempting to get 0, offsets[i] != offsets[i + 1] and offsets[i + 1] > len(content)

Since these have been passed through to Python, we can see the off-by-one directlly:

>>> a.layout.axis_wrap_if_negative(0)
0
>>> a.layout.axis_wrap_if_negative(1)
1
>>> a.layout.axis_wrap_if_negative(2)
2
>>> a.layout.axis_wrap_if_negative(-1)   # should be 2
1
>>> a.layout.axis_wrap_if_negative(-2)   # should be 1
0
>>> a.layout.axis_wrap_if_negative(-3)   # should be 0
-1

In NumPy, negative axis is a syntactic convenience, allowing you to not know or not care how many dimensions an array has but get the deepest axis or the second-deepest axis anyway. (Also, there's more of a symmetry between the first axis and the last axis, so it's just helping people count from the other end if they want to.)

But in Awkward, negative axis allows you to do things you wouldn't otherwise be able to do. Consider this array:

>>> a = awkward1.Array([
...     {"x": [1], "y": [[], [1]]},
...     {"x": [1, 2], "y": [[], [1], [1, 2]]},
...     {"x": [1, 2, 3], "y": [[], [1], [1, 2], [1, 2, 3]]}])

Starting from the shallow end (positive axis), we can address axis values up to 1 because this is how deep "x" goes ("y" goes up to 2).

>>> awkward1.num(a, axis=0).tolist()
{'x': 3, 'y': 3}
>>> awkward1.num(a, axis=1).tolist()
[{'x': 1, 'y': 2}, {'x': 2, 'y': 3}, {'x': 3, 'y': 4}]
>>> awkward1.num(a, axis=2).tolist()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
    out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'

But suppose we want the num of the deepest of both "x" and "y"? We should be able to express that with negative axis:

>>> awkward1.num(a, axis=-1).tolist()
[4]+  Done                    emacs src/libawkward/array/ListOffsetArray.cpp
Segmentation fault (core dumped)

What I'd expect from this is

[{'x': 1, 'y': [0, 1]}, {'x': 2, 'y': [0, 1, 2]}, {'x': 3, 'y': [0, 1, 2, 3]}]

To do that, the axis_wrap_if_negative method must refrain from converting negative axis to positive axis if there's a branch, such as the branch between "x" (depth 2) and "y" (depth 3). Only once the recursion has descended past the branch can it switch the negative axis into a positive axis.

That also means that cases like this have limited valid negative axis values. For instance, axis=-2 in the above should be an error because it happens above the branch point, so it doesn't have an unambiguous meaning.

@jpivarski
Copy link
Member

I'm going to merge with master for you, then give you back the branch.

@jpivarski
Copy link
Member

It should be up to date now. (I'm not making any more changes.)

Copy link
Collaborator Author

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpivarski - I left the axis_out_of_range test in intentionally. Should it throw a ValueError?

tests/test_0163_negative-axis-wrap.py Outdated Show resolved Hide resolved
@ianna
Copy link
Collaborator Author

ianna commented Jul 20, 2020

@jpivarski - The tests pass as expected. I've also added more tests for the negative axis. Please, review when you have time. Thanks!

@ianna ianna requested a review from jpivarski July 20, 2020 16:25
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I just checked it out and tried it; everything is perfect, including error messages.

If you have no additional changes to make, I'll merge it.

@jpivarski jpivarski merged commit 93337b9 into master Jul 20, 2020
@jpivarski jpivarski deleted the ianna/axis_wrap_if_negative branch July 20, 2020 17:56
@ianna
Copy link
Collaborator Author

ianna commented Jul 21, 2020

Thanks, @jpivarski !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remember to 'implement axis_wrap_if_negative'
2 participants