-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement axis_wrap_if_negative #314
Conversation
Thanks for taking this up! I just finished updating your #261 and will be merging it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the output of axis_wrap_if_negative
is off by one. The tests don't test it because argmax
is a reducer; reducers and sorting don't use axis_wrap_if_negative
. Unfortunately, all of the methods that use axis_wrap_if_negative
don't have NumPy equivalents (to motivate how it ought to work), but we still have a pretty good guide on how it ought to work. (Try negative axis
in np.size, for instance.)
Below, this three-dimensional array should allow axis
to be 0
, 1
, or 2
, and these should be equivalent to -3
, -2
, -1
(in that order). It's always the case that -1
is the deepest allowed axis
.
>>> a = awkward1.Array(numpy.arange(3*5*2).reshape(3, 5, 2))
>>> awkward1.type(a)
3 * 5 * 2 * int64
>>> awkward1.num(a, axis=0)
3
>>> awkward1.num(a, axis=1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=2)
<Array [[2, 2, 2, 2, 2], ... [2, 2, 2, 2, 2]] type='3 * 5 * int64'>
>>> awkward1.num(a, axis=3) # this should fail because axis=3 is out of range
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'
>>> awkward1.num(a, axis=-1) # therefore, -1 should be the deepest, equivalent to axis=2
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=-2) # axis=-2 should be equivalent to axis=1
3
>>> awkward1.num(a, axis=-3) # axis=-3 should be equivalent to axis=0
3
The same should be true of ListType nodes (the above are RegularType). This makes the same off-by-one mistake, which leads to a different behavior in the axis=-3
case, but that's probably unimportant.
>>> a = awkward1.Array(numpy.arange(3*5*2).reshape(3, 5, 2).tolist())
>>> awkward1.type(a)
3 * var * var * int64
>>> awkward1.num(a, axis=0)
3
>>> awkward1.num(a, axis=1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=2)
<Array [[2, 2, 2, 2, 2], ... [2, 2, 2, 2, 2]] type='3 * var * int64'>
>>> awkward1.num(a, axis=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'
>>> awkward1.num(a, axis=-1)
<Array [5, 5, 5] type='3 * int64'>
>>> awkward1.num(a, axis=-2)
3
>>> awkward1.num(a, axis=-3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pivarski/irishep/awkward-1.0/awkward1/highlevel.py", line 1145, in __repr__
limit_value, layout, self._behavior
File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1257, in minimally_touching_string
lft = next(leftgen)
File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1244, in forever
for token in iterable:
File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1150, in forward
for token in forward(x[i], sp):
File "/home/pivarski/irishep/awkward-1.0/awkward1/_util.py", line 1150, in forward
for token in forward(x[i], sp):
ValueError: in ListOffsetArray64 attempting to get 0, offsets[i] != offsets[i + 1] and offsets[i + 1] > len(content)
Since these have been passed through to Python, we can see the off-by-one directlly:
>>> a.layout.axis_wrap_if_negative(0)
0
>>> a.layout.axis_wrap_if_negative(1)
1
>>> a.layout.axis_wrap_if_negative(2)
2
>>> a.layout.axis_wrap_if_negative(-1) # should be 2
1
>>> a.layout.axis_wrap_if_negative(-2) # should be 1
0
>>> a.layout.axis_wrap_if_negative(-3) # should be 0
-1
In NumPy, negative axis is a syntactic convenience, allowing you to not know or not care how many dimensions an array has but get the deepest axis or the second-deepest axis anyway. (Also, there's more of a symmetry between the first axis and the last axis, so it's just helping people count from the other end if they want to.)
But in Awkward, negative axis allows you to do things you wouldn't otherwise be able to do. Consider this array:
>>> a = awkward1.Array([
... {"x": [1], "y": [[], [1]]},
... {"x": [1, 2], "y": [[], [1], [1, 2]]},
... {"x": [1, 2, 3], "y": [[], [1], [1, 2], [1, 2, 3]]}])
Starting from the shallow end (positive axis
), we can address axis
values up to 1
because this is how deep "x"
goes ("y"
goes up to 2
).
>>> awkward1.num(a, axis=0).tolist()
{'x': 3, 'y': 3}
>>> awkward1.num(a, axis=1).tolist()
[{'x': 1, 'y': 2}, {'x': 2, 'y': 3}, {'x': 3, 'y': 4}]
>>> awkward1.num(a, axis=2).tolist()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pivarski/irishep/awkward-1.0/awkward1/operations/structure.py", line 199, in num
out = layout.num(axis=axis)
ValueError: 'axis' out of range for 'num'
But suppose we want the num
of the deepest of both "x"
and "y"
? We should be able to express that with negative axis
:
>>> awkward1.num(a, axis=-1).tolist()
[4]+ Done emacs src/libawkward/array/ListOffsetArray.cpp
Segmentation fault (core dumped)
What I'd expect from this is
[{'x': 1, 'y': [0, 1]}, {'x': 2, 'y': [0, 1, 2]}, {'x': 3, 'y': [0, 1, 2, 3]}]
To do that, the axis_wrap_if_negative
method must refrain from converting negative axis
to positive axis
if there's a branch, such as the branch between "x"
(depth 2) and "y"
(depth 3). Only once the recursion has descended past the branch can it switch the negative axis
into a positive axis
.
That also means that cases like this have limited valid negative axis
values. For instance, axis=-2
in the above should be an error because it happens above the branch point, so it doesn't have an unambiguous meaning.
I'm going to merge with master for you, then give you back the branch. |
It should be up to date now. (I'm not making any more changes.) |
…to ianna/axis_wrap_if_negative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpivarski - I left the axis_out_of_range
test in intentionally. Should it throw a ValueError
?
…::next to return an n-th successor of iterator
…to ianna/axis_wrap_if_negative
@jpivarski - The tests pass as expected. I've also added more tests for the negative axis. Please, review when you have time. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I just checked it out and tried it; everything is perfect, including error messages.
If you have no additional changes to make, I'll merge it.
Thanks, @jpivarski ! |
Start working on issue #163