New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Lengths of empty regular slices #1568
Fix: Lengths of empty regular slices #1568
Conversation
Codecov Report
|
for more information, see https://pre-commit.ci
…tps://github.com/scikit-hep/awkward into ioanaif/fix-lengths-of-empty-regular-slices-1557
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look like a general solution: if there are two sliced dimensions, should there be two RegularArrays? What if the sliced dimension is not first? Why is start is None or stop is None
the condition for determining whether there should be a RegularArray? For positive step
, start is None
is equivalent to start == 0
and stop is None
is equivalent to stop == len(self)
.
I totally believe that a RegularArray wrapper is needed here, but not under these conditions. Maybe always? Maybe if there's at least one slice somewhere in the tuple? Maybe the number of RegularArrays depends on the number of slices, and their sizes depend on which dimensions they're at?
The way to find out is to see what NumPy does: fortunately this test doesn't depend on dimensions being irregular. You can make a NumPy array with a lot of dimensions and try slices on it with various combinations of :
and [0, 1, 2]
, and []
to see what happens to the output shape
.
Good point, I assumed in an empty slice that it would go first.
This was the pattern I found from the tests, the
I noticed this behaviour is only consistent when slice.start/slice.stop is None, here are some of the NumPy tests I did:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so this isn't a general solution to the problem.
Although this test case works (in your unit tests):
>>> ak._v2.to_regular(ak._v2.Array([[1, 2, 3], [4, 5, 6]]), axis=1)[:, []]
<Array [[], []] type='2 * 0 * 2 * 3 * int64'>
There's nothing special about the slice's start
and stop
being None
. For instance, the first dimension has length 2, so :
is completely equivalent to 0:2
, and yet:
>>> ak._v2.to_regular(ak._v2.Array([[1, 2, 3], [4, 5, 6]]), axis=1)[0:2, []]
<Array [] type='0 * 3 * int64'>
That's easily fixed by removing the if-statement as suggested below.
But here's a deeper issue (pun intended): I can make the example 3-dimensional, rather than 2, by putting the whole thing into a length-1 list, replacing axis=1
with axis=2
, and putting another :
in the slice:
>>> ak._v2.to_regular(ak._v2.Array([[[1, 2, 3], [4, 5, 6]]]), axis=2)[:, :, []]
<Array [[]] type='1 * var * 3 * int64'>
But it's not returning
>>> np.array([[[1, 2, 3], [4, 5, 6]]])[:, :, []].tolist()
[[[], []]]
as it should.
It has something to do with the nested RegularArrays. If I make the Awkward Array out of RegularArrays, it doesn't work, but if I make it out of a (multidimensional) NumpyArray, it does (although that may just be because we pass it off to NumPy, which is correct by definition).
>>> ak._v2.from_numpy(np.array([[[1, 2, 3], [4, 5, 6]]]), regulararray=True)[:, :, []]
<Array [[]] type='1 * 0 * 1 * 2 * 3 * int64'>
>>> ak._v2.from_numpy(np.array([[[1, 2, 3], [4, 5, 6]]]), regulararray=False)[:, :, []]
<Array [[[], []]] type='1 * 2 * 0 * int64'>
I thought maybe I would figure it out, but no—I poked around with it, but I don't understand why the original solution worked for the test example. Do you see what's going on here?
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…tps://github.com/scikit-hep/awkward into ioanaif/fix-lengths-of-empty-regular-slices-1557
Co-authored-by: Jim Pivarski <jpivarski@users.noreply.github.com>
for more information, see https://pre-commit.ci
#1557