Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jagged index array raises FIXME #1022

Closed
agoose77 opened this issue Jul 20, 2021 · 14 comments · Fixed by #1028
Closed

Jagged index array raises FIXME #1022

agoose77 opened this issue Jul 20, 2021 · 14 comments · Fixed by #1028
Labels
bug The problem described is something that must be fixed

Comments

@agoose77
Copy link
Collaborator

Version of Awkward Array

1.4.0

Description and code to reproduce

The following code

array = ak.layout.ListOffsetArray64(
    ak.layout.Index64(np.r_[0, 2]),
    ak.layout.NumpyArray(np.random.uniform(size=(2, 4)))
)
t = ak.argmax(array, axis=-1, keepdims=True)
y = array[t]

raises this traceback

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_46426/3125939061.py in <module>
      1 t = ak.argmax(y, axis=-1, keepdims=True)
----> 2 q = y[t]

RuntimeError: FIXME ListArrayOf<T>::SliceVarNewAxis. 2021-02-10 Was this left over from development? If so, it's not getting tested. If anyone out there encounters this error, please report it so that we can properly validate this code path and include it in the tests. https://github.com/scikit-hep/awkward-1.0/issues/new?assignees=&labels=bug+%28unverified%29&template=bug-report.md&title=
@agoose77 agoose77 added bug (unverified) The problem described would be a bug, but needs to be triaged bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Jul 20, 2021
@agoose77
Copy link
Collaborator Author

@jpivarski
Copy link
Member

The history of that exception was that I was reviewing some of @nsmith-'s corrections to jagged slicing, and I came upon this function that didn't look to me like it could ever be reached. (I couldn't construct an example that would trigger it.) I thought it might be dead code, but instead of removing it, I put this FIXME error message in, so that an example that triggers it would be reported and we can use that example to understand how that code path is reached.

If it were Python, we could just put an exception at that point and look at the stack trace. Since it's C++, we'll have to do a bit more work, but it's still possible.

Thanks!

@agoose77
Copy link
Collaborator Author

I will look into debugging this with Valgrind / gdb; I've not touched any of the compilation code yet, so it will be interesting to see whether CLion can handle the CMake project.

@jpivarski
Copy link
Member

This code is slated for conversion into Python (which is why I have Python stack traces in mind), but if this is a common enough case—that I somehow wasn't able to make up—then waiting for v2 may be too long.

If it depends crucially on the "list-of-IndexedArray-of-X" and not, for instance, on "list-of-IndexedOptionArray-of-X", then that point in the code could simply project() the IndexedArray.

@agoose77
Copy link
Collaborator Author

I can also trigger this with a more conventional layout:

y = ak.Array([
    [
        [1,2,3,4],
        [5,6,7,8],
    ]
])
y = ak.to_regular(y,axis=-1)

Could you help me to understand what this particular overload is supposed to do? By name, SliceVarNewAxis seems like np.newaxis, but there is also SliceNewAxis, so there is clearly more going on here.

@jpivarski
Copy link
Member

SliceVarNewAxis had to be added later, as an alternate form of SliceNewAxis. They're very similar, made to distinguish between np.newaxis applied to a regular axis vs a variable axis? I'll have to look it up.

@jpivarski
Copy link
Member

SliceNewAxis and SliceVarNewAxis shouldn't be relevant because none of these slices involve any np.newaxis.

But there's definitely something murky here.

>>> y = ak.to_regular(ak.Array([[[1, 2, 3, 4], [5, 6, 7, 8]]]), axis=-1)
>>> y
<Array [[[1, 2, 3, 4], [5, 6, 7, 8]]] type='1 * var * 4 * int64'>
>>> y.layout
<ListOffsetArray64>
    <offsets><Index64 i="[0 2]" offset="0" length="2" at="0x56448d7a6940"/></offsets>
    <content><RegularArray size="4">
        <content><NumpyArray format="l" shape="8" data="1 2 3 4 5 6 7 8" at="0x56448d7aa960"/></content>
    </RegularArray></content>
</ListOffsetArray64>

and

>>> t = ak.argmax(y, axis=-1, keepdims=True)
>>> t
<Array [[[3], [3]]] type='1 * var * 1 * ?int64'>
>>> t.layout
<ListOffsetArray64>
    <offsets><Index64 i="[0 2]" offset="0" length="2" at="0x56448d7a9b30"/></offsets>
    <content><RegularArray size="1">
        <content><ByteMaskedArray valid_when="false">
            <mask><Index8 i="[0 0]" offset="0" length="2" at="0x56448d7a9bb0"/></mask>
            <content><NumpyArray format="l" shape="2" data="3 3" at="0x56448d7a9ab0"/></content>
        </ByteMaskedArray></content>
    </RegularArray></content>
</ListOffsetArray64>

First of all, this is the simplest and most natural example that triggers the bug:

>>> y[t]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward-1.0/awkward/highlevel.py", line 996, in __getitem__
    tmp = ak._util.wrap(self.layout[where], self._behavior)
RuntimeError: FIXME ListArrayOf<T>::SliceVarNewAxis. 2021-02-10 Was this left over from development? If so, it's not getting tested. If anyone out there encounters this error, please report it so that we can properly validate this code path and include it in the tests. https://github.com/scikit-hep/awkward-1.0/issues/new?assignees=&labels=bug+%28unverified%29&template=bug-report.md&title=

Is it the fact that t has an option-type? No.

>>> t2 = ak.fill_none(t, 999, axis=-1)
>>> t2
<Array [[[3], [3]]] type='1 * var * 1 * int64'>
>>> t2.layout
<ListOffsetArray64>
    <offsets><Index64 i="[0 2]" offset="0" length="2" at="0x56448d7a9b30"/></offsets>
    <content><RegularArray size="1">
        <content><NumpyArray format="l" shape="2" data="3 3" at="0x56448d7ad660"/></content>
    </RegularArray></content>
</ListOffsetArray64>
>>> y[t2]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward-1.0/awkward/highlevel.py", line 996, in __getitem__
    tmp = ak._util.wrap(self.layout[where], self._behavior)
RuntimeError: FIXME ListArrayOf<T>::SliceVarNewAxis. 2021-02-10 Was this left over from development? If so, it's not getting tested. If anyone out there encounters this error, please report it so that we can properly validate this code path and include it in the tests. https://github.com/scikit-hep/awkward-1.0/issues/new?assignees=&labels=bug+%28unverified%29&template=bug-report.md&title=

Does it depend on this depth of nesting? Yes.

>>> y[0]
<Array [[1, 2, 3, 4], [5, 6, 7, 8]] type='2 * 4 * int64'>
>>> t[0]
<Array [[3], [3]] type='2 * 1 * ?int64'>
>>> y[0][t[0]]
<Array [[4, 4, 4, 4], [8, 8, 8, 8]] type='2 * var * ?int64'>

But—murkiness—it's not okay if the slicer (t) does not have potentially missing values.

>>> y[0][t2[0]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/irishep/awkward-1.0/awkward/highlevel.py", line 996, in __getitem__
    tmp = ak._util.wrap(self.layout[where], self._behavior)
ValueError: in RegularArray attempting to get 3, index out of range

(https://github.com/scikit-hep/awkward-1.0/blob/1.4.0/src/cpu-kernels/awkward_RegularArray_getitem_next_array_regularize.cpp#L19)

Oh, but I guess that's because this has become a strictly regular array problem, and it's deferring to NumPy rules.

>>> y0 = ak.to_numpy(y[0])
>>> t20 = ak.to_numpy(t2[0])
>>> y0
array([[1, 2, 3, 4],
       [5, 6, 7, 8]])
>>> t20
array([[3],
       [3]])
>>> y0[t20]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index 3 is out of bounds for axis 0 with size 2

That's the design—the potentially missing values makes the slicer be interpreted as a nested slice—but a bit counter-intuitive because it looks like a small change from the case in which the values are not formally nullable. But NumPy does not ever do nested slices; each dimension has to be a separate array in a tuple, and any multidimensionalness in the slicer is taken to reshape the output. Both of those considerations presuppose rectilinear data.

So this "murkiness," at least, is understood. It's a poor fit to advanced NumPy slicing features and the idea of variable-length data. But the FIXME error message should not be encountered. y[t2] may be the simplest case that demonstrates it, and the fact that y[0][t[0]] doesn't demonstrate it seems like a major clue.

@agoose77
Copy link
Collaborator Author

Yes, it seems that this code path is indeed coming from ListOffsetArrayOf<T>, given this call stack:

awkward::ListArrayOf<long>::getitem_next_jagged ListArray.cpp:2016
awkward::Content::getitem_next_jagged Content.cpp:1480
awkward::ListOffsetArrayOf<long>::getitem_next_jagged ListOffsetArray.cpp:811
awkward::RegularArray::getitem_next RegularArray.cpp:1554
awkward::Content::getitem_next Content.cpp:1449
awkward::Content::getitem Content.cpp:1398
getitem<awkward::ListOffsetArrayOf<long> > content.cpp:819
pybind11::detail::argument_loader<awkward::ListOffsetArrayOf<long> const&, pybind11::object const&>::call_impl<pybind11::object, pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), 0ul, 1ul, pybind11::detail::void_type>(pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && cast.h:2042
pybind11::detail::argument_loader<awkward::ListOffsetArrayOf<long> const&, pybind11::object const&>::call<pybind11::object, pybind11::detail::void_type, pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&)>(pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&)) && cast.h:2014
pybind11::cpp_function::initialize<pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::object, awkward::ListOffsetArrayOf<long> const&, pybind11::object const&, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::object (*)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const pybind11.h:192
pybind11::cpp_function::initialize<pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::object, awkward::ListOffsetArrayOf<long> const&, pybind11::object const&, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object (*&)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::object (*)(awkward::ListOffsetArrayOf<long> const&, pybind11::object const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) pybind11.h:170
pybind11::cpp_function::dispatcher pybind11.h:767
cfunction_call 0x0000564ccb04b145
_PyObject_MakeTpCall 0x0000564ccb031bb6
_PyObject_VectorcallTstate abstract.h:116
_PyObject_VectorcallTstate abstract.h:103
method_vectorcall 0x0000564ccb08bfda
_PyObject_VectorcallTstate abstract.h:118
vectorcall_unbound 0x0000564ccb098359
vectorcall_method 0x0000564ccb098359
slot_mp_subscript 0x0000564ccb098359
PyObject_GetItem 0x0000564ccb073987
_PyEval_EvalFrameDefault 0x0000564ccb0ca7e6
_PyEval_EvalFrame pycore_ceval.h:40
function_code_fastcall 0x0000564ccb09803b
_PyFunction_Vectorcall 0x0000564ccb09803b
_PyObject_VectorcallTstate abstract.h:118
vectorcall_unbound 0x0000564ccb09803b
vectorcall_method 0x0000564ccb09803b
slot_mp_subscript 0x0000564ccb09803b
PyObject_GetItem 0x0000564ccb073987
_PyEval_EvalFrameDefault 0x0000564ccb0ca7e6
_PyEval_EvalFrame pycore_ceval.h:40
_PyEval_EvalCode 0x0000564ccb024560
_PyEval_EvalCodeWithName 0x0000564ccb109227
PyEval_EvalCodeEx 0x0000564ccb109269
PyEval_EvalCode 0x0000564ccb10928b
run_eval_code_obj 0x0000564ccb13bdc9
run_mod 0x0000564ccb176894
pyrun_file 0x0000564ccafffe4a
pyrun_simple_file 0x0000564ccb17ca02
PyRun_SimpleFileExFlags 0x0000564ccb17ca02
pymain_run_file 0x0000564ccb17d0d5
pymain_run_python 0x0000564ccb17d0d5
Py_RunMain 0x0000564ccb17d0d5
Py_BytesMain 0x0000564ccb17d229
__libc_start_main 0x00007ff2ed7ed565
_start 0x0000564ccb0f63a1

@agoose77
Copy link
Collaborator Author

agoose77 commented Jul 20, 2021

The line that introduces the SliceVarNewAxis is https://github.com/scikit-hep/awkward-1.0/blob/eb61e07f52e06708d6c6d4238014218fd5288eb7/src/libawkward/array/RegularArray.cpp#L971

This is called from ListOffsetArray::asslice for my toy layout y in
https://github.com/scikit-hep/awkward-1.0/blob/eb61e07f52e06708d6c6d4238014218fd5288eb7/src/libawkward/array/ListOffsetArray.cpp#L1189

As to the interpretation of what this means ... I'm not quite up to speed there yet.

@agoose77
Copy link
Collaborator Author

agoose77 commented Jul 20, 2021

So, as per Gitter, I determined that SliceVarNewAxis is an undocumented (unsurprising given how much has been added recently!) feature that supports broadcasting 1-size dimensions in jagged advanced indices.

I.e.

>>> array = ak.Array(
...     [[[1.1, 2.2, 3.3], [3, 4], [4.4, 5.5], [6.6, 7.7], [8, 6], [7.7, 8.8, 9.9]]]
... )
>>> array
<Array [... 7.7], [8, 6], [7.7, 8.8, 9.9]]] type='1 * var * var * float64'>
>>> ix = ak.Array([[0, -1]])
>>> ix
<Array [[0, -1]] type='1 * var * int64'>
>>> array[ix]
<Array [... 1.1, 2.2, 3.3], [7.7, 8.8, 9.9]]] type='1 * var * var * float64'>
>>> array[ix[np.newaxis, ...]]
<Array [... [6.6, 7.7], [8, 6], [7.7, 9.9]]] type='1 * var * var * float64'>

Clearly, array[ix] is taking the first and last subarrays, whereas array[ix[np.newaxis, ...]] is picking the first and last elements across all subarrays.

This is being raised for my test case because t is derived from a RegularArray, and so its size=1 dimension (from keepdims=True) is also regular. This is why the bug is not triggered when the keepdims=True dimension is irregular in y.

Given my current understanding, I think we actually want to remove this exception altogether - isn't this method required?

@jpivarski
Copy link
Member

If removing the exception altogether returns the right result, then that's the best way to resolve this issue. The FIXME was added because I wasn't sure whether the whole function could ever be reached. If it can be reached and it does the right thing, then the function should stay and the exception should be removed. If it can be reached and it does the wrong thing, then it will unfortunately require more scrutiny.

@agoose77
Copy link
Collaborator Author

agoose77 commented Jul 21, 2021

In disabling this exception, the result fails with

ValueError: cannot fit jagged slice with length 1 into ListArray64 of size 2

(https://github.com/scikit-hep/awkward-1.0/blob/1.4.0/src/libawkward/array/ListArray.cpp#L1827)

I think the problem is in the translation from the SliceVarNewAxis to SliceJagged in https://github.com/scikit-hep/awkward-1.0/blob/5bb81f0a103277495b391723d61e86c6bac9c74b/src/libawkward/array/ListArray.cpp#L2036-L2050

Given these two examples:

def test_list_jagged_t():
    y = ak.Array(
        ak.layout.ListOffsetArray64(
            ak.layout.Index64(np.r_[0, 2]),
            ak.layout.ListArray64(
                ak.layout.Index64(np.r_[0, 4]),
                ak.layout.Index64(np.r_[4, 8]),
                ak.layout.NumpyArray(
                    np.array([0, 1, 2, -1, 0, 1, 2, 3])
                )
            )
        )
    )
    t = ak.argmax(y, axis=-1, keepdims=True, mask_identity=False)

    print(y[t])


def test_list_regular_t():
    y = ak.Array(
        ak.layout.ListOffsetArray64(
            ak.layout.Index64(np.r_[0, 2]),
            ak.layout.ListArray64(
                ak.layout.Index64(np.r_[0, 4]),
                ak.layout.Index64(np.r_[4, 8]),
                ak.layout.NumpyArray(
                    np.array([0, 1, 2, -1, 0, 1, 2, 3])
                )
            )
        )
    )
    t = ak.to_regular(ak.argmax(y, axis=-1, keepdims=True, mask_identity=False), axis=-1)

    print(y[t])

the where argument in Content::getitem corresponding to t in test_list_jagged_t is

[jagged([0, 2], jagged([0, 1, 2], array([2, 3])))]

conversely, for test_list_regular_t, where is initially

[jagged([0, 2], newaxis(array([2, 3])))]

ListArrayOf<T>::getitem_next_jagged then expands the SliceVarNewAxis into a SliceJagged, producing

jagged([0, 2], array([2, 2]))

which would be equivalent to an initial where of

[jagged([0, 2], jagged([0, 2], array([2, 2])))]

Given that I am not fully up to speed with the complexity here, I have made some guesses that might be wrong. I have observed that Content::getitem wraps the content in a single list with RegularArray. I don't know why this is, but it means that at each level of getitem_next_jagged, the current slice corresponds to child of the layout, i.e.

[jagged([0, 2], jagged([0, 1, 2], array([2, 3])))]
          ___________/               \_______________
        /                                             \
ListArray[1]::getitem_next_jagged  ListArray[2]::getitem_next_jagged

For this inner ListArray, the offsets are wrong for the regular case. The logic that does this is quite simple:
https://github.com/scikit-hep/awkward-1.0/blob/5bb81f0a103277495b391723d61e86c6bac9c74b/src/libawkward/array/ListArray.cpp#L2048

We are just taking the offsets off this parent array, whereas we need to use the offsets of the child.

At this point, I think my lack of knowledge about the design at play means I am unable to make any informed decision as to how to proceed!

This feature is technically not documented anywhere, so it's partially hidden behind a compile-time feature flag (namely, a big loud exception)! As such, you probably could leave this until v2.

In the mean-time, I'll ensure that any jagged lookups don't have regular 1-dims.

@jpivarski
Copy link
Member

@agoose77 I've been investigating this issue. Normally, the mixed variable and regular slicer that you made would be declared invalid: all-variable dimensions triggers Awkward advanced indexing and all-regular triggers NumPy advanced indexing; a mixture would be confusing, so it's not allowed.

However, it sneaks through because length-1 regular arrays in a slice are interpreted as SliceVarNewAxis: https://github.com/scikit-hep/awkward-1.0/pull/694/files#diff-a63810de74c2520ec41382cece2d156993c47ba9eb69772ce6b10a8262536e22

The use of the word "newaxis" is a little misleading; this is not representing a np.newaxis object in a slice, but a length-1 regular axis that was probably made by a np.newaxis. These tests demonstrate its use: https://github.com/scikit-hep/awkward-1.0/pull/694/files#diff-822ebabcc1ec64f9f91037a24a786bc869db7a9ebd334cf14189f5d8c0149988 (the np.newaxis modifies the slicer, which is then used to slice the array. SliceVarNewAxis objects are created when slicing array, not when slicing slicer.

This feature was added in a rush to prepare a tutorial that never happened (https://github.com/jpivarski-talks/2021-02-09-propose-scipy2021-tutorial/blob/main/prep/million-song.ipynb). It was the most minimal way I could see to add features that were necessary to do the analysis in that tutorial. The idea was that this is replicating a NumPy feature—boradcasting length-1 dimensions in slices—in Awkward advanced indexing. But Awkward advanced indexing fits a nested structure in the slicer to the nested structure that you're slicing, whereas NumPy advanced indexing slices each dimension by a different array, and all of those arrays in the tuple are broadcasted. NumPy advanced indexing is truly broadcasting because there are multiple arrays in the slicer; Awkward advanced indexing has only one array in the slicer, so it's not really broadcasting.

Treating a length-1 dimension differently from any other length makes this rule hard to predict. The idea was that you'd get the length-1 dimension from a np.newaxis, but in your case, you got it from a reducer with keepdims=True. I'm thinking this was a bad rule to have introduced: it has unforeseen consequences.

In the test suite, the rule is only triggered in the tests that were added to check it. The rule was never advertised (after all, that tutorial was never presented), and it is unlikely to have made its way into any analyses other than yours, since it's rather easy to trigger the FIXME (which is much older, and may yet be unreachable without the new SliceVarNewAxis rule). Slices that are enabled by the rule can still be performed without it—for instance, this test using the rule:

    array = ak.Array(
        [
            [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]],
            [[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]],
        ]
    )
    slicer = ak.Array([[3, 4], [0, 1, 2, 3]])
    assert array[slicer[:, np.newaxis]].tolist() == [
        [[3, 4], [8, 9], [13, 14]],
        [[15, 16, 17, 18], [20, 21, 22, 23], [25, 26, 27, 28]],
    ]

can be replaced by the following, without the new rule:

    assert array[[[[3, 4], [3, 4], [3, 4]], [[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]]]].tolist() == [
        [[3, 4], [8, 9], [13, 14]],
        [[15, 16, 17, 18], [20, 21, 22, 23], [25, 26, 27, 28]],
    ]

Your use-case is also possible, but only if the slicer is all-variable (in keeping with the rule to avoid confusion between Awkward advanced indexing and NumPy advanced indexing):

>>> y = ak.Array([[[1, 2, 3, 4], [5, 6, 7, 8]]])
>>> t = ak.argmax(y, axis=-1, keepdims=True)
>>> y[t]
<Array [[[4], [8]]] type='1 * var * var * ?int64'>

(y is allowed to have regular dimensions, but t can't mix regular with irregular. You could keep y irregular or make t regular.)

So I think I'm going to remove it, which reverts only PR: #694.

@jpivarski
Copy link
Member

I will be removing it: #1027

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants