indexing with list not supported #1029

brentp · 2022-05-17T09:02:55Z

Minimal, reproducible code sample, a copy-pastable example if possible

import numpy as np
import zarr

rows = [3, 4]
a = np.arange(20).reshape((10, 2))

print(a[rows]) # OK

z = zarr.array(a)
print(z[rows]) # indexerror

Problem description

Based on this I expected this to work and to return the same rows as numpy. But it raises in index error.

Version and installation information

Please provide the following:

Value of zarr.__version__: 2.11.3

The text was updated successfully, but these errors were encountered:

joshmoore · 2022-05-17T09:15:53Z

Hi @brentp. Which cells are you expecting to receive? e.g.

In [1]: import zarr, numpy as np

In [2]: z = zarr.array(np.arange(20).reshape(10,2))

In [3]: z[:]
Out[3]:
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15],
       [16, 17],
       [18, 19]])

In [4]: z[[0, 1], [1, 0]]
Out[4]: array([1, 2])

brentp · 2022-05-17T09:41:52Z

Hi, I expect to get the same result as numpy, but index it's an indexerror. I have updated the example to be more complete with this code:

import numpy as np
import zarr

rows = [3, 4]
a = np.arange(20).reshape((10, 2))

print(a[rows])

z = zarr.array(a)
print(z[rows])

joshmoore · 2022-05-17T09:49:06Z

Ah, I see. I remember there being limitations but I don't remember this one. @jni?

jakirkham · 2022-05-17T18:00:33Z

Zarr also supports oindex and vindex, which may be another way to get at the functionality needed here

joshmoore · 2022-05-17T20:18:52Z

z.oindex[rows] certainly returns the same values, but vindex is being used by z[rows].

jakirkham · 2022-05-17T20:21:08Z

There's some overlap between these which can be a kind of confusing aspect of advanced indexing. Hence why vindex & oindex were added before advanced indexing was

jni · 2022-05-18T05:06:02Z

Yes, basically as I remember it @shoyer raised the point that mixing integer indexing with slices (which this use case implicitly uses) can be tricky to get right, so we left that out of the fancy indexing implementation in #725 for a braver soul to tackle. 😜 The discussion starts here:

#725 (comment)

As I see it, the next "easy" step is to allow leading integer indexing, which won't reorder axes in either NumPy or zarr.Array.vindex (I think!). What do you think about this @joshmoore @jakirkham @shoyer?

joshmoore · 2022-05-31T14:25:02Z

Sorry, I failed to note that no one else has responded. I'm all for incremental steps, but maybe there's a question of how to document the rest of the steps needed for the likes of @brentp.

joshmoore · 2022-07-11T12:52:04Z

Having apparently killed the conversation, I'm inclined to say, "Go for it, @jni! 👏🏽"

jni · 2022-07-13T09:53:47Z

I'm currently on leave, but a PR would be pretty easy, hopefully @brentp or someone else can handle it. This is the bit of code that needs modifying:

zarr-python/zarr/core.py

Lines 784 to 789 in 9ce2f32

    
           fields, pure_selection = pop_fields(selection) 
        
           if is_pure_fancy_indexing(pure_selection, self.ndim): 
        
               result = self.vindex[selection] 
        
           else: 
        
               result = self.get_basic_selection(pure_selection, fields=fields) 
        
           return result

Before the else:, add an elif clause:

elif is_just_a_list_of_integers(pure_selection):
    return self.oindex[pure_selection]

For an appropriate implementation of "is just a list of integers." 😂

You probably want to also do this on __setitem__, here.

AndreasAlbertQC · 2023-01-24T09:55:40Z

I'd be interested in implementing a solution to this issue. Beyond what was discussed above, I'm also interested in multi-dimensional indexing with slices. Something like:

# Example from above
rows = [3, 4]
a = np.arange(20).reshape((10, 2))
print(a[rows])

# What I'd like to have in addition
a[rows, :]
a[rows,1:3]

columns = [1,5,4]
a[:,columns]
a[3:4, colums]

The solution proposed by @jni would have to be expanded a little bit. I think this should work:

def is_pure_orthogonal_indexing(selection, ndim):
    # Case 1: Selection is a single iterable of integers
    if is_integer_list(selection) or is_integer_array(selection, ndim=1):
        return True

    # Case two: selection contains either zero or one integer iterables. 
    # All other selection elements are slices.
    return (
        isinstance(selection, tuple) and len(selection) == ndim and 
        sum(is_integer_list(elem) or is_integer_array(elem) for elem in selection) <=1 and
        all(is_integer_list(elem) or is_integer_array(elem) or isinstance(elem, slice) for elem in selection)
    )

And then the elif proposed by @jni would be elif is_pure_orthogonal_indexing(pure_selection, self.ndim).

If there are no objections to this approach, I'll be happy to make a PR.

shoyer · 2023-01-24T18:04:03Z

NumPy's combined advanced and basic indexing is a little complex, but not terrible: https://numpy.org/doc/stable/user/basics.indexing.html#combining-advanced-and-basic-indexing

As long as you follow those rules exactly (and test for compatibility with NumPy's rules), we can definitely extend the cases Zarr supports.

MSanKeys963 · 2023-03-27T19:03:50Z

Safe to close this in favour of #1333?

jni · 2023-03-28T07:26:34Z

Yes this was fixed by #1333, thanks for the reminder @MSanKeys963!

MSanKeys963 mentioned this issue Dec 20, 2022

follow-up issues in zarr-python zarr-developers/community#57

Open

AndreasAlbertQC mentioned this issue Jan 25, 2023

More extensive orthogonal indexing in get/setitem #1333

Merged

5 tasks

jni closed this as completed Mar 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

indexing with list not supported #1029

indexing with list not supported #1029

brentp commented May 17, 2022 •

edited

Loading

joshmoore commented May 17, 2022

brentp commented May 17, 2022

joshmoore commented May 17, 2022

jakirkham commented May 17, 2022

joshmoore commented May 17, 2022

jakirkham commented May 17, 2022

jni commented May 18, 2022

joshmoore commented May 31, 2022

joshmoore commented Jul 11, 2022

jni commented Jul 13, 2022

AndreasAlbertQC commented Jan 24, 2023

shoyer commented Jan 24, 2023

MSanKeys963 commented Mar 27, 2023

jni commented Mar 28, 2023

indexing with list not supported #1029

indexing with list not supported #1029

Comments

brentp commented May 17, 2022 • edited Loading

Minimal, reproducible code sample, a copy-pastable example if possible

Problem description

Version and installation information

joshmoore commented May 17, 2022

brentp commented May 17, 2022

joshmoore commented May 17, 2022

jakirkham commented May 17, 2022

joshmoore commented May 17, 2022

jakirkham commented May 17, 2022

jni commented May 18, 2022

joshmoore commented May 31, 2022

joshmoore commented Jul 11, 2022

jni commented Jul 13, 2022

AndreasAlbertQC commented Jan 24, 2023

shoyer commented Jan 24, 2023

MSanKeys963 commented Mar 27, 2023

jni commented Mar 28, 2023

brentp commented May 17, 2022 •

edited

Loading