keys, eachindex and linearindices consistency #25901

nalimilan · 2018-02-05T21:53:50Z

The naming of our multiple index-related functions appears to be inconsistent to me despite recent improvements. In particular, the most generic function returning the natural type of indices for collections is now keys, but the most frequent type of collection in Julia is certainly arrays, for which we use the term "index"/"indices" everywhere else. Then we have eachindex which returns the most efficient type of indices (linear or cartesian depending on array types), and linearindices which always returns linear indices.

linearindices(x) can be deprecated in favor of LinearIndices(x), which already works, and which is parallel to CartesianIndices(x), which also works.
We could find a more explicit name for eachindex, like fastindices.
Since indices is now deprecated, maybe we could use that term instead of keys in 1.0 (not in 0.7). This was indeed proposed in the PR which renamed indices to axes. As it's been noted elsewhere, it sounds more acceptable to say that a dict has indices than to say that an array has keys. It's also more consistent with getindex/setindex. That way we could avoid saying "indices or keys" in the docs (e.g. for findall) and just say "indices" everywhere. haskey would have to be changed to hasindex (which could BTW be unified with checkbounds).

Cc: @andyferris @timholy

The text was updated successfully, but these errors were encountered:

ararslan · 2018-02-05T22:07:07Z

💯 for "indices" terminology everywhere. IMO "keys" for arrays is pretty nonsensical to a layperson without explanation. While I honestly I don't see the need to unify terminology in array and dictionary interfaces, I think "index" and "indices" are much better and more descriptive names in general. (Certainly "index" for a dictionary is better than "key" for an array...)

stevengj · 2018-02-07T19:45:01Z

Couldn't we rename both eachindex and keys to indices? I don't really like fastindices; I don't think the function name should make performance promises here.

mbauman · 2018-02-07T19:53:34Z

I agree, but the stumbling block is that we want to use keys (or whatever its replacement will be) generically to provide return values for APIs like find, but we want eachindex (or its replacement) to allow for a more optimal index type. Ref #22907

JeffBezanson · 2018-02-08T21:17:15Z

LinearIndices seems to have inconsistent definitions:

julia> LinearIndices([3,4])[1]
1

julia> LinearIndices(3:4)[1]
-1

There's an ambiguity as to whether you're passing indices or an array. That should be resolved anyway, but will definitely need to be resolved to replace linearindices.

nalimilan · 2018-02-08T21:32:00Z

One solution would be to require passing a tuple of ranges for indices, i.e. LinearIndices((3:4,)). The other is to require doing LinearIndices(size([3, 4])). Note that the method taking an AbstractArray is actually not documented.

JeffBezanson · 2018-02-15T21:13:27Z

Triage likes the idea of making indices and keys aliases in 1.0.

Jutho · 2018-02-22T14:16:06Z

Not directly a solution to the proposal, but a related idea that might be helpful here: How about not allowing linear indexing in arrays with a plain Int. If you want to use linear indexing, the linear index should be wrapped in a simple struct LinearIndex.

A general array would then only have the following methods

getindex(::AbstractArray{T,N}, I::Vararg{Int,N}}) where {T,N}
getindex(::AbstractArray{T,N}, i::LinearIndex) where {T,N}

linearindices would then probably no longer be needed, even in a one-dimensional offset array you could index with 1:length(A) provided that you wrap them in LinearIndex. Or it would still exist as a convenience method with the definition LinearIndex(i) for i = 1:length(A). If required, some basic algebra could be implemented on LinearIndex should that they behave as integers under addition, subtraction and multiplication with scalars.

Another advantage would be that new types of indices would be allowed. You could directly index into the parent of a SubArray with ParentIndex(i), or into a specific memory location for a strided array (see my comment in #26150) with MemoryIndex(i), ...

Jutho · 2018-02-23T08:22:23Z

And almost too trivial to mention, the similarity/analogy with CartesianIndex. Whereas the elements of the iterator returned by CartesianIndices are of type CartesianIndex, the elements of the iterator returned by LinearIndices would be of type LinearIndex.

Jutho · 2018-02-27T22:59:40Z

I would be very interested if you would care to elaborate on the downvote, @stevengj ? I don't quite see the downside of this. Most of the times you will get your linear index from an iterator such as linearindices(A), so why would you care whether it is really an Int. Do you often need to index with hard coded integers directly?

stevengj · 2018-02-28T17:59:05Z

Anything that requires wrapping another type around the indices makes things harder to work with, especially for newcomers. For casual exploration and prototyping, one usually doesn't work too hard at making code extremely generic to handle unusual cases (non 1-based arrays), and typing a[m:n] directly is convenient. The argument for a special index type is even stronger for strings, and yet we still opted not to have a special StringIndex type in #9297.

It's important to recognize that there is a downside to every added layer of abstraction we impose. I'm not convinced that we actually gain much by a linear-index abstraction.

Jutho · 2018-02-28T23:54:13Z

Ok, while that makes sense to some extent, I am wondering in how many cases casual exploration and prototyping leads to linear indexing into a higher-dimensional array. Anyway, not something I care strongly about, just thought it was a useful proposal, for consistency and extensibility.

andyferris · 2018-03-01T06:05:05Z

My thoughts on what makes linear indices special: as opposed to CartesianIndex{n}, they seem to fill 3 functions:

A faster way of accessing a multidimensional array - it is a "token" that points us to the given element (especially for dense arrays).
They seem to be the primary way of indexing a vector (i.e. keys never returns a collection of CartesianIndex{1}s)
At a casual glance, they appear to be a way of "enumerating" a multidimensional array - they tend to agree with iteration order and are occasionally used that way (@Jutho this might be an example of a casual, hacky usage). For Vector, which is not only a linear algebra vector but also our dynamic, ordered "list" type, just like a C# List or a C++ std::vector, this is in fact one of its key (no pun intended) indexing properties.

Note that 2. and 3. can't both hold for offset vectors (higher dimensional offset arrays are fine).

Regarding 1. I feel we are eventually going to need a more formalized token system for AbstractDict in order to have very high-performance dictionary operations (important for lots of reasons, including data (science/analytics) stuff) and because a "token lookup" operation should never, ever be confused for a "key lookup" I predict that eventually we'll have functions like gettoken and settoken!. If/once that occurs, I wonder what we'll do with the "token lookup" system for arrays (i.e. linear indices)? Would we move that over to gettoken and settoken!? In general, the best tokens for lookup and iteration might not even be Int - they should be an implementation detail of the array - so maybe these tokens are distinct to linear indices?

Regarding 3. I have wondered whether (a) an interface for getting/setting the nth element of an iterable should be separately provided (distinct from indexing, rather a generalization of first and last), and (b) how / if the system of tokens relate to the iteration information emitted & consumed by start, next and done?

Sorry, I really have more questions than answers here. I would be glad to be rid of the keys terminology, at the very least.

vtjnash · 2018-03-01T06:22:53Z

For token discussion, cross-ref #24454 and linked issues therein (esp #12157)

nalimilan · 2018-03-07T22:14:13Z

@timholy What are your thoughts about deprecating linearindices in favor of LinearIndices (after adding efficient methods for iteration)?

nalimilan · 2018-03-29T13:19:49Z

Actually linearindices and LinearIndices currently behave differently for OffsetVectors (but not for multidimensional OffsetArrays):

julia> x = TestHelpers.OAs.OffsetArray(1:3, (3,))
Main.TestHelpers.OAs.OffsetArray{Int64,1,UnitRange{Int64}} with indices 4:6:
 1
 2
 3

julia> linearindices(x)
4:6

julia> LinearIndices(axes(x))
LinearIndices{1,Tuple{UnitRange{Int64}}} with indices 4:6:
 1
 2
 3

julia> y = TestHelpers.OAs.OffsetArray([1 2; 3 4], (3, 2))
Main.TestHelpers.OAs.OffsetArray{Int64,2,Array{Int64,2}} with indices 4:5×3:4:
 1  2
 3  4

julia> linearindices(y)
Base.OneTo(4)

julia> LinearIndices(axes(y))
LinearIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}} with indices 4:5×3:4:
 1  3
 2  4

It's kind of confusing to have two functions whose names differ only in case and which behave so differently. The behavior of linearindices is needed to be able to index into vectors with non-standard indices (because we cannot distinguish linear indexing from "single-index cartesian indexing"). But maybe the behavior of LinearIndices for vectors could be changed? What matters is that conversion between linear and cartesian indices works, and a clever choice of values and custom indices could do the trick.

nalimilan added domain:arrays [a, r, r, a, y, s] status:triage This should be discussed on a triage call labels Feb 5, 2018

JeffBezanson removed the status:triage This should be discussed on a triage call label Feb 15, 2018

vtjnash closed this as completed Mar 1, 2018

vtjnash reopened this Mar 1, 2018

nalimilan mentioned this issue Apr 2, 2018

RFC: Make LinearIndices consistent with linearindices for vectors #26682

Merged

nalimilan mentioned this issue Apr 10, 2018

Deprecate linearindices in favor of LinearIndices #26775

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keys, eachindex and linearindices consistency #25901

keys, eachindex and linearindices consistency #25901

nalimilan commented Feb 5, 2018

ararslan commented Feb 5, 2018

stevengj commented Feb 7, 2018

mbauman commented Feb 7, 2018 •

edited

Loading

JeffBezanson commented Feb 8, 2018

nalimilan commented Feb 8, 2018

JeffBezanson commented Feb 15, 2018

Jutho commented Feb 22, 2018

Jutho commented Feb 23, 2018

Jutho commented Feb 27, 2018

stevengj commented Feb 28, 2018 •

edited

Loading

Jutho commented Feb 28, 2018

andyferris commented Mar 1, 2018 •

edited

Loading

vtjnash commented Mar 1, 2018 •

edited

Loading

nalimilan commented Mar 7, 2018

nalimilan commented Mar 29, 2018

keys, eachindex and linearindices consistency #25901

keys, eachindex and linearindices consistency #25901

Comments

nalimilan commented Feb 5, 2018

ararslan commented Feb 5, 2018

stevengj commented Feb 7, 2018

mbauman commented Feb 7, 2018 • edited Loading

JeffBezanson commented Feb 8, 2018

nalimilan commented Feb 8, 2018

JeffBezanson commented Feb 15, 2018

Jutho commented Feb 22, 2018

Jutho commented Feb 23, 2018

Jutho commented Feb 27, 2018

stevengj commented Feb 28, 2018 • edited Loading

Jutho commented Feb 28, 2018

andyferris commented Mar 1, 2018 • edited Loading

vtjnash commented Mar 1, 2018 • edited Loading

nalimilan commented Mar 7, 2018

nalimilan commented Mar 29, 2018

mbauman commented Feb 7, 2018 •

edited

Loading

stevengj commented Feb 28, 2018 •

edited

Loading

andyferris commented Mar 1, 2018 •

edited

Loading

vtjnash commented Mar 1, 2018 •

edited

Loading