RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

mbauman · 2015-03-15T19:30:00Z

~~This is still a work in progress, but I'd like to get some feedback on the architecture and design here before applying this same sort of scheme to setindex!.~~

The basic idea is that the only getindex method defined in base for abstract arrays is getindex(::AbstractArray, I...). And the only methods that an AbstractArray subtype must define are size and just one getindex method:

getindex(::T, ::Int) # if linearindexing(T) == LinearFast()
getindex(::T, ::Int, ::Int, #=...ndims(A) indices...=#) if LinearSlow()

Unfortunately, it is currently impossible to express the latter method for arbitrary dimensionalities, but in practice it's not a big issue: most LinearSlow() arrays have a fixed dimension.

This is achieved through dispatch on an internal _getindex method, which recomputes the indices such that it can call the canonical getindex method that the user must define. If the user has not defined their canonical method, it will fall back to an error method in _getindex. I use similar scheme for unsafe_getindex, with the exception that we can fallback to the safe version if the subtype hasn't defined the canonical unsafe method. This enables fast vector indexing by checking bounds of the index vectors instead of on each element. And once @inbounds is extensible, AbstractArrays will be able to support it by default.

The difficulty with all this redirection is that an extra function call can wreck indexing performance, and it can be hard to avoid. I've had particular difficulty getting good performance with CartesianIndexes, and I still lag in performance there by 20x for big arrays. I think call site inline annotations would be a magic bullet, but there may be other tricks we can use, too. I've not looked into this very carefully yet, though. (Fixed with a more sensible inlining strategy)

TL/DR:

In my cursory performance tests hacked onto Tim's indexing perf suite from his reshape work (more tests are needed), I'm close to matching or outperforming master with Array with only these definitions:

julia> methods(getindex, (Array, Any...))
3-element Array{Any,1}:
 getindex(A::Array{T,N},i::Int) at array.jl:304
 getindex{T<:Real}(A::Array{T,N},I::Range{T<:Real}) at array.jl:347 # Needed for bootstrap
 getindex(A::AbstractArray{T,N},I...) at abstractarray.jl:492

(Of course, in places where we're not quite able to close the gap we can always reinstate the specialized methods. This is just a very useful stress-test of both functionality and performance.)

cc: @timholy

mbauman · 2015-03-15T20:24:42Z

(~~I was expecting the tests to fail, but I'm amazed at the variety of ways they did so.~~ At a minimum, this needs ~~#9607~~ (subsumed), ~~#10337~~, and ~~#10505~~).

Jutho · 2015-03-16T06:24:49Z

base/multidimensional.jl

+    # Both index_shape and checksize compute sum(I); manually hoist it out
+    N = sum(I)
+    dest = similar(src, (N,))
+    size(dest) == (N,) || throw(DimensionMismatch())


why is this size check necessary? Haven't you just created dest of that size?

Not really - similar has. It could give us the wrong answer, and the check here should be very cheap compared to the allocations and many assignments (although I may need to hide it in a function; I haven't profiled yet). In general, my approach has been to not trust the output of similar. Master effectively does the same thing for arrays (although similar and the check are split across method boundaries): multidimensional.jl:219

timholy · 2015-03-16T12:44:23Z

I'm super-excited about this, and will try to review soon. I just have pretty limited time slots right now.

johnmyleswhite · 2015-03-16T14:37:40Z

This would be so awesome if it works out.

timholy · 2015-03-17T11:42:37Z

base/abstractarray.jl

+sub2ind(dims::(Integer,Integer), i1::Integer) = i1
+sub2ind(dims::(Integer,Integer), i1::Integer, i2::Integer) = i1+dims[1]*(i2-1)
+sub2ind(dims::(Integer,Integer), i1::Integer, i2::Integer, I::Integer...) = i1+dims[1]*(i2-1+sum(I)-length(I))
+sub2ind(dims::(Integer,Integer,Integer...), i1::Integer, i2::Integer, I::Integer...) =


Do you need an @inline here to ensure good performance for higher dimensions?

Yes, for N>5. See #10337 (comment).

timholy · 2015-03-17T12:30:05Z

This is awesome. In broad terms, I agree this is how indexing should work. You also have a lot of clever insights into how to implement this---it was fun to read your code.

You're basically building this on top of #8227? Perhaps we should just merge that and deal with the consequences later.

Unfortunately, it is currently impossible to express the latter method for arbitrary dimensionalities, but in practice it's not a big issue: most LinearSlow() arrays have a fixed dimension.

I'm not sure I agree with this. ReshapedArrays.jl, Interpolations.jl, and your own AxisArrays.jl (if you pass in a LinearSlow() parent) are good counter-examples. I fear we need to introduce getindexN.

Finally, scalar indexing with non-reals is a genuine possibility, see the remarks re DualNumbers.jl in #9586.

mbauman · 2015-03-17T15:19:40Z

Thanks!

You're basically building this on top of #8227?

Yes, in spirit. I just needed some way to express getindex without a bounds check; BitArrays and a few other data structures already use this idiom, too. I really want user-extensible @inbounds, too, but after working on this I'm not sure having two independent methods is the best solution. It adds a lot of indirection here, and it'd be nice if it could work for things like sub2ind.

I think your alternative proposal in #7799 (comment) might be more attractive, but that just punts the complexity up the chain to a system I don't know and only a few could implement. The method table would keep track of the methods compiled with and without bounds checks. When compiling without bounds checks, the compiler would simply elide everything within @boundscheck. I don't think @inbounds should propagate, so every performant getindex method would be written getindex(...) = (@boundscheck …; @inbounds …).

mbauman · 2015-03-17T16:14:09Z

The method table would keep track of the methods compiled with and without bounds checks.

Hunh, that sounds a lot like multiple dispatch. getindex(::Type{BoundsCheckOn}, x, I...)? That would actually solve the issues in #8227 with rewriting function names.

simonster · 2015-03-17T16:33:58Z

Using multiple dispatch has some elegance, but it has problems of its own.

The first problem is that it makes defining getindex uglier for cases where performance doesn't matter: if getindex(x, I...) is now getindex(::Type{BoundsCheckOn}, x, I...), then it seems that all existing definitions need to be changed. We could have a deprecation that gets thrown when an old getindex method is called, similar to @vtjnash's deprecation for convert with pointer types, but arguably users shouldn't need to think about bounds checks unless they need to turn them off for performance purposes.

The second problem is that, at least as far as I can tell, there's still a lowering issue. It's not possible to create an @inbounds macro that changes the first argument to getindex without modifying the frontend (or reproducing the lowering code in Julia), since ref is not lowered to getindex in the AST that the macro gets. Using multiple dispatch would solve @JeffBezanson's scoping concerns, but it seems we'd still need to add a new Expr or an argument to refto specify whether bounds checks are on or off.

timholy · 2015-03-22T12:23:03Z

Following up from #10507 (comment). @JeffBezanson, your input here would be helpful: how can we restrict dispatch to calling a particular method only for a specific number of varargs? I see a couple of options:

Introduce getindexN as an indexing function that only gets called when length(indexes) == ndims(A)
WIP: ReshapedArrays #10507 (comment)
Introduce getindex{T,N}(A::AbstractArray{T,N}, I::NTuple{N}) (forces a tuple creation unless inlined)
A new notation, getindex{T,N}(A::AbstractArray{T,N}, indexes...N).

To me the last seems most attractive, followed by the first or third.

timholy · 2015-03-23T20:08:58Z

One more option worth considering: rather than a tuple, use a CartesianIndex{N}. The main catch: in order to support things like InterpolationArray, the element type should be of any Number. This is a little awkward, given that the main mission for CartesianIndex is as a glorified counter.

timholy · 2015-03-25T01:56:16Z

Barring further suggestions, I'm going to (time permitting) start playing around with adding a new field to a jl_methlist_t: an index into the tvars field for the parameter specifying the length of the varargs list.
With some parser changes, that should allow us to support the syntax

getindex{T,N}(A::AbstractArray{T,N}, indexes...N)

mbauman · 2015-03-25T02:00:10Z

That would be awesome!

Minor oversight from #10525, this restores the previous behavior where indexing a SubArray by, e.g., [1 2; 3 4], returns an array of the same size with the given linear indices.

fix #11187 (pass struct and tuple objects by stack pointer) fix #11450 (ccall emission was frobbing the stack) likely may fix #11026 and may fix #11003 (ref #10525) invalid stack-read on 32-bit this additionally changes the julia specSig calling convention to pass non-primitive types by pointer instead of by-value this additionally fixes a bug in gen_cfunction that could be exposed by turning off specSig this additionally moves the alloca calls in ccall (and other places) to the entry BasicBlock in the function, ensuring that llvm detects them as static allocations and moves them into the function prologue this additionally fixes some undefined behavior from changing a variable's size through a alloca-cast instead of zext/sext/trunc this additionally prepares for turning back on allocating tuples as vectors, since the gc now guarantees 16-byte alignment future work this makes possible: - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure) - allow moving pointers sometimes rather than always copying immutable data - teach the GC how it can re-use an existing pointer as a box

fix JuliaLang#11187 (pass struct and tuple objects by stack pointer) fix JuliaLang#11450 (ccall emission was frobbing the stack) likely may fix JuliaLang#11026 and may fix JuliaLang#11003 (ref JuliaLang#10525) invalid stack-read on 32-bit this additionally changes the julia specSig calling convention to pass non-primitive types by pointer instead of by-value this additionally fixes a bug in gen_cfunction that could be exposed by turning off specSig this additionally moves the alloca calls in ccall (and other places) to the entry BasicBlock in the function, ensuring that llvm detects them as static allocations and moves them into the function prologue this additionally fixes some undefined behavior from changing a variable's size through a alloca-cast instead of zext/sext/trunc this additionally prepares for turning back on allocating tuples as vectors, since the gc now guarantees 16-byte alignment future work this makes possible: - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure) - allow moving pointers sometimes rather than always copying immutable data - teach the GC how it can re-use an existing pointer as a box

Thanks to JuliaLang/julia#10525 we no longer need these :)

Back in #10525, I decided to deprecate `setindex!(A, x, I::Union{Real, Int, AbstractArray}...)` for symmetry since `getindex` only allows vector indices when there's more than one index. But looking forward, I would really like to work towards APL semantics in 0.5 wherein the sum of the dimensionality of the indices is the dimensionality of the output array. For example, indexing `A[[1 2; 3 4], 1]` would output a 2-dimensional `2x2` array: `[A[1,1] A[2, 1]; A[3,1] A[4,1]]`. In which case, we'd add support back in for `setindex!` with array indices for symmetry. This seems like needless churn - let's just leave things be until 0.5.

Also use #10525 for indexing operations

Jutho mentioned this pull request Mar 15, 2015

improved ind2sub/sub2ind #10337

Merged

mbauman mentioned this pull request Mar 16, 2015

Should we expect floating point indexing to be implemented? #10154

Closed

mbauman force-pushed the mb/abstractsmarts branch from 6fc9483 to a7ea8e3 Compare March 16, 2015 02:15

Jutho reviewed Mar 16, 2015
View reviewed changes

timholy reviewed Mar 17, 2015
View reviewed changes

simonster mentioned this pull request Mar 17, 2015

RFC: Safer, extensible ﹫inbounds #8227

Closed

timholy mentioned this pull request Mar 22, 2015

WIP: ReshapedArrays #10507

Closed

timholy mentioned this pull request Mar 23, 2015

Taking vector transposes seriously #4774

Closed

mbauman force-pushed the mb/abstractsmarts branch from 17c649b to f837e7b Compare March 23, 2015 18:56

mbauman force-pushed the mb/abstractsmarts branch from f837e7b to 84143ea Compare March 23, 2015 21:40

mbauman mentioned this pull request Mar 24, 2015

logical indexing with a vector into a matrix fails #10618

Closed

This was referenced Mar 25, 2015

Profiling user code #10628

Closed

Developer guide JuliaMath/Interpolations.jl#30

Closed

RFC: controlling dispatch with varargs of defined length #10691

Closed

mbauman force-pushed the mb/abstractsmarts branch 2 times, most recently from d6802ad to dea9350 Compare April 3, 2015 02:09

mbauman mentioned this pull request Jun 20, 2015

For Loop Slowdown #11787

Closed

davidanthoff mentioned this pull request Jun 23, 2015

Make things work with julia 0.4 queryverse/ExcelReaders.jl#4

Closed

This was referenced Jun 26, 2015

WIP: A traits-based user-extensible @inbounds #11867

Closed

extensible bounds checking removal #7799

Closed

tomasaschan added a commit to JuliaMath/Interpolations.jl that referenced this pull request Jul 22, 2015

Remove un-needed ambiguity resolving methods

2594b6b

Thanks to JuliaLang/julia#10525 we no longer need these :)

tomasaschan mentioned this pull request Jul 22, 2015

Simplify indexing JuliaMath/Interpolations.jl#48

Merged

mbauman mentioned this pull request Jul 23, 2015

Allow multidimensional array indexing with any eltype #12273

Merged

mbauman mentioned this pull request Jul 24, 2015

Un-deprecate setindex with multidimensional indices #12290

Merged

timholy mentioned this pull request Jul 24, 2015

Enable operator-sensitive extension of element-type promotion #12292

Merged

tomasaschan mentioned this pull request Jul 24, 2015

Vector-valued evaluation JuliaMath/Interpolations.jl#24

Open

timholy added a commit that referenced this pull request Aug 11, 2015

SharedArrays: concretely type all fields, use LinearFast()

d849f60

Also use #10525 for indexing operations

timholy mentioned this pull request Aug 11, 2015

Create SharedArray from disk file #12560

Merged

mbauman mentioned this pull request Sep 15, 2015

Arraypocalypse Now and Then #13157

Closed

27 tasks

timholy mentioned this pull request Oct 3, 2015

Quick guide to images on jupyter JuliaImages/Images.jl#379

Closed

tkelman mentioned this pull request Oct 20, 2015

"Triangular matrix must be square" error for square triangular matrices #13174

Closed

mbauman mentioned this pull request Jan 8, 2016

Ensure checksize inlines #14609

Merged

mbauman mentioned this pull request Apr 28, 2016

don't short-circuit chained comparisons #16088

Closed

timholy mentioned this pull request Jul 9, 2016

flatten the call chain of checkbounds a bit #17340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

mbauman commented Mar 15, 2015

mbauman commented Mar 15, 2015

Jutho Mar 16, 2015

mbauman Mar 16, 2015

timholy commented Mar 16, 2015

johnmyleswhite commented Mar 16, 2015

timholy Mar 17, 2015

mbauman Mar 17, 2015

timholy commented Mar 17, 2015

mbauman commented Mar 17, 2015

mbauman commented Mar 17, 2015

simonster commented Mar 17, 2015

timholy commented Mar 22, 2015

timholy commented Mar 23, 2015

timholy commented Mar 25, 2015

mbauman commented Mar 25, 2015

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

Conversation

mbauman commented Mar 15, 2015

mbauman commented Mar 15, 2015

Jutho Mar 16, 2015

Choose a reason for hiding this comment

mbauman Mar 16, 2015

Choose a reason for hiding this comment

timholy commented Mar 16, 2015

johnmyleswhite commented Mar 16, 2015

timholy Mar 17, 2015

Choose a reason for hiding this comment

mbauman Mar 17, 2015

Choose a reason for hiding this comment

timholy commented Mar 17, 2015

mbauman commented Mar 17, 2015

mbauman commented Mar 17, 2015

simonster commented Mar 17, 2015

timholy commented Mar 22, 2015

timholy commented Mar 23, 2015

timholy commented Mar 25, 2015

mbauman commented Mar 25, 2015