Move Cartesian to base #5387

timholy · 2014-01-13T20:33:39Z

This introduces a new-and-improved version of Cartesian, and rewrites certain functionality from base to use it.

Main points:

A new macro @ngenerate that takes away most of the "wrapper pain" from using Cartesian
Only a handful of functions have been migrated---I've focused this on the more difficult cases, namely getindex, setindex!, reducedim, and broadcast.
Despite adding cartesian.jl (a 400+ line file), this leads to a net reduction of the codebase by approximately 400 lines; almost 800 lines were deleted by reimplementing just those 4 core functions.
In the case of broadcast, the new version is more general in that it supports AbstractArrays and not just StridedArrays.
Many functions that have previously been written with linear indexing can be easily re-written. da8f1dd is included as an example of how to do this for a single function, fill!.
Performance for SubArrays is notably better for some the functions that have been migrated over; performance has been approximately maintained in other cases.

TODO:

From the documentation I didn't quite follow how broadcast_getindex and broadcast_setindex! are supposed to work (what are the inds arrays?); these function have been deleted from this PR, but can be restored if needed.
Initially I intended to migrate the BitArray functions, so that gen_cartesian_map could be eliminated. I chickened out in the face of their complexity, but if this PR makes it in we could do that at a later time.

All times are after suitable warmup.

On master:

X = zeros(100,100,30,30);
S = sub(X, :, :, :, :);
julia> @time fill!(S, 5);
elapsed time: 0.555467706 seconds (64 bytes allocated)

This PR:

julia> @time fill!(S, 5);
elapsed time: 0.07676407 seconds (48 bytes allocated)

Approximately 7-fold better.

Other times are shown below. Test code is in this gist, as are the results of running arrayperf in test/.

reducedim (first set of rows is for Arrays, 2nd for SubArrays; left column is the region, middle column is timing for master, right column is this PR):

1   0.1435  0.1423
2   0.1754  0.1608
3   0.1538  0.1649
4   0.3379  0.3591
(1,2)           0.1339  0.1445
(1,3)           0.1302  0.1361
(1,4)           0.1305  0.1418
(2,3)           0.1574  0.1562
(2,4)           0.1608  0.1572
(3,4)           0.1511  0.1595
(1,2,3)         0.1244  0.1418
(1,2,4)         0.1244  0.142
(1,3,4)         0.129   0.1366
(2,3,4)         0.1572  0.1553
(1,2,3,4)       0.1242  0.1423
1   0.4896  0.3994
2   1.0059  0.4507
3   1.1755  0.4613
4   0.9263  0.5376
(1,2)           0.4883  0.3808
(1,3)           0.7812  0.3811
(1,4)           0.5038  0.3813
(2,3)           1.2748  0.4469
(2,4)           1.0367  0.4485
(3,4)           1.81    0.4545
(1,2,3)         0.4832  0.3807
(1,2,4)         0.4891  0.3809
(1,3,4)         0.6274  0.3814
(2,3,4)         2.8611  0.4469
(1,2,3,4)       0.4828  0.3807

broadcast:
(10,)   (1,10)  0.5324  0.6126
(10,10) (10,10) 0.3802  0.4507
(1000,) (1,1000)    0.0778  0.0641
(10,1,10)   (1,10,10)   0.1142  0.1112
(10,10,1)   (1,10,10)   0.0948  0.0936
(100,1,100) (1,100,100) 0.0659  0.0566

arrayperf results are in the gist.

CC @lindahua, @toivoh.

…sian

timholy · 2014-01-13T20:36:18Z

Also CC @carlobaldassi

toivoh · 2014-01-13T20:43:51Z

broadcast_getindex and broadcast_setindex! are an implementation of the pointwise indexing operations that I suggested in #2591. One reason that I put them in was to demonstrate the feasibility of that concept. But I'm not sure whether there is interest to keep them.

timholy · 2014-01-13T20:44:37Z

I should have added: in my view this material is quite reasonable for 0.3, but we shouldn't export any of Cartesian. (No exports have been added in this PR.) The advantages of merging sooner rather than later are that:

While Cartesian is a bit "ugly", I don't think anyone could call master's current reducedim.jl, or broadcast.jl, or gen_array_index_map easy bedtime reading. In both cases, the new core algorithms are approximately a dozen lines of code, and (once you get used to reading Cartesian code) pretty transparent.
The new algorithms are much closer in spirit to how we should write things when a prettier substitute for Cartesian gets implemented. Consequently, most of the porting work is already done.
The fact that we can be more systematic---often replacing 10 different implementations for different dimensionalities or argument types with a single function---means it's easier to maintain.

JeffBezanson · 2014-01-13T20:54:16Z

Wow. A net -400 lines to get speedups puts this change in a distinguished category indeed.

I think we're all on the same page that needing to switch to an alternate syntactic universe of @ signs to do N-d stuff is not ideal, but a patch such as this can hardly be ignored. Our goal should be to keep these changes more-or-less in place, but to be able to replace the macros with better syntax and compiler optimizations. For example it is very clear that i is a tuple and the N should follow this tuple around via type inference.

StefanKarpinski · 2014-01-13T21:20:05Z

Wow, this is super impressive. I think the last major change that was anywhere near this big of a net code deletion was the sort refactor. While I agree that we definitely want to move past doing all of this with macros, I think this is a good intermediate point and the macros tell us what kinds of things language primitives need to be able to do well. I'm all for merging it once various TODOs are taken care of.

timholy · 2014-01-14T11:58:19Z

broadcast_getindex and broadcast_setindex! are an implementation of the pointwise indexing operations that I suggested in #2591. One reason that I put them in was to demonstrate the feasibility of that concept. But I'm not sure whether there is interest to keep them.

Totally cool. I put them back. I also:

Generalized broadcast_setindex! so the assigned value can either be an array or a scalar
Allowed assigned values to be any shape consistent with Base.setindex_shape_check
Added some bounds-checks and tests for them, as recent additions of @inbounds had introduced some memory-unsafe operations.

If you disagree with any of these, feel free to make changes.

Move Cartesian to base

toivoh · 2014-01-14T12:03:05Z

I'm glad that you were able to make sense of #2591 :) and thanks for putting them back. This looks really impressive!

timholy · 2014-01-14T12:13:51Z

Woot! Now let's wait for the bug reports to start flowing in...

Next issue: with the exception of the fill! example, I deliberately confined this patch to cases where we were already using codegen to write the algorithms---using these macros simply introduced a more flexible and compact way to do that. Now that this is done, should we do more stuff like da8f1dd, cherry-picking items from abstractarrays.jl to reimplement using Cartesian? The advantage is that we finally have a way to easily write algorithms that perform quite well on SubArrays (although even cartesian SubArray indexing still might need a bit of optimization); the disadvantage is that codegen would spread into places where we didn't formerly use it.

kmsquire · 2014-01-14T14:36:49Z

One of the things I love about julia is it's readability, so while I really appreciate the speed improvements and reduction in code, and at a high level I understand what you're doing, da8f1dd is somewhat opaque to me.

carlobaldassi · 2014-01-14T15:38:09Z

Mmh, yes, I vaguely remember about a time when I used to know how those BitArray functions worked ;)

More seriously, nice job! I'll look into this issue of getting rid of gen_cartesian_map as soon as I have the time.

timholy · 2014-01-14T16:14:50Z

@kmsquire, @ngenerate is not present in the Cartesian package and therefore not documented, but with a little practice (hopefully) it's pretty simple. The syntax is @ngenerate itersymbol <function definition>; this causes methods to be created for any integer value of itersymbol. This example works exactly like what's shown in the first couple of lines of Cartesian.jl documentation.

For any function that has historically been implemented using linear indexing and gets migrated to Cartesian, all one really needs to know is one screenful's worth of documentation. From that perspective, it's perhaps not so bad. But I share your concern, and for that reason I am genuinely of two minds about whether more examples like fill! should be implemented. I'd even be totally fine with reverting that particular change; I included it precisely to provoke this kind of discussion. I think the less-questionable usage of Cartesian is for cases where our code was already unreadable because we relied on code generation; in such cases, I would assert that if you spend a little bit of time getting used to Cartesian, you'll find the new versions many times more transparent.

timholy · 2014-01-14T16:15:23Z

@carlobaldassi, no hurry. I may be able to help, too, but like you for now I should move on to other things. Thanks for your interest, though!

timholy · 2014-01-14T16:23:06Z

@kmsquire and others, to see the macros in operation just do something like this:

using Base.Cartesian
macroexpand(quote
<blob>
end)

where you just copy-paste the whole function definition (including the @ngenerate). You'll see it generates explicit versions for 1 and 2 dimensions (in this case with 1 and 2 loops, respectively), and uses a dictionary cache for anything higher. For starters, don't try to understand how the dictionary cache works, just focus on the explicit cases.

kmsquire · 2014-01-14T19:06:43Z

Thanks, Tim. I really do find this work quote impressive!

JeffBezanson · 2014-01-15T05:24:36Z

I'm not going to revert this, but it was merged slightly hastily. We have lost a lot of type information in the changed functions, due to calling a function taken from a dictionary. In many cases this can be fixed using the pattern:

A = alloc_result()
something_magical!(A)
return A

timholy · 2014-01-15T12:51:59Z

Sorry, I interpreted earlier comments as a go-ahead.

I was about to say that they already work that way:
https://github.com/JuliaLang/julia/blob/master/base/reducedim.jl#L49-L51
https://github.com/JuliaLang/julia/blob/master/base/broadcast.jl#L106
https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L39
https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L42
https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L122
but now I see what you mean.

Should be fixed by c6d359b. Let me know about other problems.

kmsquire · 2014-01-15T13:59:52Z

Perhaps we should have type checks in the tests for those functions?

On Wednesday, January 15, 2014, Tim Holy wrote:

Sorry, I interpreted earlier comments as a go-ahead.

I was about to say that they already work that way:
https://github.com/JuliaLang/julia/blob/master/base/reducedim.jl#L49-L51
https://github.com/JuliaLang/julia/blob/master/base/broadcast.jl#L106
https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L39
https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L42

https://github.com/JuliaLang/julia/blob/master/base/multidimensional.jl#L122
but now I see what you mean.

Should be fixed by c6d359bhttps://github.com/JuliaLang/julia/commit/c6d359b27b4ca41dec782a8bab38d2cec9981855.
Let me know about other problems.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/5387#issuecomment-32358652
.

kmsquire · 2014-01-15T14:03:48Z

Okay, I see that you are two steps ahead of me, Tim. :-)

blakejohnson · 2014-01-15T16:04:01Z

base/reducedim.jl

+sum(A::AbstractArray{Bool}, region) = sum!(reduction_init(A,region,0), A)
+eval(ngenerate(:N, :(sum!{T,N}(R::AbstractArray, A::AbstractArray{T,N})), N->gen_reduction_body(N, +)))
+prod{T}(A::AbstractArray{T}, region) = prod!(reduction_init(A,region,one(T)), A)
+eval(ngenerate(:N, :(prod!{T,N}(R::AbstractArray, A::AbstractArray{T,N})), N->gen_reduction_body(N, *)))


Can we bring back something like Dahua's @code_reducedim to make these easier to read?

That's kind of what @ngenerate is, but I didn't want to use it here because I was re-using a body-generator. But I can create a macro for that, too---good idea.

lindahua · 2014-01-17T13:20:05Z

@timholy I think you might want to add the document, so people can learn to use these?

timholy · 2014-01-17T14:44:28Z

I've not added anything to Julia's documentation because I think these should remain unexported---we all hope they'll go away ASAP.

Most of it is documented pretty thoroughly in Cartesian.jl (there were only small changes needed for the transition into base), but perhaps I should add some description of @ngenerate? (even though it doesn't have @ngenerate)

* Use setindex_shape_check and DimensionMismatch instead of generic errors; this (and a couple of extra minor fixes) makes BitArrays behave like Arrays (again) * Move portions of the indexing code to multidimensional.jl * Restrict signatures so as to avoid to_index and convert, then create very generic methods which do the conversion and the dispatch: this greatly simplifies the logic and removes most of the need for disambiguation. Should also improve code generation.

timholy added 7 commits January 11, 2014 06:36

Move multidimensional code later in boot sequence

5ad9a9a

Add cartesian.jl

a4bb8a2

sysimg: load cartesian.jl only after show.jl is loaded

72c357f

Implement reducedim using cartesian

ea7987f

Implement broadcasting using Cartesian

4ef39e1

multidimensional: migrate all items except BitArray routines to Carte…

c6bb39c

…sian

Convert fill!(AbstractArray, x) to Cartesian

da8f1dd

Restore broadcast_getindex and broadcast_setindex!

175cfe9

timholy added a commit that referenced this pull request Jan 14, 2014

Merge pull request #5387 from JuliaLang/teh/cartesian

626b45b

Move Cartesian to base

timholy merged commit 626b45b into master Jan 14, 2014

blakejohnson reviewed Jan 15, 2014
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move Cartesian to base #5387

Move Cartesian to base #5387

timholy commented Jan 13, 2014

timholy commented Jan 13, 2014

toivoh commented Jan 13, 2014

timholy commented Jan 13, 2014

JeffBezanson commented Jan 13, 2014

StefanKarpinski commented Jan 13, 2014

timholy commented Jan 14, 2014

toivoh commented Jan 14, 2014

timholy commented Jan 14, 2014

kmsquire commented Jan 14, 2014

carlobaldassi commented Jan 14, 2014

timholy commented Jan 14, 2014

timholy commented Jan 14, 2014

timholy commented Jan 14, 2014

kmsquire commented Jan 14, 2014

JeffBezanson commented Jan 15, 2014

timholy commented Jan 15, 2014

kmsquire commented Jan 15, 2014

kmsquire commented Jan 15, 2014

blakejohnson Jan 15, 2014

timholy Jan 15, 2014

lindahua commented Jan 17, 2014

timholy commented Jan 17, 2014

Move Cartesian to base #5387

Move Cartesian to base #5387

Conversation

timholy commented Jan 13, 2014

timholy commented Jan 13, 2014

toivoh commented Jan 13, 2014

timholy commented Jan 13, 2014

JeffBezanson commented Jan 13, 2014

StefanKarpinski commented Jan 13, 2014

timholy commented Jan 14, 2014

toivoh commented Jan 14, 2014

timholy commented Jan 14, 2014

kmsquire commented Jan 14, 2014

carlobaldassi commented Jan 14, 2014

timholy commented Jan 14, 2014

timholy commented Jan 14, 2014

timholy commented Jan 14, 2014

kmsquire commented Jan 14, 2014

JeffBezanson commented Jan 15, 2014

timholy commented Jan 15, 2014

kmsquire commented Jan 15, 2014

kmsquire commented Jan 15, 2014

blakejohnson Jan 15, 2014

Choose a reason for hiding this comment

timholy Jan 15, 2014

Choose a reason for hiding this comment

lindahua commented Jan 17, 2014

timholy commented Jan 17, 2014