Faster and more general reductions for sparse matrices #10536

simonster · 2015-03-16T23:14:28Z

This hooks sparse matrix reductions into the mapreduce/mapreducedim framework so that reductions based on those two functions (sum, prod, maximum, minimum, sumabs, sumabs2, maxabs, minabs, and others) are fast for sparse matrices. I also implemented a method of centralize_sumabs2!, which is the function that computes the reduction for var and std. This required a tiny tweak to statistics.jl.

Before:

julia> A = sprand(10000, 10000, 0.01);

julia> @time sum(A);
elapsed time: 0.000725368 seconds (128 bytes allocated)

julia> @time sum(A, 1);
elapsed time: 0.050776048 seconds (31 MB allocated, 31.91% gc time in 2 pauses with 1 full sweep)

julia> @time sum(A, 2);
elapsed time: 0.150387959 seconds (76 MB allocated, 16.90% gc time in 3 pauses with 0 full sweep)

julia> @time minimum(A);
elapsed time: 0.000895639 seconds (128 bytes allocated)

julia> @time minimum(A, 1);
elapsed time: 0.025368923 seconds (31 MB allocated, 16.27% gc time in 2 pauses with 0 full sweep)

julia> @time minimum(A, 2);
elapsed time: 0.099724495 seconds (76 MB allocated, 5.97% gc time in 3 pauses with 0 full sweep)

julia> @time sumabs2(A);
elapsed time: 2.84146989 seconds (128 bytes allocated)

julia> @time sumabs2(A, 1);
elapsed time: 1.347525703 seconds (592 kB allocated)

julia> @time sumabs2(A, 2);
elapsed time: 9.477093327 seconds (514 kB allocated)

julia> @time var(A);
elapsed time: 2.861990251 seconds (224 bytes allocated)

julia> @time var(A, 1);
elapsed time: 1.378826771 seconds (31 MB allocated, 0.29% gc time in 2 pauses with 0 full sweep)

julia> @time var(A, 2);
elapsed time: 9.576421764 seconds (76 MB allocated, 0.06% gc time in 3 pauses with 0 full sweep)

After:

julia> A = sprand(10000, 10000, 0.01);

julia> @time sum(A);
elapsed time: 0.00059218 seconds (96 bytes allocated)

julia> @time sum(A, 1);
elapsed time: 0.000527368 seconds (78 kB allocated)

julia> @time sum(A, 2);
elapsed time: 0.001874181 seconds (78 kB allocated)

julia> @time minimum(A);
elapsed time: 0.000835505 seconds (96 bytes allocated)

julia> @time minimum(A, 1);
elapsed time: 0.00154097 seconds (78 kB allocated)

julia> @time minimum(A, 2);
elapsed time: 0.003695277 seconds (156 kB allocated)

julia> @time sumabs2(A);
elapsed time: 0.000476106 seconds (96 bytes allocated)

julia> @time sumabs2(A, 1);
elapsed time: 0.000620605 seconds (78 kB allocated)

julia> @time sumabs2(A, 2);
elapsed time: 0.001989696 seconds (78 kB allocated)

julia> @time var(A);
elapsed time: 0.000910167 seconds (192 bytes allocated)

julia> @time var(A, 1);
elapsed time: 0.001041256 seconds (157 kB allocated)

julia> @time var(A, 2);
elapsed time: 0.006364037 seconds (235 kB allocated)

All times are after warmup.

IainNZ · 2015-03-16T23:19:42Z

Hah, thats a 1520x speed up for var(A,2), delightful!

andreasnoack · 2015-03-16T23:19:43Z

👍

tkelman · 2015-03-17T07:30:16Z

Doesn't this assume the function being mapped is pure? While that's the case for the higher-level wrappers around mapreduce, it's not necessarily true for general mapreduce.

simonster · 2015-03-17T14:40:07Z

That's true, or at least it assumes that f(0) is pure beyond allocation etc.

We could have a mapreduce_pure and use that for all of the wrappers, and then have a separate mapreduce for SparseMatrixCSC that calls f(0) as many times as there are zeros. Alternatively, we could specify that mapreduce expects a pure function (at least for SparseMatrixCSC) and suggest using mapfoldl/mapfoldr for impure functions.

How common is it that you want to pass an impure function to mapreduce? Anything that cares about the order in which it's called is out of the question since the associativity is undefined, but I suppose you could still pass a function that injects randomness or alters closed variables. OTOH, I don't think I've ever seen code that does this.

tkelman · 2015-03-17T15:11:34Z

I don't expect impure functions to be common with mapreduce at all, but the concern came up last time map for sparse matrices was asked about. For map I suggested a differently-named function to perform the analogous operation only on the structurally nonzero values, but in the case of mapreduce perhaps it's not necessary since there are already implicit restrictions due to associativity.

[ci skip]

simonster · 2015-03-17T23:37:05Z

I noted the possible reuse of return values of f in the docs for mapreduce. The constraint is not quite as strong as requiring functional purity of f since e.g. it's fine if every invocation of f returns a new array even though f would not be a pure function. If anyone has a better way to word it, I'm open to suggestions.

nalimilan · 2015-03-18T07:59:49Z

Looks clear enough to me.

Faster and more general reductions for sparse matrices

ViralBShah · 2015-03-19T16:12:18Z

👍

ViralBShah · 2015-03-19T16:20:00Z

We should also document mapreduce for sparse.

simonster · 2015-03-19T16:41:06Z

Does it need separate documentation? The interface is the same as mapreduce for dense. I guess I could add this to NEWS.md, though.

ViralBShah · 2015-03-20T13:55:27Z

Adding to NEWS sounds like a good idea - and also the performance improvements deserve a mention in NEWS.

Faster and more general reductions for sparse matrices

ff3723d

simonster force-pushed the sjk/sparse-reductions branch from 5489fb8 to ff3723d Compare March 16, 2015 23:18

tkelman mentioned this pull request Mar 17, 2015

fixes #8416 (and duplicated issue #9664) #10106

Merged

Note possible reuse of return values of map function for mapreduce

9e777b0

[ci skip]

simonster force-pushed the sjk/sparse-reductions branch from 74f528c to 9e777b0 Compare March 17, 2015 23:30

simonster added a commit that referenced this pull request Mar 19, 2015

Merge pull request #10536 from JuliaLang/sjk/sparse-reductions

cc95bca

Faster and more general reductions for sparse matrices

simonster merged commit cc95bca into master Mar 19, 2015

simonster deleted the sjk/sparse-reductions branch March 19, 2015 03:28

ViralBShah mentioned this pull request Mar 19, 2015

Correlation sparse array is very slow #7788

Closed

Sacha0 mentioned this pull request Sep 20, 2016

Deprecate vectorized round methods in favor of compact broadcast syntax #18590

Closed

mbauman mentioned this pull request Apr 5, 2024

RFC: require reduce-like functions to use init one or more times #53945

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster and more general reductions for sparse matrices #10536

Faster and more general reductions for sparse matrices #10536

simonster commented Mar 16, 2015

IainNZ commented Mar 16, 2015

andreasnoack commented Mar 16, 2015

tkelman commented Mar 17, 2015

simonster commented Mar 17, 2015

tkelman commented Mar 17, 2015

simonster commented Mar 17, 2015

nalimilan commented Mar 18, 2015

ViralBShah commented Mar 19, 2015

ViralBShah commented Mar 19, 2015

simonster commented Mar 19, 2015

ViralBShah commented Mar 20, 2015

Faster and more general reductions for sparse matrices #10536

Faster and more general reductions for sparse matrices #10536

Conversation

simonster commented Mar 16, 2015

IainNZ commented Mar 16, 2015

andreasnoack commented Mar 16, 2015

tkelman commented Mar 17, 2015

simonster commented Mar 17, 2015

tkelman commented Mar 17, 2015

simonster commented Mar 17, 2015

nalimilan commented Mar 18, 2015

ViralBShah commented Mar 19, 2015

ViralBShah commented Mar 19, 2015

simonster commented Mar 19, 2015

ViralBShah commented Mar 20, 2015