Quadratic complexity of sparse matrices multiplication #184

espdev · 2020-03-04T11:10:21Z

Hello!

I try to use sprs package in my experimental code. I have encountered the issue of very slow multiplication of large sparse matrices.

My code uses large diagonal matrices in CSR format. The matrices might contain from hundreds/thousands to millions NNZ elements. Multiplication of such matrices in sprs has almost quadratic complexity O(nnz^2). So we cannot use sprs in production code currently.

I wrote some tests in Rust+sprs and Python+scipy. SciPy uses SMMP algorithm (Sparse Matrix Multiplication Package) and perform multiplication such matrices in tens/hundreds of milliseconds.

The code of my benchmarks can be found on gist: https://gist.github.com/espdev/d278962828ce7360d653b8679d7bb511

The plot for mat * mat.T:

Could you consider using SMMP algorithm in sprs? This algorithm has complexity O(n_row*K^2 + max(n_row,n_col)) where K is the maximum nnz in a row of A and column of B.

The text was updated successfully, but these errors were encountered:

rth · 2020-03-04T11:58:53Z

Thanks for investigating this! I have always wondered how sprs would compare to scipy.sparse performance wise scipy/scipy#10201 (comment) It would be interesting to also have this comparison random sparse arrays to evaluate the impact of shape and density.

vbarrielle · 2020-03-04T13:07:56Z

That's a problem indeed, thanks for reporting it. I'll try and find some time to investigate.

espdev · 2020-03-04T23:21:00Z

I have compared sprs and scipy on random sparse matrices with some shapes and density. It is interesting. It seems I incorrectly defined the dependency sprs complexity from nnz. sprs mat*mat complexity directly depends on matrices sizes also. The performance is dramatically reduced while increasing the sizes of the matrices independently from nnz. At the same time in scipy the dependency is quite different.

Here are some plots from my benchmarks.

scipy:

sprs

scipy vs sprs

Benches code can be found at the gist: https://gist.github.com/espdev/f16f139f8007d3b37d1a57835fee6a19

vbarrielle · 2020-03-05T09:12:02Z

Interesting. Here's the main loop of the matrix multiplication in sprs: https://github.com/vbarrielle/sprs/blob/master/src/sparse/prod.rs#L207-L226

It accumulates each row of the result matrix into a dense vector, before converting it to sparse. That explains the dependency on the dimension (actually, the number of columns). But it makes it more interesting when the matrix is denser.

I've started a branch to implement SMMP. We'll have more information when I'm done implementing it. But your graphs mean it might be interesting to keep the original product implementation around, but perhaps with a modification to sparsify the structure that accumulates the result row.

vbarrielle · 2020-03-06T21:26:15Z

Progress report: I'm done implementing SMMP symbolic and numeric in the smmp branch. I won't have time to check the performance this weekend, but I should look at it next week.

I also have a request for you, @espdev: would you be ok with contributing your benchmarks in the examples folder of sprs? I'd like to have them around to be able to monitor performance, and I want to be sure you're fine with contributing them under sprs' license.

espdev · 2020-03-06T23:51:27Z

@vbarrielle

Wow! It's great. Thank you!
I will clean and refactor my benchmarks to contribute to sprs examples soon.

vbarrielle · 2020-03-09T21:59:42Z

Here are the performance results for smmp on your first benchmark:

(EDIT: these results have a caveat, I generated them on my unplugged laptop).

Looks good, now I have to take a look at your second benchmark. This will probably show a complexity difference between scipy's implementation of SMMP and mine, I have to sort the indices of the matrices because that's a guarantee of the CsMat class. That makes the complexity O(n_row*K^2 + max(n_row,n_col))*K*log(K). That should not matter for most workloads though.

vbarrielle · 2020-03-10T20:36:39Z

Here are the results for the random matrices:

We can see that sprs is lagging behind now, I've profiled it, and most of the time is spent sorting the indices. If I disable indices sorting sprs is more performant:

I'm going to investigate alternative indices sorting methods. I can try using an array of booleans if the nnz count is too large.

espdev · 2020-03-10T23:45:23Z

@vbarrielle

I noticed the sizes of matrices are different in legend text on the graphs:

sprs (15000, 2500)
scipy (15000, 25000)

It looks like it is just a typo.

vbarrielle · 2020-03-11T08:00:00Z

Indeed, I meant to type (15000, 25000) for both.

vbarrielle · 2020-03-11T20:56:53Z

Small update, I've figured I could dynamically switch between using the "linked list" indices from SMMP, followed by a sort, and using a linear scan of the encountered nnz indices, according to the expected amount of work.

This makes the performance much better when there is a good amount of non-zeros:

(EDIT: the legend is false once again, I forgot to remove the (unsorted), it is sorted...)

My test to switch between the two methods is a bit naive right now, but it seems to work well enough. In short, for each row, I compare the amount of elements I need to scan (difference between the max encountered nnz col and the min) versus the amount of work sorting would take (k * log(k) with k the number of nnz in the row. It's possible I should make some timings to determine a more precise bound. There are also a few optimizations that could be done for the linear scan, as it could take advantage of SIMD.

vbarrielle · 2020-07-03T07:40:18Z

Quick mention that sparse product performance has been further improved in #201.

rth mentioned this issue Mar 10, 2020

Use Cython, Numba, or C/C++ for algorithmic code pydata/sparse#126

Open

vbarrielle mentioned this issue Apr 18, 2020

Better performance for sparse matrix product #187

Merged

vbarrielle closed this as completed in 7bdb074 Apr 18, 2020

vbarrielle mentioned this issue Apr 18, 2020

Add benchmarks for sparse multiplication #188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quadratic complexity of sparse matrices multiplication #184

Quadratic complexity of sparse matrices multiplication #184

espdev commented Mar 4, 2020

rth commented Mar 4, 2020

vbarrielle commented Mar 4, 2020

espdev commented Mar 4, 2020

vbarrielle commented Mar 5, 2020

vbarrielle commented Mar 6, 2020 •

edited

Loading

espdev commented Mar 6, 2020

vbarrielle commented Mar 9, 2020 •

edited

Loading

vbarrielle commented Mar 10, 2020

espdev commented Mar 10, 2020 •

edited

Loading

vbarrielle commented Mar 11, 2020

vbarrielle commented Mar 11, 2020 •

edited

Loading

vbarrielle commented Jul 3, 2020

Quadratic complexity of sparse matrices multiplication #184

Quadratic complexity of sparse matrices multiplication #184

Comments

espdev commented Mar 4, 2020

rth commented Mar 4, 2020

vbarrielle commented Mar 4, 2020

espdev commented Mar 4, 2020

vbarrielle commented Mar 5, 2020

vbarrielle commented Mar 6, 2020 • edited Loading

espdev commented Mar 6, 2020

vbarrielle commented Mar 9, 2020 • edited Loading

vbarrielle commented Mar 10, 2020

espdev commented Mar 10, 2020 • edited Loading

vbarrielle commented Mar 11, 2020

vbarrielle commented Mar 11, 2020 • edited Loading

vbarrielle commented Jul 3, 2020

vbarrielle commented Mar 6, 2020 •

edited

Loading

vbarrielle commented Mar 9, 2020 •

edited

Loading

espdev commented Mar 10, 2020 •

edited

Loading

vbarrielle commented Mar 11, 2020 •

edited

Loading