-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quadratic complexity of sparse matrices multiplication #184
Comments
Thanks for investigating this! I have always wondered how sprs would compare to scipy.sparse performance wise scipy/scipy#10201 (comment) It would be interesting to also have this comparison random sparse arrays to evaluate the impact of shape and density. |
That's a problem indeed, thanks for reporting it. I'll try and find some time to investigate. |
I have compared sprs and scipy on random sparse matrices with some shapes and density. It is interesting. It seems I incorrectly defined the dependency sprs complexity from nnz. sprs mat*mat complexity directly depends on matrices sizes also. The performance is dramatically reduced while increasing the sizes of the matrices independently from nnz. At the same time in scipy the dependency is quite different. Here are some plots from my benchmarks. Benches code can be found at the gist: https://gist.github.com/espdev/f16f139f8007d3b37d1a57835fee6a19 |
Interesting. Here's the main loop of the matrix multiplication in sprs: https://github.com/vbarrielle/sprs/blob/master/src/sparse/prod.rs#L207-L226 It accumulates each row of the result matrix into a dense vector, before converting it to sparse. That explains the dependency on the dimension (actually, the number of columns). But it makes it more interesting when the matrix is denser. I've started a branch to implement SMMP. We'll have more information when I'm done implementing it. But your graphs mean it might be interesting to keep the original product implementation around, but perhaps with a modification to sparsify the structure that accumulates the result row. |
Progress report: I'm done implementing SMMP symbolic and numeric in the smmp branch. I won't have time to check the performance this weekend, but I should look at it next week. I also have a request for you, @espdev: would you be ok with contributing your benchmarks in the examples folder of sprs? I'd like to have them around to be able to monitor performance, and I want to be sure you're fine with contributing them under sprs' license. |
Wow! It's great. Thank you! |
Here are the performance results for smmp on your first benchmark: (EDIT: these results have a caveat, I generated them on my unplugged laptop). Looks good, now I have to take a look at your second benchmark. This will probably show a complexity difference between scipy's implementation of SMMP and mine, I have to sort the indices of the matrices because that's a guarantee of the |
I noticed the sizes of matrices are different in legend text on the graphs:
It looks like it is just a typo. |
Indeed, I meant to type |
Small update, I've figured I could dynamically switch between using the "linked list" indices from SMMP, followed by a sort, and using a linear scan of the encountered nnz indices, according to the expected amount of work. This makes the performance much better when there is a good amount of non-zeros: (EDIT: the legend is false once again, I forgot to remove the My test to switch between the two methods is a bit naive right now, but it seems to work well enough. In short, for each row, I compare the amount of elements I need to scan (difference between the max encountered nnz col and the min) versus the amount of work sorting would take (k * log(k) with k the number of nnz in the row. It's possible I should make some timings to determine a more precise bound. There are also a few optimizations that could be done for the linear scan, as it could take advantage of SIMD. |
Quick mention that sparse product performance has been further improved in #201. |
Hello!
I try to use sprs package in my experimental code. I have encountered the issue of very slow multiplication of large sparse matrices.
My code uses large diagonal matrices in CSR format. The matrices might contain from hundreds/thousands to millions
NNZ
elements. Multiplication of such matrices in sprs has almost quadratic complexityO(nnz^2)
. So we cannot use sprs in production code currently.I wrote some tests in Rust+sprs and Python+scipy. SciPy uses SMMP algorithm (Sparse Matrix Multiplication Package) and perform multiplication such matrices in tens/hundreds of milliseconds.
The code of my benchmarks can be found on gist: https://gist.github.com/espdev/d278962828ce7360d653b8679d7bb511
The plot for
mat * mat.T
:Could you consider using SMMP algorithm in sprs? This algorithm has complexity
O(n_row*K^2 + max(n_row,n_col))
where K is the maximum nnz in a row of A and column of B.The text was updated successfully, but these errors were encountered: