-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Change getindex algorithms on sparse matrices #40519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if rowI > rowA | ||
| ptrA += 1 | ||
| elseif rowI < rowA | ||
| ptrI += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't one do a binary search here? Aren't the row indices sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that is possible. It comes at the cost of a log(k) factor where k is the size of I. I was hesitating between the binary search and the hashmap here as I understand getindex_I_sorted_linear, to be the fall-back when both the column in A has relatively many nonzero entries and I is fairly large. So the idea is to go through both I and A[:,j] simultaneously.
In the end, I decided to do an 'algorithmic proposal' as the issue I see with the current implementation lies in the scaling for big matrices. It seemed to me that if one had an efficient hashmap, this would be the algorithm of choice in this situation. However, as no-one really computes asymptotically, your suggestion might be the better one for all practical purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a hash map has a "hidden cost" of log(k) (or worse!) as well, so it is not a priori the better choice IMHO, one should benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of accessing it or of building it? The operation I am talking about here happens every time an item is found that should go into the new matrix. So using a binary search would in theory lead to a higher cost assuming that access to the hashmap is O(1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accessing it. Hash map is O(1) just because the memory is finite (but then, essentially everything is O(1)). So there's no real theoretical advantage. It should be benchmarked in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see .. I believe the right thing to do for this algorithm might be to go with a binary search in I as @stevengj suggested. This still avoids the cache that is linear in n and should be considerably faster than the current hashmap implementation I proposed. However, for small matrices it will be slower than the O(1) access of the linear cache... nothing beats that for small matrices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I will implement it when I have some time)
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
|
@nanosoldier |
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @christopher-dG |
|
There are some perf regressions but the sizes of the sparse matrices used in the benchmarks are quite small. Might be worth looking into them nevertheless? |
Yes, I have observed the same thing with my experiments. For small matrices, it's hard to beat a dense cache but it quickly becomes inefficient with growing matrix sizes. Unfortunately I am very tied up with my thesis right now so I didn't have time yet to improve upon the pull request. The experiments can be found here: https://github.com/bonevbs/nla-notebooks |
|
Yeah, the benchmarks look like they are better over 1k, and worse under 100. Perhaps we should merge if that is more expected sizes, and then later introduce a branch that uses the old code for small sizes? |
|
The best thing would be to update this PR to have a heuristic that picks the old algorithm for smaller sizes and the new one where it makes sense. This always is the case with sparse matrix algorithms. |
|
I agree with @ViralBShah and I apologise for this inactivity. I am in the process of writing my thesis, so a very bad moment for opening this issue... |
|
No worries! Good luck with the thesis. We can keep this open until you or someone else can find the time to see it through. |
|
We have moved the SparseArrays stdlib to an external repository. Please open this PR against that repository: https://github.com/JuliaLang/SparseArrays.jl Thank you! |
Two of the current algorithms for accessing SparseMatrixCSC (
getindex_I_sorted_linearandgetindex_I_sorted_bsearch_I) are suboptimalas they use a cache linear in size of the overall matrix.I made changes to both algorithms to avoid this.
getindex_I_sorted_bsearch_Iseems fine now, although there might be performance improvements possible.getindex_I_sorted_linearshould ideally use an efficient implementation of a hashmap to be performant. As of now using Dict is somewhat slower on small matrices. One has to go to big problems in order to see performance benefits.It could be great if people could look at this. I have raised the issue here: https://discourse.julialang.org/t/getindex-a-i-j-wrong-complexity-on-sparsematrixcsc/59277