New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve slicing lazily transposed sparse matrices: WIP #28654
Conversation
This PR uses the code for getting slices in sparsevector.jl to get slices of sparse matrices with lazy adjoint and transpose wrappers. Before, the fallback code in abstractarray.jl created the slices inefficiently.
This tests slicing (indexing) of sparse matrices wrapped with Transpose or Adjoint.
Testing with code coverage showed that before this PR there were no tests of indexing sparse matrices wrapped with This commit adds tests for all of the new indexing methods. |
…rixCSC * Only do a recursive transpose if the data type of SparseMatrixCSC is not a Number. Some tests on 1000x1000 and 100x100 matrices showed at most approximately 10% decrease in time for slicing transposed, numerical SparseMatrixCSC. * rename _colptr_range to _row_index_range and improve comments.
8438f75
to
e5d2acd
Compare
Commit #e5d2acd decreases the benchmark time consistently by about 10% for some test matrices. I think the compiler has enough information to do the same optimization without this commit. But, that optimization might not come soon. |
nice! I'm looking forward to this being merged. is it still WIP? |
It is finished. I just needs to be reviewed. I changed existing indexing routines. But, I checked that the changes are optimized away, and perform exactly as before. Still someone should trigger nanosoldier. |
@nanosoldier |
Where can I find the output from nanosoldier ? ... or maybe it failed to run. |
I think it was on strike when I tried last time but should be back on duty now so let's try again @nanosoldier |
Hm. It looks like I've been misinformed. It still isn't running. |
Sorry about your wasted time mucking with this. I doubt it will show a performance hit, but the modification really should be tested properly. I prefer to wait a bit rather than run them locally. |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Looks like there may be a regression. And, reviewing my own benchmarks in the gist I see a regression:
These should be unchanged. But, in fact there is an additional allocation. I thought I took care of this before the PR. It can definitely be fixed by copying code. But, I want to avoid that, if possible. I will look into it... |
Let's try to get this in. |
Bump. |
I have four PRs to Julia in the works. I'll prioritize this one. But, at the moment, my time budget is pretty tight. Thanks for the reminder. |
Just checking to see if it is worth revisiting now. |
We have moved the SparseArrays stdlib to an external repository. Please open this PR against that repository: https://github.com/JuliaLang/SparseArrays.jl Thank you! |
Huh. I completely forgot about this. I'll look into it. |
This PR uses the code for getting slices in sparsevector.jl to get slices of sparse matrices
that have lazy adjoint and transpose wrappers. Before, the fallback code in abstractarray.jl created the
slices inefficiently.
Here is an example of a sparse matrix created with
sprand(10^4 10^4, 0.01)
, which shows an increase in speed of 10 to 1000 x.The test code.
Benchmark before the PR.
Benchmark after the PR
The changes should have no effect on the efficiency of slicing unwrapped, ie
SparseMatrixCSC
, matrices.The new slicing calls
adjoint
andtranspose
recursively. I don't know of any cases where that is useful, although there may be. Consider for example a sparse matrix with elements that are dense matrices. If you want to interpret this as a block matrix, thentranspose(S)[1, :]
doesn't give you the correct result. (Neither doesS[1,:]
.) Maybe restricting the recursion to structures that are intended to be used for linear algebra (egBlockArray
), rather than any ad-hoc array makes sense. The reason for this restriction would be to reduce complexity. I'm sure this has been discussed elsewhere, but I have not yet read about it.There could be further optimizations. EDIT: The following has been done and commited to this branch.
For instance, check if the elements of the parent matrix are numbers (via dispatch), in which case it is not necessary to calltranspose
.EDIT: Tests have been added to this branch
This PR needs tests, there are none.I would not be surprised if some or all slicing of wrapped arrays is not covered by existing tests.