Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve slicing lazily transposed sparse matrices: WIP #28654

Closed
wants to merge 3 commits into from

Conversation

jlapeyre
Copy link
Contributor

@jlapeyre jlapeyre commented Aug 14, 2018

This PR uses the code for getting slices in sparsevector.jl to get slices of sparse matrices
that have lazy adjoint and transpose wrappers. Before, the fallback code in abstractarray.jl created the
slices inefficiently.

Here is an example of a sparse matrix created with sprand(10^4 10^4, 0.01), which shows an increase in speed of 10 to 1000 x.
The test code.
Benchmark before the PR.
Benchmark after the PR

The changes should have no effect on the efficiency of slicing unwrapped, ie SparseMatrixCSC, matrices.

The new slicing calls adjoint and transpose recursively. I don't know of any cases where that is useful, although there may be. Consider for example a sparse matrix with elements that are dense matrices. If you want to interpret this as a block matrix, then transpose(S)[1, :] doesn't give you the correct result. (Neither does S[1,:].) Maybe restricting the recursion to structures that are intended to be used for linear algebra (eg BlockArray), rather than any ad-hoc array makes sense. The reason for this restriction would be to reduce complexity. I'm sure this has been discussed elsewhere, but I have not yet read about it.

There could be further optimizations. EDIT: The following has been done and commited to this branch.
For instance, check if the elements of the parent matrix are numbers (via dispatch), in which case it is not necessary to call transpose.

EDIT: Tests have been added to this branch This PR needs tests, there are none. I would not be surprised if some or all slicing of wrapped arrays is not covered by existing tests.

This PR uses the code for getting slices in
sparsevector.jl to get slices of sparse matrices
with lazy adjoint and transpose wrappers. Before,
the fallback code in abstractarray.jl created the
slices inefficiently.
@jlapeyre
Copy link
Contributor Author

This tests slicing (indexing) of sparse
matrices wrapped with Transpose or Adjoint.
@jlapeyre
Copy link
Contributor Author

Testing with code coverage showed that before this PR there were no tests of indexing sparse matrices wrapped with Transpose or Adjoint.

This commit adds tests for all of the new indexing methods.

…rixCSC

* Only do a recursive transpose if the data type of SparseMatrixCSC
  is not a Number. Some tests on 1000x1000 and 100x100 matrices showed
  at most approximately 10% decrease in time for slicing transposed,
  numerical SparseMatrixCSC.

* rename _colptr_range to _row_index_range and improve comments.
@jlapeyre
Copy link
Contributor Author

Commit #e5d2acd decreases the benchmark time consistently by about 10% for some test matrices. I think the compiler has enough information to do the same optimization without this commit. But, that optimization might not come soon.

@chriscoey
Copy link

nice! I'm looking forward to this being merged. is it still WIP?

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 18, 2018

It is finished. I just needs to be reviewed. I changed existing indexing routines. But, I checked that the changes are optimized away, and perform exactly as before. Still someone should trigger nanosoldier.

@andreasnoack
Copy link
Member

@nanosoldier runbenchmarks("sparse", vs = ":master")

@jlapeyre
Copy link
Contributor Author

Where can I find the output from nanosoldier ? ... or maybe it failed to run.

@andreasnoack
Copy link
Member

I think it was on strike when I tried last time but should be back on duty now so let's try again

@nanosoldier runbenchmarks("sparse", vs = ":master")

@mbauman mbauman added sparse Sparse arrays performance Must go faster labels Aug 29, 2018
@andreasnoack
Copy link
Member

Hm. It looks like I've been misinformed. It still isn't running.

@jlapeyre
Copy link
Contributor Author

Sorry about your wasted time mucking with this. I doubt it will show a performance hit, but the modification really should be tested properly. I prefer to wait a bit rather than run them locally.

@andreasnoack
Copy link
Member

@nanosoldier runbenchmarks("sparse", vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Sep 8, 2018

Looks like there may be a regression. And, reviewing my own benchmarks in the gist I see a regression:

# before PR
@btime a[1, :];
 25.862 μs (15 allocations: 4.56 KiB)

# after PR
@btime a[1, :];
  31.257 μs (16 allocations: 4.59 KiB)

These should be unchanged. But, in fact there is an additional allocation. I thought I took care of this before the PR. It can definitely be fixed by copying code. But, I want to avoid that, if possible. I will look into it...

@ViralBShah
Copy link
Member

Let's try to get this in.

@ViralBShah
Copy link
Member

Bump.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Feb 3, 2019

I have four PRs to Julia in the works. I'll prioritize this one. But, at the moment, my time budget is pretty tight. Thanks for the reminder.

@ViralBShah
Copy link
Member

Just checking to see if it is worth revisiting now.

@DilumAluthge
Copy link
Member

We have moved the SparseArrays stdlib to an external repository.

Please open this PR against that repository: https://github.com/JuliaLang/SparseArrays.jl

Thank you!

@jlapeyre
Copy link
Contributor Author

Huh. I completely forgot about this. I'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants