ENH: Optimise _cs_matrix._set_many when new entries are all zero #11603
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference issue
Closes: gh-11600
What does this implement/fix?
This adds a special case for doing a scatter-set (
arrayXarray
) on a compressed matrices where there's many zeros, and, in particular, where the only new values outside the existing sparsity structure are zero.Changing the sparsity structure of a csr_matrix or csc_matrix is expensive, and so it's a nice idea to check whether that's actually required, before doing anything. This accelerates functionality like
m.setdiag(0)
andm[i, j] = 0
significantly, because it never has to change sparsity.For the latter case, the
Getset.track_fancy_setitem
benchmark is extended to also measure setting some indices to zero, in addition to setting it to random values. On csr and csc matrices, with a different sparsity structure, setting to zero is 3.7-67× faster than random.The example in #11600 (
setdiag(0)
on a 10000x10000 random CSR matrix) goes from 147ms to ~5ms; essentially the same as the "manual" one described there.