perf: single-pass _sparse_nanmean (#1894)#4141
Open
Lawson-Darrow wants to merge 2 commits into
Open
Conversation
_sparse_nanmean copied the matrix twice and did a sparse set-index + eliminate_zeros. Replace with single-pass numba kernels over the compressed buffers (one for within-slot reduction, one for the scatter across slots), handling both axes and CSR/CSC. Implicit zeros still count as observed; only stored NaNs are excluded, matching np.nanmean on the dense array. Benchmarked on a 20k x 3k 5%-dense matrix: ~88x faster for axis=1, ~9.5x for axis=0. Adds CSC regression tests.
Author
|
The failing pre and low-vers checks here are unrelated to this change. The 8 failures in the pre (3.14) job are all in tests/test_read_10x.py and tests/test_aggregated.py, caused by new scipy pre-release DeprecationWarnings (the spmatrix-to-sparray migration and the block_diag interface change) being promoted to errors by the warnings-as-errors config. Every score_genes / _sparse_nanmean test passes in that same job. The low-vers (3.12) job did not actually fail a test, it was cancelled by fail-fast when the pre job failed. Happy to rebase once the pre env is fixed on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1894.
_sparse_nanmeancopied the matrix twice and did a sparsedata[isnan] = 0set-index pluseliminate_zeros. This replaces it with single-pass numba kernels over the compressed buffers: one reduces within each compressed slot (per row for CSR), the other scatters across slots for the other axis. Both axes and CSR/CSC are handled. Semantics are unchanged: implicit zeros count as observed values, only stored NaNs are excluded, matchingnp.nanmeanon the dense array.Benchmark (20k x 3k, 5% dense, with NaNs): axis=1 ~88x faster, axis=0 ~9.5x faster.
Existing
test_sparse_nanmean(both axes) still passes; added CSC regression tests. Used numba per the issue's suggestion, happy to adjust if you'd prefer.