Skip to content

perf: single-pass _sparse_nanmean (#1894)#4141

Open
Lawson-Darrow wants to merge 2 commits into
scverse:mainfrom
Lawson-Darrow:perf/sparse-nanmean-single-pass-1894
Open

perf: single-pass _sparse_nanmean (#1894)#4141
Lawson-Darrow wants to merge 2 commits into
scverse:mainfrom
Lawson-Darrow:perf/sparse-nanmean-single-pass-1894

Conversation

@Lawson-Darrow
Copy link
Copy Markdown

Closes #1894.

_sparse_nanmean copied the matrix twice and did a sparse data[isnan] = 0 set-index plus eliminate_zeros. This replaces it with single-pass numba kernels over the compressed buffers: one reduces within each compressed slot (per row for CSR), the other scatters across slots for the other axis. Both axes and CSR/CSC are handled. Semantics are unchanged: implicit zeros count as observed values, only stored NaNs are excluded, matching np.nanmean on the dense array.

Benchmark (20k x 3k, 5% dense, with NaNs): axis=1 ~88x faster, axis=0 ~9.5x faster.

Existing test_sparse_nanmean (both axes) still passes; added CSC regression tests. Used numba per the issue's suggestion, happy to adjust if you'd prefer.

_sparse_nanmean copied the matrix twice and did a sparse set-index +
eliminate_zeros. Replace with single-pass numba kernels over the compressed
buffers (one for within-slot reduction, one for the scatter across slots),
handling both axes and CSR/CSC. Implicit zeros still count as observed; only
stored NaNs are excluded, matching np.nanmean on the dense array.

Benchmarked on a 20k x 3k 5%-dense matrix: ~88x faster for axis=1, ~9.5x for
axis=0. Adds CSC regression tests.
@Lawson-Darrow
Copy link
Copy Markdown
Author

The failing pre and low-vers checks here are unrelated to this change. The 8 failures in the pre (3.14) job are all in tests/test_read_10x.py and tests/test_aggregated.py, caused by new scipy pre-release DeprecationWarnings (the spmatrix-to-sparray migration and the block_diag interface change) being promoted to errors by the warnings-as-errors config. Every score_genes / _sparse_nanmean test passes in that same job. The low-vers (3.12) job did not actually fail a test, it was cancelled by fail-fast when the pre job failed. Happy to rebase once the pre env is fixed on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_sparse_nanmean is inefficient

1 participant