perf: `numba` based aggregations for sparse data by ilan-gold · Pull Request #4062 · scverse/scanpy

ilan-gold · 2026-04-15T18:46:23Z

Now that I know #4013 won't be hurt if we use the acceleration in this PR in-memory, I'm redoing #4041 against main with a standalone benchmark.

Closes #
Tests included or not required because:

Release notes not necessary because:

for more information, see https://pre-commit.ci

Co-authored-by: Philipp A <flying-sheep@web.de>

codecov · 2026-04-15T18:53:21Z

Codecov Report

❌ Patch coverage is 97.22222% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 78.63%. Comparing base (87dc1ec) to head (83c0668).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/scanpy/get/_aggregated.py	96.77%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4062      +/-   ##
==========================================
+ Coverage   78.61%   78.63%   +0.02%     
==========================================
  Files         117      118       +1     
  Lines       12713    12729      +16     
==========================================
+ Hits         9994    10010      +16     
  Misses       2719     2719

Flag	Coverage Δ
hatch-test.low-vers	`77.94% <97.22%> (+0.02%)`	⬆️
hatch-test.pre	`78.52% <97.22%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/scanpy/get/_kernels.py	`100.00% <100.00%> (ø)`
src/scanpy/get/_aggregated.py	`93.30% <96.77%> (+0.34%)`	⬆️

scverse-benchmark · 2026-04-15T19:41:42Z

Benchmark changes

Change	Before [`87dc1ec`]	After [`83c0668`]	Ratio	Benchmark (Parameter)
-	4.91G	4.05G	0.82	preprocessing_counts.Agg.peakmem_agg('mean')
-	4.91G	4.05G	0.82	preprocessing_counts.Agg.peakmem_agg('sum')
-	5.83G	4.06G	0.7	preprocessing_counts.Agg.peakmem_agg('var')
-	887±5ms	639±0.8ms	0.72	preprocessing_counts.Agg.time_agg('count_nonzero')
-	548±0.4ms	90.5±0.8ms	0.17	preprocessing_counts.Agg.time_agg('mean')
-	550±0.7ms	88.4±1ms	0.16	preprocessing_counts.Agg.time_agg('sum')
-	1.37±0.01s	134±5ms	0.1	preprocessing_counts.Agg.time_agg('var')

Comparison: https://github.com/scverse/scanpy/compare/87dc1eca044af39ab45a854a3297dbc3b72f6f0c..83c06680a6fae53dc63e478b88aa1cb02017f054
Last changed: Thu, 16 Apr 2026 15:49:52 +0000

More details: https://github.com/scverse/scanpy/pull/4062/checks?check_run_id=71658816499

flying-sheep

awesome! do these kernels come from other work (annbatch?) or did you make them just now?

flying-sheep · 2026-04-16T14:30:38Z

-        return (
-            utils.asarray(self.indicator_matrix @ self.data)
-            / np.bincount(self.groupby.codes)[:, None]
-        )


do you think it would make sense to avoid re-executing sum by having basically _sum_mean and _sum_mean_var and using that in aggregate_array?

Or is sum so fast that re-executing it is fine?

I think for now, this is probably fine. But it's a good point, no doubt! This PR was just focused on the current implementation. The perf difference between having 2 sum calls vs 1 in mean_var wasn't that huge so it probably is "very fast" as you say

ilan-gold · 2026-04-16T14:58:51Z

awesome! do these kernels come from other work (annbatch?) or did you make them just now?

Just made them on the spot! Saw a problem (un-parallelized aggregation used in two-pass seurat HVG #4013 causing a performance regresion) and a solution (parallel kernels in numba).

… data

…for sparse data) (#4064) Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>

ilan-gold and others added 17 commits April 15, 2026 20:29

feat: speed up numba sums

ada140a

fix: no use keeping booleans around

7a8aaff

chore: remove dead code

ce44fbe

fix: dtype issue

be47e37

fix: preallocate out

0d125c5

[pre-commit.ci] auto fixes from pre-commit.com hooks

7867c53

for more information, see https://pre-commit.ci

fix: dtypes

1feca14

pre-commit

86f40b0

fix: use registry

53713d5

fix: timeout

337d798

chore: new sparse kernels

c1ff908

fix: agg count_nonzero

5732423

right, count nonzero

116fd07

fix: use weights to calculate modularity (#4045)

2dfd20e

docs: generate 1.12.1 release notes (#4050)

5cf58ad

Co-authored-by: Philipp A <flying-sheep@web.de>

chore: add benchmark

407f838

fix: remove timeout

b8ddbbb

ilan-gold added the benchmark label Apr 15, 2026

fix: counts

9812299

ilan-gold added 2 commits April 15, 2026 21:51

chore: relnote

4dc8771

chore: retain old dense behavior

4713015

ilan-gold marked this pull request as ready for review April 15, 2026 19:56

ilan-gold requested a review from flying-sheep April 15, 2026 19:56

ilan-gold added this to the 1.12.2 milestone Apr 15, 2026

Update _aggregated.py

43c679c

scverse deleted a comment from azure-pipelines bot Apr 15, 2026

Merge branch 'main' into ig/numba_agg_main

128c5e0

scverse deleted a comment from azure-pipelines bot Apr 16, 2026

flying-sheep added 2 commits April 16, 2026 16:20

style

df6c0ba

style

83c0668

flying-sheep approved these changes Apr 16, 2026

View reviewed changes

ilan-gold merged commit cb3e6c2 into main Apr 16, 2026
15 checks passed

ilan-gold deleted the ig/numba_agg_main branch April 16, 2026 15:01

meeseeksmachine pushed a commit to meeseeksmachine/scanpy that referenced this pull request Apr 16, 2026

Backport PR scverse#4062: perf: numba based aggregations for sparse…

39a29de

… data

meeseeksmachine mentioned this pull request Apr 16, 2026

Backport PR #4062 on branch 1.12.x (perf: numba based aggregations for sparse data) #4064

Merged

ilan-gold added a commit that referenced this pull request Apr 16, 2026

Backport PR #4062 on branch 1.12.x (perf: numba based aggregations …

ca8f391

…for sparse data) (#4064) Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: `numba` based aggregations for sparse data#4062

perf: `numba` based aggregations for sparse data#4062
ilan-gold merged 24 commits intomainfrom
ig/numba_agg_main

ilan-gold commented Apr 15, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

scverse-benchmark bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

flying-sheep left a comment

Uh oh!

flying-sheep Apr 16, 2026

Uh oh!

ilan-gold Apr 16, 2026

Uh oh!

ilan-gold commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ilan-gold commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

scverse-benchmark bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark changes

Uh oh!

flying-sheep left a comment

Choose a reason for hiding this comment

Uh oh!

flying-sheep Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ilan-gold commented Apr 15, 2026 •

edited

Loading

codecov bot commented Apr 15, 2026 •

edited

Loading

scverse-benchmark bot commented Apr 15, 2026 •

edited

Loading

ilan-gold commented Apr 16, 2026 •

edited

Loading