Skip to content

perf: "two-pass" seurat hvg via scanpy.get.aggregate#4013

Draft
ilan-gold wants to merge 10 commits intomainfrom
ig/two_pass_hvg_v3
Draft

perf: "two-pass" seurat hvg via scanpy.get.aggregate#4013
ilan-gold wants to merge 10 commits intomainfrom
ig/two_pass_hvg_v3

Conversation

@ilan-gold
Copy link
Copy Markdown
Contributor

An idea that popped into my head for disk-bound datasets but likely also normal ones. This should, in theory, greatly improve on-disk access and produce speed ups for disk bound data by reducing the amount of i/o in the worst case, unordered scenario (while, I would guess, leaving in-memory datasets untocuhed or maybe improved thanks to memory access + more efficient mean/var).

  • Closes #
  • Tests included or not required because:

@ilan-gold ilan-gold added this to the 1.12.1 milestone Mar 26, 2026
@ilan-gold ilan-gold changed the title perf: "two-pass" seurat hvg3 via scanpy.get.aggregate perf: "two-pass" seurat hvg via scanpy.get.aggregate Mar 26, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 83.87097% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.64%. Comparing base (cb3e6c2) to head (cc0d67e).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/scanpy/preprocessing/_highly_variable_genes.py 83.87% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4013   +/-   ##
=======================================
  Coverage   78.63%   78.64%           
=======================================
  Files         118      118           
  Lines       12729    12753   +24     
=======================================
+ Hits        10010    10030   +20     
- Misses       2719     2723    +4     
Flag Coverage Δ
hatch-test.low-vers 77.95% <83.87%> (+0.01%) ⬆️
hatch-test.pre 78.53% <83.87%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/scanpy/preprocessing/_highly_variable_genes.py 93.83% <83.87%> (-1.24%) ⬇️

... and 1 file with indirect coverage changes

@scverse-benchmark
Copy link
Copy Markdown

scverse-benchmark bot commented Mar 26, 2026

@flying-sheep flying-sheep modified the milestones: 1.12.1, 1.12.2 Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants