feat: remove ligrec parallelize by selmanozleyen · Pull Request #1125 · scverse/squidpy

selmanozleyen · 2026-02-23T14:51:37Z

main results
pr results

Results compared:

python benchmarks/bench_ligrec.py --compare benchmarks/results


scenario         n_jobs   main (s)     PR (s)  speedup   change
---------------------------------------------------------------
large                 1      2.548      3.865    0.66x  -51.7%
large                 4      1.561      1.259    1.24x +  19.3%
large                 8      1.706      1.122    1.52x +  34.2%
many_perms            1      1.620      2.247    0.72x  -38.7%
many_perms            4      0.924      0.637    1.45x +  31.1%
many_perms            8      0.966      0.647    1.49x +  33.1%
medium                1      0.261      0.425    0.61x  -62.8%
medium                4      0.581      0.183    3.18x +  68.5%
medium                8      0.687      0.173    3.97x +  74.8%
xlarge                1      8.139      9.259    0.88x  -13.8%
xlarge                4     10.498      3.077    3.41x +  70.7%
xlarge                8     10.188      2.768    3.68x +  72.8%

both faster and cleaner code. this removes parallelize.

update: the reason main is faster when n_jobs=1 is because main sets also numba_parallel=True so it's because it's still numba parallel even though it's one process.

codecov · 2026-03-02T11:16:46Z

Codecov Report

❌ Patch coverage is 64.47368% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.90%. Comparing base (9690a55) to head (165544d).

Files with missing lines	Patch %	Lines
src/squidpy/gr/_ligrec.py	63.23%	25 Missing ⚠️
src/squidpy/_utils.py	75.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1125      +/-   ##
==========================================
- Coverage   74.05%   73.90%   -0.16%     
==========================================
  Files          39       39              
  Lines        6495     6510      +15     
  Branches     1122     1122              
==========================================
+ Hits         4810     4811       +1     
- Misses       1230     1249      +19     
+ Partials      455      450       -5

Files with missing lines	Coverage Δ
src/squidpy/_utils.py	`57.94% <75.00%> (+0.72%)`	⬆️
src/squidpy/gr/_ligrec.py	`74.18% <63.23%> (-3.41%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

for more information, see https://pre-commit.ci

ilan-gold · 2026-03-18T14:06:23Z

src/squidpy/gr/_ligrec.py

        )


+@njit(nogil=True, cache=True)


Why not parallel=True + prange? Because this is being run in a thread pool? Why not just make every individual step parallel?
https://numba.pydata.org/numba-doc/dev/user/parallel.html?highlight=njit#explicit-parallel-loops

Would this require rewriting into a reduction of some sort to prevent overlapping writes?

I see in the benchmarks that the speedups with more jobs is not really scaling linearly, which is not what I would expect.

I see in the benchmarks that the speedups with more jobs is not really scaling linearly, which is not what I would expect.

Thats a good point worth investigating

ilan-gold · 2026-03-18T14:13:50Z

src/squidpy/gr/_ligrec.py

+    def _worker(t: int) -> NDArrayA:
+        local_counts = np.zeros((n_inter, n_cpairs), dtype=np.int64)
+        rs = np.random.RandomState(None if seed is None else t + seed)
+        perm = clustering.copy()
+        for _ in range(chunk_sizes[t]):
+            rs.shuffle(perm)
+            _score_permutation(
+                data_arr,
+                perm,
+                inv_counts,
+                mean_obs,
+                interactions,
+                interaction_clusters,
+                valid,
+                local_counts,
+            )
+            pbar.update(1)
+        return local_counts


Why can't this also be numba-ified with an outer-loop of some sort? Why do we still need a thread pool? I thought "one giant kernel" was the goal

Is shuffling not parallelizable? Certainly there are ways around this like argsort + randomindices or somethign? Other than that, I don't really see why therange(chunk_sizes[t]) couldn't be parallelized. Is it the validity of local_counts? Seems like there should be ways around this

to have a responsive progress bar and to have the same shuffling results as old version.

Could you explain a bit more

Why is the "same results" thing a hard blocker? clustering seems small so copy+shuffle should be cheap as a pre-processing step i.e., do all the "shuffle" stuff ahead of time / outside numba

Would you expect a giant kernel to be faster? My gut is "yes" given Severin's experience/our experience with co_occurrence but I'm all ears

selmanozleyen force-pushed the feat/remove-ligrec-parallelize branch from 9fe8f25 to 4a60ef3 Compare March 2, 2026 11:02

selmanozleyen mentioned this pull request Mar 2, 2026

chore: ligrec add explicit reproducibility tests #1132

Open

refactor ligrec: replace parallelize with threading + numba nogil

6230aed

selmanozleyen force-pushed the feat/remove-ligrec-parallelize branch from d1f752c to 6230aed Compare March 11, 2026 14:00

pre-commit-ci bot and others added 5 commits March 11, 2026 14:01

[pre-commit.ci] auto fixes from pre-commit.com hooks

2de8fd5

for more information, see https://pre-commit.ci

Merge branch 'main' into feat/remove-ligrec-parallelize

fc62c07

undo extract_adata

4cdf709

undo changes

6ec9571

checkout main

d6324d5

selmanozleyen changed the title ~~Feat/remove ligrec parallelize~~ feat: remove ligrec parallelize Mar 11, 2026

selmanozleyen marked this pull request as ready for review March 11, 2026 14:25

selmanozleyen requested a review from timtreis March 11, 2026 14:25

remove old test

df46993

selmanozleyen marked this pull request as draft March 11, 2026 14:28

selmanozleyen removed the request for review from timtreis March 11, 2026 14:28

also deprecate backend

2f999dc

selmanozleyen marked this pull request as ready for review March 11, 2026 14:55

selmanozleyen requested review from ilan-gold and timtreis March 11, 2026 14:57

selmanozleyen and others added 7 commits March 16, 2026 21:43

Merge branch 'main' into feat/remove-ligrec-parallelize

c7c775e

[pre-commit.ci] auto fixes from pre-commit.com hooks

3a62d7d

for more information, see https://pre-commit.ci

update conf

8f8a192

use _get_n_cores

0c1add0

use thread_map

d8dc655

[pre-commit.ci] auto fixes from pre-commit.com hooks

fb5c2e7

for more information, see https://pre-commit.ci

update threadmap

165544d

ilan-gold reviewed Mar 18, 2026

View reviewed changes

ilan-gold mentioned this pull request Mar 20, 2026

perf: parallel downsample scverse/scanpy#4004

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: remove ligrec parallelize#1125

feat: remove ligrec parallelize#1125
selmanozleyen wants to merge 15 commits intoscverse:mainfrom
selmanozleyen:feat/remove-ligrec-parallelize

selmanozleyen commented Feb 23, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

ilan-gold Mar 18, 2026

Uh oh!

selmanozleyen Mar 18, 2026

Uh oh!

ilan-gold Mar 18, 2026

Uh oh!

selmanozleyen Mar 18, 2026 •

edited

Loading

Uh oh!

ilan-gold Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		)


		@njit(nogil=True, cache=True)

Conversation

selmanozleyen commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ilan-gold Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

selmanozleyen Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

selmanozleyen Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilan-gold Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

selmanozleyen commented Feb 23, 2026 •

edited

Loading

codecov bot commented Mar 2, 2026 •

edited

Loading

selmanozleyen Mar 18, 2026 •

edited

Loading