(feat): `igraph` leiden implementation now included as an option in `sc.tl.leiden` #2815

ilan-gold · 2024-01-16T18:07:07Z

Closes Leiden now included in python-igraph #1053
Tests included or not required because:

Release notes not necessary because:

ilan-gold · 2024-01-16T18:10:29Z

TODOs:

Figure out why some tests are passing when they shouldn't (hence why I pushed the branch, curious about CI). UPDATE: tol for matplotlib.testing.compare.compare_images is too high for a sparse-ish plot like rank_genes_groups. This is somewhat worrying so will need to be amended. Other than that, changed plotting outputs make sense so this should be resolved.
Check with scanpy tutorials to see what needs to be changed there as well, if anything (if needed, the two PRs should be merged in tandem). The following use leiden in some capacity:
a. https://scanpy-tutorials.readthedocs.io/en/latest/pbmc3k.html
b. https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html
c. https://scanpy-tutorials.readthedocs.io/en/latest/spatial/basic-analysis.html
d. https://scanpy-tutorials.readthedocs.io/en/latest/spatial/integration-scanorama.html
Do a large dataset test - check NMI for accuracy of the new default against the old one, check speed to confirm what we're doing makes sense (although this was covered, it seems, in Leiden now included in python-igraph #1053), and scalability

ilan-gold · 2024-01-16T21:20:37Z

Failing violin_2 does not fail locally, so that's bizarre.

…ph_leiden

codecov · 2024-01-16T22:18:08Z

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (1ac74a7) 74.57% compared to head (2549f61) 74.63%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2815      +/-   ##
==========================================
+ Coverage   74.57%   74.63%   +0.05%     
==========================================
  Files         115      115              
  Lines       12716    12756      +40     
==========================================
+ Hits         9483     9520      +37     
- Misses       3233     3236       +3

Files	Coverage Δ
scanpy/_utils/__init__.py	`68.20% <90.90%> (+1.00%)`	⬆️
scanpy/tools/_leiden.py	`88.40% <94.44%> (+3.29%)`	⬆️

scanpy/tests/notebooks/test_pbmc3k.py

ilan-gold · 2024-01-16T23:05:26Z

scanpy/tools/_leiden.py

+        clustering_args["weights"] = (
+            "weight" if use_igraph else np.array(g.es["weight"]).astype(np.float64)
+        )


I think both of these should be added to uns as outputs, weights in the params part of the dict and use_igraph at the top levels

adata['leiden'] # { 'params': { ... 'use_weights': True }, 'use_igraph': False }

…ph_leiden

scanpy/tests/test_clustering.py

ivirshup

There are a lot of branches here like if flavor == "leidenalg", I think it could be cleaner to just have one higher level branch for the two cases.

I think we can change the default for directed to be None then dynamically set the value based on which flavor is passed.

We could do the same with n_iterations, though I'm a little on the fence about this.

A number of plots are being changed that don't seem to be very different. Mostly things pre-clustering, like pca and qc metrics. I'm going to check if these are actually different enough to trigger test failures.

scanpy/tools/_leiden.py

ilan-gold · 2024-02-19T13:28:05Z

@ivirshup You are right about some of the pre-processing plots. I should have created a separate branch for those. I simply think hte outputs have changed and we never caught it but would be curious to see what you get. It could be my M1

ilan-gold · 2024-02-19T14:19:49Z

@ivirshup I simplified the conditionals a bit and there are only two sets now. One to check for various {Value/Import}Errors and another to do the clustering_kwargs building. I think this is cleaner and faster since no code will run that doesn't have to. I didn't really see a way to do it with only one set of conditionals without code duplication. There's some code that's just common to both, but that shouldn't be run in the case of one of the {Value/Import}Errors .

ivirshup · 2024-02-19T14:48:17Z

About the preprocessing plots:

scanpy/tests/notebooks/_images_pbmc3k/filter_genes_dispersion/expected.png

Text and some points are shifted slightly. I'm not totally sure whether any points are actually in a different place

scanpy/tests/notebooks/_images_pbmc3k/highest_expr_genes/expected.png
scanpy/tests/notebooks/_images_pbmc3k/pca/expected.png
scanpy/tests/notebooks/_images_pbmc3k/pca_variance_ratio/expected.png
scanpy/tests/notebooks/_images_pbmc3k/scatter_2/expected.png

Axis text shifted slightly
Can probably be reverted if the tests still pass

scanpy/tests/notebooks/_images_pbmc3k/scatter_1/expected.png

y axis moved

ilan-gold · 2024-02-19T14:59:47Z

@ivirshup I will revert those, and hopefully tests pass

ivirshup

Looks good! But I don't think that you addressed these:

I think we can change the default for directed to be None then dynamically set the value based on which flavor is passed.

We could do the same with n_iterations, though I'm a little on the fence about this.

I think I would definitely change directed, and add a note in the docs that setting n_iterations=2 will be faster and is the default for the underlying packages.

ilan-gold · 2024-02-19T15:47:43Z

Ah, you're right. Will change.

ivirshup

Great, looks good!

ilan-gold added 7 commits January 12, 2024 17:43

(feat): igraph as option for leiden

eba6a9a

(feat): add test for similarity

519cad3

(feat): migrate defaults to igraph

25b6705

(chore): add test for directed + igraph

00f5904

(chore): change expected images

7f46900

(fix): weights condition bug

e306ac3

Merge branch 'master' into igraph_leiden

642235d

ilan-gold added 2 commits January 16, 2024 17:02

(fix): change rank_genes_groups tolerance and update test images

5439d9d

Merge branch 'igraph_leiden' of github.com:ilan-gold/scanpy into igra…

2449148

…ph_leiden

(feat): new violin plot based on redone cluster assignments

2fe2b9a

ilan-gold commented Jan 16, 2024

View reviewed changes

scanpy/tests/notebooks/test_pbmc3k.py Outdated Show resolved Hide resolved

ilan-gold added 7 commits January 16, 2024 17:32

(chore): check parameters matching

a14b13e

(fix): handle import properly

8f3b169

(fix): handle partition_type with use_igraph

b89eaa0

(chore): remove unnecessary test args

f67225d

(chore): add test for old defaults

202787c

(chore): pre-commit?

d738092

(chore): pre-commit hooks run

2d8ab25

ilan-gold commented Jan 16, 2024

View reviewed changes

scanpy/tests/notebooks/test_pbmc3k.py Outdated Show resolved Hide resolved

ilan-gold commented Jan 16, 2024

View reviewed changes

ilan-gold added 2 commits January 16, 2024 18:09

(chore): make violin plot expected correct

ece40bf

(fix): change tol again for violin plots

b24d1c4

ilan-gold mentioned this pull request Jan 17, 2024

(feat): Update notebooks for new leiden defaults scverse/scanpy-tutorials#77

Merged

ilan-gold self-assigned this Jan 19, 2024

ilan-gold added Enhancement ✨ Area - Plotting 🌺 breaking change ‼️ labels Jan 19, 2024

ilan-gold added 8 commits February 16, 2024 14:16

(fix): correct category swapping

cc31a2e

(fix): need to reorder categories as well

935d34f

Merge branch 'master' into igraph_leiden

662f918

(fix): clean up simple tests

5df37d2

Merge branch 'igraph_leiden' of github.com:ilan-gold/scanpy into igra…

51f0a02

…ph_leiden

(fix): remove unnecessary cluster swap.

9f6b535

(fix): just use random state that gives same number of categories

102d128

(fix): use np.random instead of random module

84dd615

ilan-gold changed the title ~~(feat): igraph leiden implementation is now the default~~ (feat): igraph leiden implementation now included as an option in sc.tl.leiden Feb 16, 2024

flying-sheep reviewed Feb 16, 2024

View reviewed changes

scanpy/tests/test_clustering.py Outdated Show resolved Hide resolved

ilan-gold added 2 commits February 19, 2024 08:59

(chore): remove unnecessary comment in test about state

d6b1dff

Merge branch 'master' into igraph_leiden

f2db271

ivirshup reviewed Feb 19, 2024

View reviewed changes

scanpy/tools/_leiden.py Outdated Show resolved Hide resolved

scanpy/tools/_leiden.py Outdated Show resolved Hide resolved

scanpy/tools/_leiden.py Outdated Show resolved Hide resolved

ilan-gold added 2 commits February 19, 2024 15:15

(refactor): simplify conditions

5e09532

(refactor): elif -> else when flavor already checked

579f005

(fix): move leiden import for test

1cefa19

(fix): revert unnecessary image changes

6247d76

ilan-gold requested a review from ivirshup February 19, 2024 15:38

ivirshup reviewed Feb 19, 2024

View reviewed changes

(chore): address comments

2549f61

ivirshup approved these changes Feb 19, 2024

View reviewed changes

ivirshup enabled auto-merge (squash) February 19, 2024 16:37

ivirshup mentioned this pull request Feb 19, 2024

Change default leiden clustering backend to igraph, and reduce default number of iterations #2865

Open

ivirshup merged commit 6ee18b9 into scverse:master Feb 19, 2024
13 checks passed

ilan-gold deleted the igraph_leiden branch February 19, 2024 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat): `igraph` leiden implementation now included as an option in `sc.tl.leiden` #2815

(feat): `igraph` leiden implementation now included as an option in `sc.tl.leiden` #2815

ilan-gold commented Jan 16, 2024 •

edited

ilan-gold commented Jan 16, 2024 •

edited

ilan-gold commented Jan 16, 2024 •

edited

codecov bot commented Jan 16, 2024 •

edited

ilan-gold Jan 16, 2024

ivirshup left a comment

ilan-gold commented Feb 19, 2024

ilan-gold commented Feb 19, 2024 •

edited

ivirshup commented Feb 19, 2024

ilan-gold commented Feb 19, 2024

ivirshup left a comment

ilan-gold commented Feb 19, 2024

ivirshup left a comment

(feat): igraph leiden implementation now included as an option in sc.tl.leiden #2815

(feat): igraph leiden implementation now included as an option in sc.tl.leiden #2815

Conversation

ilan-gold commented Jan 16, 2024 • edited

ilan-gold commented Jan 16, 2024 • edited

ilan-gold commented Jan 16, 2024 • edited

codecov bot commented Jan 16, 2024 • edited

Codecov Report

ilan-gold Jan 16, 2024

Choose a reason for hiding this comment

ivirshup left a comment

Choose a reason for hiding this comment

ilan-gold commented Feb 19, 2024

ilan-gold commented Feb 19, 2024 • edited

ivirshup commented Feb 19, 2024

ilan-gold commented Feb 19, 2024

ivirshup left a comment

Choose a reason for hiding this comment

ilan-gold commented Feb 19, 2024

ivirshup left a comment

Choose a reason for hiding this comment

(feat): `igraph` leiden implementation now included as an option in `sc.tl.leiden` #2815

(feat): `igraph` leiden implementation now included as an option in `sc.tl.leiden` #2815

ilan-gold commented Jan 16, 2024 •

edited

ilan-gold commented Jan 16, 2024 •

edited

ilan-gold commented Jan 16, 2024 •

edited

codecov bot commented Jan 16, 2024 •

edited

ilan-gold commented Feb 19, 2024 •

edited