Skip to content

ECG Community Detection implementation#502

Merged
Krastanov merged 10 commits intoJuliaGraphs:masterfrom
ryandewolfe33:ECG
Apr 27, 2026
Merged

ECG Community Detection implementation#502
Krastanov merged 10 commits intoJuliaGraphs:masterfrom
ryandewolfe33:ECG

Conversation

@ryandewolfe33
Copy link
Copy Markdown
Contributor

Another community detection algorithm, see #231

Ensemble Clustering for Graphs (ECG) uses many runs of Louvain (see #488 ) to improve performance and stability.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Benchmark Results (Julia v1)

Time benchmarks
master d9cd585... master / d9cd585...
centrality/digraphs/betweenness_centrality 18.8 ± 0.53 ms 19.1 ± 0.4 ms 0.984 ± 0.035
centrality/digraphs/closeness_centrality 12.7 ± 1.2 ms 13.6 ± 0.55 ms 0.935 ± 0.099
centrality/digraphs/degree_centrality 2.31 ± 0.41 μs 2.24 ± 0.23 μs 1.03 ± 0.21
centrality/digraphs/katz_centrality 0.913 ± 0.056 ms 0.923 ± 0.057 ms 0.989 ± 0.087
centrality/digraphs/pagerank 0.0381 ± 0.00064 ms 0.038 ± 0.00078 ms 1 ± 0.027
centrality/graphs/betweenness_centrality 30.2 ± 2.2 ms 30.7 ± 2.2 ms 0.984 ± 0.1
centrality/graphs/closeness_centrality 22.1 ± 0.67 ms 22.2 ± 0.64 ms 0.995 ± 0.042
centrality/graphs/degree_centrality 1.77 ± 0.22 μs 1.74 ± 0.23 μs 1.02 ± 0.18
centrality/graphs/katz_centrality 1.08 ± 0.063 ms 1.09 ± 0.063 ms 0.987 ± 0.081
connectivity/digraphs/strongly_connected_components 0.0462 ± 0.0025 ms 0.0459 ± 0.0025 ms 1.01 ± 0.077
connectivity/graphs/connected_components 26.4 ± 0.9 μs 26.4 ± 0.98 μs 1 ± 0.051
core/edges/digraphs 8 ± 0.01 μs 8.01 ± 0.001 μs 0.999 ± 0.0013
core/edges/graphs 14.5 ± 0.03 μs 14.5 ± 0.03 μs 1 ± 0.0029
core/has_edge/digraphs 6.54 ± 1.7 μs 6.1 ± 0.55 μs 1.07 ± 0.3
core/has_edge/graphs 9.58 ± 0.9 μs 6.46 ± 0.56 μs 1.48 ± 0.19
core/nv/digraphs 0.361 ± 0.02 μs 0.351 ± 0.011 μs 1.03 ± 0.065
core/nv/graphs 0.381 ± 0.02 μs 0.37 ± 0.01 μs 1.03 ± 0.061
distance/weighted_diameter/barabasi_albert_naive 0.0339 ± 0.0016 s 0.0338 ± 0.001 s 1 ± 0.057
distance/weighted_diameter/barabasi_albert_optimized 3.02 ± 0.053 ms 4.36 ± 0.1 ms 0.693 ± 0.02
distance/weighted_diameter/erdos_renyi_naive 0.0337 ± 0.0012 s 0.0341 ± 0.0018 s 0.988 ± 0.062
distance/weighted_diameter/erdos_renyi_optimized 12.9 ± 1.5 ms 16.4 ± 1.6 ms 0.788 ± 0.12
edges/fille 6.61 ± 0.85 μs 6.55 ± 1 μs 1.01 ± 0.2
edges/fillp 5.2 ± 2.3 μs 6.35 ± 1.3 μs 0.819 ± 0.41
edges/tsume 2.17 ± 0.021 μs 2.16 ± 0.02 μs 1 ± 0.013
edges/tsump 2.16 ± 0.019 μs 2.17 ± 0.02 μs 0.995 ± 0.013
insertions/SG(n,e) Generation 28.5 ± 4.2 ms 28.6 ± 4.3 ms 0.996 ± 0.21
parallel/egonet/twohop 0.386 ± 0.022 s 0.449 ± 0.0073 s 0.859 ± 0.051
parallel/egonet/vertexfunction 3.86 ± 1.2 ms 4.78 ± 0.25 ms 0.806 ± 0.26
serial/egonet/twohop 0.38 ± 0.0048 s 0.435 ± 0.013 s 0.872 ± 0.028
serial/egonet/vertexfunction 3.88 ± 0.41 ms 4.56 ± 0.24 ms 0.852 ± 0.1
traversals/digraphs/bfs_tree 0.0563 ± 0.029 ms 0.0563 ± 0.028 ms 1 ± 0.71
traversals/digraphs/dfs_tree 0.0697 ± 0.03 ms 0.07 ± 0.028 ms 0.996 ± 0.58
traversals/graphs/bfs_tree 0.0613 ± 0.0062 ms 0.0614 ± 0.0063 ms 0.997 ± 0.14
traversals/graphs/dfs_tree 0.0726 ± 0.012 ms 0.0737 ± 0.012 ms 0.986 ± 0.23
time_to_load 0.567 ± 0.0049 s 0.569 ± 0.043 s 0.996 ± 0.076
Memory benchmarks
master d9cd585... master / d9cd585...
centrality/digraphs/betweenness_centrality 0.29 M allocs: 24 MB 0.29 M allocs: 24 MB 1
centrality/digraphs/closeness_centrality 18.6 k allocs: 14.5 MB 18.6 k allocs: 14.5 MB 1
centrality/digraphs/degree_centrality 8 allocs: 5.01 kB 8 allocs: 5.01 kB 1
centrality/digraphs/katz_centrality 2.63 k allocs: 2.83 MB 2.63 k allocs: 2.83 MB 1
centrality/digraphs/pagerank 21 allocs: 14.9 kB 21 allocs: 14.9 kB 1
centrality/graphs/betweenness_centrality 0.545 M allocs: 0.0313 GB 0.545 M allocs: 0.0313 GB 1
centrality/graphs/closeness_centrality 19.3 k allocs: 14 MB 19.3 k allocs: 14 MB 1
centrality/graphs/degree_centrality 10 allocs: 5.43 kB 10 allocs: 5.43 kB 1
centrality/graphs/katz_centrality 2.96 k allocs: 3.1 MB 2.96 k allocs: 3.1 MB 1
connectivity/digraphs/strongly_connected_components 1.05 k allocs: 0.075 MB 1.05 k allocs: 0.075 MB 1
connectivity/graphs/connected_components 0.061 k allocs: 22.5 kB 0.061 k allocs: 22.5 kB 1
core/edges/digraphs 3 allocs: 0.0938 kB 3 allocs: 0.0938 kB 1
core/edges/graphs 3 allocs: 0.0938 kB 3 allocs: 0.0938 kB 1
core/has_edge/digraphs 20 allocs: 12.6 kB 20 allocs: 12.6 kB 1
core/has_edge/graphs 28 allocs: 13.8 kB 28 allocs: 13.8 kB 1
core/nv/digraphs 3 allocs: 0.0938 kB 3 allocs: 0.0938 kB 1
core/nv/graphs 3 allocs: 0.0938 kB 3 allocs: 0.0938 kB 1
distance/weighted_diameter/barabasi_albert_naive 13.5 k allocs: 13.8 MB 13.5 k allocs: 13.8 MB 1
distance/weighted_diameter/barabasi_albert_optimized 1.86 k allocs: 1.31 MB 2.4 k allocs: 1.87 MB 0.704
distance/weighted_diameter/erdos_renyi_naive 13.5 k allocs: 13.8 MB 13.5 k allocs: 13.8 MB 1
distance/weighted_diameter/erdos_renyi_optimized 5.96 k allocs: 5.51 MB 7.3 k allocs: 6.89 MB 0.799
edges/fille 3 allocs: 0.153 MB 3 allocs: 0.153 MB 1
edges/fillp 3 allocs: 0.153 MB 3 allocs: 0.153 MB 1
edges/tsume 0 allocs: 0 B 0 allocs: 0 B
edges/tsump 0 allocs: 0 B 0 allocs: 0 B
insertions/SG(n,e) Generation 0.0465 M allocs: 10.9 MB 0.0465 M allocs: 10.9 MB 1
parallel/egonet/twohop 10 allocs: 0.0768 MB 10 allocs: 0.0768 MB 1
parallel/egonet/vertexfunction 10 allocs: 0.0768 MB 10 allocs: 0.0768 MB 1
serial/egonet/twohop 3 allocs: 0.0764 MB 3 allocs: 0.0764 MB 1
serial/egonet/vertexfunction 3 allocs: 0.0764 MB 3 allocs: 0.0764 MB 1
traversals/digraphs/bfs_tree 2.34 k allocs: 0.113 MB 2.34 k allocs: 0.113 MB 1
traversals/digraphs/dfs_tree 2.44 k allocs: 0.118 MB 2.44 k allocs: 0.118 MB 1
traversals/graphs/bfs_tree 2.52 k allocs: 0.121 MB 2.52 k allocs: 0.121 MB 1
traversals/graphs/dfs_tree 2.63 k allocs: 0.127 MB 2.63 k allocs: 0.127 MB 1
time_to_load 0.145 k allocs: 11 kB 0.145 k allocs: 11 kB 1

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.46%. Comparing base (bafd56e) to head (d9cd585).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #502      +/-   ##
==========================================
+ Coverage   97.31%   97.46%   +0.15%     
==========================================
  Files         126      127       +1     
  Lines        7739     7766      +27     
==========================================
+ Hits         7531     7569      +38     
+ Misses        208      197      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@LoveLow-Global LoveLow-Global left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Adding ECG is a great feature for the community detection.

I've left a few comments. The two main things to check on are ensuring the ensemble weighting only applies to the 2-core of the graph and a small bug with undirected self-loops.

As this is my first review after being a reviewer here, I may have not given the best review possible. I apologize in advance for this issue, please let me know if there is anything to be improved from my side.

Comment thread src/community/ecg.jl Outdated
Comment thread test/community/ecg.jl
Comment thread src/community/ecg.jl
@ryandewolfe33
Copy link
Copy Markdown
Contributor Author

Thanks for the helpful review! It's only my ~3rd pull request so I don't feel qualified to say much about your process, but it seems good to me.

Comment thread src/community/ecg.jl Outdated
@LoveLow-Global
Copy link
Copy Markdown
Member

And thank you the kind comments, I really appreciate it! I will try to improve more in the future.

@LoveLow-Global
Copy link
Copy Markdown
Member

Thank you so much for the great contribution. I believe it is ready to be merged!

@LoveLow-Global
Copy link
Copy Markdown
Member

@Krastanov Hello, I wonder if you can check on the workflow before merging, as well as if it is ready to be merged, as it is my first time reviewing a PR and I want to make it sure. Thank you!

@Krastanov
Copy link
Copy Markdown
Member

Thanks both, for the contribution and for the review!

I want to briefly discuss to questions of API design before we merge it (it otherwise seems ready for merge):

  • What are the pros and cons of ensemble_clustering vs ecg as names? ensemble_clustering seems a bit more legible to me, especially to someone outside of the field.
  • Is the *_weights function meant to be public? Do other clustering algorithms implemented here have corresponding *_weights functions or something equivalent?

And a question for future work: Given that we now have a few different types of clustering algorithms, does it make sense to start discussing how these can be organized together, e.g. by some general Clustering return type and some clustering(g, ::AlgorithmType) API?

@ryandewolfe33
Copy link
Copy Markdown
Contributor Author

Hi Krastanov, thanks for the comments.

  • My interpretation is that ensemble_clustering is more of a meta-clustering paradigm, whereas ecg is the name of this specific ensemble method (with louvain the specific reweighting). I could see a future with a generic ensemble method that uses a different or many internal algorithms.

  • I'm not sure, I'd be open to changing it to private. I do know that the weights returned can be quite useful (in layouts for example, see this notebook for some ideas).


I think discussion around a standardized type makes sense, but I'm not sufficiently comfortable with Julia to lead that. I will say that everything right now uses arrays with entries corresponding to community id as the return type, which I think is fine for partitions (every vertex gets exactly one community). Most popular algorithms use partitions (louvain, leiden, infomap, label propagation, some spectral things) so I think it would be fine to standardize around a partition as a result.

There are generalizations with outliers (a vertex can have no communities), or overlaps (a vertex can have multiple communities), or fuzzy membership (e.g. 50% in each of two communities) where the return type would have to be different, but at that point it could make sense as it's own package.

I don't have plans for adding any more community detection algorithms right now, so maybe we can move this to an issue where it's more visible and better for a longer discussion.

@Krastanov Krastanov merged commit bb8456e into JuliaGraphs:master Apr 27, 2026
16 checks passed
@Krastanov
Copy link
Copy Markdown
Member

Thanks, Ryan! Thanks Jihyung for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants