ECG Community Detection implementation#502
Conversation
Benchmark Results (Julia v1)Time benchmarks
Memory benchmarks
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #502 +/- ##
==========================================
+ Coverage 97.31% 97.46% +0.15%
==========================================
Files 126 127 +1
Lines 7739 7766 +27
==========================================
+ Hits 7531 7569 +38
+ Misses 208 197 -11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
LoveLow-Global
left a comment
There was a problem hiding this comment.
Thanks for working on this! Adding ECG is a great feature for the community detection.
I've left a few comments. The two main things to check on are ensuring the ensemble weighting only applies to the 2-core of the graph and a small bug with undirected self-loops.
As this is my first review after being a reviewer here, I may have not given the best review possible. I apologize in advance for this issue, please let me know if there is anything to be improved from my side.
|
Thanks for the helpful review! It's only my ~3rd pull request so I don't feel qualified to say much about your process, but it seems good to me. |
|
And thank you the kind comments, I really appreciate it! I will try to improve more in the future. |
|
Thank you so much for the great contribution. I believe it is ready to be merged! |
|
@Krastanov Hello, I wonder if you can check on the workflow before merging, as well as if it is ready to be merged, as it is my first time reviewing a PR and I want to make it sure. Thank you! |
|
Thanks both, for the contribution and for the review! I want to briefly discuss to questions of API design before we merge it (it otherwise seems ready for merge):
And a question for future work: Given that we now have a few different types of clustering algorithms, does it make sense to start discussing how these can be organized together, e.g. by some general |
|
Hi Krastanov, thanks for the comments.
I think discussion around a standardized type makes sense, but I'm not sufficiently comfortable with Julia to lead that. I will say that everything right now uses arrays with entries corresponding to community id as the return type, which I think is fine for partitions (every vertex gets exactly one community). Most popular algorithms use partitions (louvain, leiden, infomap, label propagation, some spectral things) so I think it would be fine to standardize around a partition as a result. There are generalizations with outliers (a vertex can have no communities), or overlaps (a vertex can have multiple communities), or fuzzy membership (e.g. 50% in each of two communities) where the return type would have to be different, but at that point it could make sense as it's own package. I don't have plans for adding any more community detection algorithms right now, so maybe we can move this to an issue where it's more visible and better for a longer discussion. |
|
Thanks, Ryan! Thanks Jihyung for the review! |
Another community detection algorithm, see #231
Ensemble Clustering for Graphs (ECG) uses many runs of Louvain (see #488 ) to improve performance and stability.