SDP Clustering •
- Community detection using fast low-cardinality semidefinite programming *
This repository contains the source code to reproduce the experiments in the NeurIPS'20 paper Community detection using fast low-cardinality semidefinite programming by Po-Wei Wang and J. Zico Kolter.
What the package provides
It detect communities (that is, clustering with unknown number of clusters) via maximizing a metric called modularity. Further, it provides sparse embeddings for nodes in a graph.
How it works
We relax the (combinatorial) modularity maximization problem to a smooth semidefinite program (SDP) by converting the Kronecker delta into a dot-product. By further controlling the cardinality (sparsity) in the dot-product space, we develop a efficient optimization algorithm that scales linearly with the number of data entries. See the paper for more details.
pip install sdp-clustering
git clone --recursive https://github.com/locuslab/sdp_clustering cd sdp_clustering && python setup.py install
conda install -c numpy scipy
After installation, the package provides a command-line utility locale_alg accepting matrix-market format. For example, to detect communities in Zachary Karate Club and output the result in labels.txt, run
locale_alg data/zachary.mtx --out labels.txt
To obtain the low-cardinality embedding (without rounding) with cardinality ≤2, run
locale_alg data/zachary.mtx --out emb.txt --embedding --k=2
All experiments can be replicated by the default parameters (k=8), except that the Amazon data requires k=16.
See bin/locale_alg for the example usage. Mainly, the package provides 3 functions
locale_embedding: obtain embeddings from the continuous optimization algorithm leiden_locale: obtain comminity assignments by the hierarchical Leiden-Locale algorithm init_random_seed: set random seed
For more details, see sdp_clustering/models.py. For even more details, see the Cpp implementation in the src/ folder.