GitHub - puzzlef/slpa-communities: Single-threaded CPU-based Speaker-listener Label Propagation Algorithm (SLPA) for community detection.

Single-threaded CPU-based Speaker-listener Label Propagation Algorithm (SLPA) for community detection.

I was trying out the Speaker-listener Label Propagation Algorithm (SLPA) which i had implemented yesterday (results out today). This is one of the algorithms given in literature review of LabelRank. Here each node has a fixed memory where it remembers all the popular labels that it listened. The size of this memory is fixed to the total number of iterations to be performed + 1. The algorithm is as follows.

Each vertex is initialized such that it remembers itself as popular.
Each neighbor speaks one of the random labels in its memory.
The vertex (listener) adds the most popular label to its memory.
Repeat from 2 until a fixed number of iterations is performed (labels - 1).
I allow early convergence if at least n% of vertices remember their previous label.
For each vertex, i pick the most popular label in its memory as its community.

Unlike RAK (also called LPA) and COPRA, this is a randomized historical-label based technique. A random number generator (RNG) is used by each neighbor of a vertex to pick a random label to speak from its memory. I originally used C++'s default random engine (which is probably mersenne twister) which seems slow. So i updated it to use a xorshift32 random engine. The listener vertex listens to all its neigbors, and based on edge weight, picks the most popular label. In strict mode, the listener pick the first most popular label as to record in its memory, and in non-strict mode, the listener pick randomly one of the most popular labels to record. It records this in the next index in its memory. The authors of this paper suggest to perform a fixed number of iterations (which is one less than the space we have reserved for storing the labels in the memory of each vertex). However, if the latest popular label (for a vertex) is same as the previous one in its memory, i consider it as a sign of convergence. So, when n% of vertices show this sign of convergence (tolerance), i perform an early exit. By default, a tolerance of 0.05 is used. As we want to find disjoint communities, i use the most popular label in the memory of each vertex as its final community.

The experiment is run with labels (memory space of each vertex) ranging from 4 to 64.

It appears that increasing the number of labels increases the time required for completion (as expected), and also increases the final modularity. However, on graphs soc-Slashdot* modularity seems to decrease with increasing labels.

It is also observed that the strict variant is slightly faster than the non-strict one, but achieves similar modularity. On road networks it seems strict approach obtains lower modularity, and on other types of graphs, strict approach achieves higher modularity.

However, its seems that this method of community detection does not provide a good return on investment (it takes longer but does not achieve great modularity). I do not perform any post-processing of communities as of now (such as splitting disconnected communities).

All outputs are saved in a gist and a small part of the output is listed here. Some charts are also included below, generated from sheets. The input data used for this experiment is available from the SuiteSparse Matrix Collection. This experiment was done with guidance from Prof. Kishore Kothapalli and Prof. Dip Sankar Banerjee.

$ g++ -std=c++17 -O3 main.cxx
$ ./a.out ~/data/web-Stanford.mtx
$ ./a.out ~/data/web-BerkStan.mtx
$ ...

# Loading graph /home/subhajit/data/web-Stanford.mtx ...
# order: 281903 size: 2312497 [directed] {}
# order: 281903 size: 3985272 [directed] {} (symmetricize)
# [-0.000497 modularity] noop
# [00337.151 ms; 0004 iters.; 0.474440932 modularity] slpaSeqStatic       {labels=04, tolerance=5e-02}
# [00304.351 ms; 0004 iters.; 0.479095578 modularity] slpaSeqStaticStrict {labels=04, tolerance=5e-02}
# [00901.554 ms; 0008 iters.; 0.752845287 modularity] slpaSeqStatic       {labels=08, tolerance=5e-02}
# [00866.540 ms; 0008 iters.; 0.679729104 modularity] slpaSeqStaticStrict {labels=08, tolerance=5e-02}
# [02229.548 ms; 0016 iters.; 0.826291323 modularity] slpaSeqStatic       {labels=16, tolerance=5e-02}
# [02193.593 ms; 0016 iters.; 0.789462328 modularity] slpaSeqStaticStrict {labels=16, tolerance=5e-02}
# [03706.551 ms; 0032 iters.; 0.847930729 modularity] slpaSeqStatic       {labels=32, tolerance=5e-02}
# [03655.065 ms; 0032 iters.; 0.817599833 modularity] slpaSeqStaticStrict {labels=32, tolerance=5e-02}
# [07928.522 ms; 0064 iters.; 0.857062638 modularity] slpaSeqStatic       {labels=64, tolerance=5e-02}
# [07843.098 ms; 0064 iters.; 0.826945186 modularity] slpaSeqStaticStrict {labels=64, tolerance=5e-02}
#
# ...

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
main.cxx		main.cxx
main.py		main.py
main.sh		main.sh
process.js		process.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

CITATION.cff

CITATION.cff

LICENSE

LICENSE

README.md

README.md

main.cxx

main.cxx

main.py

main.py

main.sh

main.sh

process.js

process.js

Repository files navigation

References

About

Languages

License

puzzlef/slpa-communities

Folders and files

Latest commit

History

Repository files navigation

References

About

Resources

License

Stars

Watchers

Forks

Languages