GitHub - premgopalan/svinet: Efficient discovery of overlapping communities in massive networks PK Gopalan, DM Blei Proceedings of the National Academy of Sciences 110 (36), 14534-14539

SVINET implements sampling based algorithms that derive from stochastic variational inference under the (assortative) mixed-membership stochastic blockmodel. For details see the following references:

Prem K. Gopalan, David M. Blei. Efficient discovery of overlapping communities in massive networks. To appear in the Proceedings of the National Academy of Sciences, 2013 (published ahead of print August 15, 2013, doi:10.1073/pnas.1221839110).

Article: http://www.pnas.org/content/early/2013/08/14/1221839110.full.pdf

SI: http://www.pnas.org/content/early/2013/08/14/1221839110/suppl/DCSupplemental

Installation

Required libraries: gsl, gslblas, pthread

On Linux/Unix run

./configure make; make install

SVINET has not been tested on Mac. Please run on Linux.

The binary 'svinet' will be installed in /usr/local/bin unless a different prefix is provided to configure. (See INSTALL.)

Tutorial

Prepare your network data as a tab-separated file (e.g., network.txt)
Run the following command to find the overlapping communities:

svinet -file network.txt -n 10000 -k 75 -link-sampling
Run the following command to visualize the communities:

cd output-dir; svinet -file ../network.txt -n 10000 -k 75 -gml

In step 2, "-n" specifies the number of nodes, "-k" specifies the number of communities and "-link-sampling" specifies the sampling method.

Step 2 writes out the communities in communities.txt, the model fit in gamma.txt and lambda.txt and the mixed-memberships in groups.txt.

Step 3 writes out a GML file (network.gml) that can be loaded into a tool, such as Gephi, to visualize the communities. Note that each node is colored by its most likely community in the visualization.

Some advanced tips

Estimating the number of communities

Run the following command setting the number of communities equal to the number of nodes:

svinet -file network.txt -n 10000 -k 10000 -findk

Estimate the number of communities using the following:

wc -l n10000-k10000-mmsb-findk-uniform/communities.txt

Specify this count as the number of communities in step 2 in the tutorial.
Comparing communities to the ground truth

If you have a text file with ground truth community labels, you can specify it in step 2 above, to compute a normalized mutual information score between the true communities and the inferred communities. Run the command in step 2 as follows:

svinet -file network.txt -n 10000 -k 75 -link-sampling -nmi community.txt

Each line of the ground truth community file is a tab-separated list of integers including a node and its list of communities.

node-id list of communities the node is a member of

Example: 65 17 22 43 54

The above line says node with id 65 is a member of communities 17, 22, 43, 54. The community ids are arbitrary. See detailed_readme.txt.
Comparing communities to results from other methods

We recommend running svinet with two settings of the link threshold as follows:

svinet -file network.txt -n 10000 -k 75 -link-sampling -link-thresh 0.5

svinet -file network.txt -n 10000 -k 75 -link-sampling -link-thresh 0.9

For further details, see detailed_readme.txt or please email the authors.
Set an alternate prior on community strengths

The "fromdata" option sets a prior based on the network data on the community strengths and may result in better communities. By default, the prior on community strenths is a uniform distribution.

svinet -file network.txt -n 10000 -k 75 -link-sampling -eta-type fromdata
Other sampling methods

See detailed_readme.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
example		example
patches		patches
src		src
AUTHORS		AUTHORS
COPYING		COPYING
ChangeLog		ChangeLog
INSTALL		INSTALL
Makefile.am		Makefile.am
Makefile.in		Makefile.in
NEWS		NEWS
README		README
README.md		README.md
aclocal.m4		aclocal.m4
config.guess		config.guess
config.h.in		config.h.in
config.sub		config.sub
configure		configure
configure.ac		configure.ac
depcomp		depcomp
detailed_readme.txt		detailed_readme.txt
install-sh		install-sh
missing		missing
mkinstalldirs		mkinstalldirs
validation-test-file-options-patch.txt		validation-test-file-options-patch.txt

License

premgopalan/svinet

Folders and files

Latest commit

History

Repository files navigation

Installation

Tutorial

Some advanced tips

About

Resources

License

Stars

Watchers

Forks

Languages