nets-nodegroups - Network analysis of node group structures
nets-nodegroups is a network analysis tool that implements the node group extraction framework and introduces the group type parameter Tau for researching and exploring node group structures of various networks. The framework is capable of identifying and sequentially extracting significant node group structures, such as:
- hubs & spokes
- and many similar structures
Details of the algorithm and framework are published in:
- L. Šubelj, N. Blagus, and M. Bajec, "Group extraction for real-world networks: The case of communities, modules, and hubs and spokes," in Proc. of NetSci '13, 2013, p. 152.
- L. Šubelj, S. Žitnik, N. Blagus, and M. Bajec, “Node mixing and group structure of complex software networks,” Advs. Complex Syst., vol. 17, 2014.
The adopted network node group extraction framework extracts groups from a simple undirected graph sequentially. An optimization method (currently random-restart hill climbing) is used to maximize the group criterion W(S,T) and extract group S with the corresponding linking pattern T. After extraction edges between S and T are removed and the whole process repeated on the largest weakly-connected component until the group criterion W is larger than expected on a Erdös-Rényi random graph.
Open source project:
- home: http://gw.tnode.com/nets-nodegroups/
- github: http://github.com/gw0/nets-nodegroups/
- technology: C++, SNAP library
First compile the tool to get the executable
nodegroups (see below). To use it you need your file with graph edges and specify parameters you want to tweak:
-o: prefix for all file names (simplified usage) (default:
-i: input file with graph edges (undirected edge per line) (default:
-l: input file (optional) with node labels (node ID, node label) (default:
-og: output file with ST-group assignments (default:
-os: output file with only ST-group extraction summary (default:
-n: number of restarts of the optimization algorithm (default: 2000)
-sm: maximal number of steps in each optimization run (default: 100000)
-sw: stop optimization if no W improvement in steps (default: 1000)
-ss: initial random-sample size of S ant T (0=random) (default: 1)
-fn: finish after extracting so many groups (turn off random graphs) (default: 0)
-fw: finish if W smaller than top percentile on random graphs (default: 1)
-rg: random graphs (Erdos-Renyi) to construct for estimating W (default: 500)
-rn: random graph restarts of the optimization algorithm (default: 10)
-rf: random graph re-estimation of W if relative difference smaller (default: inf)
Example to read from
graph.edgelist and output to
$ ./nodegroups -o:graph
Same example but extract only first 12 groups (ignoring estimated W on random graphs):
$ ./nodegroups -o:graph -fn:12
graph.edgelist contains undirected graph edges (
0 1 0 2 1 2 2 3 3 4 3 5 ...
graph.groups contains extracted node groups S and linking patterns T (
# NId GroupS GroupT NLabel 0 0 0 foo 1 0 0 bar 2 0 0 foobar 3 1 -1 - 2 -1 1 - ...
graph.groupsreport contains a summary of extracted node groups (
# Graphs: 12 Nodes: 115 Edges: 613 N M N_S M_S N_T M_T N_ST M_ST L_ST L_STc W Tau Mod_S Mod_T Type 115 613 9 36 9 36 9 36 72 25 823.0000 1.0000 0.1352 0.1352 COM 115 582 9 36 9 36 9 36 72 30 818.0000 1.0000 0.1164 0.1164 COM 115 550 10 40 10 40 10 40 80 30 810.0000 1.0000 0.1191 0.1191 COM ...
Description of columns:
N: number of nodes left in graph
M: number of edges left in graph
N_S: number of nodes in subgraph on group S
M_S: number of edges in subgraph on group S
N_T: number of nodes in subgraph on linking pattern T
M_T: number of edges in subgraph on linking pattern T
N_ST: number of nodes in subgraph on intersection of S and T
M_ST: number of edges in subgraph on intersection of S and T
L_ST: number of edges L(S,T) between groups S and T
L_STc: number of edges L(S,Tc) between groups S and complement of T
W: group critetion W(S,T)
Tau: group type parameter Tau(S,T)
Mod_S: modularity measure on group S
Mod_T: modularity measure on linking pattern T
Type: human name for group type parameter Tau:
COM: community (S = T)
MOD: module (S intersection with T = 0)
HSD: hub&spokes module (module and |T| = 1)
MIX: mixture (otherwise)
CPX: core/periphery mixture (S subset of T or T subset of S)
Exit status codes:
-1: error, file not found or crash
0: success, groups extracted
1: no groups extracted
You will need to compile the source code to get a working tool. This should work on any platform using any standard C++ compiler (eg. GCC, Visual Studio), but it has only been extensively tested in the following environment.
First you need to download source code of net-nodegroups:
$ git clone http://github.com/gw0/nets-nodegroups.git $ cd ./nets-nodegroups
Download and compile SNAP library from console using GCC:
$ apt-get install build-essential g++ $ wget http://snap.stanford.edu/releases/Snap-2.1.zip $ unzip Snap-2.1.zip $ mv Snap-2.1 snap $ cd ./snap $ make all $ cd ..
Compile net-nodegroups from console using GCC:
$ cd ./src $ make all
Executable file is located in
This library is licensed under the GNU Affero General Public License 3.0+ (AGPL-3.0+). Note that it is mandatory to make all modifications and complete source code of this library publicly available to any user.