Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
Anti-Community Detection ######################## Installation ============ As a first step, please make sure that all required dependencies are installed. The following commands can be used to install missing packages on Ubuntu or Debian operating systems: $ sudo apt-get install build-essential $ sudo apt-get install libgsl-dev Afterwards, just extract the source code to a directory of your choice and run: $ make If everything goes well, the program terminates with exitcode 0. If not, the error message should hopefully contain a hint what is going wrong. All compiled programs will be located within the src/ and third-party/ directory. Self-Test ========= After the compilation finished, it makes sense to run the included self-tests to check that everything works fine. The tests are all located in the src/ folder, so we first have to switch to the right directory: $ cd src To run the self-tests of the library itself, execute: $ make test To run tests for the Python bindings, execute: $ ./pygraph.py Graph File Format ================= Most programs contained in this package require graph data with the following input format: --- snip --- NumberOfNodes StartNode EndNode [Weight] \ StartNode EndNode [Weight] } Edges StartNode EndNode [Weight] / [...] --- snip --- The first line defines the number of nodes, all following lines describe the edges (first node index, second node index, optional weight). Lines starting with "#" are comments and have to be ignored. An example for a trivial graph can be seen in src/data/example-trivial.graph. Programs ======== The current version of the package contains the following programs. Each program also has a separate help, which can be shown by running "PROGRAM --help", e.g., "src/laprop --help". $ src/gml Convert graph to GML file format. $ src/laprop Apply label propagation algorithm to the graph specified by the given file. $ src/maxmodularity Agglomerative hierarchical clustering algorithm for community detection / modularity maximization. $ src/minmodularity Agglomerative hierarchical clustering algorithm for anti- community detection / modularity minimization. $ src/antimodularity Agglomerative hierarchical clustering algorithm for anti- community detection / anti-modularity maximization. $ src/components Return connected components of an undirected graph. $ src/metrics Compute modularity and anti-modularity for the graph specified by the given file. When a ground-truth label file is given, also compute cluster quality metrics. $ src/subgraph Extract subgraph specified by the given files. Datasets ======== The datasets directory contains a collection of various small graphs. The graphs already have been converted to the correct format. Each directory contains a file called "result.graph". After changing the implementation of an algorithm, it is sufficient to execute: $ cd datasets $ make clean $ make to run all evaluations again. This might take a few minutes. In each directory, a "metrics.txt" file will be created, which contains common measures (such as the number of communities, Modularity, Antimodularity, etc.) and can be used to compare different algorithms. Usage Example ============= Community Detection ------------------- As an example, the following commandline applies a Modularity minimization algorithm to the Adjectives and Nouns dataset. $ src/minmodularity datasets/adjectives-noun/result.graph By default, the output (a vector of vertex labels) is written directly to the terminal. Nevertheless, if preferred, it can also be redirected to a file: $ src/minmodularity datasets/adjectives-noun/result.graph > output.txt Note that some programs might have additional parameters, which are passed through to the (anti-)community detection method. Please check "PROGRAM --help" to get a list of available options. Evaluation ---------- To compute the modularity and anti-modularity measure, as well as some other statistical values for a community detection result, just run: $ ./src/metrics --clusters=datasets/adjectives-noun/minmodularity.txt \ $ --groundtruth=datasets/adjectives-noun/ground-truth.txt \ $ datasets/adjectives-noun/result.graph The output will contain one measure per line. To write the result to a file, the operator ">" can be used, as in the example above. Plotting -------- To create plots suitable for publication, we recommend the tool Gephi available from https://gephi.org/. As a prepartion, use the "gml" program to create a file containing all attributes required for plotting, for example: $ ./src/gml --base=datasets/adjectives-noun/ground-truth.txt \ $ --algo=datasets/adjectives-noun/minmodularity.txt \ $ datasets/adjectives-noun/result.graph > import.gml Note that you can add an arbitrary number of attributes by adding additional --NAME=FILENAME arguments to the commandline. When you have selected all attributes you need, import the output file (here "import.gml") into Gephi. To color vertices based on an attribute, select the menu "Partition" in the side bar "Appearance" on the left. Choose one of the attributes (e.g., "algo"), and then click on "Apply" to confirm your changes. Arranging the vertices based on attributes requires a bit more effort: Create a new filter "Library -> Attributes -> Partition -> base" using the side bar "Filters" on the right. Then select one partition at a time, click on the "Filter" button in the bottom right corner, and apply a Layout method of your choice (e.g., "Fruchterman Reingold"). When this is done, use the "Drag" cursor to move the cluster to a location of your choice, and proceed with the next partition. Note that you can change the selection radius by clicking on "Configure" in the top part of the "Graph" view. (If there is an easier way to achieve a similar result, please let me know!) References ========== - S. Lackner, A. Spitz, M. Weidemüller, and M. Gertz. 2018. Efficient Anti-community Detection in Complex Networks. Proceedings of the 30th International Conference on Scientific and Statistical Database Management (SSDBM '18), July 9-11, 2018, Bozen-Bolzano, Italy. ACM, NY, USA. https://doi.org/10.1145/3221269.3221289 - L. Chen, Q. Yu, and B. Chen. 2014. Anti-modularity and anti-community detecting in complex networks. Inf. Sci. 275 (2014), 293–313. https://doi.org/10.1016/j.ins.2014.02.040 - M. E. J. Newman. 2004. Fast algorithm for detecting community structure in networks. Phys. Rev. E 69, 6 (2004), 5. https://doi.org/10.1103/PhysRevE.69.066133 - M. E. J. Newman and M. Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 15. https://doi.org/10.1103/PhysRevE.69.026113 - M. E. J. Newman and G. Reinert. 2016. Estimating the number of communities in a network. Phys. Rev. Lett. 117 (2016), 5. Issue 7. https://doi.org/10.1103/PhysRevLett.117.078301