Skip to content

METIS, Balanced Label Propagation, Bayesian Sharding, and its optimization techniques

Notifications You must be signed in to change notification settings

SteveDengZishi/Graph-Partitioning-Research

Repository files navigation

                  Paper Name: Optimizing Iterative Algorithms for Social Network Sharding
                                Published Conference: IEEE Big Data 2021

           Student: Zishi Deng    Supervisor: Torsten Suel  Institution: New York University

Research Abstract There has recently been significant interest in applications that require computations on massive graph structures, including scenarios where the graph is too large to be processed on a single machine. In this case, the graph needs to be partitioned into subgraphs that can be assigned to individual machines, in a process called graph or social network sharding. Given the sizes of the graphs involved, it is necessary or at least highly desirable that the partitioning itself can also be performed in a distributed manner, instead of running a sequential partitioning algorithm on a single node.

We study such distributed algorithms for graph sharding, where the goal is to create subgraphs of roughly equal size that minimize the number of edges crossing subgraph boundaries. In particular, we focus on two well-known approaches that can be efficiently implemented in MapReduce and related distributed computing paradigms: the Balanced Label Propagation algorithm of Ugander and Backstrom, and the method of Duong et al. based on the Bayesian Stochastic Block Modeling approach of Hofman and Wiggins. Our contributions are as follows: (1) We perform the first direct experimental comparison of the two approaches, which were independently proposed and published. (2) We propose and evaluate several enhancements of Balanced Label Propagation that result in improved graph shardings. (3) We propose and evaluate hybrid methods that perform label propagation both on individual nodes, as suggested by Ugander and Backstrom, and on stochastic blocks inferred using the approach of Duong et al.

Repository Description

This git repository consists of five folders

Balanced_Label_Propagation_Deprecated
Community_Detection
Disruptive_BLP_AutoConverge
References
**Weighted_Graph_Optimized_BLP**

NOTE THAT: Weight_Graph_Optimized_BLP folder contains all up-to-date implementations for initializations, refinement and disruption techniques, including BLP, SBM, Probability based disruption, cluster based movement, KL Swaps, you should use these up-to-date implementations

Balanced_Label_Propagation consists of two parts, it was the first implementation but it is DEPRECATED!
1.Randomly initialized Balanced Label Propagation
2.METIS + BLP (Balanced Label Propagation run on top of METIS)

Disruptive_BLP_AutoConverge is the major improvements from Balanced_Label_Propagation
1.Improved mapping algorithm allow it to run faster on larger graph and partitions
2.It will run disruptive rounds when it is close to convergence (When improvement in locality between two rounds < 0.5%)
3.It will automatically stop the BLP process and return the highest locality achieved when no improvement can be achieved (Convergence arises and it is lower than the previous convergence point)

Community_Detection contains SBM methods for clustering and greedy map to blocks

Weighted_Graph_Optimized_BLP contains all up-to-date algorithms used in generating results in the paper as well as helper script to run multiple comparison with plotting. 

References contains all the relevant papers referred during this research 

Use Instruction

Please go into the respective sub-folders to read the instructions from there

About

METIS, Balanced Label Propagation, Bayesian Sharding, and its optimization techniques

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages