Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithms for in-partition connected components #2

Closed
vasia opened this issue Nov 12, 2015 · 0 comments
Closed

Algorithms for in-partition connected components #2

vasia opened this issue Nov 12, 2015 · 0 comments

Comments

@vasia
Copy link
Owner

vasia commented Nov 12, 2015

For the in-partition connected components algorithm, you can experiment with the following approaches:

  1. Store the whole partition as a graph structure (adjacency list / matrix) and find connected components using BFS/DFS iteratively for each vertex. This is the approach used in the Giraph++ paper.

  2. Use a semi-streaming-like graph algorithm. In this model, your partition operator will iterate over "augmented" edges, where the edge source vertex has an associated value (the component Id). You can create this type of records with a join of your workset (vertices) and edges. The algorithm works as follows:

  • For each partition, you maintain a disjoint set data structure for the partitions' components. For each edge, check if its endpoints already belong to an existing component. If not, create a component with id the smallest of the vertex Ids. If they belong to different components, merge them and update the component Id to the smallest of the 2 components Ids. If only one of the endpoints belongs to a component, add the other one to the same component and update the component Id if necessary.

Consider the following edges input in a partition: (3, 5), (3, 4), (1, 3), (2, 4), (5, 1), (6, 7), (5, 6). The algorithm proceeds as follows (assume each vertex has its own Id as initial component Id):

edge (3, 5) => create a new component and add the vertices

Component id Vertices
3 3, 5

edge (3, 4) => add 4 to the component

Component id Vertices
3 3, 5, 4

edge (1, 3) => add 1 to the component and update the component Id to min(1, 3)=1

Component id Vertices
1 3, 5, 4, 1

edge (2, 4) => add 2 to the component

Component id Vertices
1 3, 5, 4, 1, 2

edge (5, 1) => both vertices already in the component

Component id Vertices
1 3, 5, 4, 1, 2

edge (6, 7) => create new component with Id=6 and add both vertices

Component id Vertices
1 3, 5, 4, 1, 2
6 6, 7

edge (5, 6) => 5 belongs to comp=1, 6 belongs to comp=6 => merge the components and update the component Id to min(1, 6)=1.

Component id Vertices
1 3, 5, 4, 1, 2, 6, 7
@vasia vasia closed this as completed Dec 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant