You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the in-partition connected components algorithm, you can experiment with the following approaches:
Store the whole partition as a graph structure (adjacency list / matrix) and find connected components using BFS/DFS iteratively for each vertex. This is the approach used in the Giraph++ paper.
Use a semi-streaming-like graph algorithm. In this model, your partition operator will iterate over "augmented" edges, where the edge source vertex has an associated value (the component Id). You can create this type of records with a join of your workset (vertices) and edges. The algorithm works as follows:
For each partition, you maintain a disjoint set data structure for the partitions' components. For each edge, check if its endpoints already belong to an existing component. If not, create a component with id the smallest of the vertex Ids. If they belong to different components, merge them and update the component Id to the smallest of the 2 components Ids. If only one of the endpoints belongs to a component, add the other one to the same component and update the component Id if necessary.
Consider the following edges input in a partition: (3, 5), (3, 4), (1, 3), (2, 4), (5, 1), (6, 7), (5, 6). The algorithm proceeds as follows (assume each vertex has its own Id as initial component Id):
edge (3, 5) => create a new component and add the vertices
Component id
Vertices
3
3, 5
edge (3, 4) => add 4 to the component
Component id
Vertices
3
3, 5, 4
edge (1, 3) => add 1 to the component and update the component Id to min(1, 3)=1
Component id
Vertices
1
3, 5, 4, 1
edge (2, 4) => add 2 to the component
Component id
Vertices
1
3, 5, 4, 1, 2
edge (5, 1) => both vertices already in the component
Component id
Vertices
1
3, 5, 4, 1, 2
edge (6, 7) => create new component with Id=6 and add both vertices
Component id
Vertices
1
3, 5, 4, 1, 2
6
6, 7
edge (5, 6) => 5 belongs to comp=1, 6 belongs to comp=6 => merge the components and update the component Id to min(1, 6)=1.
Component id
Vertices
1
3, 5, 4, 1, 2, 6, 7
The text was updated successfully, but these errors were encountered:
For the in-partition connected components algorithm, you can experiment with the following approaches:
Store the whole partition as a graph structure (adjacency list / matrix) and find connected components using BFS/DFS iteratively for each vertex. This is the approach used in the Giraph++ paper.
Use a semi-streaming-like graph algorithm. In this model, your partition operator will iterate over "augmented" edges, where the edge source vertex has an associated value (the component Id). You can create this type of records with a join of your workset (vertices) and edges. The algorithm works as follows:
Consider the following edges input in a partition: (3, 5), (3, 4), (1, 3), (2, 4), (5, 1), (6, 7), (5, 6). The algorithm proceeds as follows (assume each vertex has its own Id as initial component Id):
edge (3, 5) => create a new component and add the vertices
edge (3, 4) => add 4 to the component
edge (1, 3) => add 1 to the component and update the component Id to min(1, 3)=1
edge (2, 4) => add 2 to the component
edge (5, 1) => both vertices already in the component
edge (6, 7) => create new component with Id=6 and add both vertices
edge (5, 6) => 5 belongs to comp=1, 6 belongs to comp=6 => merge the components and update the component Id to min(1, 6)=1.
The text was updated successfully, but these errors were encountered: