# Fragment Decomposition

Abhinav Madahar <abhinav.madahar@rutgers.edu>, James Abello Monedero <abelloj@cs.rutgers.edu>

<br />

We want to find the fragment decomposition of a large graph.

In [3]:
import networkx as nx
import matplotlib.pyplot as plt
from queue import Queue
from unionfind import UnionFind

In [2]:
!pip3 install networkx matplotlib

Collecting networkx
  Downloading networkx-2.5-py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 5.4 MB/s 
[?25hCollecting matplotlib
  Downloading matplotlib-3.3.1-cp36-cp36m-manylinux1_x86_64.whl (11.6 MB)
[K     |████████████████████████████████| 11.6 MB 77 kB/s 
Collecting cycler>=0.10
  Using cached cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Collecting kiwisolver>=1.0.1
  Using cached kiwisolver-1.2.0-cp36-cp36m-manylinux1_x86_64.whl (88 kB)
Collecting certifi>=2020.06.20
  Using cached certifi-2020.6.20-py2.py3-none-any.whl (156 kB)
Collecting pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3
  Using cached pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
Collecting numpy>=1.15
  Downloading numpy-1.19.1-cp36-cp36m-manylinux2010_x86_64.whl (14.5 MB)
[K     |████████████████████████████████| 14.5 MB 53 kB/s 
[?25hCollecting pillow>=6.2.0
  Downloading Pillow-7.2.0-cp36-cp36m-manylinux1_x86_64.whl (2.2 MB)
[K     |████████████████████████████████| 2.2 MB 58.2 MB/

First, we need to be able to run pBFS on a path in a graph and get the connected components of the waves.
Let's make an implementation of pBFS which yields tuples of connected components at every iteration.

In [4]:
def parallel_bfs(G: nx.Graph, path: list):
    # starting from the path, we take the neighbhours of all the vertices, moving outward like the waves that form when a stone falls in water.
    # the first wave is the given path; after that, the next wave is the neighbours of the path. Then, take the neighbours of that wave. Repeat until the entire graph is done.
    # this generates the waves so that we don't use up memory storing all the waves
    visited = set(path)
    wave = path
    yield [list(wave)]
    while len(wave):
        uf = UnionFind()
        wave = set(sum((list(G[node].keys()) for node in wave), [])) - visited
        visited |= wave
        for node in wave:
            uf.add(node)
        for src, dest in G.subgraph(wave).edges:
            uf.union(src, dest)
        if len(wave) != 0:
            components = [list(comp) for comp in uf.components()]
            for node in old_wave:
                for neighbour in G.adj[node]:
                    if neighbour in wave:
                        print(uf.find(node), uf.find(neighbour))
            yield components

Let's get the waves.

In [5]:
G = nx.gnm_random_graph(100, 200)
G = G.subgraph(next(nx.connected_components(G)))
waves = list(parallel_bfs(G, [0]))

NameError: name 'old_wave' is not defined

In [100]:
waves

[[[0]],
 [[35], [10, 69], [14], [52], [88]],
 [[6],
  [17, 13],
  [15],
  [19],
  [21],
  [22],
  [57, 27],
  [39],
  [45],
  [54],
  [67],
  [89, 71],
  [74],
  [79],
  [90],
  [91],
  [92],
  [94],
  [96],
  [97]],
 [[3,
   7,
   8,
   11,
   12,
   16,
   18,
   23,
   24,
   26,
   28,
   31,
   32,
   34,
   40,
   42,
   43,
   44,
   47,
   48,
   49,
   50,
   51,
   53,
   55,
   56,
   58,
   59,
   68,
   72,
   73,
   75,
   76,
   77,
   81,
   83,
   84,
   85,
   93,
   95,
   98,
   99],
  [5],
  [65],
  [86],
  [37],
  [38],
  [60]],
 [[1],
  [2],
  [33],
  [4],
  [36],
  [66, 30],
  [41, 70, 87],
  [9, 82],
  [46],
  [80, 20],
  [61],
  [62],
  [25, 29],
  [63]]]

Now, let's find the meta-graph structure.

In [74]:
connections = []
for upper, lower in zip(waves, waves[1:]):
    for src_cc in upper:
        for dest_cc in lower:
            for src in src_cc:
                for dest in dest_cc:
                    if (src, dest) in G.edges:
                        connections.append((src_cc, dest_cc))
                        break