# Supply chain partitioning example


In this notebook, we will use cuGraph to prototype partitioning of JDA supply chain example graph  

* Created:   11/3/2019
* Last Edit: 12/9/2019

RAPIDS Versions: 0.11.0

Test Hardware
* GV100 32G, CUDA 10.1

Using docker container: rapidsai/rapidsai-nightly:cuda10.1-runtime-centos7

## cuGraph Notice 
The current version of cuGraph has some limitations:

* Vertex IDs need to be 32-bit integers.
* Vertex IDs are expected to be contiguous integers starting from 0.

cuGraph provides the renumber function to mitigate this problem. Input vertex IDs for the renumber function can be either 32-bit or 64-bit integers, can be non-contiguous, and can start from an arbitrary number. The renumber function maps the provided input vertex IDs to 32-bit contiguous integers starting from 0. cuGraph still requires the renumbered vertex IDs to be representable in 32-bit integers. These limitations are being addressed and will be fixed soon. 

### Test Data
We will be using the example dataset Arijit provided.

### Prep

In [1]:
# Import needed libraries
import cugraph
import cudf
import numpy as np

In [2]:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

### Read data using cuDF and pandas

In [8]:
# Test file    
datafile='../data/Arijit210.prefix.edgelist.csv'

In [10]:
# read the data using cuDF
gdf = cudf.read_csv(datafile, delimiter=",", names=['src', 'dst'], dtype=['int32', 'int32'],skiprows=1)
df = pd.read_csv(datafile, delimiter=",", names=['src', 'dst'], skiprows=1)

In [11]:
gdf.head().to_pandas()

Unnamed: 0,src,dst
0,0,312262
1,0,526142
2,1,526146
3,1,312266
4,2,312269


In [12]:
df.head()

Unnamed: 0,src,dst
0,0,312262
1,0,526142
2,1,526146
3,1,312266
4,2,312269


### Create the directed graph using NetworkX

In [13]:
cpuG=nx.from_pandas_edgelist(df, source='src', target='dst',create_using=nx.DiGraph)
#nx.draw(cpuG, with_labels=True,pos=nx.circular_layout(cpuG), node_color='r', edge_color='b')
#plt.show()

In [14]:
print("cpu Graph")
print("\tNumber of Vertices: " + str(cpuG.number_of_nodes()))
print("\tNumber of Edges:    " + str(cpuG.number_of_edges()))

cpu Graph
	Number of Vertices: 552975
	Number of Edges:    792665


In [15]:
nx.is_strongly_connected(cpuG)

False

In [16]:
nx.is_weakly_connected(cpuG)

False

In [17]:
connectedskus = sorted(nx.weakly_connected_components(cpuG), key=len, reverse=True)

In [18]:
print("\tNumber weakly connected components: " + str(len(connectedskus)))

	Number weakly connected components: 20464


In [23]:
# Generate WCCs of cpuG, returning a geneator of sets of nodes, one for each weakly connected component of G
#[len(c) for c in sorted(nx.weakly_connected_components(cpuG), key=len, reverse=True)]

In [24]:
#nodeslist = []
#for consku in connectedskus:
#  nodeslist = []
#  for val in consku:
#          nodeslist.append(val)
#  print(nodeslist)

### Create the directed graph using cugraph

In [20]:
# we don't need to renumber for this dataset as the node index starts from 0 and contiguous 
gdf['renumbered_src'], gdf['renumbered_dst'], mapping = cugraph.renumber(gdf['src'], gdf['dst'])

In [21]:
# Note that currently cuGraph WCC only supported undirected network graph, so we use Graph() instead of DiGraph()
#gpuG = cugraph.DiGraph()
gpuG = cugraph.Graph()
gpuG.from_cudf_edgelist(gdf, source='renumbered_src', destination='renumbered_dst')

In [22]:
print("Main Graph")
print("\tNumber of Vertices: " + str(gpuG.number_of_vertices()))
print("\tNumber of Edges:    " + str(gpuG.number_of_edges()))

Main Graph
	Number of Vertices: 552975
	Number of Edges:    1585330


In [25]:
# Generate WCCs of gpuG, returning cuda dataFrame df
# df[‘labels’][i] gives the label id of the ith vertex and df[‘vertices’][i] gives the vertex id of the i’th vertex
wcc = cugraph.weakly_connected_components(gpuG)

In [26]:
wcc['org_vertices'] = mapping[wcc['vertices']]

In [27]:
wcc.head()

Unnamed: 0,labels,vertices,org_vertices
0,1,0,0
1,2,1,8191
2,3,2,16382
3,4,3,24573
4,5,4,32764


In [28]:
wcc['labels'].unique()

0             1
1             2
2             3
3             4
4             5
          ...  
20459    539596
20460    539643
20461    542559
20462    544366
20463    544783
Name: labels, Length: 20464, dtype: int32

In [31]:
label_gby = wcc.groupby('labels')
maxNodesCountPerComponent = label_gby['org_vertices'].count().max()  
print("Total number of components found : ", wcc['labels'].unique().count())
print("Max # of nodes in any of the component : ", maxNodesCountPerComponent)

Total number of components found :  20464
Max # of nodes in any of the component :  146874


In [59]:
def print_components(_df, id, maxColumnLength):
    
    _f = _df.query('labels == @id')
    print(len(_f))
    part = []
    for i in range(len(_f)):
        part.append(_f['vertices'][i])
        
    print(part)
    
    return part

In [60]:
tempdf = cudf.DataFrame()

for j in wcc['labels'].unique().head():
        print("Vertex Ids that belong to component label ", j, ": ") 
        
        tempdf['WCC'+str(j)] = str(print_components(wcc, j, label_gby['org_vertices'].count().max()))
        #print_components(wcc, j, label_gby['org_vertices'].count().max())



Vertex Ids that belong to component label  1 : 
1058
[0, 1435, 1837, 2264, 2516, 3008, 3935, 4431, 4624, 6059, 6461, 6888, 7031, 7140, 7632, 8491, 9055, 9248, 10683, 10882, 11085, 11512, 11655, 11764, 12256, 12283, 13047, 13872, 15307, 15506, 16136, 16279, 16839, 16880, 17603, 20108, 20130, 20760, 20903, 21395, 21504, 22159, 22335, 23656, 24664, 24754, 25527, 25636, 25951, 26128, 26715, 26959, 28212, 29220, 29378, 29993, 30151, 30507, 30552, 30752, 31271, 32768, 33776, 34002, 34617, 34775, 35063, 35176, 35376, 35827, 37324, 38332, 38626, 38663, 39241, 39399, 39619, 39800, 40000, 40383, 41880, 42888, 43219, 43865, 44023, 44175, 44424, 44624, 44939, 46436, 47444, 47775, 48489, 48647, 48731, 49048, 49248, 49495, 50992, 52000, 52331, 52498, 53113, 53271, 53287, 53416, 53672, 53872, 54051, 55548, 56556, 56887, 57122, 57737, 57843, 57895, 57972, 58296, 58496, 58607, 60104, 61112, 61443, 61746, 62361, 62399, 62519, 62528, 62920, 63120, 63163, 64565, 64660, 65668, 65999, 66370, 66955, 66985, 6