# Skip notebook test
-----

#### NOTE:  This notebook will take hours to run.
-----



# Comparing NetworkX vs cuGraph using synthetic data on various algorithms


This notebook compares the execution times of many of the cuGraph and NetworkX algorithms when run against identical synthetic data at multiple scales.

This notebook uses the RMAT data generator which allows the creation of graphs at various scales.  The notebook, by default, runs on a set of selected sizes but users are free to change or add to that list.

Notebook Credits

    
| Author        |    Date    |  Update             | cuGraph Version |  Test Hardware         |
| --------------|------------|---------------------|-----------------|------------------------|
| Don Acosta    | 1/12/2023  | Created             | 23.02 nightly   | RTX A6000, CUDA 11.7   |
| Brad Rees     | 1/27/2023  | Modified            | 23.02 nightly   | RTX A6000, CUDA 11.7   |


### Timing 

When looking at the overall workflow, NetworkX and cuGraph do things differently.  For example, NetworkX spends a lot of time creating the graph data structure.  cuGraph on the other hand does a lazy creation of the data structure when an algorithm is called.  To further complicate the comparison problem, NetworkX does not always return the answer.  In some cases, it returns a generator that is then called to produce the data.  

This benchmark produces two performance metrics:
 - (1)	Just the algorithm run time 
 - (2)	The algorithm plus graph creation time

Since GPU memory is a precious resource, having a lot of temporary data laying around is avoided.  So once a graph is created, the raw data is dropped.  
 
__What is not timed__:  Generating the data with R-MAT</p>
__What is timed__:     (1) creating a Graph, (2) running the algorithm (3) run any generators


### Algorithms

|        Algorithm        |  Type         | Undirected Graph | Directed Graph |   Notes
| ------------------------|---------------|------ | ------- |-------------
| Katz                    | Centrality    |   X   |         | 
| Betweenness Centrality  | Centrality    |   X   |         | Estimated, k = 100
| Louvain                 | Community     |   X   |         | Uses python-louvain for comparison
| Triangle Counting       | Community     |   X   |         |
| Core Number             | Core          |   X   |         |
| PageRank                | Link Analysis |       |    X    |
| Jaccard                 | Similarity    |   X   |         |
| BFS                     | Traversal     |   X   |         | No depth limit
| SSSP                    | Traversal     |   X   |         | 


### Test Data
Data is generated using a Recursive MATrix (R-MAT) graph generation algorithm. 
The generator specifics are documented [here](https://docs.rapids.ai/api/cugraph/stable/api_docs/generator.html)



### Notes
* Running Betweenness Centrality on the full graph is prohibitive using NetworkX.  Anything over k=100 can explode runtime to days


## Import Modules

In [None]:
# system and other
import gc
import os
from time import perf_counter
import numpy as np
import math

# rapids
import cugraph
import cudf

# NetworkX libraries
import networkx as nx

# RMAT data generator
from cugraph.generators import rmat
from cugraph.structure import NumberMap

In [None]:
try: 
    import community
except ModuleNotFoundError:
    os.system('pip install python-louvain')
    import community

### Determine the scale of the test data
RMAT generates graph where the number of vertices is a power of 2 and the number of edges is based on an edge factor times the number vertices.

Since RMAT tends to generate about 50% isolated vertices, those vertices are dropped from the graph data.  Hence the number of vertices is closer to (2 ** scale) / 2


| Scale | Vertices (est) | Edges  |
| ------|----------------|--------|
| 10 | 512 | 16,384 | 
| 11 | 1,024 | 32,768| 
| 12 | 2,048 | 65,536| 
| 13 | 4,096 | 131,072| 
| 14 | 8,192 | 262,144| 
| 15 | 16,384 | 524,288 | 
| 16 | 32,768 | 1,048,576 | 
| 17 | 65,536 | 2,097,152 | 
| 18 | 131,072 | 4,194,304 | 
| 19 | 262,144 | 8,388,608 | 
| 20 | 524,288 | 16,777,216 | 
| 21 | 1,048,576 | 33,554,432 | 
| 22 | 2,097,152 | 67,108,864 | 
| 23 | 4,194,304 | 134,217,728 | 
| 24 | 8,388,608 | 268,435,456 | 
| 25 | 16,777,216 | 536,870,912 | 


In [None]:
# Test Data Sizes
# Here you can create an array of test data sizes.   Then set the "data" variable to the array you want
# the dictionary format is 'name' : scale


# These scales are used by R-MAT to determine the number of vertices/edges in the synthetic data graph.
data_full = {
    'data_scale_10'   :  10,
    'data_scale_12'   :  12,
    'data_scale_14'  :   14,
    'data_scale_16'  :   16,
    'data_scale_18'  :   18,
    'data_scale_20'  :   20,
}

# for quick testing
data_quick = {
   'data_scale_9' : 9,
   'data_scale_10' : 10,
   'data_scale_11' : 11,
}


# Which dataset is to be used
data = data_full


### Generate data
The data is generated once for each size.

In [None]:
# Data generator 
#  The result is an edgelist of the size determined by the scale and edge factor
def generate_data(scale, edgefactor=16):
    _gdf = rmat(
        scale,
        (2 ** scale) * edgefactor,
        0.57,
        0.19,
        0.19,
        42,
        clip_and_flip=False,
        scramble_vertex_ids=True,
        create_using=None,  # return edgelist instead of Graph instance
        mg=False # determines whether generated data will be used on one or multiple GPUs
        )

    clean_coo = NumberMap.renumber(_gdf, src_col_names="src", dst_col_names="dst")[0]
    clean_coo.rename(columns={"renumbered_src": "src", "renumbered_dst": "dst"}, inplace=True)
    print(f'Generated a dataframe of {len(clean_coo)} edges')
    return clean_coo

## Create Graph functions
There are two types of graphs created:
* Directed Graphs - calls to create_nx_digraph, create_cu_directed_graph.
* Undirected Graphs - calls to create_xx_ugraph <- fully symmeterized

In [None]:
# NetworkX
def create_nx_graph(_df, directed=False):
    t1 = perf_counter()
    if directed:
        g_type = nx.DiGraph
    else:
        g_type = nx.Graph
        
    _gnx = nx.from_pandas_edgelist(_df,
                            source='src',
                            target='dst',
                            edge_attr=None,
                            create_using=g_type)
    t2 = perf_counter() - t1

    return _gnx, t2


# cuGraph
def create_cu_graph(_df,transpose=False, directed=False):
    t1 = perf_counter()
    _g = cugraph.Graph(directed=directed)
    _g.from_cudf_edgelist(_df,
                          source='src',
                          destination='dst',
                          edge_attr=None,
                          renumber=False,
                          store_transposed=transpose)
    t2 = perf_counter() - t1

    return _g, t2

## Algorithm Execution

### Katz

In [None]:
def nx_katz(_G, alpha):
    t1 = perf_counter()
    _ = nx.katz_centrality(_G, alpha)
    t2 = perf_counter() - t1
    return t2

def cu_katz(_G, alpha):
    t1 = perf_counter()
    _ = cugraph.katz_centrality(_G, alpha)
    t2 = perf_counter() - t1
    return t2


### Betweenness Centrality

In [None]:
def nx_bc(_G, _k):
    t1 = perf_counter()
    _ = nx.betweenness_centrality(_G, k=_k)
    t2 = perf_counter() - t1
    return t2

def cu_bc(_G, _k):
    t1 = perf_counter()
    _ = cugraph.betweenness_centrality(_G, k=_k)
    t2 = perf_counter() - t1
    return t2


### Louvain

In [None]:
def nx_louvain(_G):
    t1 = perf_counter()
    parts = community.best_partition(_G)
    
    # Calculating modularity scores for comparison
    _ = community.modularity(parts, _G)
    
    t2 = perf_counter() - t1
    return t2

def cu_louvain(_G):
    t1 = perf_counter()
    _,_ = cugraph.louvain(_G)
    t2 = perf_counter() - t1
    return t2


### Triangle Counting

In [None]:
def nx_tc(_G):
    t1 = perf_counter()
    nx_count = nx.triangles(_G)

    # To get the number of triangles, we would need to loop through the array and add up each count
    count = 0
    for key, value in nx_count.items():
        count = count + value
    
    t2 = perf_counter() - t1
    return t2

def cu_tc(_G):
    t1 = perf_counter()
    _ = cugraph.triangle_count(_G)
    t2 = perf_counter() - t1
    return t2


### Core Number

In [None]:
def nx_core_num(_G):
    t1 = perf_counter()
    _G.remove_edges_from(nx.selfloop_edges(_G))
    nx_count = nx.core_number(_G)
    
    count = 0
    for key, value in nx_count.items():
        count = count + value
    
    t2 = perf_counter() - t1
    return t2

def cu_core_num(_G):
    t1 = perf_counter()
    _ = cugraph.core_number(_G)
    t2 = perf_counter() - t1
    return t2


### PageRank

In [None]:
def nx_pagerank(_G):
    t1 = perf_counter()
    _ = nx.pagerank(_G)
    t2 = perf_counter() - t1
    return t2 

def cu_pagerank(_G):
    t1 = perf_counter()
    _ = cugraph.pagerank(_G)
    t2 = perf_counter() - t1
    return t2


### Jaccard

In [None]:
def nx_jaccard(_G):
    t1 = perf_counter()
    nj = nx.jaccard_coefficient(_G)
    t2 = perf_counter() - t1
    return t2

def cu_jaccard(_G):
    t1 = perf_counter()
    _ = cugraph.jaccard_coefficient(_G)
    t2 = perf_counter() - t1
    return t2


### BFS

In [None]:
def nx_bfs(_G):
    seed = 0
    t1 = perf_counter()
    nb = nx.bfs_edges(_G, seed)
    nb_list = list(nb) # gen -> list
    t2 = perf_counter() - t1
    return t2

def cu_bfs(_G):
    seed = 0
    t1 = perf_counter()
    _ = cugraph.bfs(_G, seed)
    t2 = perf_counter() - t1
    return t2


### SSSP

In [None]:
def nx_sssp(_G):
    seed = 0
    t1 = perf_counter()
    _ = nx.shortest_path(_G, seed)
    t2 = perf_counter() - t1
    return t2

def cu_sssp(_G):
    seed = 0
    t1 = perf_counter()
    _ = cugraph.sssp(_G, seed)
    t2 = perf_counter() - t1
    return t2


---

# Benchmark

In [None]:
# number of datasets
num_datasets = len(data)

In [None]:
# arrays to capture performance gains
names = []
algos = []
graph_create_cu = []
graph_create_nx = []

# Two dimension data [file, perf]
time_algo_nx = []          # NetworkX
time_algo_cu = []          # cuGraph
perf = []
perf_algo = []

algos.append("   ")

i = 0
for k,v in data.items():
    # init all the 2-d arrays
    time_algo_nx.append([])
    time_algo_cu.append([])
    perf.append([])
    perf_algo.append([])

    # Saved the file Name
    names.append(k)

    # generate data
    print("------------------------------")
    print(f'Creating Graph of Scale = {v}')

    gdf = generate_data(v)
    pdf = gdf.to_pandas()
    print(f"\tdata in gdf {len(gdf)} and data in pandas {len(pdf)}")

    # create the graphs
    g_cu, tcu = create_cu_graph(gdf)
    g_nx, tnx = create_nx_graph(pdf)
    graph_create_cu.append(tcu)
    graph_create_nx.append(tnx)
    del gdf, pdf

    # prep
    deg = g_cu.degree()
    deg_max = deg['degree'].max()

    alpha = 1 / deg_max
    num_nodes = g_cu.number_of_vertices()

    del deg
    gc.collect()

    #----- Algorithm order is same as defined at top ----

    #-- Katz 
    print("\tKatz  ", end = '')
    if i == 0: 
        algos.append("Katz")

    print("n.", end='')
    tx = nx_katz(g_nx, alpha)
    print("c.", end='')
    tc = cu_katz(g_cu, alpha)
    print("")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- BC
    print("\tBC k=100  ", end='')
    if i == 0:
        algos.append("BC Estimate fixed")

    k = 100
    if k > num_nodes:
        k = int(num_nodes)
    print("n.", end='')
    tx = nx_bc(g_nx, k)
    print("c.", end='')
    tc = cu_bc(g_cu, k)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- Louvain
    print("\tLouvain  ", end='')
    if i == 0:
        algos.append("Louvain")

    print("n.", end='')
    tx = nx_louvain(g_nx)
    print("c.", end='')
    tc = cu_louvain(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- TC
    print("\tTC  ", end='')
    if i == 0:
        algos.append("TC")

    print("n.", end='')
    tx = nx_tc(g_nx)
    print("c.", end='')
    tc = cu_tc(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- Core Number
    print("\tCore Number  ", end='')
    if i == 0:
        algos.append("Core Number")

    print("n.", end='')
    tx = nx_core_num(g_nx)
    print("c.", end='')
    tc = cu_core_num(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- PageRank
    print("\tPageRank  ", end='')
    if i == 0:
        algos.append("PageRank")

    print("n.", end='')
    tx = nx_pagerank(g_nx)
    print("c.", end='')
    tc = cu_pagerank(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- Jaccard
    print("\tJaccard  ", end='')
    if i == 0:
        algos.append("Jaccard")

    print("n.", end='')
    tx = nx_jaccard(g_nx)
    print("c.", end='')
    tc = cu_jaccard(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- BFS
    print("\tBFS  ", end='')
    if i == 0:
        algos.append("BFS")

    print("n.", end='')
    tx = nx_bfs(g_nx)
    print("c.", end='')
    tc = cu_bfs(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    #-- SSSP
    print("\tSSSP  ", end='')
    if i == 0:
        algos.append("SSP")

    print("n.", end='')
    tx = nx_sssp(g_nx)
    print("c.", end='')
    tc = cu_sssp(g_cu)
    print(" ")

    time_algo_nx[i].append(tx)
    time_algo_cu[i].append(tc)
    perf_algo[i].append ( (tx/tc) )
    perf[i].append( (tx + tnx) /  (tc + tcu) )

    # increament count
    i = i + 1
    
    del g_cu, g_nx
    gc.collect()


In [None]:
#Print results
print(algos)

for i in range(num_datasets):
    print(f"{names[i]}")
    print(f"{perf[i]}")
    print(f"{perf_algo[i]}")

In [None]:
#Print results
print("\n------------------------------")
print("\tAlgorithm Run times  (NX then cuGraph)\n")

print(algos)
for i in range(num_datasets):
    print(f"{names[i]}")
    print(f"{time_algo_nx[i]}")
    print(f"{time_algo_cu[i]}")

___
Copyright (c) 2020-2023, NVIDIA CORPORATION.

Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
___