# Benchmarking NetworkX compatibility
This notebook benchmark the use of a NetworkX Graph object as input into algorithms.  <p>
The intention of the feature is to be able to drop cuGraph into existing NetworkX code in spot where performance is not optimal.


### Betweenness Centrality
Both NetworkX and cuGraph allow for estimating the betweenness centrality score by using a subset of vertices rather than all the vertices.  WHile that does produce a less accurate answer, it dramatically improves performance when the sample is small.  For this test, the algorithms will use only 10% of the vertices to compute the estimate 


__Notebook Credits__

* Original Authors: Bradley Rees
* Last Edit: 09/27/2020

RAPIDS Versions: 0.16

Test Hardware
```
    GV100 32G, CUDA 10,0
    Intel(R) Core(TM) CPU i7-7800X @ 3.50GHz
    32GB system memory
```

In [1]:
import networkx as nx
import cugraph as cudanx
import time
import operator

In [2]:
# starting number of Nodes
N = 100

In [3]:
# average degree
M = 16

In [4]:
def run_nx(G, k=None):
    t1 = time.time()
    bc = nx.betweenness_centrality(G, k)
    t2 = time.time() - t1
    return t2, bc

In [5]:
def run_cu(G, k=None):
    t1 = time.time()
    bc = cudanx.betweenness_centrality(G, k)
    t2 = time.time() - t1
    return t2, bc

In [7]:
print("use all nodes")
print(f"Node \tEdges  \tSpeedup  \t\tcreate time  \t\tnx time  \t\tcu time ")

for x in range(10):
    if x == 0:
        n = N
    else:
        n = n * 2

    
    t1 = time.time()    
    # create a random graph
    G = nx.barabasi_albert_graph(n, M)
    g_time = time.time() - t1
    
    num_edges = G.number_of_edges()
    num_nodes = G.number_of_nodes()
    
    time_nx, bc = run_nx(G)
    time_cu, bcc = run_cu(G)

    speedup = time_nx / time_cu
    print(f"{num_nodes}\t{num_edges}\t{speedup}\t{g_time}\t{time_nx}\t{time_cu}")
    

use all nodes
Node 	Edges  	Speedup  		create time  		nx time  		cu time   		max NX 	max Cu
100	1344	0.44623723212497496	0.12005257606506348	0.05099630355834961	0.11428070068359375	17	17
200	2944	1.1391240166050252	0.0049457550048828125	0.20274639129638672	0.17798447608947754	18	18
400	6144	2.640587078495305	0.010426521301269531	0.836834192276001	0.3169121742248535	19	19
800	12544	5.2610751870538826	0.021261930465698242	3.8402199745178223	0.7299306392669678	17	17
1600	25344	12.991305370727268	0.04396557807922363	17.509873628616333	1.3478147983551025	19	19
3200	50944	26.49811966652707	0.08774495124816895	78.09810280799866	2.9473073482513428	16	16
6400	102144	48.61890556849022	0.17733025550842285	338.7272083759308	6.966985464096069	17	17
12800	204544	89.80946825026868	0.4636502265930176	1539.7254755496979	17.14435577392578	18	18
25600	409344	180.42019041320523	0.8656513690948486	7804.056516170502	43.25489568710327	18	18
51200	818944	328.0516295018564	1.861088514328003	47763.087628126144	

In [None]:
print("use all nodes")
print(f"Node \tEdges  \tSpeedup \t\tnx time  \t\tcu time ")

pr_speedup = []

for x in range(10):
    if x == 0:
        n = N
    else:
        n = n * 2

    # create a random graph
    G = nx.barabasi_albert_graph(n, M)
    num_edges = G.number_of_edges()
    num_nodes = G.number_of_nodes()
    
    t1 = time.time()    
    nx_pr = nx.pagerank(G)
    time_nx = time.time() - t1
    
    t1 = time.time()    
    cp_pr = cugraph.pagerank(G)
    time_cu = time.time() - t1

    speedup = time_nx / time_cu
    print(f"{num_nodes}\t{num_edges}\t{speedup}\t{time_nx}\t{time_cu}")
    

___
Copyright (c) 2020, NVIDIA CORPORATION.

Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
___