# Benchmarking Performance of NetworkX with Rapids GPU-based nx_cugraph backend vs on cpu
# Skip notebook test
This notebook benchmarks the use of nx_cugraph as dispatcher for NetworkX algorithms. 

It runs  Betweenness Centrality, Breadth First Search, Louvain Community Detectio and collects times for these runs with and without nx_cugraph backend and graph caching enabled. nx_cugraph is a registered NetworkX backend. Using it is a codeless change. 

In the notebook these variables are set via the [NetworkX config package](https://networkx.org/documentation/stable/reference/backends.html#networkx.utils.configs.NetworkXConfig). They can be set at the command line as well.

### See this example from GTC Spring 2024



Here is a sample minimal script to demonstrate codeless changes to use nx-cugraph.
bc_demo.ipy:

```
import pandas as pd
import networkx as nx

url = "https://data.rapids.ai/cugraph/datasets/cit-Patents.csv"
df = pd.read_csv(url, sep=" ", names=["src", "dst"], dtype="int32")
G = nx.from_pandas_edgelist(df, source="src", target="dst")

%time result = nx.betweenness_centrality(G, k=10)
```
Running it with the nx-cugraph backend looks like this:
```
user@machine:/# ipython bc_demo.ipy
CPU times: user 7min 38s, sys: 5.6 s, total: 7min 44s
Wall time: 7min 44s

user@machine:/# NETWORKX_BACKEND_PRIORITY=cugraph ipython bc_demo.ipy
CPU times: user 18.4 s, sys: 1.44 s, total: 19.9 s
Wall time: 20 s
```



In [None]:
import pandas as pd
import networkx as nx
import time
import os

This installs the NetworkX cuGraph dispatcher if not already present.

In [None]:
try: 
    import nx_cugraph
except ModuleNotFoundError:
    os.system('conda install -c rapidsai -c conda-forge -c nvidia nx-cugraph')

This is boiler plate NetworkX code to run:
* betweenness Centrality
* Bredth first Search
* Louvain community detection

and collect times. it is completely unaware of cugraph or GPU-based tools.
[NetworkX configurations](https://networkx.org/documentation/stable/reference/utils.html#backends) can determine how they are run.

In [None]:
def run_algos(G):

    time_begin = time.time() 
    result = nx.betweenness_centrality(G, k=1)
    time_bc =  time.time() - time_begin
    print("Done bc, time="+ str(time_bc))
    time_begin = time.time()
    result =  nx.bfs_tree(G,source=1)
    time_bfs = time.time() - time_begin
    print("Done bfs, time="+ str(time_bfs))
    time_begin = time.time()
    result = nx.community.louvain_communities(G,threshold=1e-04)
    time_louvain =  time.time() - time_begin
    print("Done louvain, time="+ str(time_louvain))
    return time_bc + time_bfs + time_louvain

Downloads a patent citation dataset containing 3774768 nodes and 16518948 edges and loads it into a NetworkX graph

In [None]:
filepath = "./data/cit-Patents.csv"

if os.path.exists(filepath):
    print("File found")
    url = filepath
else:
    url = "https://data.rapids.ai/cugraph/datasets/cit-Patents.csv"
df = pd.read_csv(url, sep=" ", names=["src", "dst"], dtype="int32")
G = nx.from_pandas_edgelist(df, source="src", target="dst")

Setting the NetworkX dispatcher with an environment variable or in code using NetworkX config package which is new to [NetworkX 3.3 config](https://networkx.org/documentation/stable/reference/backends.html#networkx.utils.configs.NetworkXConfig).


In [None]:
nx.config["backend_priority"]=['cugraph']
nx.config["cache_converted_graphs"]= True

run the algorithms on GPU

In [None]:
run_algos(G)

Here is running the same algorithms on CPU without nx-cugraph. This will take about 40 minutes. Compared to the roughly 90 seconds above. That is more than a 25x speedup by loading the graph once and running the three algorithms on GPU!

In [None]:
# turn off graph caching and the nx-cugraph backend and rerun
nx.config["cache_converted_graphs"]= False
nx.config["backend_priority"]=[]
run_algos(G)

___
Copyright (c) 2024, NVIDIA CORPORATION.

Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
___