# `nx-cugraph`: a NetworkX backend that provides GPU acceleration with RAPIDS cuGraph

This notebook will demonstrate the `nx-cugraph` NetworkX backend using the NetworkX betweenness_centrality algorithm.

## Background
Networkx version 3.0 introduced a dispatching mechanism that allows users to configure NetworkX to dispatch various algorithms to third-party backends. Backends can provide different implementations of graph algorithms, allowing users to take advantage of capabilities not available in NetworkX. `nx-cugraph` is a NetworkX backend provided by the [RAPIDS](https://rapids.ai) cuGraph project that adds GPU acceleration to greatly improve performance.

## System Requirements
Using `nx-cugraph` with this notebook requires the following: 
- NVIDIA GPU, Pascal architecture or later
- CUDA 11.2, 11.4, 11.5, 11.8, or 12.0
- Python versions 3.10, 3.11, or 3.12
- NetworkX >= version 3.2
  - _NetworkX 3.0 supports dispatching and is compatible with `nx-cugraph`, but this notebook will demonstrate features added in 3.2_
  - At the time of this writing, NetworkX 3.2 is only available from source and can be installed by following the [development version install instructions](https://github.com/networkx/networkx/blob/main/INSTALL.rst#install-the-development-version).
- Pandas

More details about system requirements can be found in the [RAPIDS System Requirements documentation](https://docs.rapids.ai/install#system-req).

## Installation

Assuming NetworkX >= 3.2 has been installed using the [development version install instructions](https://github.com/networkx/networkx/blob/main/INSTALL.rst#install-the-development-version), `nx-cugraph` can be installed using either `conda` or `pip`.  

#### conda
```
conda install -c rapidsai-nightly -c conda-forge -c nvidia nx-cugraph
```
#### pip
```
python -m pip install nx-cugraph-cu11 --extra-index-url https://pypi.nvidia.com
```
#### _Notes:_
 * nightly wheel builds will not be available until the 23.12 release, therefore the index URL for the stable release version is being used in the pip install command above.
 * Additional information relevant to installing any RAPIDS package can be found [here](https://rapids.ai/#quick-start).
 * If you installed any of the packages described here since running this notebook, you may need to restart the kernel to have them visible to this notebook.

## Notebook Helper Functions

A few helper functions will be defined here that will be used in order to help keep this notebook easy to read.

In [1]:
import sys
def reimport_networkx():
    """
    Re-imports networkx for demonstrating different backend configuration
    options applied at import-time. This is only needed for demonstration
    purposes since other mechanisms are available for runtime configuration.
    """
    # Using importlib.reload(networkx) has several caveats (described here:
    # https://docs.python.org/3/library/imp.html?highlight=reload#imp.reload)
    # which result in backend configuration not being re-applied correctly.
    # Instead, manually remove all modules and re-import
    nx_mods = [m for m in sys.modules.keys()
               if (m.startswith("networkx") or m.startswith("nx_cugraph"))]
    for m in nx_mods:
        sys.modules.pop(m)
    import networkx
    return networkx


from pathlib import Path
import requests
import gzip
import pandas as pd
def create_cit_patents_graph(verbose=True):
    """
    Downloads the cit-Patents dataset (if not previously downloaded), reads
    it, and creates a nx.DiGraph from it and returns it.
    cit-Patents is described here:
    https://snap.stanford.edu/data/cit-Patents.html
    """
    url = "https://snap.stanford.edu/data/cit-Patents.txt.gz"
    gz_file_name = Path(url.split("/")[-1])
    csv_file_name = Path(gz_file_name.stem)
    if csv_file_name.exists():
        if verbose: print(f"{csv_file_name} already exists, not downloading.")
    else:
        if verbose: print(f"downloading {url}...", end="", flush=True)
        req = requests.get(url)
        open(gz_file_name, "wb").write(req.content)
        if verbose: print("done")
        if verbose: print(f"unzipping {gz_file_name}...", end="", flush=True)
        with gzip.open(gz_file_name, "rb") as gz_in:
            with open(csv_file_name, "wb") as txt_out:
                txt_out.write(gz_in.read())
        if verbose: print("done")

    if verbose: print("reading csv to dataframe...", end="", flush=True)
    pandas_edgelist = pd.read_csv(
        csv_file_name.name,
        skiprows=4,
        delimiter="\t",
        names=["src", "dst"],
        dtype={"src":"int32", "dst":"int32"},
    )
    if verbose: print("done")
    if verbose: print("creating NX graph from dataframe...", end="", flush=True)
    G = nx.from_pandas_edgelist(
        pandas_edgelist, source="src", target="dst", create_using=nx.DiGraph
    )
    if verbose: print("done")
    return G

## Running `betweenness_centrality`
Let's start by running `betweenness_centrality` on the Karate Club graph using the default NetworkX implementation.

### Zachary's Karate Club

Zachary's Karate Club is a small dataset consisting of 34 nodes and 78 edges which represent the friendships between members of a karate club. This dataset is small enough to make comparing results between NetworkX and `nx-cugraph` easy.

In [2]:
import networkx as nx
karate_club_graph = nx.karate_club_graph()

Having NetworkX compute the `betweenness_centrality` values for each node on this graph is quick and easy.

In [3]:
%%timeit global karate_nx_bc_results
karate_nx_bc_results = nx.betweenness_centrality(karate_club_graph)

2.51 ms ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Automatic GPU acceleration
When `nx-cugraph` is installed, NetworkX will detect it on import and make it available as a backend for APIs supported by that backend.  However, NetworkX does not assume the user always wants to use a particular backend, and instead looks at various configuration mechanisms in place for users to specify how NetworkX should use installed backends. Since NetworkX was not configured to use a backend for the above `betweenness_centrality` call, it used the default implementation provided by NetworkX.

The first configuration mechanism to be demonstrated below is the `NETWORKX_AUTOMATIC_BACKENDS` environment variable.  This environment variable directs NetworkX to use the backend specified everywhere it's supported and does not require the user to modify any of their existing NetworkX code.

To use it, a user sets `NETWORKX_AUTOMATIC_BACKENDS` in their shell to the backend they'd like to use.  If a user has more than one backend installed, the environment variable can also accept a comma-separated list of backends, ordered by priority in which NetworkX should use them, where the first backend that supports a particular API call will be used.  For example:
```
bash> export NETWORKX_AUTOMATIC_BACKENDS=cugraph
bash> python my_nx_app.py  # uses nx-cugraph wherever possible, then falls back to default implementation where it's not.
```
or in the case of multiple backends installed
```
bash> export NETWORKX_AUTOMATIC_BACKENDS=cugraph,graphblas
bash> python my_nx_app.py  # uses nx-cugraph if possible, then nx-graphblas if possible, then default implementation.
```

NetworkX looks at the environment variable and the installed backends at import time, and will not re-examine the environment after that.  Because `networkx` was already imported in this notebook, the `reimport_nx()` utility will be called after the `os.environ` dictionary is updated to simulate an environment variable being set in the shell.

**Please note, this is only needed for demonstration purposes to compare runs both with and without fully-automatic backend use enabled.**

In [4]:
import os
os.environ["NETWORKX_AUTOMATIC_BACKENDS"] = "cugraph"
nx = reimport_networkx()
# reimporting nx requires reinstantiating Graphs since python considers
# types from the prior nx import != types from the reimported nx
karate_club_graph = nx.karate_club_graph()

Once the environment is updated, re-running the same `betweenness_centrality` call on the same graph requires no code changes.

In [5]:
%%timeit global karate_cg_bc_results
karate_cg_bc_results = nx.betweenness_centrality(karate_club_graph)

43.9 ms ± 222 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


We may see that the same computation actually took *longer* using `nx-cugraph`. This is not too surprising given how small the graph is, since there's a small amount of overhead to copy data to and from the GPU which becomes more obvious on very small graphs.  We'll see with a larger graph how this overhead becomes negligible.

### Results Comparison

Let's examine the results of each run to see how they compare.  
The `betweenness_centrality` results are a dictionary mapping vertex IDs to betweenness_centrality scores.  The score itself is usually not as important as the relative rank of each vertex ID (e.g. vertex A is ranked higher than vertex B in both sets of results).

In [6]:
# The lists contain tuples of (vertex ID, betweenness_centrality score),
# sorted based on the score.
nx_sorted = sorted(karate_nx_bc_results.items(), key=lambda t:t[1], reverse=True)
cg_sorted = sorted(karate_cg_bc_results.items(), key=lambda t:t[1], reverse=True)

for i in range(len(nx_sorted)):
    print("NX: (%d, %.6f), CG: (%d, %.6f)" % (nx_sorted[i] + cg_sorted[i]))

NX: (0, 0.437635), CG: (0, 0.437635)
NX: (33, 0.304075), CG: (33, 0.304075)
NX: (32, 0.145247), CG: (32, 0.145247)
NX: (2, 0.143657), CG: (2, 0.143657)
NX: (31, 0.138276), CG: (31, 0.138276)
NX: (8, 0.055927), CG: (8, 0.055927)
NX: (1, 0.053937), CG: (1, 0.053937)
NX: (13, 0.045863), CG: (13, 0.045863)
NX: (19, 0.032475), CG: (19, 0.032475)
NX: (5, 0.029987), CG: (5, 0.029987)
NX: (6, 0.029987), CG: (6, 0.029987)
NX: (27, 0.022333), CG: (27, 0.022333)
NX: (23, 0.017614), CG: (23, 0.017614)
NX: (30, 0.014412), CG: (30, 0.014412)
NX: (3, 0.011909), CG: (3, 0.011909)
NX: (25, 0.003840), CG: (25, 0.003840)
NX: (29, 0.002922), CG: (29, 0.002922)
NX: (24, 0.002210), CG: (24, 0.002210)
NX: (28, 0.001795), CG: (28, 0.001795)
NX: (9, 0.000848), CG: (9, 0.000848)
NX: (4, 0.000631), CG: (4, 0.000631)
NX: (10, 0.000631), CG: (10, 0.000631)
NX: (7, 0.000000), CG: (7, 0.000000)
NX: (11, 0.000000), CG: (11, 0.000000)
NX: (12, 0.000000), CG: (12, 0.000000)
NX: (14, 0.000000), CG: (14, 0.000000)
NX: (1

Here we can see that the results match exactly as expected.  

For larger graphs, results are harder to compare given that `betweenness_centrality` is an approximation algorithm influenced by the random selection of paths used to compute the betweenness_centrality score of each vertex.  The argument `k` is used for limiting the number of paths used in the computation, since using every path for every vertex would be prohibitively expensive for large graphs.  For small graphs, `k` need not be specified, which allows `betweenness_centrality` to use all paths for all vertices and makes for an easier comparison.

### `betweenness_centrality` on larger graphs - The U.S. Patent Citation Network<sup>1</sup>

The U.S. Patent Citation Network dataset is much larger with over 3.7M nodes and over 16.5M edges and demonstrates how `nx-cugraph` enables NetworkX to run `betweenness_centrality` on graphs this large (and larger) in seconds instead of minutes.

#### NetworkX default implementation

In [7]:
import os
# Unset NETWORKX_AUTOMATIC_BACKENDS so the default NetworkX implementation is used
os.environ.pop("NETWORKX_AUTOMATIC_BACKENDS", None)
nx = reimport_networkx()
# Create the cit-Patents graph - this will also download the dataset if not previously downloaded
cit_patents_graph = create_cit_patents_graph()

downloading https://snap.stanford.edu/data/cit-Patents.txt.gz...done
unzipping cit-Patents.txt.gz...done
reading csv to dataframe...done
creating NX graph from dataframe...done


In [8]:
# Since this is a large graph, a k value must be set so the computation returns in a reasonable time
k = 40

Because this run will take time, `%%timeit` is restricted to a single pass.

*NOTE: this run may take approximately 1 minute*

In [9]:
%%timeit -r 1
results = nx.betweenness_centrality(cit_patents_graph, k=k)

1min 4s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Something to note is that `%%timeit` disables garbage collection by default, which may not be something a user is able to do. To see a more realistic real-world run time, `gc` can be enabled.

In [10]:
# import and run the garbage collector upfront prior to using it in the benchmark
import gc
gc.collect()

0

*NOTE: this run may take approximately 7 minutes!*

In [11]:
%%timeit -r 1 gc.enable()
nx.betweenness_centrality(cit_patents_graph, k=k)

6min 50s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


#### `nx-cugraph`

Running on a GPU using `nx-cugraph` can result in a tremendous speedup, especially when graphs reach sizes larger than a few thousand nodes or `k` values become larger to increase accuracy.

Rather than setting the `NETWORKX_AUTOMATIC_BACKENDS` environment variable and re-importing again, this example will demonstrate the `backend=` keyword argument to explicitly direct the NetworkX dispatcher to use the `cugraph` backend.

In [12]:
%%timeit -r 1 gc.enable()
nx.betweenness_centrality(cit_patents_graph, k=k, backend="cugraph")

10.1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [13]:
k = 150

In [14]:
%%timeit -r 1 gc.enable()
nx.betweenness_centrality(cit_patents_graph, k=k, backend="cugraph")

11.6 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


For the same graph and the same `k` value, the `"cugraph"` backend returns results in seconds instead of minutes.  Increasing the `k` value has very little relative impact to runtime due to the high parallel processing ability of the GPU, allowing the user to get improved accuracy for virtually no additional cost.

### Type-based dispatching

NetworkX also supports automatically dispatching to backends associated with specific graph types.  This requires the user to write code for a specific backend, and therefore requires the backend to be installed, but has the advantage of ensuring a particular behavior without the potential for runtime conversions.

To use type-based dispatching with `nx-cugraph`, the user must import the backend directly in their code to access the utilities provided to create a Graph instance specifically for the `nx-cugraph` backend.

In [15]:
import nx_cugraph as nxcg

The `from_networkx()` API will copy the data from the NetworkX graph instance to the GPU and return a new `nx-cugraph` graph instance.  By passing an explicit `nx-cugraph` graph, the NetworkX dispatcher will automatically call the `"cugraph"` backend (and only the `"cugraph"` backend) without requiring future conversions to copy data to the GPU.

In [16]:
%%timeit -r 2 global nxcg_cit_patents_graph
nxcg_cit_patents_graph = nxcg.from_networkx(cit_patents_graph)

7.92 s ± 2.85 ms per loop (mean ± std. dev. of 2 runs, 1 loop each)


In [17]:
%%timeit -r 1 gc.enable()
nx.betweenness_centrality(nxcg_cit_patents_graph, k=k)

3.14 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


## Conclusion

This notebook demonstrated `nx-cugraph`'s support for `betweenness_centrality`.  At the time of this writing, `nx-cugraph` also provides support for `edge_netweenness_centrality` and `louvain_communities`.  Other algorithms are scheduled to be supported based on their availability in the cuGraph [pylibcugraph](https://github.com/rapidsai/cugraph/tree/branch-23.10/python/pylibcugraph/pylibcugraph) package and demand by the NetworkX community.

#### Benchmark Results
The results included in this notebook were generated on a workstation with the following hardware:

<table align="left">
    <tr><td>CPU:</td><td>Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz, 45GB</td></tr>
    <tr><td>GPU:</td><td>Quatro RTX 8000, 50GB</td></tr>
</table>

<sup>1</sup> Information on the U.S. Patent Citation Network dataset used in this notebook is as follows:
<table align="left">
    <tr><td>Authors:</td><td>Jure Leskovec and Andrej Krevl</td></tr>
    <tr><td>Title:</td><td>SNAP Datasets, Stanford Large Network Dataset Collection</td></tr>
    <tr><td>URL:</td><td>http://snap.stanford.edu/data</td></tr>
    <tr><td>Date:</td><td>June 2014</td></tr>
</table>
