<a href="https://colab.research.google.com/github/PavlosPo/social-networks/blob/pavlos-playground/nx-cugraph.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Environment Sanity Check #

Click the _Runtime_ dropdown at the top of the page, then _Change Runtime Type_ and confirm the instance type is _GPU_.

You can check the output of `!nvidia-smi` to check which GPU you have.  Please uncomment the cell below if you'd like to do that.  Currently, RAPIDS runs on all available Colab GPU instances.

In [None]:
# !nvidia-smi

Thu Nov 30 17:46:20 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   67C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

#Setup:
This set up script:

1. Checks to make sure that the GPU is RAPIDS compatible
1. Installs the **current stable version** of RAPIDSAI's core libraries using pip, which are:
  1. cuDF
  1. cuML
  1. cuGraph
  1. cuSpatial
  1. cuxFilter
  1. cuCIM
  1. xgboost

**This will complete in about 5-6 minutes**

If you require installing the **nightly** releases of RAPIDSAI, please use the [RAPIDS Conda Colab Template notebook](https://colab.research.google.com/drive/1TAAi_szMfWqRfHVfjGSqnGVLr_ztzUM9) and use the nightly parameter option when running the RAPIDS installation cell.


In [2]:
# This get the RAPIDS-Colab install files and test check your GPU.  Run this and the next cell only.
# Please read the output of this cell.  If your Colab Instance is not RAPIDS compatible, it will warn you and give you remediation steps.
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!python rapidsai-csp-utils/colab/pip-install.py
!pip install networkx ogb


Cloning into 'rapidsai-csp-utils'...
remote: Enumerating objects: 476, done.[K
remote: Counting objects: 100% (207/207), done.[K
remote: Compressing objects: 100% (116/116), done.[K
remote: Total 476 (delta 141), reused 124 (delta 91), pack-reused 269[K
Receiving objects: 100% (476/476), 131.59 KiB | 3.13 MiB/s, done.
Resolving deltas: 100% (243/243), done.
Collecting pynvml
  Downloading pynvml-11.5.0-py3-none-any.whl (53 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.1/53.1 kB 1.0 MB/s eta 0:00:00
Installing collected packages: pynvml
Successfully installed pynvml-11.5.0
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pynvml/nvml.py", line 1798, in _LoadNvmlLibrary
    nvmlLib = CDLL("libnvidia-ml.so.1")
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory

During handling of the above excepti

# RAPIDS is now installed on Colab.  
You can copy your code into the cells below or use the below to validate your RAPIDS installation and version.  
# Enjoy!

In [3]:
import cudf
cudf.__version__


stdout:



stderr:

Traceback (most recent call last):
  File "<string>", line 4, in <module>
  File "/usr/local/lib/python3.10/dist-packages/numba/cuda/cudadrv/driver.py", line 295, in __getattr__
    raise CudaSupportError("Error at driver init: \n%s:" %
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: 

CUDA driver library cannot be found.
If you are sure that a CUDA driver is installed,
try setting environment variable NUMBA_CUDA_DRIVER
with the file path of the CUDA driver shared library.
:


Not patching Numba


ImportError: 
================================================================
Failed to import CuPy.

If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.

On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.

Check the Installation Guide for details:
  https://docs.cupy.dev/en/latest/install.html

Original error:
  ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
================================================================


In [None]:
import cuml
cuml.__version__

In [None]:
import cugraph
cugraph.__version__

In [None]:
import cuspatial
cuspatial.__version__

In [None]:
import cuxfilter
cuxfilter.__version__

# Next Steps #

For an overview of how you can access and work with your own datasets in Colab, check out [this guide](https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92).

For more RAPIDS examples, check out our RAPIDS notebooks repos:
1. https://github.com/rapidsai/notebooks
2. https://github.com/rapidsai/notebooks-contrib

# Betweeness

In [None]:
# %load_ext cudf.pandas
# pandas API is now GPU accelerated

import pandas as pd
import networkx as nx
import cugraph as cg
import cudf
from cugraph.experimental import PropertyGraph
from ogb.linkproppred import LinkPropPredDataset
dataset = LinkPropPredDataset(name='ogbl-ppa')

split_edge = dataset.get_edge_split()
train_edge, valid_edge, test_edge = split_edge["train"], split_edge["valid"], split_edge["test"]
graph = dataset[0]  # graph: library-agnostic graph object

G = nx.Graph()

for i in range(graph['num_nodes']):
    source = graph['edge_index'][0][i]
    target = graph['edge_index'][1][i]
    G.add_edge(source, target)
# Calculate betweenness centrality
bc_scores = cg.betweenness_centrality(G)

# Save in pd.DataFrame
results_df = pd.DataFrame(bc_scores.items(), columns=['Node', 'Betweenness Centrality'])

# Save it in csv
results_df.to_csv('betweenness_centrality.csv', index=False)


# Closeness Centrality

In [1]:
import cugraph as cg
import cudf
import pandas as pd
from ogb.linkproppred import LinkPropPredDataset

print('Loading dataset...')
dataset = LinkPropPredDataset(name='ogbl-ppa')

split_edge = dataset.get_edge_split()
train_edge, valid_edge, test_edge = split_edge["train"], split_edge["valid"], split_edge["test"]
graph = dataset[0]  # graph: library-agnostic graph object

print('Creating graph...')
# Extract edge data
edges_df = pd.DataFrame({'source': graph['edge_index'][0], 'target': graph['edge_index'][1]})
edges_cudf = cudf.from_pandas(edges_df)

# Initialize CuGraph graph directly from edge list
G = cg.Graph()
G.from_cudf_edgelist(edges_cudf, source='source', destination='target')

print('Calculating closeness centrality...')
# Calculate closeness centrality
cc_scores = cg.closeness_centrality(G)

# Convert closeness centrality scores to dictionary
results = cc_scores.to_pandas().set_index('vertex')['closeness_centrality'].to_dict()

# Save in pd.DataFrame
print('Saving results...')
results_df = pd.DataFrame(results.items(), columns=['Node', 'Closeness Centrality'])

# Save results in csv
results_df.to_csv('closeness_centrality.csv', index=False)

print('Done!')


ModuleNotFoundError: No module named 'cugraph'