<a href="https://colab.research.google.com/github/dcolinmorgan/grph/blob/main/clean_gfql_cpu_gpu_benchmark.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GFQL CPU, GPU Benchmark

This notebook examines GFQL property graph query performance on 1-8 hop queries using CPU + GPU modes on various real-world 100K - 100M edge graphs. The data comes from a variety of popular social networks. The single-threaded CPU mode benefits from GFQL's novel dataframe engine, and the GPU mode further adds single-GPU acceleration. Both the `chain()` and `hop()` methods are examined.

The benchmark does not examine bigger-than-memory and distributed scenarios. The provided results here are from running on a free Google Colab T4 runtime, with a 2.2GHz Intel CPU (12 GB CPU RAM) and T4 Nvidia GPU (16 GB GPU RAM).

## Data
From [SNAP](https://snap.stanford.edu/data/)

| Network | Nodes     | Edges        |
|-------------|-----------|--------------|
| **Facebook**| 4,039     | 88,234       |
| **Twitter** | 81,306    | 2,420,766    |
| **GPlus**   | 107,614   | 30,494,866   |
| **Orkut**   | 3,072,441 | 117,185,082  |

## Results

Definitions:

* GTEPS: Giga (billion) edges traversed per second

* T edges / \$: Estimated trillion edges traversed for 1\$ USD based on observed GTEPS and a 3yr AWS reservation (as of 12/2023)

Tasks:

1. `chain()` - includes complex pre/post processing

  **Task**: `g.chain([n({'id': some_id}), e_forward(hops=some_n)])`


| **Dataset** | Max GPU Speedup      | CPU GTEPS   | GPU GTEPS   | T CPU edges / \$ (t3.l) | T GPU edges / \$ (g4dn.xl) |
|-------------|--------------|-------------|-------------|----------------------------|--------------------------------|
| **Facebook**| 1.1X  | 0.66 | 0.61 | 65.7                | 10.4                    |
| **Twitter** | 17.4X   | 0.17 | 2.81 | 16.7                | 48.1                    |
| **GPlus**   | 43.8X  | 0.09 | 2.87 | 8.5                | 49.2                    |
| **Orkut**   | N/A            | N/A         | 12.15 | N/A                        | 208.3                    |
| **AVG** | 20.7X | 0.30 | 4.61 | 30.3 | 79.0
| **MAX** | 43.8X | 0.66 | 12.15 | 65.7 | 208.3


2. `hop()` - core property search primitive similar to BFS

  **Task**: `g.hop(nodes=[some_id], direction='forward', hops=some_n)`


| **Dataset** | Max GPU Speedup | CPU GTEPS | GPU GTEPS | T CPU edges / \$ (t3.l) | T GPU edges / \$ (g4dn.xl) |
|-------------|-------------|-----------|-----------|--------------------|--------------------------------|
| **Facebook**| 3X          | 0.47      | 1.47     | 47.0        | 25.2                    |
| **Twitter** | 42X         | 0.50      | 10.51      | 50.2        | 180.2                    |
| **GPlus**   | 21X         | 0.26      | 4.11       | 26.2        | 70.4                    |
| **Orkut**   | N/A         | N/A       | 41.50     | N/A                | 711.4                    |
| **AVG** | 22X | 0.41 | 14.4 | 41.1 | 246.8
| **MAX** | 42X | 0.50 | 41.50 | 50.2 | 711.4


## Optional: GPU setup - Google Colab

In [None]:
# Report GPU used when GPU benchmarking
! nvidia-smi

Mon Feb 19 04:14:57 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
# if in google colab
# !git clone https://github.com/rapidsai/rapidsai-csp-utils.git
# !python rapidsai-csp-utils/colab/pip-install.py
!pip install --extra-index-url=https://pypi.nvidia.com cuml-cu12 cudf-cu12 #==23.12.00 #cugraph-cu11 pylibraft_cu11 raft_dask_cu11 dask_cudf_cu11 pylibcugraph_cu11 pylibraft_cu11


In [None]:
import cudf
cudf.__version__

'24.02.01'

# 1. Install & configure

In [None]:
#! pip install graphistry[igraph]

!pip install -q igraph
!pip install -q graphistry


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m14.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.4/244.4 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m332.3/332.3 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[?25h

## Imports

In [None]:
import pandas as pd

import graphistry, time

from graphistry import (

    # graph operators
    n, e_undirected, e_forward, e_reverse,

    # attribute predicates
    is_in, ge, startswith, contains, match as match_re
)
graphistry.__version__

'0.33.0'

In [None]:
import cudf

In [None]:
#work around google colab shell encoding bugs

import locale
locale.getpreferredencoding = lambda: "UTF-8"

# 2. Perf benchmarks

### Facebook: 88K edges

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/graphistry/pygraphistry/master/demos/data/facebook_combined.txt', sep=' ', names=['s', 'd'])
print(df.shape)
df.head(5)

(88234, 2)


Unnamed: 0,s,d
0,0,1
1,0,2
2,0,3
3,0,4
4,0,5


In [None]:
fg = graphistry.edges(df, 's', 'd').materialize_nodes()
print(fg._nodes.shape, fg._edges.shape)
fg._nodes.head(5)

(4039, 1) (88234, 2)


Unnamed: 0,id
0,0
1,1
2,2
3,3
4,4


with 2 and 5 hop `chain` comparison we see a slight/negligable speedup enabled by setting g. to `cudf`

In [None]:
for n_hop in [2,5]:
    start0 = time.time()
    for i in range(100):
        fg2 = fg.chain([n({'id': 0}), e_forward(hops=n_hop)])  # using n notation
    mid0 = time.time()
    for i in range(100):
        fg2 = fg.chain([e_forward(source_node_match={'id': 0}, hops=n_hop)])  # using source_node_match in e_forward
    end0 = time.time()
    T0 = mid0-start0
    T1 = end0-mid0
    fg_gdf = fg.nodes(lambda g: cudf.DataFrame(g._nodes)).edges(lambda g: cudf.DataFrame(g._edges))
    start1 = time.time()
    for i in range(100):
        fg2 = fg_gdf.chain([n({'id': 0}), e_forward(hops=n_hop)])
    mid1 = time.time()
    for i in range(100):
        fg2 = fg_gdf.chain([e_forward(source_node_match={'id': 0}, hops=n_hop)])
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del fg_gdf
    del fg2
    T2 = mid1-start1
    T3 = end1-mid1
    print('\nhops:',n_hop,'\nCPU n_notation time:',np.round(T0,4),'\nGPU n_notation time:',np.round(T2,4),'\nspeedup:', np.round(T0/T2,4),
          '\nCPU source_node_match time:',np.round(T1,4),'\nGPU source_node_match time:',np.round(T3,4),'\nspeedup:', np.round(T1/T3,4), )

hops: 2 
CPU n_notation time: 13.6357 
GPU n_notation time: 12.2177 
n_notation speedup: 1.1161 
CPU source_node_match time: 21.2028 
GPU source_node_match time: 14.3844 
source_node_match speedup: 1.474
hops: 5 
CPU n_notation time: 36.8941 
GPU n_notation time: 21.3562 
n_notation speedup: 1.7276 
CPU source_node_match time: 17.8739 
GPU source_node_match time: 14.8514 
source_node_match speedup: 1.2035


and with simple 2 and 5 hop `hop` comparison we see a 2x speedup enabled by setting g. to `cudf`

In [None]:
for n_hop in [2,5]:
    start_nodes = pd.DataFrame({fg._node: [0]})
    start0 = time.time()
    for i in range(100):
        fg2 = fg.hop(
            nodes=start_nodes,
            direction='forward',
            hops=n_hop)
    end0 = time.time()
    T0 = end0-start0
    start_nodes = cudf.DataFrame({fg._node: [0]})
    fg_gdf = fg.nodes(cudf.from_pandas(fg._nodes)).edges(cudf.from_pandas(fg._edges))
    start1 = time.time()
    for i in range(100):
        fg2 = fg_gdf.hop(
            nodes=start_nodes,
            direction='forward',
            hops=n_hop)
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del fg_gdf
    del fg2
    T1 = end1-start1
    print('\nCPU',n_hop,'hop time:',np.round(T0,4),'\nGPU',n_hop,'hop time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))


CPU 2 hop time: 5.7415 
GPU 2 hop time: 2.7301 2 
hop speedup: 2.103

CPU 5 hop time: 14.3391 
GPU 5 hop time: 6.9998 5 
hop speedup: 2.0485


## Twitter

- edges: 2420766
- nodes: 81306

In [None]:
! wget 'https://snap.stanford.edu/data/twitter_combined.txt.gz'
#! curl -L 'https://snap.stanford.edu/data/twitter_combined.txt.gz' -o twitter_combined.txt.gz

--2024-02-20 09:48:59--  https://snap.stanford.edu/data/twitter_combined.txt.gz
Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80
Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10621918 (10M) [application/x-gzip]
Saving to: ‘twitter_combined.txt.gz’


2024-02-20 09:49:00 (9.10 MB/s) - ‘twitter_combined.txt.gz’ saved [10621918/10621918]



In [None]:
! gunzip twitter_combined.txt.gz

In [None]:
! head -n 5 twitter_combined.txt

214328887 34428380
17116707 28465635
380580781 18996905
221036078 153460275
107830991 17868918


In [None]:
te_df = pd.read_csv('twitter_combined.txt', sep=' ', names=['s', 'd'])
te_df.shape

(2420766, 2)

In [None]:
import graphistry

In [None]:
g = graphistry.edges(te_df, 's', 'd').materialize_nodes()
g._nodes.shape

(81306, 1)

on the twitter data, simpler `chain` operations over several different hops -- **10-20x** *italicized text* speed increases

In [None]:
for n_hop in [1,2,8]:
    start_nodes = pd.DataFrame({fg._node: [0]})
    start0 = time.time()
    for i in range(10):
        g2 = g.chain([n({'id': 17116707}), e_forward(hops=n_hop)])
    end0 = time.time()
    T0 = end0-start0
    g_gdf = g.nodes(lambda g: cudf.DataFrame(g._nodes)).edges(lambda g: cudf.DataFrame(g._edges))
    start1 = time.time()
    for i in range(10):
        out = g_gdf.chain([n({'id': 17116707}), e_forward(hops=n_hop)])._nodes
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del g_gdf
    del out
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))


CPU 1 hop chain time: 20.1676 
GPU 1 hop chain time: 1.0259 
 1 hop chain speedup: 19.6579

CPU 2 hop chain time: 21.7168 
GPU 2 hop chain time: 2.2507 
 2 hop chain speedup: 9.6488

CPU 8 hop chain time: 157.5035 
GPU 8 hop chain time: 7.8694 
 8 hop chain speedup: 20.0147


and similarly for these `hop` operations -- **10-40x** speed increases

In [None]:
for n_hop in [1,2,8]:
    start_nodes = pd.DataFrame({g._node: [17116707]})
    start0 = time.time()
    for i in range(10):
      g2 = g.hop(
          nodes=start_nodes,
          direction='forward',
          hops=n_hop)
    end0 = time.time()
    T0 = end0-start0
    start_nodes = cudf.DataFrame({g._node: [17116707]})
    g_gdf = g.nodes(cudf.from_pandas(g._nodes)).edges(cudf.from_pandas(g._edges))
    start1 = time.time()
    for i in range(10):
        g2 = g_gdf.hop(
            nodes=start_nodes,
            direction='forward',
            hops=5)
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del start_nodes
    del g_gdf
    del g2
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))


CPU 1 hop chain time: 12.3446 
GPU 1 hop chain time: 1.204 
speedup: 10.2526

CPU 2 hop chain time: 13.2377 
GPU 2 hop chain time: 1.1608 
speedup: 11.4037

CPU 8 hop chain time: 52.2491 
GPU 8 hop chain time: 1.2148 
speedup: 43.012


## GPlus

- edges: 30494866
- nodes: 107614

In [None]:
! wget https://snap.stanford.edu/data/gplus_combined.txt.gz

--2024-02-20 09:59:24--  https://snap.stanford.edu/data/gplus_combined.txt.gz
Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80
Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 398930514 (380M) [application/x-gzip]
Saving to: ‘gplus_combined.txt.gz’


2024-02-20 09:59:34 (38.5 MB/s) - ‘gplus_combined.txt.gz’ saved [398930514/398930514]



In [None]:
! gunzip gplus_combined.txt.gz

In [None]:
%%time
ge_df = pd.read_csv('gplus_combined.txt', sep=' ', names=['s', 'd'])
print(ge_df.shape)
ge_df.head(5)

(30494866, 2)
CPU times: user 16.8 s, sys: 1.41 s, total: 18.2 s
Wall time: 18.4 s


Unnamed: 0,s,d
0,116374117927631468606,101765416973555767821
1,112188647432305746617,107727150903234299458
2,116719211656774388392,100432456209427807893
3,117421021456205115327,101096322838605097368
4,116407635616074189669,113556266482860931616


In [None]:
%%time
gg = graphistry.edges(ge_df, 's', 'd').materialize_nodes()
gg = graphistry.edges(ge_df, 's', 'd').nodes(gg._nodes, 'id')
print(gg._edges.shape, gg._nodes.shape)
gg._nodes.head(5)

(30494866, 2) (107614, 1)
CPU times: user 4.41 s, sys: 1.29 s, total: 5.7 s
Wall time: 5.69 s


Unnamed: 0,id
0,116374117927631468606
1,112188647432305746617
2,116719211656774388392
3,117421021456205115327
4,116407635616074189669


In [None]:
%%time
gg.chain([ n({'id': '116374117927631468606'})])._nodes

CPU times: user 471 ms, sys: 307 ms, total: 779 ms
Wall time: 776 ms


Unnamed: 0,id
0,116374117927631468606


on the GPlus data, simpler `chain` operations over several different hops -- **100-200x** speed increases

In [None]:
for n_hop in [1,2,3,4,5]:
    start_nodes = pd.DataFrame({fg._node: [0]})
    start0 = time.time()
    out = gg.chain([ n({'id': '116374117927631468606'}), e_forward(hops=n_hop)])._nodes
    end0 = time.time()
    T0 = end0-start0
    gg_gdf = gg.nodes(lambda g: cudf.DataFrame(g._nodes)).edges(lambda g: cudf.DataFrame(g._edges))
    start1 = time.time()
    out = gg_gdf.chain([ n({'id': '116374117927631468606'}), e_forward(hops=n_hop)])
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del gg_gdf
    del out
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))


CPU 1 hop chain time: 70.7013 
GPU 1 hop chain time: 0.2911 
speedup: 242.9049

CPU 2 hop chain time: 84.2395 
GPU 2 hop chain time: 0.6138 
speedup: 137.252


KeyboardInterrupt: 

and similarly for these hop operations -- **100x** speed increases

In [None]:
for n_hop in [1,2,3,4,5]:
    start_nodes = pd.DataFrame({gg._node: ['116374117927631468606']})
    start0 = time.time()
    for i in range(1):
      g2 = gg.hop(
          nodes=start_nodes,
          direction='forward',
          hops=n_hop)
    end0 = time.time()
    T0 = end0-start0
    start_nodes = cudf.DataFrame({gg._node: ['116374117927631468606']})
    gg_gdf = gg.nodes(cudf.from_pandas(gg._nodes)).edges(cudf.from_pandas(gg._edges))
    start1 = time.time()
    for i in range(1):
      g2 = gg_gdf.hop(
          nodes=start_nodes,
          direction='forward',
          hops=n_hop)
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del start_nodes
    del gg_gdf
    del g2
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))


CPU 1 hop chain time: 38.0714 
GPU 1 hop chain time: 0.2615 
speedup: 145.5678

CPU 2 hop chain time: 52.949 
GPU 2 hop chain time: 0.4553 
speedup: 116.2876


## Orkut
- 117M edges
- 3M nodes

In [None]:
! wget https://snap.stanford.edu/data/bigdata/communities/com-orkut.ungraph.txt.gz

--2024-02-19 06:02:00--  https://snap.stanford.edu/data/bigdata/communities/com-orkut.ungraph.txt.gz
Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80
Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 447251958 (427M) [application/x-gzip]
Saving to: ‘com-orkut.ungraph.txt.gz’


2024-02-19 06:02:11 (37.4 MB/s) - ‘com-orkut.ungraph.txt.gz’ saved [447251958/447251958]



In [None]:
! gunzip com-orkut.ungraph.txt.gz

In [None]:
! head -n 7 com-orkut.ungraph.txt

# Undirected graph: ../../data/output/orkut.txt
# Orkut
# Nodes: 3072441 Edges: 117185083
# FromNodeId	ToNodeId
1	2
1	3
1	4


In [None]:
import pandas as pd

import graphistry

from graphistry import (

    # graph operators
    n, e_undirected, e_forward, e_reverse,

    # attribute predicates
    is_in, ge, startswith, contains, match as match_re
)

import cudf

#work around google colab shell encoding bugs
import locale
locale.getpreferredencoding = lambda: "UTF-8"

cudf.__version__, graphistry.__version__

('24.02.01', '0.33.0')

In [None]:
! nvidia-smi

Mon Feb 19 06:02:29 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   64C    P0              29W /  70W |    111MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
%%time
co_df = cudf.read_csv('com-orkut.ungraph.txt', sep='\t', names=['s', 'd'], skiprows=5).to_pandas()
print(co_df.shape)
print(co_df.head(5))
print(co_df.dtypes)
#del co_df

(117185082, 2)
   s  d
0  1  3
1  1  4
2  1  5
3  1  6
4  1  7
s    int64
d    int64
dtype: object
CPU times: user 2.34 s, sys: 1.29 s, total: 3.63 s
Wall time: 3.77 s


from load into gpu and back to cpu again

In [None]:
%%time
co_g = graphistry.edges(cudf.DataFrame(co_df), 's', 'd').materialize_nodes(engine='cudf')
co_g = co_g.nodes(lambda g: g._nodes.to_pandas()).edges(lambda g: g._edges.to_pandas())
print(co_g._nodes.shape, co_g._edges.shape)
co_g._nodes.head(5)

(3072441, 1) (117185082, 2)
CPU times: user 2.06 s, sys: 7.93 s, total: 10 s
Wall time: 11.2 s


Unnamed: 0,id
0,1
1,2
2,3
3,4
4,5


on the Orkut data, simpler chain operations over several different hops -- **10-50x** speed increases

In [None]:
for n_hop in [1,2,3,4,5,6]:
    start_nodes = pd.DataFrame({fg._node: [0]})
    start0 = time.time()
    for i in range(10):
        out = co_g.chain([ n({'id': 1}), e_forward(hops=n_hop)])._nodes
    end0 = time.time()
    T0 = end0-start0
    co_gdf = co_g.nodes(lambda g: cudf.DataFrame(g._nodes)).edges(lambda g: cudf.DataFrame(g._edges))
    start1 = time.time()
    for i in range(10):
        out = co_gdf.chain([ n({'id': 1}), e_forward(hops=n_hop)])    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del co_gdf
    del out
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))

and similarly for these hop operations -- 10-40x speed increases

In [None]:
for n_hop in [1,2,3,4,5]:
    start_nodes = pd.DataFrame({'id': [1]})
    start0 = time.time()
    for i in range(1):
      g2 = co_g.hop(
          nodes=start_nodes,
          direction='forward',
          hops=n_hop)
    end0 = time.time()
    T0 = end0-start0
    start_nodes = cudf.DataFrame({'id': [1]})
    co_gdf = co_g.nodes(lambda g: cudf.DataFrame(g._nodes)).edges(lambda g: cudf.DataFrame(g._edges))
    start1 = time.time()
    for i in range(1):
      g2 = gg_gdf.hop(
          nodes=start_nodes,
          direction='forward',
          hops=n_hop)
    end1 = time.time()
    # print(fg._nodes.shape, fg._edges.shape)
    # print(fg2._nodes.shape, fg2._edges.shape)
    del start_nodes
    del co_gdf
    del g2
    T1 = end1-start1
    print('\nCPU',n_hop,'hop chain time:',np.round(T0,4),'\nGPU',n_hop,'hop chain time:',np.round(T1,4),'\nspeedup:', np.round(T0/T1,4))

In [None]:
!lscpu


In [None]:
!free -h


               total        used        free      shared  buff/cache   available
Mem:            12Gi       5.8Gi       1.6Gi       1.0Gi       5.2Gi       5.5Gi
Swap:             0B          0B          0B
