# HyperBall-based approximated Harmonic and Closeness with GRAPE
Graphs are an important tool in many areas of computer science, such as social networks, transportation networks, and recommendation systems. One of the key challenges when dealing with large graphs is computing various graph metrics, such as harmonic centrality and closeness centrality, in a reasonable amount of time.

HyperBall is an algorithm for computing approximate graph metrics, such as centrality, on large graphs. The GRAPE library contains many features, including a Rust work-stealing implementation of HyperBall that provides a simple and efficient way to compute centrality metrics on large graphs.

In this tutorial, we will use GRAPE to compute approximate harmonic and closeness centrality metrics on a decently large graph. We will begin by providing an overview of the GRAPE library and its usage. Then, we will describe the HyperBall algorithm and how it can be used to compute centrality metrics. Finally, we will walk through an example of using GRAPE to compute centrality metrics on a large graph and discuss best practices for performance optimization. By the end of this tutorial, you will have a good understanding of how to use GRAPE to compute centrality metrics on large graphs efficiently.


## What is GRAPE
[🍇🍇 GRAPE 🍇🍇](https://github.com/AnacletoLAB/grape) is a graph processing and embedding library that enables users to easily manipulate and analyze graphs. With [GRAPE](https://github.com/AnacletoLAB/grape), users can efficiently load and preprocess graphs, generate random walks, and apply various node and edge embedding models. Additionally, [GRAPE](https://github.com/AnacletoLAB/grape) provides a fair and reproducible evaluation pipeline for comparing different graph embedding and graph-based prediction methods.

![features in GRAPE](https://github.com/AnacletoLAB/grape/raw/main/images/sequence_diagram.png?raw=true)

## Closeness centrality
Closeness centrality is a measure of the degree to which a node is close to all other nodes in a network. The intuition behind this measure is that a node that is located close to other nodes is more likely to be important in terms of transmitting information or influence through the network.

Formally, the closeness centrality of a node $v$ in a connected graph $G$ is defined as the reciprocal of the sum of the shortest path distances between $v$ and all other nodes in the graph:
$$C_c(v) = \frac{1}{\sum_{u \in V} d(u, v)}$$

where $d(u,v)$ is the shortest path distance between nodes $u$ and $v$, and $V$ is the set of all nodes in the graph.

Note that the closeness centrality of a node is high if it is close to many other nodes in the network. Conversely, a node with low closeness centrality is located far away from other nodes in the network.

[We have covered how to compute Closeness centrality with GRAPE in this past tutorial](https://github.com/AnacletoLAB/grape/blob/main/tutorials/Efficient%20Weighted%20and%20Unweighted%20Closeness%20Centrality%20with%20GRAPE.ipynb)

## Harmonic centrality
Harmonic centrality is another measure of centrality in a network that takes into account the inverse of the distances between a node and all other nodes in the network. Formally, the harmonic centrality $C_H(v)$ of a node $v$ is defined as:

$$C_h(v) = \sum_{u \in V\; u \neq v} \frac{1}{d(u, v)}$$

where $d(u,v)$ is the shortest path distance between nodes $u$ and $v$. The harmonic centrality measures the extent to which a node can reach other nodes in the network and be reached by them. Nodes with higher harmonic centrality are those that are closer to many other nodes in the network.

Like the closeness centrality, the harmonic centrality is oftencase normalized to be in the range $[0,1]$, where a value of 1 represents a node that is maximally central in the network. However, it is less sensitive to outliers and is generally better suited for measuring centrality in networks that are not strongly connected.

We will cover extensively Harmonic Centrality in a future tutorial.

## What is HyperBall
[HyperBall](https://vigna.di.unimi.it/ftp/papers/HyperBall.pdf) is an algorithm for, among other things, approximating the closeness and harmonic centrality of nodes in large graphs. The algorithm is based on a probabilistic data structure called [HyperLogLog](https://github.com/LucaCappelletti94/hyperloglog-rs) and works by estimating the number of neighbours at different "balls" of neighbours.

The basic idea behind HyperBall is to use HyperLogLog counters to estimate the number of distinct neighbours at distance $k$. Specifically, the algorithm maintains a set of HyperLogLog counters, one for each node in the graph.

HyperBall initializes the HyperLogLog counters by inserting the associated nodes (insert the first node in the first counter etc etc). After initialization, the algorithm iteratively computes the union of the counter of a node with the counters of its neighbors. Specifically, the algorithm computes the maximum register across all of the HyperLogLog counters registers involved in the union, and then updates the counter of the node with the maximum registers. This process continues until the counters converge, i.e. all counters reachable have the same register values.

The deltas between the counters at step $t$ and $t+1$ is the difference in number of neighbours between two "neighbours balls", and that gives us an estimate of the number of nodes at depth $t$. From there, computing the harmonic and closeness centralities becomes a trivial affair.

### Work stealing
Work stealing is a popular scheduling strategy used in parallel computing. The goal of work stealing is to balance the workload among multiple threads to improve the overall performance of the system. In a work-stealing algorithm, each thread has its own local queue of tasks to perform. When a thread completes its local tasks, it can steal tasks from the queue of another thread that has more work than it can handle.

In the case of the HyperBall algorithm, work stealing is used to balance the computation of high-degree nodes across multiple threads. Since high-degree nodes can have a significant impact on the overall computation time, it is important to distribute the work of computing these nodes evenly across all threads. Work stealing allows threads to dynamically adjust their workload based on the current state of the system, ensuring that no single thread is overloaded while other threads remain idle.

In this particular implementation of HyperBall, each thread is assigned a subset of nodes to process. Once a thread has completed its allotted local work, it will check the queues of other threads that have fallen behind schedule because they are dealing with some particularly high degre nodes. This allows the thread to take on additional work and balance the workload across all threads.

[Find the Rust implementation of HyperBall here](https://github.com/AnacletoLAB/ensmallen/blob/fa8225614008508ac9b806d884dfc58b79d19d6f/graph/src/hyperball.rs#L63)

## Some pratical experiments

In [1]:
%%time
from grape.datasets.kgobo import HP
from grape.datasets.string import HomoSapiens
from grape.datasets.kghub import KGCOVID19

2023-05-14 18:37:48.092708: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-05-14 18:37:48.092727: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


CPU times: user 3.88 s, sys: 2.94 s, total: 6.82 s
Wall time: 5.29 s


We start by trying it out on a small graph. We observe that here the improvements is not too significant, and to get to a low error we need to use rather high precision HyperLogLog counters.

In [2]:
%%time
hp = HP()

CPU times: user 157 ms, sys: 25.2 ms, total: 182 ms
Wall time: 42 ms


In [3]:
%%time
(hp.get_number_of_nodes(), hp.get_diameter(), hp.get_number_of_directed_edges())

CPU times: user 148 ms, sys: 48.6 ms, total: 197 ms
Wall time: 9.23 ms


(31668, 29.0, 135633)

In [4]:
%%time
hp.get_closeness_centrality()

CPU times: user 13.7 s, sys: 14.4 ms, total: 13.7 s
Wall time: 588 ms


array([5.6500367e-06, 5.6499730e-06, 7.5825933e-06, ..., 1.3628805e-05,
       1.4220705e-05, 1.8399602e-05], dtype=float32)

In [5]:
%%time
hp.get_approximated_closeness_centrality(precision=9, bits=6)

CPU times: user 5.44 s, sys: 25.6 ms, total: 5.46 s
Wall time: 241 ms


array([5.49620381e-06, 5.49614288e-06, 7.35795084e-06, ...,
       1.41679775e-05, 1.47832434e-05, 1.89927359e-05], dtype=float32)

In [6]:
%%time
hp.get_harmonic_centrality()

CPU times: user 17.7 s, sys: 37.2 ms, total: 17.8 s
Wall time: 768 ms


array([1618.525 , 1618.2749, 2242.5837, ..., 2752.841 , 2937.7966,
       3908.6458], dtype=float32)

In [7]:
%%time
hp.get_approximated_harmonic_centrality(precision=9, bits=6)

CPU times: user 6.18 s, sys: 23.2 ms, total: 6.2 s
Wall time: 289 ms


array([1662.215 , 1661.9622, 2299.9229, ..., 2597.7651, 2773.5237,
       3692.4763], dtype=float32)

Let's bump it up a notch, and we move to STRING Homo Sapiens:

In [8]:
%%time
string = HomoSapiens()

CPU times: user 14.7 s, sys: 638 ms, total: 15.3 s
Wall time: 8.98 s


In [9]:
%%time
(string.get_number_of_nodes(), string.get_diameter(), string.get_number_of_directed_edges())

CPU times: user 35.8 s, sys: 1.12 s, total: 36.9 s
Wall time: 1.85 s


(19566, 5.0, 11938498)

In [10]:
%%time
string.get_closeness_centrality()

CPU times: user 10min 36s, sys: 1 s, total: 10min 37s
Wall time: 27.6 s


array([2.6706548e-05, 2.5682513e-05, 2.6850683e-05, ..., 0.0000000e+00,
       0.0000000e+00, 1.9996800e-05], dtype=float32)

In [11]:
%%time
string.get_approximated_closeness_centrality(precision=8, bits=6)

CPU times: user 52.8 s, sys: 53.7 ms, total: 52.9 s
Wall time: 2.28 s


array([2.5858175e-05, 2.4865947e-05, 2.5768313e-05, ..., 0.0000000e+00,
       0.0000000e+00, 1.9493376e-05], dtype=float32)

In [12]:
%%time
string.get_harmonic_centrality()

CPU times: user 8min 12s, sys: 1.32 s, total: 8min 13s
Wall time: 21.6 s


array([10392.652,  9823.812, 10512.813, ...,     0.   ,     0.   ,
        7848.315], dtype=float32)

In [13]:
%%time
string.get_approximated_harmonic_centrality(precision=8, bits=6)

CPU times: user 33.2 s, sys: 29.5 ms, total: 33.2 s
Wall time: 1.42 s


array([10728.257 , 10162.341 , 10726.847 , ...,     0.    ,     0.    ,
        8176.8022], dtype=float32)

We can spend a bit of time to see how the error and time requirements change with the resolution.

The run at highest resolution are exponentially slower, as HyperLogLog counters have $2^{\text{precision}}$ registers. While the HyperLogLog implementation we are using supports from resolution $4$ to resolution $16$ included, we are going to cut it short at $12$ for time sakes. Feel free to bench it extensively on your own time.

In [14]:
%%time
import numpy as np
import pandas as pd
from time import time
from tqdm.auto import trange

actual_values = string.get_closeness_centrality()

closeness_performance = []

for precision in trange(4, 11 + 1, leave=False):
    for bits in (5, 6):
        start = time()
        approximated = string.get_approximated_closeness_centrality(
            precision=precision,
            bits=bits
        )
        end = time()
        mse = np.sum((approximated - actual_values)**2)
        closeness_performance.append(dict(
            mse=mse,
            precision=precision,
            bits=bits,
            time=end - start
        ))
closeness_performance = pd.DataFrame(closeness_performance)
closeness_performance

  0%|          | 0/8 [00:00<?, ?it/s]

CPU times: user 28min 56s, sys: 2.61 s, total: 28min 59s
Wall time: 1min 15s


Unnamed: 0,mse,precision,bits,time
0,3.156656e-06,4,5,0.062338
1,3.150493e-06,4,6,0.074883
2,2.285012e-07,5,5,0.209248
3,2.234365e-07,5,6,0.219988
4,2.469377e-07,6,5,0.388623
5,2.515463e-07,6,6,0.38813
6,1.346935e-07,7,5,0.892079
7,1.330898e-07,7,6,1.442079
8,1.282743e-08,8,5,1.410394
9,1.302366e-08,8,6,1.433877


In [15]:
%%time
import numpy as np
import pandas as pd
from time import time
from tqdm.auto import trange

actual_values = string.get_harmonic_centrality()

harmonic_performance = []

for precision in trange(4, 11 + 1, leave=False):
    for bits in (5, 6):
        start = time()
        approximated = string.get_approximated_harmonic_centrality(
            precision=precision,
            bits=bits
        )
        end = time()
        mse = np.sum((approximated - actual_values)**2)
        harmonic_performance.append(dict(
            mse=mse,
            precision=precision,
            bits=bits,
            time=end - start
        ))
harmonic_performance = pd.DataFrame(harmonic_performance)
harmonic_performance

  0%|          | 0/8 [00:00<?, ?it/s]

CPU times: user 26min 49s, sys: 2.87 s, total: 26min 52s
Wall time: 1min 9s


Unnamed: 0,mse,precision,bits,time
0,161947600000.0,4,5,0.064384
1,161913200000.0,4,6,0.119695
2,15139560000.0,5,5,0.212849
3,15222820000.0,5,6,0.22085
4,58298970000.0,6,5,0.387177
5,56905820000.0,6,6,0.38066
6,28193330000.0,7,5,0.734679
7,27778330000.0,7,6,0.719611
8,2655626000.0,8,5,1.419431
9,2265788000.0,8,6,1.435327


Let's move up anothe notch. From KGCOVID19 onwards, the graph is too large to finish the computation within seconds (it takes hours on this computer), so we need to use only the approximation. And that is what HyperBall is for!

In [16]:
%%time
kgcovid = KGCOVID19()

CPU times: user 23 s, sys: 273 ms, total: 23.3 s
Wall time: 1.79 s


In [17]:
%%time
(kgcovid.get_number_of_nodes(), kgcovid.get_diameter(), kgcovid.get_number_of_directed_edges())

CPU times: user 1.31 s, sys: 43.3 ms, total: 1.35 s
Wall time: 64.8 ms


(574232, 38.0, 36501154)

In [18]:
%%time
kgcovid.get_approximated_closeness_centrality(precision=8, bits=6)

CPU times: user 6min 25s, sys: 609 ms, total: 6min 26s
Wall time: 16.7 s


array([1.8629692e-07, 2.0928746e-07, 1.6905928e-07, ..., 2.8484943e-04,
       2.8118393e-07, 9.8631144e-02], dtype=float32)

In [19]:
%%time
kgcovid.get_approximated_harmonic_centrality(precision=8, bits=6)

CPU times: user 5min 42s, sys: 654 ms, total: 5min 43s
Wall time: 14.8 s


array([7.07870938e+04, 8.37640078e+04, 6.51354531e+04, ...,
       2.68150444e+01, 1.01187445e+05, 2.10325456e+00], dtype=float32)

## Conclusive notes
In this tutorial, we have discussed how HyperBall, a state-of-the-art algorithm for approximating closeness and harmonic centrality in large graphs, can be easily used with GRAPE. We have introduced HyperBall and HyperLogLog counters, and explained how they work together to compute centrality measures efficiently. We have also described the work-stealing strategy implemented in the GRAPE version of HyperBall, which allows for better load balancing among threads. Overall, HyperBall provides a powerful tool for analyzing large graphs and can be easily integrated into existing workflows.

[You can learn more about HyperLogLog here](https://github.com/LucaCappelletti94/hyperloglog-rs).

Don't forget to ⭐ GRAPE!