# Calculating node centralities

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/centralities.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/centralities.ipynb)

In the following we implement degree- and path-based centrality measures and apply them to identify important nodes in empirical networks.

In [None]:
pip install git+git://github.com/pathpy/pathpy.git

In [None]:
from collections import defaultdict, Counter

import pathpy as pp
import numpy as np

We will test our implementation in an undirected and a directed example network.

In [None]:
n_undirected = pp.Network(directed=False)
n_undirected.add_edge('a', 'b')
n_undirected.add_edge('b', 'c')
n_undirected.add_edge('c', 'a')
n_undirected.add_edge('d', 'e')
n_undirected.add_edge('e', 'f')
n_undirected.add_edge('f', 'g')
n_undirected.add_edge('g', 'd')
n_undirected.add_edge('d', 'f')
n_undirected.add_edge('b', 'd')
n_undirected.plot()

In [None]:
n_directed = pp.Network(directed=True)
n_directed.add_edge('a', 'b')
n_directed.add_edge('b', 'c')
n_directed.add_edge('c', 'a')
n_directed.add_edge('d', 'e')
n_directed.add_edge('e', 'f')
n_directed.add_edge('f', 'g')
n_directed.add_edge('g', 'd')
n_directed.add_edge('d', 'f')
n_directed.add_edge('b', 'd')
n_directed.plot()

## Degree Centrality

A simple, local notion of node importance in networks can be defined based on the degrees of nodes. In `pathpy` we can compute the (in- or out-)degrees of nodes as follows:

In [None]:
n_undirected.degrees()

In [None]:
n_directed.indegrees()

In [None]:
n_directed.outdegrees()

In order to provide a unified API to all centrality measures, `pathpy` additionally includes a `degree_centrality` function in the module `pp.algorithms.centralities`. Using the `mode` parameter, we can switch between degre, in-, or out-degree.

In [None]:
pp.algorithms.centralities.degree_centrality(n_undirected)

In [None]:
pp.algorithms.centralities.degree_centrality(n_directed, mode='indegree')

In [None]:
pp.algorithms.centralities.degree_centrality(n_directed, mode='outdegree')

A common task in network analysis is the ranking of nodes by centrality. Since dictionaries in `python` are not ordered, this requires a different data structure. To simplify this frequent task, `pathpy` comes with a `rank_centralities` function that takes an unordered dictionary with centrality values as parameter, and returns a list of tuples with node uids and centrality values that are arranged in descending order:

In [None]:
ranking = pp.algorithms.centralities.rank_centralities(pp.algorithms.centralities.degree_centrality(n_undirected))
print(ranking)

print('The most important node is', ranking[0][0])

## Centrality measures in `pathpy`

To obtain a centrality measure that actually considers the topology of links (and not only the number of links incident to nodes) we can use the `centralities` module in `pathpy.algorithms`.  

In [None]:
pp.algorithms.centralities.closeness_centrality(n_undirected)

In [None]:
pp.algorithms.centralities.closeness_centrality(n_undirected, normalized=True)

Alternatively, the same methods are also available as members of the Network class, which allows us to directly calculate them on an instance:

In [None]:
pp.algorithms.centralities.rank_centralities(n_undirected.betweenness_centrality())

# todo

datenbank highschoolabklären und visualiesirung hinzufügen

# Centralities in Empirical Networks

We conclude this unit by an exploration of node centralities in empirical networks. We first use `pathpy`'s SQLe integration to read the table `gentoo` from the database file (can be downloaded from Moodle) as **directed** network. We further read the table `highschool` in the database as undirected network. We then apply the functions in the `pathpy.algorithms.centralities` module to rank nodes according to the following centrality measures:

1) in- and out-degree (for directed network), degree (for undirected network)  
2) closeness centrality   
3) betweenness centrality