# DyNetworkX Tutorial – Plotting Dynamic Betweenness Centrality

The objective of this tutorial is to showcase a typical use case of DyNetworkX.  
We will be using the Enron dataset, a preview of the data file can be seen below.

In [1]:
import  os
import urllib

file_path = "execs.email.linesnum"

if not os.path.exists(file_path):
    print("Downloading Enron dataset from http://www.cis.jhu.edu/~parky/Enron/execs.email.linesnum ...")
    urllib.request.urlretrieve("http://www.cis.jhu.edu/~parky/Enron/execs.email.linesnum", file_path)
    print("Download complete.")
    
with open("execs.email.linesnum", "r") as file:
    for x in range(5):
        print(next(file))

315522000 24 153

315522000 24 153

315522000 29 29

315522000 29 29

315522000 29 29



### Loading Data into DyNetworkX

Loading data a text file using the function `dnx.ImpulseGraph.load_from_txt`.
Make sure to specify necessary arguments such as `delimiter`, `timestamptype`, and `order`.

Comparing the output of the new ImpulseGraph, it is possible to verify the data set is correctly imported. (Note: order not guaranteed)

In [2]:
import dynetworkx as dnx

impulseG = dnx.ImpulseGraph.load_from_txt("execs.email.linesnum", delimiter=" ", timestamptype=int, order=('t', 'v', 'u'))

print(impulseG.edges()[:5])

[(169, 169, 315522000), (9, 92, 315522000), (145, 145, 315522000), (178, 178, 315522000), (169, 169, 315522000)]


### Converting between different graph types

Traditionally working with dynamic networks, it is commmon to flatten the temporal dimension by binning data into smaller static graphs called snapshots. This behavior is replicated by the DynetworkX class `SnapshotGraph`.  
By using the argument `length_of_snapshots`, it is possible to specify the desired length of each snapshot to 1 year (converted to seconds to match the data set).

In [9]:
snapshotG = impulseG.to_snapshot_graph(length_of_snapshots=31536000)

### Calculating Dynamic Betweenness Centrality

`compute_network_statistic` returns a list, each item in the list refers to each snapshot in the SnapshotGraph. The specified method is applied to each snapshot in the graph, passing additional arguments if present.
The first snapshot can be seen below.

In [4]:
from networkx.algorithms.centrality import betweenness_centrality

centrality_list = snapshotG.compute_network_statistic(betweenness_centrality, normalized=True)
print(centrality_list[0])

{169: 0.004645760743321719, 9: 0.0, 92: 0.003484320557491289, 145: 0.2862950058072009, 178: 0.004065040650406504, 27: 0.0, 163: 0.10627177700348432, 146: 0.03484320557491289, 155: 0.0, 165: 0.0, 60: 0.010452961672473868, 99: 0.0, 65: 0.0, 63: 0.13182346109175377, 128: 0.009872241579558653, 66: 0.0011614401858304297, 157: 0.0, 140: 0.13821138211382114, 147: 0.011033681765389082, 29: 0.0, 82: 0.15737514518002324, 114: 0.0, 124: 0.030197444831591175, 107: 0.0, 33: 0.0, 167: 0.11033681765389082, 173: 0.0, 58: 0.03484320557491289, 103: 0.05807200929152149, 105: 0.0, 123: 0.0, 57: 0.0, 94: 0.0, 56: 0.030197444831591175, 51: 0.01916376306620209, 153: 0.0, 24: 0.0, 78: 0.0, 111: 0.0, 38: 0.0, 162: 0.0, 95: 0.0, 137: 0.0}


### Formatting Data

From this point forward, we are done using DyNetworkX, continuing to finish the objective by transposing the degree data into a format that can be easily plotted.

Columns in the dataframe represent snapshots. Indexes in the dataframe represent nodes. Values in the dataframe represent betweenness centrality. Final dataframe is filtered to reduce clutter of final plot.

In [41]:
import pandas as pd

df = pd.DataFrame()

for i in range(len(centrality_list)):
    df[i] = 0.0
    
    for node in centrality_list[i]:
        if node not in df.index:
            df.loc[node] = 0.0
        df.at[node, i] = centrality_list[i][node]

df = df[[*range(18,len(centrality_list))]]
print(df.loc[169])
print(centrality_list[18])
df = df[(df != 0).all(1)]
print(df.head())
print(df.shape)

18    0.000000
19    0.379736
20    0.057515
21    0.004021
22    0.010995
Name: 169, dtype: float64
{169: 0.0, 114: 1.0, 123: 0.0, 155: 0.0, 110: 0.0, 38: 0.0, 112: 0.0, 65: 0.0, 145: 0.0, 11: 0.0, 160: 0.0, 22: 0.0}
Empty DataFrame
Columns: [18, 19, 20, 21, 22]
Index: []
(0, 5)


### Plotting Dynamic Betweenness Centrality

Finally, plot the betweenness centrality over time using plotly.

In [None]:
import plotly.express as px

fig = px.line(df.T)
fig.update_layout(xaxis_title="Snapshot",
                  xaxis_nticks=len(snapshotG),
                  yaxis_title="Betweenness Centrality",
                  legend_title_text='Node',
                  template="plotly_white")
fig.show()