# Temporal Graphs and Path Data

## Prerequisites

First, we need to set up our Python environment that has PyTorch, PyTorch Geometric and PathpyG installed. Depending on where you are executing this notebook, this might already be (partially) done. E.g. Google Colab has PyTorch installed by default so we only need to install the remaining dependencies. The DevContainer that is part of our GitHub Repository on the other hand already has all of the necessary dependencies installed. 

In the following, we install the packages for usage in Google Colab using Jupyter magic commands. For other environments comment in or out the commands as necessary. For more details on how to install `pathpyG` especially if you want to install it with GPU-support, we refer to our [documentation](https://www.pathpy.net/dev/getting_started/). Note that `%%capture` discards the full output of the cell to not clutter this tutorial with unnecessary installation details. If you want to print the output, you can comment `%%capture` out.

In [None]:
%%capture
# !pip install torch
!pip install torch_geometric
!pip install git+https://github.com/pathpy/pathpyG.git

## Motivation and Learning Objectives

In this tutorial we will introduce the representation of temporal graph data in the `Temporal Graph` class and how such data can be used to calculate time respecting paths.

In [1]:
import torch
from torch_geometric.data import TemporalData
import pathpyG as pp

pp.config['torch']['device'] = 'cuda'

We can create a temporal graph object from a list of time-stamped edges. Since TemporalGraph is a subclass of the `Graph` class, the inernal structures are very similar:

In [2]:
tedges = [('a', 'b', 1),('a', 'b', 2), ('b', 'a', 3), ('b', 'c', 3), ('d', 'c', 4), ('a', 'b', 4), ('c', 'b', 4),
              ('c', 'd', 5), ('b', 'a', 5), ('c', 'b', 6)]
t = pp.TemporalGraph.from_edge_list(tedges)
print(t.mapping)
print(t.N)
print(t.M)

a -> 0
b -> 1
c -> 2
d -> 3

4
10


By default, all temporal graphs are directed. We can create an undirected version a temporal graph as follows:

In [5]:
x = t.to_undirected()
print(x.mapping)
print(x.N)
print(x.M)

a -> 0
b -> 1
c -> 2
d -> 3

4
20


We can also create a temporal graph from an instance of `TemporalData`

In [13]:
td = TemporalData(
    src = torch.Tensor([0,1,2,0]),
    dst = torch.Tensor([1,2,3,1]), 
    t = torch.Tensor([0,1,2,3]))
print(td)
t2 = pp.TemporalGraph(td)
print(t2)

TemporalData(src=[4], dst=[4], t=[4])
Temporal Graph with 4 nodes, 3 unique edges and 4 events in [0.0, 3.0]

Graph attributes
	src		<class 'torch.Tensor'> -> torch.Size([4])
	dst		<class 'torch.Tensor'> -> torch.Size([4])
	t		<class 'torch.Tensor'> -> torch.Size([4])





We can restrict a temporal graph to a time window:

In [14]:
t1 = t.get_window(0,4)
print(t1)
print(t1.N)
print(t1.M)

Temporal Graph with 3 nodes, 3 unique edges and 4 events in [1.0, 3.0]

Graph attributes
	src		<class 'torch.Tensor'> -> torch.Size([4])
	dst		<class 'torch.Tensor'> -> torch.Size([4])
	t		<class 'torch.Tensor'> -> torch.Size([4])

3
4


We can convert a temporal graph into a weighted time-aggregated static graph:

In [17]:
g = t.to_static_graph(weighted=True)
print(g)

Undirected graph with 4 nodes and 6 (directed) edges

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([6])

Graph attributes
	num_nodes		<class 'int'>



We can also aggregate the temporal graph in a given time window:

In [19]:
g = t.to_static_graph(time_window=(1, 3), weighted=True)
print(g)

Directed graph with 2 nodes and 1 edges

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([1])

Graph attributes
	num_nodes		<class 'int'>



Finally, we can perform a rolling window analysis:

In [21]:
r = pp.algorithms.RollingTimeWindow(t, 3, 1, return_window=True)
for g, w in r:
    print('Time window ', w)
    print(g)
    print(g.data.edge_index)
    print('---')

Time window  (1.0, 4.0)
Directed graph with 3 nodes and 3 edges

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([3])

Graph attributes
	num_nodes		<class 'int'>

EdgeIndex([[0, 1, 1],
           [1, 2, 0]], device='cuda:0', sparse_size=(3, 3), nnz=3,
          sort_order=row)
---
Time window  (2.0, 5.0)
Directed graph with 4 nodes and 5 edges

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([5])

Graph attributes
	num_nodes		<class 'int'>

EdgeIndex([[0, 1, 1, 2, 3],
           [1, 2, 0, 1, 2]], device='cuda:0', sparse_size=(4, 4), nnz=5,
          sort_order=row)
---
Time window  (3.0, 6.0)
Undirected graph with 4 nodes and 6 (directed) edges

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([6])

Graph attributes
	num_nodes		<class 'int'>

EdgeIndex([[0, 1, 1, 2, 2, 3],
           [1, 2, 0, 3, 1, 2]], device='cuda:0', sparse_size=(4, 4), nnz=6,
          sort_order=row)
---
Time window  (4.0, 7.0)
Directed graph with 4 nodes an

## Temporal Graphs

Let's start with a simple temporal graph with four nodes `a`,`b`,`c`,`d` and seven timestamped edges `(b,c;2)`,`(a,b;1)`,`(c,d;3)`,`(d,a;4)`,`(b,d;2)`, `(d,a;6)`,`(a,b;7)`. 

The following code generates this temporal graph from the given edge list.

In [29]:
g = pp.TemporalGraph.from_edge_list([['b', 'c', 2],['a', 'b', 1], ['c', 'd', 3], ['d', 'a', 4], ['b', 'd', 2], ['d', 'a', 6], ['a', 'b', 7]])
print(g)

Temporal Graph with 4 nodes, 5 unique edges and 7 events in [1.0, 7.0]

Graph attributes
	src		<class 'torch.Tensor'> -> torch.Size([7])
	dst		<class 'torch.Tensor'> -> torch.Size([7])
	t		<class 'torch.Tensor'> -> torch.Size([7])



We can visualize a temporal graph by using the pathpyG plot function.

In [30]:
pp.plot(g, edge_color='lightgray')

<pathpyG.visualisations.network_plots.TemporalNetworkPlot at 0x7f32c6f8f730>

Consistent with `pyG` the sources, destinations and timestamps are stored as a `pyG TemporalData` object, which we can access in the following way.



In [31]:
g.data

TemporalData(src=[7], dst=[7], t=[7])

In [32]:
print(g.data.t)

tensor([1., 2., 2., 3., 4., 6., 7.], device='cuda:0')


With the generator functions `edges` and `temporal_edges` we can iterate through the (temporal) edges of this graph.

In [33]:
for v, w in g.edges:
    print(v, w)

a b
b c
b d
c d
d a
d a
a b


In [34]:
for v, w, t in g.temporal_edges:
    print(v, w, t)

a b 1.0
b c 2.0
b d 2.0
c d 3.0
d a 4.0
d a 6.0
a b 7.0


## Extracting Time-Respecting Paths

We are often interested in the time respecting paths of a temporal graph.

A time respecting path is defined as a sequence of nodes $v_0,...,v_l$ where the corresponding edges occur in the right time ordering and with a maximum time difference of $\delta\in \N$. 

To calculate time-respecting paths in a temporal graph, we can construct a time-unfolded directed acyclic graph (DAG), where each node is a time-stamped edge $(u,v;t)$ and two nodes representing time-stamped edges $(u,v;t_1)$ and $(v,w;t_2)$ are connected by a (second-order) edge iff $0 < t_2-t_1 \leq \delta$.

For the toy example above, we can construct the time-unfolded DAG as follows:

In [36]:
dag = pp.algorithms.temporal_graph_to_event_dag(g, delta=1, create_mapping=True)
pp.plot(dag, node_label = [str(dag.mapping.to_id(i)) for i in range(dag.N)])


<pathpyG.visualisations.network_plots.StaticNetworkPlot at 0x7f32b00f5540>

For $\delta=1$, this DAG with two connected components tells us that there are the following three time-respecting paths:

a -> b -> c -> d -> a  
a -> b -> d  
d -> a -> b


We can use the function `pp.algorithms.time_respecting_paths` to calculate all (longest) time-respecting paths:

In [39]:
pp.algorithms.time_respecting_paths(g, delta=1)

Constructed temporal event DAG with 7 nodes and 5 edges
Processing root 1/2


defaultdict(<function pathpyG.algorithms.temporal.time_respecting_paths.<locals>.<lambda>()>,
            {2: [['a', 'b', 'd'], ['d', 'a', 'b']],
             4: [['a', 'b', 'c', 'd', 'a']]})

The following function computes all shortest time-respecting paths between all pairs of nodes:

In [47]:
shortest_paths, distances = pp.algorithms.temporal_shortest_paths(g, delta=1)
print(shortest_paths['a'])
print(shortest_paths['b'])
print(shortest_paths['c'])
print(shortest_paths['d'])

Constructed temporal event DAG with 7 nodes and 5 edges
Processing root 1/2
defaultdict(<class 'set'>, {'d': {('a', 'b', 'd')}, 'a': {('a', 'b', 'c', 'd', 'a')}})
defaultdict(<class 'set'>, {})
defaultdict(<class 'set'>, {})
defaultdict(<class 'set'>, {'b': {('d', 'a', 'b')}})


## Higher-Order De Bruijn Graph Models for Time-Respecting Paths

In [57]:
m = pp.MultiOrderModel.from_temporal_graph(g, delta=1, max_order=4)
print(m.layers[1])
print(m.layers[2])
print(m.layers[3])
print(m.layers[4])

Directed graph with 4 nodes and 5 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([4, 1])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([5])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 5 nodes and 5 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([5, 2])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([5])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 5 nodes and 2 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([5, 3])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([2])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 2 nodes and 1 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([2, 4])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([1])

Graph attributes
	num_nodes		<class 'int'>



In [59]:
pp.plot(m.layers[1], node_label=[v for v in m.layers[1].nodes])

<pathpyG.visualisations.network_plots.StaticNetworkPlot at 0x7f3299c34be0>

In [60]:
pp.plot(m.layers[2], node_label=[v for v in m.layers[2].nodes])

<pathpyG.visualisations.network_plots.StaticNetworkPlot at 0x7f3299c34ca0>

In [61]:
pp.plot(m.layers[3], node_label=[v for v in m.layers[3].nodes])

<pathpyG.visualisations.network_plots.StaticNetworkPlot at 0x7f3299c33610>

In [62]:
pp.plot(m.layers[4], node_label=[v for v in m.layers[4].nodes])

<pathpyG.visualisations.network_plots.StaticNetworkPlot at 0x7f3299c33730>

## Analysis of empirical temporal graphs

We can read temporal graphs from CSV files that contain the source, target, and time-stamps of edges in each line:

In [63]:
t = pp.TemporalGraph.from_csv('../data/ants_1_1.tedges')
print(t)

Temporal Graph with 89 nodes, 947 unique edges and 1911 events in [0.0, 1438.0]

Graph attributes
	src		<class 'torch.Tensor'> -> torch.Size([1911])
	dst		<class 'torch.Tensor'> -> torch.Size([1911])
	t		<class 'torch.Tensor'> -> torch.Size([1911])





In [65]:
paths = pp.algorithms.time_respecting_paths(t, delta=5)

Constructed temporal event DAG with 1910 nodes and 562 edges
Processing root 1/1416
Processing root 11/1416
Processing root 21/1416
Processing root 31/1416
Processing root 41/1416
Processing root 51/1416
Processing root 61/1416
Processing root 71/1416
Processing root 81/1416
Processing root 91/1416
Processing root 101/1416
Processing root 111/1416
Processing root 121/1416
Processing root 131/1416
Processing root 141/1416
Processing root 151/1416
Processing root 161/1416
Processing root 171/1416
Processing root 181/1416
Processing root 191/1416
Processing root 201/1416
Processing root 211/1416
Processing root 221/1416
Processing root 231/1416
Processing root 241/1416
Processing root 251/1416
Processing root 261/1416
Processing root 271/1416
Processing root 281/1416
Processing root 291/1416
Processing root 301/1416
Processing root 311/1416
Processing root 321/1416
Processing root 331/1416
Processing root 341/1416
Processing root 351/1416
Processing root 361/1416
Processing root 371/1416


In [66]:
m = pp.MultiOrderModel.from_temporal_graph(t, delta=30, max_order=4)
print(m.layers[1])
print(m.layers[2])
print(m.layers[3])
print(m.layers[4])

Directed graph with 89 nodes and 947 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([89, 1])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([947])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 947 nodes and 1780 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([947, 2])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([1780])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 1780 nodes and 2410 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([1780, 3])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([2410])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 2410 nodes and 3292 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([2410, 4])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([3292])

Graph attributes
	num_nodes		<class 'int'>



In [67]:
t = pp.TemporalGraph.from_csv('../data/manufacturing_email.tedges')
print(t)

Temporal Graph with 167 nodes, 5784 unique edges and 82927 events in [1262454016.0, 1285884544.0]

Graph attributes
	src		<class 'torch.Tensor'> -> torch.Size([82927])
	dst		<class 'torch.Tensor'> -> torch.Size([82927])
	t		<class 'torch.Tensor'> -> torch.Size([82927])





In [68]:
paths = pp.algorithms.time_respecting_paths(t, delta=240)

Constructed temporal event DAG with 82887 nodes and 9779 edges
Processing root 1/74729
Processing root 11/74729
Processing root 21/74729
Processing root 31/74729
Processing root 41/74729
Processing root 51/74729
Processing root 61/74729
Processing root 71/74729
Processing root 81/74729
Processing root 91/74729
Processing root 101/74729
Processing root 111/74729
Processing root 121/74729
Processing root 131/74729
Processing root 141/74729
Processing root 151/74729
Processing root 161/74729
Processing root 171/74729
Processing root 181/74729
Processing root 191/74729
Processing root 201/74729
Processing root 211/74729
Processing root 221/74729
Processing root 231/74729
Processing root 241/74729
Processing root 251/74729
Processing root 261/74729
Processing root 271/74729
Processing root 281/74729
Processing root 291/74729
Processing root 301/74729
Processing root 311/74729
Processing root 321/74729
Processing root 331/74729
Processing root 341/74729
Processing root 351/74729
Processing r

In [70]:
m = pp.MultiOrderModel.from_temporal_graph(t, delta=240, max_order=4)
print(m.layers[1])
print(m.layers[2])
print(m.layers[3])
print(m.layers[4])

Directed graph with 167 nodes and 5784 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([167, 1])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([5784])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 5784 nodes and 3542 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([5784, 2])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([3542])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 3542 nodes and 812 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([3542, 3])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([812])

Graph attributes
	num_nodes		<class 'int'>

Directed graph with 812 nodes and 156 edges

Node attributes
	node_sequence		<class 'torch.Tensor'> -> torch.Size([812, 4])

Edge attributes
	edge_weight		<class 'torch.Tensor'> -> torch.Size([156])

Graph attributes
	num_nodes		<class 'int'>

