<div class="alert alert-danger">
In this tutorial the alpha version of pathpy 3 is used. With high probability there will be code changes! To complete this tutorial successfully use pathpy version v3.0.0a1!
</div>

# Higher-order Network Analysis and Visualisation with `pathpy`

**Jürgen Hackl**  
Assistant Professor  
Department of Civil Engineering and Industrial Design  
School of Engineering  
University of Liverpool, UK   

**January 23 2020**

1. [Introduction to `pathpy`](#part-1)
2. [(Spatial)-Temporal Network Analysis and Visualisation in `pathpy`](#part-2)
3. [Higher-order Models of Paths](#part-3)
4. [Multi-order Model Selection](#part-4)

# 1 Introduction to `pathpy` <a class="anchor" id="part-1"></a>

In the introductory lecture, we have seen that higher-order modelling, visualisation, and analysis techniques are useful to analyse **temporal network data** that provide us with **statistics of causal paths**. But how can we apply higher-order network analytics to such data in practice?

In this tutorial, we introduce [``pathpy``](http://www.pathpy.net), an OpenSource `python` package that provides higher-order data analytics and representation learning techniques. It contains data structures, algorithms, data import/export methods, and visualisation techniques for various types of time series data on complex networks.

`pathpy` is pure `python` code with no platform-specific dependencies. It only depends on `numpy`, `scipy` and `pandas`, which come with `Anaconda`, so it should be easy to install.

In principle, installing the latest 3.0.0a1 version of `pathpy` should be as easy as running:

```
pip install pathpy3==3.0.0a1
```

on the terminal. In any case, you can find more detailed setup instructions on the [tutorial website](https://pathpy.github.io/pathpy-tutorials/setup).

<div class="alert alert-warning">
Please ensure that ipywidgets is installed and enabled! Therefore, use:<br><br>
    pip install ipywidgets<br>
    jupyter nbextension enable --py widgetsnbextension
</div>

<span style="color:red">**TODO:** Import the package `pathpy` and rename it to `pp`.</span>

In [1]:
import pathpy as pp

## Default network functionalities

Before we start exploring paths in more detail. Let us have a look at some default network functionalities pathpy comes with. A major change in the new version 3 is its focus on object-oriented programming. I.e. we have now objects for **Nodes**, **Edges**, **Paths**, **Networks**,...

<span style="color:red">**TODO:** Create a `Node` object by calling the constructur with an unique identifier `uid` (i.e. the name of the node).</span>

In [2]:
a = pp.Node('a')

The node object allows us to store node-specific data we later can use for our analysis. Thereby it accts like a dictionary with keys and values.

<span style="color:red">**TODO:** Add a 'color' attribute to the node.</span>

In [3]:
a['color'] = 'red'

Similar to a `Node` we can create an `Edge` object. In order to generate such an object, we need a second `Node`. We can create the node as before or together with the `Edge`.

<div class="alert alert-info">
Per default, pathpy allows multiple edges between any pair of nodes, since every edge has its own uid. If no uid is assigned, pathpy automatically generates one, based on the node properties.   
</div>

<span style="color:red">**TODO:** Create an `Edge` between node `a` and a new node called `"c"`.</span>

In [4]:
e = pp.Edge(a,"c")

Here we created a new `Node` `c` and an `Edge` since we did not specify the uid of the edge. Pathpy assigned one for us based on the two node uids. To see the edge uid let us print the object.

<span style="color:red">**TODO:** Print the `Edge` object which gives you the assigned uid.</span>

In [5]:
e

Edge a-c

<div class="alert alert-info">
Per default the edge uid uses '-' to seperate the node uids. This can be customized in your config file.   
</div>

Finally, let us create a `Network` and add our nodes and edges. 

<span style="color:red">**TODO:** Create a new `Network` object and add `Edge` e.</span>

In [6]:
net = pp.Network()
net.add_edge(e)

<span style="color:red">**TODO:** Show a summary of the Network.</span>

In [7]:
net.summary()

[01-16 09:11:53: INFO ] Name:
[01-16 09:11:53: INFO ] Type:			DirectedNetwork
[01-16 09:11:53: INFO ] Directed:		True
[01-16 09:11:53: INFO ] Number of unique nodes:	2
[01-16 09:11:53: INFO ] Number of unique edges:	1
[01-16 09:11:53: INFO ] Number of unique paths:	0
[01-16 09:11:53: INFO ] Number of total paths:	0


The summary reveals that our network is directed, has two nodes and one edge.

<div class="alert alert-info">
Per default the edge in pathpy are directed. If nothing is defined, pathpy chooses the Network type by itself.  
</div>

An essential concept of pathpy is that paths are also valid and vital building objects for networks; therefore, in the summary these properties are also mentioned. 

<div class="alert alert-info">
Pathpy also allows to considere edges as paths of length 1. This can be enabled in the config file.  
</div>

Before we now take a closer look at paths and their analysis, let's have a last look at our network.

<span style="color:red">**TODO:** Plot the network.</span>

In [8]:
net.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea33ded60>

## paths

A core functionality of `pathpy` is to read, calculate, store, manipulate, and model path statistics extracted from different kinds of temporal data on complex networks. For this `pathpy` provides the class `Path`, which can store collections of paths with varying lengths.

<span style="color:red">**TODO:** Create a path from node $a \rightarrow c \rightarrow d$.</span>

In [9]:
p = pp.Path('a-c-d')

Similar to edges, we can create paths by using a `sting` notation. Here pathpy creates automatically three nodes (`a`,`b`, and `c`) as well as two edges `a-c` and `c-d`. With the function summary, we can get more detail about our path.

<span style="color:red">**TODO:** Show a summary of the path instance `p` \rightarrow d$.</span>

In [10]:
p.summary()

[01-16 09:11:53: INFO ] Name:			a-c|c-d
[01-16 09:11:53: INFO ] Type:			Path
[01-16 09:11:53: INFO ] Directed:		True
[01-16 09:11:53: INFO ] Number of unique nodes:	3
[01-16 09:11:53: INFO ] Number of unique edges:	2
[01-16 09:11:53: INFO ] Path length (# edges):	2


<div class="alert alert-info">
If no dedicated path uid is defined, pathpy creates the uid based on the observed edge uids, separated by '|'. This also can be customized in your config file.   
</div>

Let us add ten observations of our path `p` to our network for further analysis.

<span style="color:red">**TODO:** Add path `p` with a frequency of 10 to the `net` object and print a new summary of the network.</span>

In [11]:
net.add_path(p,frequency=10)
net.summary()

[01-16 09:11:53: INFO ] Name:
[01-16 09:11:53: INFO ] Type:			DirectedNetwork
[01-16 09:11:53: INFO ] Directed:		True
[01-16 09:11:53: INFO ] Number of unique nodes:	3
[01-16 09:11:53: INFO ] Number of unique edges:	2
[01-16 09:11:53: INFO ] Number of unique paths:	1
[01-16 09:11:53: INFO ] Number of total paths:	10


Now our network consists of three unique nodes, two unique edges and one unique path. Thereby the path is ten times observed in our network.

To analyze **path data**, `pathpy` provides us with various  **statistics of causal paths**. One of them is a tool for **sub-path statistics**.

Let's generate a summary for the sub-paths in our example and discuss the findings.

<span style="color:red">**TODO:** Generate a summary of the sub-statistic using pathpy's `subpath` class.</span>

In [12]:
net.subpaths.summary()

HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=1.0, style=ProgressStyle(descriptio…

[01-16 09:11:53: INFO ] Sub path statistics
[01-16 09:11:53: INFO ] - General --------------------------------------------
[01-16 09:11:53: INFO ] Number of unique nodes:            3.000
[01-16 09:11:53: INFO ] Number of unique edges:            2.000
[01-16 09:11:53: INFO ] Number of unique paths:            1.000
[01-16 09:11:53: INFO ] - Path statistics ------------------------------------
[01-16 09:11:53: INFO ] Mean path length:                  2.000
[01-16 09:11:53: INFO ] Standard derivation:               0.000
[01-16 09:11:53: INFO ] Min. path length:                  2.000
[01-16 09:11:53: INFO ] 25% quantile:                      2.000
[01-16 09:11:53: INFO ] 50% quantile:                      2.000
[01-16 09:11:53: INFO ] 75% quantile:                      2.000
[01-16 09:11:53: INFO ] Max. path length:                  2.000
[01-16 09:11:53: INFO ] - Sub path statistics --------------------------------
[01-16 09:11:53: INFO ]  path  |          frequencies          |    u



We get summary statistics of the `net` instance. Our toy example contains ten observed paths between three nodes. These paths imply a graph topology with two edges `a-c` and `c-d`. Both the maximum and the average path length is two (the path length counts the number of edge traversals of a path).

To understand the last table, we must look into the inner workings of `pathpy`. For the fitting of higher-order graphical models as well as for the representation learning algorithm, `pathpy` uses all path statistics available. Specifically to fit, say, a second-order model to a set of paths that all have length 10 or longer, we calculate which paths of length two are contained as sub-paths within these observations of longer paths. For this reason, `pathpy` automatically computes the statistics of actual path observations as well as the statistics of **sub-paths** contained in these observed paths.

In our case, we have ten observations of a single path `a-c-d` of length two, thus the last line in the output above. Each of these paths additionally contains two sub-paths `a-c` and `c-d` of length one, hence the number 20 in the sub-path count for length = 1. Finally, each of the paths contains three "paths" of length zero, which are just observations of a single node (i.e. there is no transition across an edge), thus the sub-path count of 30 for length = 0. This amounts to a total of 50 possible sub-paths and ten observations of an actual (longest) path.

Let's add a second path and see how the sub-path statistics change.

<span style="color:red">**TODO:** Add path `"b-c-e"` ten times to the network and generate new sup-path statistics.</span>

In [13]:
net.add_path('b-c-e',frequency=10)
net.subpaths.summary()

[01-16 09:11:53: INFO ] Sub path statistics
[01-16 09:11:53: INFO ] - General --------------------------------------------
[01-16 09:11:53: INFO ] Number of unique nodes:            5.000
[01-16 09:11:53: INFO ] Number of unique edges:            4.000
[01-16 09:11:53: INFO ] Number of unique paths:            2.000
[01-16 09:11:53: INFO ] - Path statistics ------------------------------------
[01-16 09:11:53: INFO ] Mean path length:                  2.000
[01-16 09:11:53: INFO ] Standard derivation:               0.000
[01-16 09:11:53: INFO ] Min. path length:                  2.000
[01-16 09:11:53: INFO ] 25% quantile:                      2.000
[01-16 09:11:53: INFO ] 50% quantile:                      2.000
[01-16 09:11:53: INFO ] 75% quantile:                      2.000
[01-16 09:11:53: INFO ] Max. path length:                  2.000
[01-16 09:11:53: INFO ] - Sub path statistics --------------------------------
[01-16 09:11:53: INFO ]  path  |          frequencies          |    u

We obtain a new **sub-path statistic** with 20 observed paths between five nodes `a`, `b`, `c`, `d`, and `e` across four edges `a-c`, `c-d`, `b-c` and `c-e`.

Let's have a look at our network before we continue with temporal networks.

<span style="color:red">**TODO:** Plot the network. (Bonus: Add the number of node observations.)</span>

In [14]:
net.plot(node_text=net.nodes.counter())

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea3385520>

# 2 (Spatial)-Temporal Network Analysis and Visualisation in `pathpy`<a class="anchor" id="part-2"></a>

We have considered the `Path` class, which is useful if you have direct access to path statistics in your time series data. This includes clickstreams of users in information networks, origin-destination statistics in transportation networks, flight ticket sequences, or other **collections of short, ordered sequences**.

In this section, we expand this view towards temporal networks, i.e. high-resolution time-series network data, where edges carry fine-grained time stamps. Considering technical, social, and biological systems that can be modelled as dynamic networks, such data cover a broad class of complex systems that can be studied with higher-order network models.

## Temporal Networks

`pathpy` natively provides special support for the analysis of temporal networks. It is suitable for data that captures time-stamped nodes and edges occurring at discrete time $t$. Let us start by creating an empty network.

<span style="color:red">**TODO:** Create a new instance `tmp` of the `Network` class and print a summary of the instance.</span>

In [15]:
tmp = pp.Network()
tmp.summary()

[01-16 09:11:53: INFO ] Name:
[01-16 09:11:53: INFO ] Type:			DirectedNetwork
[01-16 09:11:53: INFO ] Directed:		True
[01-16 09:11:53: INFO ] Number of unique nodes:	0
[01-16 09:11:53: INFO ] Number of unique edges:	0
[01-16 09:11:53: INFO ] Number of unique paths:	0
[01-16 09:11:53: INFO ] Number of total paths:	0


As expected, we get a "normal" directed Network with no nodes, edges, or paths. 

In the next step, let us add some time-stamped edges to this network. `pathpy` automatically identifies temporal attributes of your objects. The supported temporal attributes are:

|key|description|
|---|-----------|
|start|time-stamp when the object starts to be active|
|end|time-stamp when the object starts to be inactive|
|time|time-stamp when an action or event happend|
|duration|duration of an action or event|

Time-stamps can either be `int`, `str` or `pandas.Timestamp`. Thereby an `int` value represents a **discrete time unit** which is per default in seconds. Of course, this can also be changed in the config file. Finally, `pathpy` converts all time-steps into `pandas.Timestamp` which makes searching and data handling much more comfortable. 

<span style="color:red">**TODO:** Add to the network a termporal edge between nodes `"a"` and `"c"` which exists 10 seconds. Plot the network.</span>

In [16]:
tmp.add_edge('a-c',start=0,end=10)
tmp.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea3385b80>

We get a similar plot as before, however with an additional menu bar and time slider. When we click play, nothing happens since we only consider one temporal edge. To see some changes we have to add another edge.

<span style="color:red">**TODO:** Add a second edge between nodes `"c"` and `"d"` which starts at 5 seconds and ends after 10 (i.e. at time-stamp 15). Plot the network.</span>

In [17]:
tmp.add_edge('c-d',start=5,duration=10)
tmp.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea3397df0>

Now, when we click play, we see that the connections change over time. Of course speed, size, colours, ... can be adjusted for the visualization (which will be shown later)

Two important parameters which will influence the force-directed layout algorithm - that is used to position nodes in the network - can be modified directly from the menu bar. In a temporal network, the question is which time-stamped edges should be taken into account for the force-calculation at any given time-stamp. If we only consider currently active edges, the layout will change too fast to recognize interesting patterns. If we consider all edges at every time step, node positions will be static despite the dynamics of edges. In real settings, we want a compromise between those extremes, i.e. we specify a time window around the current time-stamp within which edges are taken into account in the force-directed layout calculation. 

Wit the button **Aggregation** we can enable to aggregate time-steps. When we slide the aggregation window till the end, we get the aggregated network.

With the button **Lookout** we can create a window around the current time-stamp within which edges are taken into account in the force-directed layout calculation.

Beside activating and deactivating the edges, `pathpy` also allows to add, manipulate and visualize temporal properties of nodes and edges.

<span style="color:red">**TODO:** Add a temporal color to node `"a"` and update this color over time. Plot the network.</span>

In [18]:
tmp.nodes['a'].update(color='red',start=0,end=15)
tmp.nodes['a'].update(color='orange',time=5)
tmp.nodes['a'].update(color='green',time=10)
tmp.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea2b29220>

Visualizations are a great tool to communicate and show the complexity of our systems, however, to analyze them we need the data in a more compact format. `pathpy` allows to convert the (temporal) date into a `pandas.DataFrame` which is a standard data frame format for most ML toolboxes.

<span style="color:red">**TODO:** Convert the temproal edges of `tmp` into a pandas table, thereby rescale the observations frequencis to a 5 second intervall (`"5s"`)</span>

In [19]:
tmp.edges.to_temporal_frame(frequency='5s')

Unnamed: 0,time,uid,status,active,directed,v,w
0,1970-01-01 00:00:00,a-c,start,True,True,a,c
1,1970-01-01 00:00:05,a-c,active,True,True,a,c
2,1970-01-01 00:00:10,a-c,end,False,True,a,c
3,1970-01-01 00:00:05,c-d,start,True,True,c,d
4,1970-01-01 00:00:10,c-d,active,True,True,c,d
5,1970-01-01 00:00:15,c-d,end,False,True,c,d


## Spatial-Temporal Networks

Many real-world processed does not only happen over time but also in space. `pathpy` supports spatial analysis similar to the temporal analysis shown before. I.e. it identifies the spatial attributes of your objects. The supported temporal attributes are:

|key|description|format|
|:-:|:----------|:----:|
|euclidean|Coordinate in a Euclidean space|`tuple`|
|x|x-coordinate in a Euclidean space|`float` |
|y|y-coordinate in a Euclidean space|`float` |
|coordinate|GPS coordinate|`tuple`|
|lat|latitude|`float`|
|lon|longitude|`float`|


<div class="alert alert-warning">
While pathpy v3.0.0a1 supports all coordinates, the visualization is currently limited to the Euclidean space.
</div>

<span style="color:red">**TODO:** Add more nodes and edges to the network. Assign coordinates and movement. Finally, plot the spatial-temporal network.</span>


In [20]:
# Load numpy for some math stuff
import numpy as np

# Add additional eddges
tmp.add_edges_from(['b-c','c-e'])

# Assign coordinates to all nodes
coord = {'a':(-1,-1),'b':(1,1),'c':(0,0),'d':(1,-1),'e':(-1,1)}
for n in tmp.nodes.values():
    n.update(euclidean=coord[n.uid])

# Create some movement
tmp.nodes['c'].update(start=0,end=15)
for t in range(1,14):
    r = 0.7
    w = 1
    x = r * np.cos(w*t)
    y = r * np.sin(w*t)
    tmp.nodes['c'].update(time=t,euclidean=(x,y))

# Plot the network
tmp.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea2ab3280>

<div class="alert alert-info">
Spatial-temporal networks are currently under development in pathpy. More cool stuff will be releast soon!
</div>

# 3 Higher-order Models of Paths <a class="anchor" id="part-3"></a>

So far, we have focused on network models, but the real **purpose of `pathpy` is to fit and analyse higher-order models for paths in complex networks**. For this, we can use the class `HigherOrderNetwork`.

From a higher-order network analytic point of view, standard graphs or networks are **first-order probabilistic generative models** for paths in complex networks. As we have seen in 1.2, they can be viewed as **maximum entropy models that consider first-order dyad statistics** (i.e. edge frequencies), while ignoring higher-order dependencies in the real-world path, sequence, or time-series data.

In the following works, we have studied measures for higher-order correlations in such data and we generalised network models to higher-order models with arbitrary order:

- R Pfitzner, I Scholtes, A Garas, CJ Tessone, F Schweitzer: **Betweenness Preference: Quantifying Correlations in the Topological Dynamics of Temporal Networks**, In Physical Review Letters, May 2013, [arXiv 1208.0588](http://arxiv.org/abs/1208.0588)

- I Scholtes, N Wider, R Pfitzner, A Garas, CJ Tessone, F Schweitzer: **Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal networks**, In Nature Communications, September 2014, [arXiv 1307.4030](http://arxiv.org/abs/1307.4030)

- I Scholtes, N Wider, A Garas: **Higher-Order Aggregate Networks in the Analysis of Temporal Networks: Path structures and centralities**, In The European Physical Journal B, March 2016, [arXiv 1508.06467](http://arxiv.org/abs/1508.06467) 

- I Scholtes: **When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks**, In KDD'17, February 2017, [arXiv 1702.05499](https://arxiv.org/abs/1702.05499)

A broader overview of the research on higher-order models for complex systems is available in the following article:

- R Lambiotte, M Rosvall, I Scholtes: **From Networks to Optimal Higher-Order Models of Complex Systems**, Nature Physics, March 2019,  [arXiv 1806.05977](https://arxiv.org/abs/1806.05977)

The data analysis and modelling framework outlined in these works build on a generalisation of standard, first-order networks to $k$-dimensional De Bruijn graph models for paths in complex networks.

The class `HigherOrderNetwork` allows us to generate such higher-order network models of paths. In the documentation, we find that the constructor takes a parameter `network`, i.e. the network with the observed paths we want to model. With the setting `order` we specify the order of the higher-order model that we want to fit. To understand this better, let us do this for our toy example.

<span style="color:red">**TODO:** Recreate the toy example from unit 1 and generate a **first-order** model instance `hon_1` and print a summary of the resulting instance.</span>

In [21]:
net = pp.Network()
net.add_paths_from(['a-c-d','b-c-e'],frequency=10)
net.summary()

[01-16 09:11:54: INFO ] Name:
[01-16 09:11:54: INFO ] Type:			DirectedNetwork
[01-16 09:11:54: INFO ] Directed:		True
[01-16 09:11:54: INFO ] Number of unique nodes:	5
[01-16 09:11:54: INFO ] Number of unique edges:	4
[01-16 09:11:54: INFO ] Number of unique paths:	2
[01-16 09:11:54: INFO ] Number of total paths:	20


In [22]:
hon_1 = pp.HigherOrderNetwork(net,order=1)
hon_1.summary()

[01-16 09:11:54: INFO ] Name:
[01-16 09:11:54: INFO ] Type:			HigherOrderNetwork
[01-16 09:11:54: INFO ] Directed:		True
[01-16 09:11:54: INFO ] Number of unique nodes:	5
[01-16 09:11:54: INFO ] Number of unique edges:	4
[01-16 09:11:54: INFO ] Number of unique paths:	2
[01-16 09:11:54: INFO ] Number of total paths:	20


This generates a first-order model of our paths, with five nodes `a`,`b`,`c`,`d` and `e`, and four links `a-c`, `b-c`, `c-d` and `c-e`. It is identically to the `Network` instance that we have previously created. Indeed, each `HigherOrderNetwork` instance is derived from the class `Network`, which means we can store edge and node attributes and visualise it by precisely the same methods.

<span style="color:red">**TODO:** Plot the `HigherOrderModel` instance `hon_1` and print the observed edges/paths conter (weights).</span>

In [23]:
hon_1.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea3385250>

In [24]:
hon_1.edges.counter()

Counter({'a=c': 10, 'c=d': 10, 'b=c': 10, 'c=e': 10})

This output confirms that a `HigherOrderModel` with `order=1` is identical to our `Network` model. We can see this network as a **first-order model** for paths where **edges are paths of length one**. That is, in a model with order `order=1` edge weights capture the statistics of paths of `length=1`.

<div class="alert alert-info">
To destingwuisch between "normal" edges and edges in a higer-order network, pathpy uses '=' to indicate higer-order edges. This can be changed in the config file. 
</div>

We can generalise this idea to **k-th-order models** for paths, where **nodes are paths of length $k-1$** while **edge weights capture the statistics of paths of length $k$**. We can generate such a $k$-th order model by performing a line graph transformation on a model with order $k-1$. That is, edges in the model of order $k-1$ become nodes in the model with order $k$. We then draw edges between higher-order nodes whenever there is a possible path of length $k$ in the underlying network. The result is a $k$-dimensional De Bruijn graph model for paths. Let us try this in our example.

<span style="color:red">**TODO:** Create a second-order model `hon_2` for `net`. Visualise the model and print the weights of all edges.</span>

In [25]:
hon_2 = pp.HigherOrderNetwork(net,order=2)
hon_2.summary()

[01-16 09:11:54: INFO ] Name:
[01-16 09:11:54: INFO ] Type:			HigherOrderNetwork
[01-16 09:11:54: INFO ] Directed:		True
[01-16 09:11:54: INFO ] Number of unique nodes:	4
[01-16 09:11:54: INFO ] Number of unique edges:	2
[01-16 09:11:54: INFO ] Number of unique paths:	2
[01-16 09:11:54: INFO ] Number of total paths:	20


In [26]:
hon_2.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea33feaf0>

In [27]:
hon_2.edges.counter()

Counter({'a-c=c-d': 10, 'b-c=c-e': 10})

Each of the four **edges** in the first-order model is now represented by a **node** in the second-order model. We further have two directed edges `a-c=c-d` and `b-c=c-e` that represent the two paths of length two that occur in our data.

This is important because it captures to what extent the paths that we observe in our data deviate from what we would expect based on the (first-order) network topology of the system. Considering such a first-order model, all four paths `a-c-d`, `a-c-e`,`b-c-d`, and `b-c-e` of length two are possible. If edges were statistically independent, we would expect those four paths to occur with the same frequency.

Another way to express this independence assumption is to consider Markov chain models for the sequences of nodes traversed by a path. In this view, independently occurring edges translate to a memoryless first-order Markov process for the node sequence. In our example, we expect paths `a-c-d` and `a-c-e` to occur with the same probability, i.e. the next nodes `d` or `e` on a path through `c` are independent of the previous node `a`, their probabilities only depending on the relative frequency of edges `c-d` vs `c-e`. In our toy example, we have a total of 20 observed paths of length two, so we expect each of the path to occur five times on average.

`pathpy` can actually generate this **null-model** for paths within the space of possible second-order models. This allows us to compare how the observed path statistics deviate from a (Markovian) expectation.

<span style="color:red">**TODO:** Use the `NullModel` class to generate a second-order null model `hon_2_null` for `net`. Visualise the model and output all edge weights.</span>

In [28]:
hon_2_null = pp.NullModel(net).generate(order=2)

HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=1.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=2.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



In [29]:
hon_2_null.edges.counter()

Counter({'a-c=c-d': 5.0, 'a-c=c-e': 5.0, 'b-c=c-d': 5.0, 'b-c=c-e': 5.0})

We can easily find out which of the paths of length two occur more or less often than expected under the null model. We can just subtract the adjacency matrices of the two instances `hon_2` and `hon_2_null`.

<span style="color:red">**TODO:** For all egdes in `hon_2_null`, calculate how much the observed frequency in `hon_2` deviates from the random expectation.</span>

<span style="color:green">**Hint:** Use the index function to map node names to matrix indices.</span>

In [30]:
A_2 = hon_2.adjacency_matrix()
n_2 = list(hon_2.nodes)

A_2_null = hon_2_null.adjacency_matrix()
n_2_null = list(hon_2_null.nodes)

for e in hon_2_null.edges.values():
    v1 = n_2.index(e.v.uid)
    w1 = n_2.index(e.w.uid)
    v2 = n_2_null.index(e.v.uid)
    w2 = n_2_null.index(e.w.uid)
    print(e.uid,A_2[v1,w1]-A_2_null[v2,w2])
    

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…

a-c=c-d 5.0
a-c=c-e -5.0
b-c=c-d -5.0
b-c=c-e 5.0


This analysis confirms that the paths `b-c-e` and `a-c-d` occur five more times than we would expect at random, while the other two paths do occur five times fewer than expected (i.e. not at all). This deviation from our expectation changes the causal topology of the system, i.e. who can influence whom. In a network model we implicitly assume that paths are transitive, i.e. since `a` is connected to `c` and `c` is connected to `e` we assume that there is a path by which `a` can influence `e` via node `c`. The second-order model of our toy example reveals that this transitivity assumption is misleading, highlighting higher-order dependencies in our data that result in the fact that neither `a` can influence `e`, nor `b` can influence `d`.

# 4 Multi-order Model Selection <a class="anchor" id="part-4"></a>

So far, we studied higher-order network models for path data with a fixed, given order $k$. We have seen that such higher-order models can yield better predictions compared to standard network models. However, a critical question arises: how can we decide **at which order we should model a given system**? This points to a more general problem as we can also imagine systems for which path statistics do not deviate significantly from the transitive, Markovian assumption made by a first-order model. So we **need methods to decide when higher-order models are actually required**.

Moreover, a higher-order model with order $k$ can only capture higher-order dependencies at a single fixed correlation length $k$. But we may encounter data that exhibit multiple correlation lengths at once. How can we **combine models with multiple higher orders into a multi-order model**?

In this unit, we take a statistical inference and machine learning perspective to answer these questions. To show how the method works, we again start with a maximally simple toy example:

<span style="color:red">**TODO:** Create a new instance `observations` of class `Network` and add two paths `"a-c-d"` and `"b-c-e"`, each occurring twice. Print the subpath statistic.</span>

In [31]:
observations = pp.Network()
observations.add_paths_from(['a-c-d','b-c-e'],frequency=2)
observations.subpaths.summary()

HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=2.0, style=ProgressStyle(descriptio…

[01-16 09:11:54: INFO ] Sub path statistics
[01-16 09:11:54: INFO ] - General --------------------------------------------
[01-16 09:11:54: INFO ] Number of unique nodes:            5.000
[01-16 09:11:54: INFO ] Number of unique edges:            4.000
[01-16 09:11:54: INFO ] Number of unique paths:            2.000
[01-16 09:11:54: INFO ] - Path statistics ------------------------------------
[01-16 09:11:54: INFO ] Mean path length:                  2.000
[01-16 09:11:54: INFO ] Standard derivation:               0.000
[01-16 09:11:54: INFO ] Min. path length:                  2.000
[01-16 09:11:54: INFO ] 25% quantile:                      2.000
[01-16 09:11:54: INFO ] 50% quantile:                      2.000
[01-16 09:11:54: INFO ] 75% quantile:                      2.000
[01-16 09:11:54: INFO ] Max. path length:                  2.000
[01-16 09:11:54: INFO ] - Sub path statistics --------------------------------
[01-16 09:11:54: INFO ]  path  |          frequencies          |    u



As mentioned before, in this example, we only observe two of the four paths of length two that would be possible in the null model. Hence, this is an example of path statistics that exhibit correlations that warrant a second-order model. 

But how can we decide this in a statistically sound way? We can take a statistical inference perspective on the problem. More specifically, we will consider our higher-order networks as probabilistic generative models for paths in a given network topology. For this, let us use the weighted first-order network model to construct a transition matrix of a Markov chain model for paths in a network. We simply use the relative frequencies of edges to proportionally scale the probabilities of edge transitions in the model.

<span style="color:red">**TODO:** Generate a first-order model of `net`. Plot the model and print the transition matrix generated by the method `HigherOrderNetwork.transition_matrix`.</span>

In [32]:
hon_1 = pp.HigherOrderNetwork(observations,order=1)
print(hon_1.transition_matrix())

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…

  (0, 1)	1.0
  (1, 4)	0.5
  (1, 2)	0.5
  (3, 1)	1.0


In [33]:
hon_1.plot()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<pathpy.visualizations.plot.Plot at 0x7faea2ab41f0>

This transition matrix can be viewed as a first-order Markov chain model for paths in the underlying network topology. This probabilistic view allows us to calculate the likelihood of the first-order model, given the paths that we have observed. With `pathpy`, we can directly calculate the likelihood of a higher-order model.

<span style="color:red">**TODO:** Use the `HigherOrderNetwork.likelihood` method to calculate the likelihood of the first-order model given the `observations`. Set the parameter `log` to False.</span>

In [34]:
hon_1.likelihood(observations,log=False)

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



0.0625

This result is particularly easy to understand for our toy example. Each path of length two corresponds to two transitions in the transition matrix of our Markov chain model. For each of the four paths of length two in `observations`, the first transition is deterministic because nodes $a$ and $b$ only point to node $c$. However, based on the network topology, for the second step, we have a choice between nodes $d$ and $e$. Considering that we see as many transitions through edge $(c,d)$ as we see through edge $(c,e)$, in a first-order model we have no reason to prefer one over the other, so each is assigned probability $0.5$.

Hence, for each of the four observed paths, we obtain a likelihood of $1 \cdot 0.5 = 0.5$, which yields a total likelihood for four (independent) observations of $0.5^{4} = 0.0625$

Let us compare this to the likelihood of a second-order model for our observations.

<span style="color:red">**TODO:** Generate a second-order model for `observations` and print the transition matrix. Use the `HigherOrderNetwork.likelihood` method to calculate the likelihood of a second-order model, given the `observations`.</span>

In [35]:
hon_2 = pp.HigherOrderNetwork(observations, order=2)
print(hon_2.transition_matrix())
hon_2.likelihood(observations, log=False)

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…

  (0, 1)	1.0
  (2, 3)	1.0


HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…



1.0

Here, the likelihood assumes its maximal value of $1.0$, simply because all transitions in the second-order model are deterministic, i.e. we simply multiply $1 \cdot 1$ four times. 

Let us now have a look at the *second-order null model*, which is actually a first-order model represented in the second-order space. So we should expect the same likelihood as the first-order model.

<span style="color:red">**TODO:** Generate a second-order null model for our `observations` and print the transition matrix. Use the `HigherOrderNetwork.likelihood` method to calculate the likelihood of this model, given the `observations`.</span>

In [36]:
null = pp.NullModel(net)
hon_2_null = null.generate(order=2)
print(hon_2_null.transition_matrix())
hon_2_null.likelihood(observations, log=False)

HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=1.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=2.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…

  (0, 2)	0.5
  (0, 1)	0.5
  (3, 2)	0.5
  (3, 1)	0.5


HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



0.0625

## Model selection for higher-order network models

Clearly, the second-order null should have the same likelihood as the first-order model. This also shows a way to test hypotheses about the presence of higher-order correlations in paths. We can use a likelihood ratio test to compare the likelihood of the null hypothesis (i.e. a second-order representation of the first-order model) with the likelihood of an alternative hypothesis (the *fitted* second-order model).

But what do we learn from the fact that the likelihood of a model increases as we increase the order of the model? By itself, not much. Higher-order models are more complex than first-order models, i.e. while fitting their transition matrix, we actually fit more parameters to the data. We can thus expect that such a more complex model better explains our (path) data. 

We should remind ourselves about Occam's razor, which states that we should favour models that make fewer assumptions. That is, in the comparison of the model likelihoods, we should account for the additional complexity (or degrees of freedom) of a higher-order model over the null hypothesis.

A nice feature of our framework is that the null model and the alternative model are actually **nested**, i.e. the null model is one particular point in the parameter space of the (more general) higher-order model. Thanks to this property, we can apply [Wilk's theorem](https://en.wikipedia.org/wiki/Likelihood-ratio_test#Distribution:_Wilks’_theorem) to derive an analytical expression for the $p$-value of the null hypothesis that second-order correlations are absent (i.e. that a first-order model is sufficient to explain the observed paths), compared to the alternative hypothesis that a second-order model is needed. You can find the full mathematical details of this hypothesis testing approach in the following KDD'17 paper:

I Scholtes: [When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks](http://dl.acm.org/citation.cfm?id=3098145), In KDD'17 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Nova Scotia, Canada, August 13-17, 2017

Let us apply this to test the hypothesis that there are **significant** second-order dependencies in our toy example. This test consists of three basic steps: 

1. We calculate the difference $d$ between the parameters (or degrees of freedom) of a second- and a first-order model.
2. We calculate the test statistic $x = -2 \cdot(\log(\text{hon_1.likelihood}) - \log(\text{hon_2.likelihood}))$ for the likelihood ratio test.
3. We calculate a p-value as $1-cdf(x, d)$, where $cdf$ is the cumulative distribution function of a chi-square distribution.

<span style="color:red">**TODO:** Perform the likelihood ratio test for the null hypothesis that the observed paths can be explained by a first-order model. Use the function `NullModel.degrees_of_freedom` to calculate the degrees of freedom of a k-th order model. Use `chi2.cdf` from `scipy.stats` to calculate the p-value.</span>

In [37]:
from scipy.stats import chi2
hon_1_null = null.generate(order=1)
d = hon_2_null.degrees_of_freedom() - hon_1_null.degrees_of_freedom()
x = - 2 * (hon_1.likelihood(observations, log=True) - hon_2.likelihood(observations, log=True))
p = 1 - chi2.cdf(x, d)

print('The p-value of the null hypothesis (first-order model) is {0}'.format(p))

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…

The p-value of the null hypothesis (first-order model) is 0.018531677751199016


The $p$-value of the null hypothesis that we can explain the four observed paths based on the weighted network topology alone is (borderline) 0.019. This is intuitive, as we have only observed four paths, which is hardly enough to robustly reject a first-order network model. Let us see what happens if we observe those same paths more often.

<span style="color:red">**TODO:** Add an other 18 times the paths observed (i.e. so that we have in total 20 $\times$ 2 observations). Repeat the likelihood ratio test and output the p-value.</span>

In [38]:
observations.add_paths_from(['a-c-d','b-c-e'],frequency=18)

In [39]:
x = - 2 * (hon_1.likelihood(observations, log=True) - hon_2.likelihood(observations, log=True))
p = 1 - chi2.cdf(x, d)

print('The p-value of the null hypothesis (first-order model) is {0}'.format(p))

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…

The p-value of the null hypothesis (first-order model) is 1.6123768986631148e-12


So, if we were to observe each of the two paths four-time, we would reject the null hypothesis because it is very unlikely to not observe two of the four possible paths a single time in eight observations. If we were to further increase the number of observations of the two paths, the p-value decreases.

Unfortunately, the toy example above is too simple in multiple ways: First, it only contains paths of exactly length two, thus justifying a second-order model. But real data are more complex, as we have observations of paths at multiple lengths simultaneously. Such data are likely to exhibit multiple correlation lengths at the same time.

Even more importantly, in real data, the model selection will, unfortunately, not work as described above. In fact, I have cheated because we cannot - in general - directly compare likelihoods of models with a different order. The following example highlights this problem:

<span style="color:red">**TODO:** Create an empty `Network` instance and add the following path:</span>

`('a','b','c','d','e','c','b','a','c','d','e','c','e','d','c','a')`

<span style="color:red">Generate a first-order model, as well as a second- and fifth-order **null** model for the data. Compare the likelihoods between the three models.</span>

In [40]:
path = pp.Network()
path.add_path('a','b','c','d','e','c','b','a','c','d','e','c','e','d','c','a')

hon_1 = pp.HigherOrderNetwork(path,order=1)
null = pp.NullModel(path)
hon_2_null = null.generate(order=2)
hon_5_null = null.generate(order=5)

print(hon_1.likelihood(path, log=False))
print(hon_2_null.likelihood(path, log=False))
print(hon_5_null.likelihood(path, log=False))

HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=1.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=12.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=1.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=12.0, style=ProgressStyle(description_wi…



HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=4.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=12.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=80.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=208.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=1.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=12.0, style=ProgressStyle(description_wi…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=12.0, style=ProgressStyle(description_wi…

1.755829903978052e-06


HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=32.0, style=ProgressStyle(description_wi…

3.511659807956104e-06


HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=528.0, style=ProgressStyle(description_w…

2.633744855967078e-05


This is strange! Shouldn't the likelihoods of these three models be identical? They are not, and this is a major issue when we have data that consists of large numbers of short paths: in terms of the number of transitions that enter the likelihood calculation, a model of order $k$ discards the first $k$ nodes on each path. That is, a second-order model can only account for all but the first edge traversals on the path. This means that - in the general case - we actually compare likelihoods computed for different sample spaces, which is not valid.

## Multi-order representation learning

To fix the issues above, we need a probabilistic generative model that can deal with extensive collections of (short) paths in a network. The key idea is to combine multiple higher-order network models into a single multi-layered, multi-order model. To calculate the likelihood of such a model, we can use all layers, thus avoiding the problem that we discard prefixes of paths. For each path, we start the calculation at a layer of order zero, which considers the relative probabilities of nodes. We then use this model layer to calculate the probability to observe the first node on a path. For the next transition to step two, we then use a first-order model. The next transition is calculated in the second-order model and so on until we have reached the maximum order of our multi-order model. At this point, we can transitively calculate the likelihood based on the remaining transitions of the path.

The method is described in all details in the following paper:

I Scholtes: [When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks](http://dl.acm.org/citation.cfm?id=3098145), In KDD'17 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Nova Scotia, Canada, August 13-17, 2017

But let us go to practice. `pathpy` can directly generate and analyse multi-order network models. Let us try this in our example.

<span style="color:red">**TODO:** Create an instance of class `MultiOrderModel` and fit it to the `observations` from above. Finally, print the summary.</span>

In [41]:
mom = pp.MultiOrderModel(observations)
mom.summary()

[01-16 09:11:55: INFO ] Multi-order model
[01-16 09:11:55: INFO ] - General --------------------------------------------
[01-16 09:11:55: INFO ] layer  |            network            |      DoF
[01-16 09:11:55: INFO ] order  |   nodes     edges     paths   | paths  ngrams


<div class="alert alert-info">
Since, the computational time of multi-order-models can be quite high, pathpy, the generation of the Layers has to be done seperately. 
</div>

In [42]:
mom.generate(max_order=2)
mom.summary()

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=5.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=2.0, style=ProgressStyle(description_wid…

[01-16 09:11:55: INFO ] Multi-order model
[01-16 09:11:55: INFO ] - General --------------------------------------------
[01-16 09:11:55: INFO ] layer  |            network            |      DoF
[01-16 09:11:55: INFO ] order  |   nodes     edges     paths   | paths  ngrams
[01-16 09:11:55: INFO ]      0 |         6         5         5 |      4      4
[01-16 09:11:55: INFO ]      1 |         5         4         2 |      1     20
[01-16 09:11:55: INFO ]      2 |         4         2         2 |      0    100




We can now use the `likelihood` function of the class `MultiOrderModel` to repeat our likelihood ratio test. Rather than generating multiple `MultiOrderModel` instances for different hypotheses, we can directly calculate likelihoods based on different model layers within the same `MultiOrderModel` instance.

However, rather than performing the likelihood test ourselves, we can actually simply call the method `MultiOrderModel.estimate`. It will return the maximum order among all of its layers for which the likelihood ratio test rejects the null hypothesis.

<span style="color:red">**TODO:** Use the `MultiOrderModel.estimate` method to learn the optimal order in the `MultiOrderModel` from above.</span>

In [43]:
mom.estimate()

HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=1.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=2.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…

2


We now test whether this approach to **learn the optimal representation of path data** actually works. For this, let us generate path statistics that are in line with what we expect based on a first-order network model, and check whether the order estimation gives the right result.

<span style="color:red">**TODO:** Create a `random` instance with path statistics that conform to a first-order model, for a `MultiOrderModel` and return the optimal maximum order of the model.</span>

In [44]:
random = pp.Network()
random.add_paths_from(['a-c-d','a-c-e','b-c-e','b-c-d'],frequency=5)

mom_2 = pp.MultiOrderModel(random)
mom_2.generate(max_order=2)
mom_2.estimate()

HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=5.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…



HBox(children=(FloatProgress(value=0.0, description='possilbe paths', max=1.0, style=ProgressStyle(description…

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))



HBox(children=(FloatProgress(value=0.0, description='subpath counter', max=4.0, style=ProgressStyle(descriptio…



HBox(children=(FloatProgress(value=0.0, description='adj matrix', max=4.0, style=ProgressStyle(description_wid…

1
