# Random Walks in Static Networks

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/random_walks.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/random_walks.ipynb)

Using the general interafce provided by the abstract class `pathpy.processes.BaseProcess` it is easy to simulate random walks. In the following, we demonstrate this in both a toy example and an empirical network. Let us first import pathoy and create a simple toy network.

In [None]:
pip install git+git://github.com/pathpy/pathpy.git

In [None]:
import pathpy as pp
from pprint import pprint

Our toy example is a directed network with four nodes and five weighted edges:

In [None]:
n = pp.Network(directed=True)
n.add_edge('a', 'b', weight=1, uid='a-b')
n.add_edge('b', 'c', weight=1, uid='b-c')
n.add_edge('c', 'a', weight=2, uid='c-a')
n.add_edge('c', 'd', weight=1, uid='c-d')
n.add_edge('d', 'a', weight=1, uid='d-a')
n.plot()

To simulate a random walk in this network, we first create a `RandomWalk` instance. The constructor will generate a (sparse) transition matrix and use it to initialize the random walk sampling process for a specific network, which is why we need to pass the network instance as the first parameter. Specifying the optional second `weight` attribute will lead to a biased random walk, where the transition probabilities are weighted by the respective numerical property:

In [None]:
rw = pp.processes.RandomWalk(n, weight='weight')

We can inspect the sparse transition matrix of the random walk process by using the function `rw.transition_matrix()`:

In [None]:
print(rw.transition_matrix)

The function `transition_matrix_pd` returns the matrix as a (nicely formatted and properly labelled) pandas DataFrame, which makes it easy to read it:

In [None]:
print(rw.transition_matrix_pd())

To compute the stationary visitation probabilites of the random walk, we can call:

In [None]:
rw.stationary_state()

Following the general design of the class `BaseProcess`, we can use the iterator function `simulation_run` to iterate through the steps of a random walk with a given length. If we specify `seed`, the walk will start from the given node uid.

In each step, the iterator will yield the current time, as well as a tuple containing those nodes whose state has changed. In the random walk process, this is the currently visited node (the first entry in the tuple) and the previous node, that is now not visited anymore (the second entry in the tuple). 

In each step, we can use properties and methods to access the current state of the random walk, e.g. we can output the current node visitation frequencies as well as the total variation distance to the stationary visitation probabilities. Note that the first iteration yields the status after the first transition, i.e. in the following example, the random walk is initialized in node `a` at time 0.

In [None]:
for time, updated_nodes in rw.simulation_run(steps=10, seed='a'):
    print('time = {0}, current node = {1}'.format(time, updated_nodes[0]))
    pprint(rw.visitation_frequencies)
    print(rw.total_variation_distance)

We can use the function `run_experiment` to generate data on multiple runs of a random walker starting in different nodes. This method will return a data frame, that contains all node changes that ocurred during the simulations. To generate two runs of a random walk with 10 steps each, starting from node `a` and node `b` we thus call:

In [None]:
data = rw.run_experiment(steps=10, runs=['a', 'b'])
print(data)

While this data frame can be easily exported or visualised, we often want to retrieve `Path` objects that capture the trajectories of random walks. This allows, for instance, to fit higher-order models based on observed random walks. We can use the `get_path` function to retrieve a single path from the data recorded during an experiment. Here, we have to specify the `run_id` of the path that shall be extracted, a zero-based counter that is automatically generated during the experiment:

In [None]:
p = rw.get_path(data, 1)
print(' -> '.join([v.uid for v in p.nodes]))

If we omit the `run_id`, the first run (with ID 0) will be returned as a path:

In [None]:
p = rw.get_path(data)
print(' -> '.join([v.uid for v in p.nodes]))

We can use the `get_paths` function of the `RandomWalk` class to generate a `PathCollection` containing all paths captured by a list of run_ids (or all runs in a data frame if we omit the `run_ids` argument):

In [None]:
pc = rw.get_paths(data, run_ids=[0, 1])
for p in pc:
    print(' -> '.join([v.uid for v in p.nodes]))

We can use the data frame returned by `run_experiment` to generate an interactive visualisation of the random walk process. We just have to pass the data frame to the `plot` function of the random walk instance. The result is a temporal network visualization, where we can use a slider bar to move forward and backward through time.

The `plot` function accepts a `run_id` argument that defines which of the recorded random walks in `data` shall be visualised. If we omit this parameter, the first random walk will be visualised.

In [None]:
rw.plot(data)

Apart from specifying a list of start nodes, we can also give a number of runs that shall be simulated. In this case, the seed nodes of the individual simulation runs are chosen uniformly at random.

In [None]:
pc = rw.get_paths(rw.run_experiment(steps=10, runs=20))
for p in pc:
    print( tuple([v.uid for v in p.nodes]))

If `runs` is a list of `Node` objects, a single random walk will be generated for each start node. To generate exactly one random walk starting in each node of the network, we can simply pass the node uids of a network:

In [None]:
pc = rw.get_paths(rw.run_experiment(steps=10, runs=n.nodes.uids))
for p in pc:
    print(tuple([ v.uid for v in p.nodes ]))

We close this tutorial with a random walk simulation in an empirical network.

In [None]:
n = pp.io.graphtool.read_netzschleuder_network('game_thrones')
rw = pp.processes.RandomWalk(n, weight='weight')
rw.plot(rw.run_experiment(steps=100))