# Random Walks in Static Networks

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/random_walks.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/random_walks.ipynb)

In [None]:
pip install git+git://github.com/pathpy/pathpy.git

In [1]:
import pathpy as pp
from pprint import pprint

We first generate a directed network with weighted edges:

In [2]:
n = pp.Network(directed=True)
n.add_edge('a', 'b', weight=1, uid='a-b')
n.add_edge('b', 'c', weight=1, uid='b-c')
n.add_edge('c', 'a', weight=2, uid='c-a')
n.add_edge('c', 'd', weight=1, uid='c-d')
n.add_edge('d', 'a', weight=1, uid='d-a')
n.plot()

To initialize a random walk process, we can generate a `RandomWalk` instance on the network. Specifying the `weight` attribute will lead to a biased random walk, where the transition probabilities are weighted by the respective numerical property:

In [3]:
rw = pp.processes.RandomWalk(n, weight='weight')

We can inspect the transition matrix of the random walk process as follows. Using the function `matrix_pd` will return the matrix as a (nicely formatted) pandas DataFrame, while `matrix` returns the internal sparse matrix representation.

In [4]:
print(rw.transition_matrix_pd())

          a    b    c         d
a  0.000000  1.0  0.0  0.000000
b  0.000000  0.0  1.0  0.000000
c  0.666667  0.0  0.0  0.333333
d  1.000000  0.0  0.0  0.000000


We can use the method `generate_walk` of the random walk instance to generate a random walk with a given length. If we specify `start_node`, the walk will start from the given node object. The method returns a `Path` object.

In [5]:
for time, updated_nodes in rw.simulation_run(steps=10, seed='a'):
    print('time = {0}, node = {1}'.format(time, updated_nodes[0]))

time = 1, node = b
time = 2, node = c
time = 3, node = a
time = 4, node = b
time = 5, node = c
time = 6, node = a
time = 7, node = b
time = 8, node = c
time = 9, node = a
time = 10, node = b


In [6]:
data = rw.run_experiment(steps=10, runs=['a', 'b'])
print(data)

    run_id seed  time node  state
0        0    a     0    a   True
1        0    a     0    d  False
2        0    a     0    c  False
3        0    a     0    b  False
4        0    a     1    b   True
5        0    a     1    a  False
6        0    a     2    c   True
7        0    a     2    b  False
8        0    a     3    a   True
9        0    a     3    c  False
10       0    a     4    b   True
11       0    a     4    a  False
12       0    a     5    c   True
13       0    a     5    b  False
14       0    a     6    a   True
15       0    a     6    c  False
16       0    a     7    b   True
17       0    a     7    a  False
18       0    a     8    c   True
19       0    a     8    b  False
20       0    a     9    a   True
21       0    a     9    c  False
22       0    a    10    b   True
23       0    a    10    a  False
24       1    b     0    a  False
25       1    b     0    d  False
26       1    b     0    c  False
27       1    b     0    b   True
28       1    

In [7]:
p = rw.get_path(data, 0)
print([v for v in p.nodes])

[Node a, Node b, Node c, Node a, Node b, Node c, Node a, Node b, Node c, Node a, Node b]


In [8]:
pc = rw.get_paths(data, [0,1])
print(pc)

{<pathpy.core.path.Path object at 0x000001997ED036A0>, <pathpy.core.path.Path object at 0x000001993D9F4CF8>}


In [9]:
rw.plot(data)

If we omit the `start_node` argument, a random node will be chosen as start node.

In [10]:
data = rw.run_experiment(steps=200, runs=2)
rw.plot(data)

We can generate a `PathCollection` that contains a given number of random walks with a given length specified by `steps_per_walk`. If the `start_nodes` argument is a numeric, the given number of walks will be generated, each walk starting from a random node.

In [11]:
pc = rw.get_paths(rw.run_experiment(steps=10, runs=10))
for p in pc:
    print( tuple([v.uid for v in p.nodes]))

('c', 'd', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b')
('c', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'a', 'b', 'c')
('b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'a', 'b')
('c', 'd', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b')
('d', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c')
('d', 'a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c')
('b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'a', 'b')
('a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b')
('a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd')
('b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c')


If `start_nodes` is a list of `Node` objects, a single random walk will be generated for each start node. To generate a single random walk starting in each node of the network, we can write:

In [21]:
pc = rw.get_paths(rw.run_experiment(steps=10, runs=n.nodes.uids))
for p in pc:
    print(tuple([ v.uid for v in p.nodes ]))

('a', 'b', 'c', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd')
('d', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a')
('c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a')
('b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c')


We can also use an iterator interface to iteratively perform random walk steps. I neach step, we can assess the current state of the random walk at time t.

In [20]:
for time, _ in rw.simulation_run(10, seed='a'):
    print('Current node = {0}'.format(rw.current_node))
    print('Current time = {0}'.format(rw.time))
    print(rw.visitation_frequencies)    
    print(rw.total_variation_distance)

Current node = b
Current time = 1
[0.5 0.5 0.  0. ]
0.39999999999999997
Current node = c
Current time = 2
[0.33333333 0.33333333 0.33333333 0.        ]
0.10000000000000017
Current node = a
Current time = 3
[0.5  0.25 0.25 0.  ]
0.19999999999999996
Current node = b
Current time = 4
[0.4 0.4 0.2 0. ]
0.19999999999999998
Current node = c
Current time = 5
[0.33333333 0.33333333 0.33333333 0.        ]
0.10000000000000017
Current node = a
Current time = 6
[0.42857143 0.28571429 0.28571429 0.        ]
0.12857142857142853
Current node = b
Current time = 7
[0.375 0.375 0.25  0.   ]
0.14999999999999997
Current node = c
Current time = 8
[0.33333333 0.33333333 0.33333333 0.        ]
0.10000000000000017
Current node = a
Current time = 9
[0.4 0.3 0.3 0. ]
0.1000000000000002
Current node = b
Current time = 10
[0.36363636 0.36363636 0.27272727 0.        ]
0.12727272727272726


The visitation probabilities after t steps for a given start node can be computed using the `visitation_probabilities` function.

In [14]:
rw.visitation_probabilities(0, start_node=n.nodes['a'])

matrix([[1., 0., 0., 0.]])

In [15]:
rw.visitation_probabilities(1, start_node=n.nodes['a'])

matrix([[0., 1., 0., 0.]])

In [16]:
rw.visitation_probabilities(100, start_node=n.nodes['a'])

matrix([[0.30000005, 0.30000046, 0.29999959, 0.0999999 ]])

The stationary visitation probabilities for $t \rightarrow \infty$ can be computed as follows:

In [19]:
rw.stationary_state()

array([0.3, 0.3, 0.3, 0.1])