# Epidemic Spreading

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/epidemic_spreading.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/epidemic_spreading.ipynb)

The class `BaseProcess` enables users to implement, simulate, and visualise custom-defined discrete-time dynamical processes. Some simple processes like the Susceptible-Infected-Recoverd (SIR) model for epidemic spreading are implemented in pathpy, mainly to illustrate how you can implement your own, more realistic model.

In this notebook, we demonstrate how we can simulate a simple spreading process in `pathpy`.

In [None]:
pip install git+git://github.com/pathpy/pathpy.git

In [1]:
import pathpy as pp
import seaborn as sn
import pandas as pd
import numpy as np
from pprint import pprint

from matplotlib import pyplot as plt

  import pandas.util.testing as tm


We first need a network on which we will run the dynamical process. We will use the network of game of throne characters.

In [None]:
n = pp.io.graphtool.read_netzschleuder_network('game_thrones')
print(n)

Inspecting the node attributes above, we find that we can use the node-level attribute `name` to plot the network:

In [None]:
n.plot(node_label={v.uid: v['name'] for v in n.nodes}, label_color='black')

To simulate the SIR epidemic spreading process, we must first initialize the process. We can do this by creating a new instance of the class `pp.processes.EpidemicSIR`. The constructor takes three parameters. The first parameter is the network on which we want to simulate the process. The second parameter is the recovery time, i.e. the number of time steps for which an infected node is infected before it recovers (and becomes immune). The third parameter is the per-time step probability by which a susceptible node that is connected to an infected node becomes infected.

In [None]:
sir = pp.processes.EpidemicSIR(n, recovery_time=20, infection_prob=0.1)

Now that our experiment is setup, we can run it. This can be done in two ways: 

The first method is to use the `simulation_run` iterator, which allows us to iterate through the steps of the process. After each discrete time step of the simulation, this iterator will yield a tuple consisting of the current time and a set of nodes whose state has been changed in the current step. In the SIR model, this state change can either be due to a susceptible node becoming infected, or an infected node changing to recovered. We can use the method `sir.node_state` to check the current state of each node in the network. 

In the SIR model, the three states corresponding to the three compartments `susceptible`, `infected`, `recovered` are encoded by the integer values `0`, `1`, and `2` respectively. Hence, to print a list of newly infected nodes in each step, we can write:

In [None]:
for time, changed_nodes in sir.simulation_run(steps=20):
    print('time = {0}: {1}'.format(time, [v for v in changed_nodes if sir.node_state(v)==1]))

In the example above, the seed node that is initially infected (at time 0) is chosen uniformly at random. If we want to start the simulation with a specific seed node, we can pass the uid of the initially infected node via the `seed` parameter:

In [None]:
for time, changed_nodes in sir.simulation_run(steps=20, seed='0'):
    print('time = {0}: {1}'.format(time, [v for v in changed_nodes if sir.node_state(v)==1]))

A common task in the simulation of stochastic processes is the executing of multiple runs with random initializations. Rather than requiring the user to collect the result of the individual simulation runs ourselves via the `simulation_run` iterator, `pathpy` provides a `run_experiment` function that makes this task simple. We can simply specify the numer of steps for which we wish to simulate process, as well as the number of times the experiment shall be executed (runs). 

For each run a new random seed node will be chosen automatically.

In [None]:
data = sir.run_experiment(steps=100, runs=10)

This method returns a `pandas.DataFrame` which collects the full evolution of the process. Each row in the data frame stores a single state change of a node in a given run and time step along with the updated state. This data frame allows us to reconstruct the full dynamics, and execute downstream analyses and visualisations. Only updates to states are stored, i.e. unchanged states of nodes are omitted.

In [None]:
data.head()

As an example, let us plot the evolution of the average total number of infections over time, as well as the average number of new infections in each time step, across the 10 runs of the process. To plot this, we can write:

In [None]:
dynamics = [] 
total_infections = 0
for t in range(100):
    new_infections = len(data.loc[(data['time']==t) & (data['state']==1)])/100
    total_infections += new_infections
    dynamics.append({ 
        'time': t,
        'new_infections': new_infections, 
        'total_infections': total_infections}
    )

dynamics = pd.DataFrame.from_dict(dynamics)
sn.lineplot(data=dynamics, x='time', y='total_infections', label='Total infections')
sn.lineplot(data=dynamics, x='time', y='new_infections', label='New infections')
plt.legend()

Finally, there is a simple method to generate an interactive visualisation of the dynamical process in the network. We can simply call the `plot` function of the process instance, passing the data frame that we collected in our experiment. Since this data frame can contain data on more than one simulation run, we can specify the id of the run that we wish to visualize. If we omit this parameter, the first run will be shown.

In [None]:
sir.plot(data, run_id=0)

In [None]:
sir.plot(data, run_id=1)

To simplify matters, the following line executes and visualizes a single run of the process with a random seed node.

In [None]:
sir.plot(sir.run_experiment(steps=100))

Finally, we can use the sequence of state changes recorded in the evolution of a process run to extract a directed acyclic grap of possible causal influences between nodes. In this directed acyclic graph, each change of a state of node w at time t is represented by a time-unfolded node 'w-t'., i.e. the directed acyclic graph can be thought of as a time-unfolded static representation of the process evolution. In addition, an edge (v-t', w-t) indicates that prior to node w changing its state, a node v connected to w changed its state at time t'<t. We can limit the directed acyclic graph construction to specific state changes. In the following example, we use a directed acyclic graph to capture possible transmission routes of the epidemic process in a small example network.

In [None]:
n = pp.Network(directed=False)
n.add_edge('a', 'b')
n.add_edge('a', 'c')
n.add_edge('c', 'd')
n.add_edge('c', 'e')
n.add_edge('d', 'e')
n.add_edge('c', 'b')
n.plot()

In [None]:
sir = pp.processes.EpidemicSIR(n, recovery_time=10, infection_prob=0.5)
data = sir.run_experiment(steps = 50, runs=['a'])
print(data)

In [None]:
dag = sir.to_directed_acylic_graph(data, states=[1])
print(dag)
dag.plot()

In [None]:
pc = pp.algorithms.path_extraction.extract_path_collection(dag)
for p in pc:
    print([v['node_label'] for v in p.nodes])

## Epidemic Spreading in Temporal Networks

In [2]:
t = pp.TemporalNetwork()
t.add_edge('a', 'b', timestamp=1)
t.add_edge('b', 'c', timestamp=2)
t.add_edge('c', 'd', timestamp=3)
t.plot()

In [3]:
sir = pp.processes.EpidemicSIR(t, recovery_time=1, infection_prob=1)
data = sir.run_experiment(steps=10, runs=['a'])

found active neighbor
found active neighbor
found active neighbor


In [4]:
data

Unnamed: 0,run_id,seed,time,node,state
0,0,a,1,d,0
1,0,a,1,a,1
2,0,a,1,c,0
3,0,a,1,b,0
4,0,a,2,b,1
5,0,a,3,c,1
6,0,a,4,a,2
7,0,a,4,d,1
8,0,a,4,b,2
9,0,a,5,c,2


In [5]:
sir.plot(data)

In [12]:
t = pp.io.graphtool.read_netzschleuder_network('sp_hospital')
print(t)
t.plot()

Uid:			0x1e2b80f8240
Type:			TemporalNetwork
Directed:		False
Multi-Edges:		False
Number of unique nodes:	75
Number of unique edges:	1139
Number of temp nodes:	75
Number of temp edges:	32424
Observation period:	140 - 347641.0
Observation length:	347501.0

Network attributes
------------------
name:	sp_hospital
description:	This dataset contains the temporal network of contacts between patients, patients and health-care workers (HCWs) and among HCWs in a hospital ward in Lyon, France, from Monday, December 6, 2010 at 1:00 pm to Friday, December 10, 2010 at 2:00 pm. The study included 46 HCWs and 29 patients.[^icon]

The file contains a tab-separated list representing the active contacts during 20-second intervals of the data collection. Each line has the form “t i j Si Sj“, where i and j are the anonymous IDs of the persons in contact, Si and Sj are their statuses (NUR=paramedical staff, i.e. nurses and nurses’ aides; PAT=Patient; MED=Medical doctor; ADM=administrative staff), and the i

In [19]:
sir = pp.processes.EpidemicSIR(t, recovery_time=10000, infection_prob=1)
data = sir.run_experiment(steps=10000, runs=['14'])

found active neighbor
found active neighbor
found active neighbor
found active neighbor
found active neighbor


In [16]:
sir.plot(data)