# Important note!

Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your GT login and the GT logins of any of your collaborators below. (The GT logins are worth 1 point per notebook, so don't miss the opportunity to get a free point!)

In [None]:
YOUR_ID = "" # Please enter your GT login, e.g., "rvuduc3" or "gtg911x"
COLLABORATORS = [] # list of strings of your collaborators' IDs

In [None]:
import re

RE_CHECK_ID = re.compile (r'''[a-zA-Z]+\d+|[gG][tT][gG]\d+[a-zA-Z]''')
assert RE_CHECK_ID.match (YOUR_ID) is not None

collab_check = [RE_CHECK_ID.match (i) is not None for i in COLLABORATORS]
assert all (collab_check)

del collab_check
del RE_CHECK_ID
del re

**Jupyter / IPython version check.** The following code cell verifies that you are using the correct version of Jupyter/IPython.

In [None]:
import IPython
assert IPython.version_info[0] >= 3, "Your version of IPython is too old, please update it."

# A cellular automaton for the S-I-R model of infection

In this notebook, you will use a cellular automaton to implement a model of the spread of infection, which we will refer to as the susceptible-infectious-recovered CA (SIR-CA) model.

The slides that accompany this notebook will be available on both Piazza and T-Square under "Resources."

## Setup

Some code setup: run these cells, declare victory, and move on.

In [None]:
import numpy as np
import scipy as sp
import scipy.sparse

def count (G):
    """
    Counts the number of locations in a NumPy array, `G`,
    where `np.where (G)` holds.
    """
    return len (np.where (G)[0])

def find (G):
    """
    Returns the set of locations of a NumPy array, `G`,
    where `np.where (G)` holds.
    """
    assert type (G) is np.ndarray
    return {(i, j) for i, j in zip (*np.where (G))}

In [None]:
import matplotlib.pyplot as plt # Core plotting support
%matplotlib inline

## The phenomenon to be modeled and simulated

Suppose we wish to model the spread of an illness in a population distributed geographically. This illness is non-fatal, meaning a person who has it does not die from it; an ill person eventually recovers. The illness is also contagious, spreading by contact. After a roughly fixed and predictable period of time, an ill person recovers and develops an immunity to the illness, meaning he or she will never suffer from the same illness again.

## Conceptual model

As a first cut, let's try using a cellular automaton (CA) as the conceptual model. We will refer to the specific model we develop as the SIR-CA model, as noted above.

Let the world be a square $n \times n$ grid $G = G(t) \equiv \left(g_{ij}(t)\right)$ of cells that evolve over time, which is discrete and measured in, say, (integer) days.

Every cell of $G$ is a position that is either empty or occupied by a person, who exists in one of three possible states:

1. **Susceptible (S)**: This person has never gotten the illness before. If he or she comes in close contact with a sick person, he or she is at risk of catching the illness.
2. **Infected (I)**: This person has the illness and is contagious.
3. **Recovered (R)**: This person had the illness but has developed the immunity. He or she cannot become sick again.

Let's associate these states with the following integers:

In [None]:
# Possible states:
EMPTY = -1
SUSCEPTIBLE = 0
INFECTED = 1
RECOVERED = 2

**Exercise 1** (2 points). On the "zeroth day" ($t = 0$), the world is full of susceptible people and one of them (near the center) gets sick. This state is our initial condition.

Complete the following function so that creates a world satisfying this initial condition. That is, given a dimension, `n`, it should return a `n`-by-`n` NumPy array of integer values that are empty on the boundary and susceptible everywhere in the interior except at the center, where there should be one "index case" (i.e., one infected person).

> Define the "center" along a dimension of length $n$ to be $\left\lceil\frac{n+2}{2}\right\rceil$. The "+2" accounts for the boundary cells.

In [None]:
def create_world (n):
    # YOUR CODE HERE
    raise NotImplementedError()
    
def show_peeps (G, vmin=EMPTY, vmax=RECOVERED, values="states"):
    # Set color range
    assert values in ["states", "bool"]
    if values == "states":
        vticks = range (vmin, vmax+1)
        vlabels = ['Empty', 'Susceptible', 'Infected', 'Recovered']
    else:
        vticks = [0, 1]
        vlabels = ['False (0)', 'True (1)']
    
    m, n = G.shape[0]-2, G.shape[1]-2
    plt.pcolor (G, vmin=vmin, vmax=vmax, edgecolor='black')
    cb = plt.colorbar ()
    cb.set_ticks (vticks)
    cb.set_ticklabels (vlabels)
    plt.axis ('square')
    plt.axis ([0, m+2, 0, n+2])

# Create an empty world at time t=0
N = 9
peeps_0 = create_world (N)
show_peeps (peeps_0)

In [None]:
assert peeps_0.shape == (N+2, N+2)
assert np.sum (peeps_0) == ((4*N+4)*EMPTY + INFECTED)
assert len (np.where (peeps_0 == INFECTED)[0]) == 1

i_mid = int ((N+2) / 2)
assert peeps_0[i_mid, i_mid] == INFECTED

print ("\n(Passed.)")

Let's define some functions to help identify susceptible, infected, and recovered people in this world.

In [None]:
def susceptible (G):
    """
    Given a grid, G, returns a grid S whose (i, j) entry
    equals 1 if G[i, j] is susceptible or 0 otherwise.
    """
    return (G == SUSCEPTIBLE).astype (int)

print ("There are", count (susceptible (peeps_0)), "susceptible patient(s) initially")

**Exercise 2** (1 point). Complete the following functions, which should find infected individuals in a given world.

In [None]:
def infected (G):
    """
    Given a grid G, returns a grid I whose (i, j) entry equals 1 if
    G[i, j] is infected or 0 otherwise.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
assert (infected (peeps_0) >= 0).all ()
assert (infected (peeps_0) <= 1).all ()
print ("There are", count (infected (peeps_0)), "infected patient(s) initially")
assert count (infected (peeps_0)) == 1

**Exercise 3** (1 point). Complete the following function, which should find recovered individuals in a given world.

In [None]:
def recovered (G):
    """
    Given a grid G, returns a grid R whose (i, j) entry equals 1 if
    G[i, j] has recovered or 0 otherwise.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
assert ((recovered (peeps_0) == 0) | (recovered (peeps_0) == 1)).all ()
print ("There are", count (recovered (peeps_0)), "patient(s) in recovery.")
assert count (recovered (peeps_0)) == 0

**Time evolution.** Next, let's define the state evolution rules that determine how the sickness spreads on each subsequent day, $t \geq 1$:

* **R1**) A person is sick for only one day. That is, if he or she is sick on day $t$, then on day $t+1$ he or she will have recovered.
* **R2**) The illness spreads from an infected persons to their north, south, east, and west neighbors, but it does so nondeterministically. More formally, let's call a person at $(i, j)$ _exposed_ if _any_ of her north, south, east, or west neighbors is infected. The _conditional_ probability that any exposed person becomes infected is $\tau$, which is uniform and independent for all positions. Thus, this rule says that all exposed persons will become infected randomly with probability $\tau$.

**Exercise 4** (2 points). To help determine who might catch the disease in a given time step, let's write a function that determines who is exposed. That is, given a grid $G$, this function returns a new grid $E$ such that $e_{ij}$ is `1` if $g_{ij}$ is susceptible and at least one neighbor of $g_{ij}$ is sick, and `0` otherwise.

In [None]:
def exposed (G):
    """
    Returns a grid whose (i, j) entry is 1 if it has
    at least 1 infected neighbor, or 0 otherwise.
    """
    E = np.zeros (G.shape, dtype=int) # exposed people
    # YOUR CODE HERE
    raise NotImplementedError()
    return E

# Visualizes your results:
print ("There are", count (exposed (peeps_0)), "exposed patient(s).")
show_peeps (exposed (peeps_0), values="bool")

In [None]:
peeps_0 = create_world (N)
E_locs = find (exposed (peeps_0))
assert len (E_locs) == 4

i_mid = int ((N+2) / 2)
assert (i_mid-1, i_mid) in E_locs
assert (i_mid, i_mid+1) in E_locs
assert (i_mid+1, i_mid) in E_locs
assert (i_mid, i_mid-1) in E_locs
print ("\n(Passed.)")

**Exercise 5** (2 points). Complete the following function, `spreads()`. It takes as input a grid `G[:,:]` of people and the conditional probability `p` of becoming infected given any sick neighbors. It should determine to which grid cells the infection spreads.

In particular, it should return a binary (0-1) grid `G_s[:,:]` of the same size as `G[:,:]` where `G_s[i,j]` is `1` with probability `p` if `G[i,j]` is exposed, and `0` otherwise.

In [None]:
COND_PROB_ILL = 0.5 # Probability of getting sick, given any sick neighbors

def spreads (G, tau=COND_PROB_ILL):
    # grid of uniformly random values
    random_draw = np.random.uniform (size=G.shape)
    # YOUR CODE HERE
    raise NotImplementedError()
    
np.random.seed (1602034230) # Fixed seed, for debugging
G_s = spreads (peeps_0)
print ("The infection is spreading to", np.sum (G_s), "patient(s)")
show_peeps (G_s, values="bool")

In [None]:
assert ((G_s >= 0) & (G_s <= 1)).all ()
assert np.sum (G_s) <= 4

G_s_locs = np.where (G_s == 1)
assert (((G_s_locs[0] == i_mid) & (np.abs (G_s_locs[1] - i_mid) == 1))
        | ((G_s_locs[1] == i_mid) & (np.abs (G_s_locs[0] - i_mid) == 1))).all ()

print ("\n(Passed.)")

**Exercise 6** (2 points). Write a function to simulate one time-step, given a grid `G[:,:]` and conditional probability `p` of infection when exposed.

In [None]:
def step (G, tau=COND_PROB_ILL):
    """
    Simulates one time step and returns a grid
    of the resulting states.
    """
    # YOUR CODE HERE
    raise NotImplementedError()
    
np.random.seed (1602034230) # Fixed seed, for debugging
peeps_1 = step (peeps_0)

fig = plt.figure (figsize=(12, 4))
plt.subplot (1, 2, 1)
show_peeps (peeps_0)
plt.title ('Before')

plt.subplot (1, 2, 2)
show_peeps (peeps_1)
plt.title ('After')

In [None]:
E_0_locs = find (exposed (peeps_0))
print (E_0_locs)

I_1_locs = find (infected (peeps_1))
print (I_1_locs)

assert I_1_locs <= E_locs
print ("\n(Passed.)")

## Putting it all together

The preceding code lays the building blocks for the complete simulation, which the following function implements.

In [None]:
def summarize (G_t, verbose=True):
    n_S = count (susceptible (G_t))
    n_I = count (infected (G_t))
    n_R = count (recovered (G_t))
    if verbose:
        print ("# susceptible:", n_S)
        print ("# infected:", n_I)
        print ("# recovered:", n_R)
    return n_S, n_I, n_R
    
def sim (G_0, max_steps, tau=COND_PROB_ILL):
    """
    Starting from a given initial state, `G_0`, this
    function simulates up to `max_steps` time steps of
    the S-I-R cellular automaton.
    
    It returns a tuple `(t, G_t)` containing the final
    time step `t <= max_steps` and simulation state
    `G_t`.
    """
    t, G_t = 0, G_0.copy ()
    (_, num_infected, _) = summarize (G_t, verbose=False)
    while (num_infected > 0) and (t < max_steps):
        t = t + 1
        G_t = step (G_t, tau)
        (_, num_infected, _) = summarize (G_t, verbose=False)
    return (t, G_t)

In [None]:
from ipywidgets import interact

def isim (m, n, max_steps=0, tau=COND_PROB_ILL, seed=0):
    np.random.seed (seed)
    G_0 = EMPTY * np.ones ((m+2, n+2), dtype=int)
    G_0[1:-1, 1:-1] = SUSCEPTIBLE
    i_mid = int ((m+2) / 2)
    j_mid = int ((n+2) / 2)
    G_0[i_mid, j_mid] = INFECTED
    (_, G_t) = sim (G_0, max_steps, tau)
    show_peeps (G_t)
    
interact (isim
          , m=(1, 50, 1)
          , n=(1, 50, 1)
          , max_steps=(0, 100, 1)
          , tau=(0.0, 1.0, 0.125)
          , seed=(0, 50, 1)
         )

Since the simulation is nondeterministic, we may need to run it many times to get a sense of the "average" behavior of this system.

Suppose we run the simulation many times. For the $k$-th simulation, let $(S^{(k)}_t, I^{(k)}_t, R^{(k)}_t)$ denote the number of susceptible, infected, and recovered individuals at time $t$. Let's define the average behavior by the average at each time step taken over all simulations. That is, the averages $(\bar{S_t}, \bar{I_t}, \bar{R_t})$ are

$$
\begin{eqnarray}
  \bar{S_t} & = & \sum_{k=0}^{M-1} S^{(k)}_t \\
  \bar{I_t} & = & \sum_{k=0}^{M-1} I^{(k)}_t \\
  \bar{R_t} & = & \sum_{k=0}^{M-1} R^{(k)}_t,
\end{eqnarray}
$$

where $M$ is the number of simulations.

**Exercise 7** (4 points). Conduct $M=1000$ simulations with the following parameters:

* The world is 25$\times$25 (excluding the boundary).
* Initial conditions: Each cell is infected independently with probability $\alpha = 0.1$ (i.e., 10%).
* The conditional probability of an individual becoming infected, given that he/she is exposed, is $\tau=0.5$.
* The maximum number of time steps per simulation should be capped at 25. (That is, the initial condition is taken to be $t=0$; beyond that, $t \leq T=30$.)

During each simulation, record $(S^{(k)}_t, I^{(k)}_t, R^{(k)}_t)$. Store these in three NumPy arrays each of size $(T+1) \times M$. Lastly, also record the time step at which the infection ends in an array `T_stop[:]` of length $M$.

In [None]:
# === Simulation parameters ===

N = 25 # World is N x N
ALPHA = 0.1 # Initial probability of being infected
TAU = 0.5 # Conditional probability of infection spreading
MAX_STEPS = 30 # I.e., $T$
NUM_SIMS = 1000 # I.e., $M$

# === Holds simulation results ===
S = np.zeros ((MAX_STEPS+1, NUM_SIMS))
I = np.zeros ((MAX_STEPS+1, NUM_SIMS))
R = np.zeros ((MAX_STEPS+1, NUM_SIMS))
T_stop = np.zeros (NUM_SIMS)

# === Define initial condition ===
G_0 = EMPTY * np.ones ((N+2, N+2), dtype=int)
G_0[1:-1, 1:-1] = np.random.choice ([SUSCEPTIBLE, INFECTED],
                                    size=(N, N),
                                    p=[1.0-ALPHA, ALPHA])

In [None]:
# === Simulation: Your code goes below ===

# The initial condition is provided in `G_0`.
# You should record your results in `S`, `I`, `R`
# as described in the instructions.

# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# This code cell helps you visualize your results.

# Computes the averages, $(\bar{S_t}, \bar{I_t}, \bar{R_t})$.
S_avg = np.mean (S, axis=1)
I_avg = np.mean (I, axis=1)
R_avg = np.mean (R, axis=1)
t_stop_avg = np.mean (T_stop)

S_std = np.std (S, axis=1)
I_std = np.std (I, axis=1)
R_std = np.std (R, axis=1)
t_stop_std = np.std (T_stop)

T = np.arange (MAX_STEPS+1)
fig = plt.figure (figsize=(12, 6))

SCALE = 1e2 / (N**2)
plt.errorbar (T, S_avg*SCALE, yerr=S_std*SCALE, fmt='ys--')
plt.errorbar (T, I_avg*SCALE, yerr=I_std*SCALE, fmt='r*--')
plt.errorbar (T, R_avg*SCALE, yerr=R_std*SCALE, fmt='bo--')
plt.plot ([t_stop_avg, t_stop_avg], [0., 100.], 'k-')
plt.plot ([t_stop_avg-t_stop_std, t_stop_avg-t_stop_std], [0., 100.], 'k--')
plt.plot ([t_stop_avg+t_stop_std, t_stop_avg+t_stop_std], [0., 100.], 'k--')
plt.axis ([0, MAX_STEPS, 0.0, 100.0])
plt.legend (['S', 'I', 'R'], loc=0)
plt.title ("Spread of infection (% of population)")

# Sanity check
assert (np.abs ((S_avg + I_avg + R_avg)/(N**2) - 1.0) <= 1e-15).all ()