# Solving differential equations numerically
___

Differential equations are a standard mathematical model used to understand dynamical systems. In general, these, or any, models aren't necessarily tractable. This notebook aims to address how can we study differential equations by integrating them numerically and plotting the results. In particular, we study the SIR model.

# Import modules

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.integrate as sp

plt.style.use('custom.mplstyle')
%config InlineBackend.figure_format = 'retina'

# The Euler method
___

A simple, first-order algorithm for solving ordinary differential equations given some initial value is the [Euler method](https://en.wikipedia.org/wiki/Euler_method).

Perhaps the simplest example of a differential equation is that for exponential growth.

$$
\frac{dN}{dt} = rN(t)
$$

where $r$ is some constant growth rate and $N(t)$ represents the size of a population at time $t$. We can solve this analytically by separating the variables and integrating on paper. Numerically, we can solve for $N(t)$ using a first-order Taylor expansion. Using the first-order ordinary differential equation above, we have

$$
\begin{align*}
N(t + \Delta t) &= N(t) + \frac{dN}{dt}(t) \Delta t + O(\Delta t^2)\\
&\approx N(t) + r N(t) \Delta t
\end{align*}
$$

Suppose $N(0) = 2$ and $r = 1$. Letting $\Delta t = 0.1$, $N$ changes by

$$
\Delta N(0) = rN(0) \Delta t = (1)(2)(0.1) = 0.2
$$

where $\Delta N(0)$ means the change in $\Delta N$ at $t=0$. We can continue evolving $N(t)$ indefinitely. Though we've shown we can perform Euler's method by hand, if we're to evolve $N(t)$ for long times or use a small $\Delta t$ to get a better approximation, it's better to use a computer.

Notably, 

$$
N(t + \Delta t) = N(t) + \frac{dN(t)}{dt} \Delta t + O(\Delta t^2)
$$

This means that the error in our approximation of $N(t)$ scales with $\Delta t^2$. $\Delta t$ is something we're free to choose. Therefore, it's best to choose a $\Delta t$ that is small. In particular, it must be small enough such that at most only one event occurs within an interval between $t$ and $t + \Delta t$. Take the evolution of bacterial cells as an example. If we are integrating exponential growth of this process, we shouldn't take $\Delta t$ to be larger than the time it takes for a cell to divide. This is stated formally as the [Courant-Friedrichs-Lewy condition](https://en.wikipedia.org/wiki/Courant%E2%80%93Friedrichs%E2%80%93Lewy_condition). On the other hand, choosing $\Delta t$ too small (e.g. $10^{-16}$) can ensue in unnecessarily long computational times and possibly numerical inaccuracies. All in all, Euler's method is simple and powerful, but we must be mindful of our choice for $\Delta t$.

# Exponential growth
___

Let's write a function that outputs $\vec{t} = \{0, \Delta t, 2 \Delta t, \ldots \}$ and $\vec{N} = \{N(0), N(\Delta t), N(2\Delta t), \ldots\}$ and takes as input the initial population size $N(0)$, the growth rate $r$, the total amount of time $T$ over which we want to evolve $N(t)$, and $\Delta t$ the size of our time step.

In [2]:
def euler_exponential(n0, r, T, dt):
    pass

Let $N(0) = 10$, $r = 0.1 $(min$^{-1})$, $T = 100$ (min). Try out the Euler method using $\Delta t \in [10^{-2}, 10^{-1}, 1, 10]$.

Using a semi-log scale for one axes object and a linear scale on another axes object, plot the results of each $\Delta t$ as a scatter with each scatter associated with a particular $\Delta t$ having a different color. Set the size of the markers in the scatter plot to 5: `s=5`. Plot the analytic function as a black solid line. Display a legend associating each plotting object with its label.

In [3]:
pass

Now let $N(0) = 10$, $r = 0.01$, $T = 100$. Again, try out the Euler method using $\Delta t \in [10^{-2}, 10^{-1}, 1, 10]$.

Using a semi-log scale for one axes object and a linear scale on another axes object, plot the results of each $\Delta t$ as a scatter with each scatter associated with a particular $\Delta t$ having a different color. Set the size of the markers in the scatter plot to 5: `s=5`. Plot the analytic function as a black solid line. Display a legend associating each plotting object with its label.

In [4]:
pass

By choosing a smaller $r$, $N(t)$ grows slower. We see that a higher $\Delta t$ can be accurate for this choice of $r$. This is a demonstration of the Courant-Friedrichs-Lewy condition. For the above, $r \Delta t = (0.1)(10) = 1$ (1 event occurs per interval) whereas here $r \Delta t = (0.01)(10) = 0.1$ (0.1 events occur per interval). How good is the Euler's method for different $\Delta t$ when $r=0.01$ and $T=1000?$

We see that choosing a smaller $\Delta t$ gives a more accurate solution overall. Additionally, the characteristic time $r^{-1}$ also sets how correct a choice of $\Delta t$ can be. If $\Delta t > r^{-1}$, our approximation misses events of growth and errors in our numerical approximation accumulate.

# Susceptible-Infected-Recovered (SIR) models
___

[SIR models](https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#The_SIR_model_without_vital_dynamics) describe the spread of a pathogen in a population. The SIR model breaks down the overall population into three subsets: the susceptible population, which consists of folks who haven't been infected yet and can be; the infected population, which can recover from the infection or infect the susceptible population; and the recovered population, which assumedly can't be reinfected. The system of differential equations for this model are:

$$
\begin{align*}
\frac{dS(t)}{dt} &= -\beta S(t) I(t) \\
\frac{dI(t)}{dt} &= \beta S(t) I(t) - \nu I(t) \\
\frac{dR(t)}{dt} &= \nu I(t)
\end{align*}
$$

where $\beta > 0$ is the rate at which susceptible people become infected from an encounter with folks who are infected, and $\nu > 0$ is the rate at which infectious people recover. (What are the units of $\beta$ and $\nu$?) Here, we are letting $S$, $I$, and $R$ be the **fractions of the population** which are susceptible, infectious, or recovered, respectively: $S(t) + I(t) + R(t) = 1$. Additionally, by modeling the fractions of the overall population, we don't need to worry about what the size of the total population size, making our lives a bit easier.

The ratio of $\beta$ and $\nu$ constitutes what is called $R_0$, the basic reproduction number:

$$
R_0 = \frac{\beta}{\nu}
$$

$R_0$ describes how many people an infectious person infects on average. (What are the units of $R_0$?)

Note that $dS(t)/dt$ and $dI(t)/dt$ are coupled in a nonlinear way due to the presence of $\beta S(t) I(t)$. Because of this relationship and $S(t) + I(t) + R(t) = 1$, we consider $dR(t)/dt$ to be superfluous. Systems of differential equations coupled nonlinearly rarely have analytical solutions. We can play the game of looking at how the populations evolve in limiting cases which make the solution tractable, but these solutions are accurate only in those regimes. Instead, we'll solve the equations irrespective of any regime using numerical integration.

# Euler method and the SIR model
___

Because $R(t) = 1 - S(t) - I(t)$, we need only solve for the evolution of $S(t)$ and $I(t)$. As before,

$$
\begin{align*}
S(t + \Delta t) &= S(t) + \frac{dS(t)}{dt} \Delta t + O(\Delta t^2) \\
I(t + \Delta t) &= I(t) + \frac{dI(t)}{dt} \Delta t + O(\Delta t^2)
\end{align*}
$$

Applying the approximation and plugging in the values for the derivatives, we have

$$
\begin{align*}
S(t + \Delta t) &\approx S(t) - \beta S(t) I(t) \Delta t \\
I(t + \Delta t) &\approx I(t) + \beta S(t) I(t) \Delta t - \nu I(t) \Delta t
\end{align*}
$$


Write a function which takes in $S(0)$, $I(0)$, $R_0$, $\nu$, $T$ (the total time over which we want to track the evolution of the populations), and $\Delta t$ and outputs $\vec{t} = \{0, \Delta t, 2 \Delta t, \ldots \}$, $\vec{S} = \{S(0), S(\Delta t), S(2 \Delta t), \ldots\}$, and $\vec{I} = \{I(0), I(\Delta t), I(2 \Delta t), \ldots\}$.

In [5]:
def euler_sir(s0, i0, r0, nu, T, dt):
    pass

Run your function, obtaining trajectories for $S$ and $I$ using the following parameters:
- $I(0) = 10^{-6}$
- $S(0) = 1 - I(0)$
- $\nu = \frac{1}{7}$ (recovery after a week)
- $R_0 = 5$ (average number of infections per individual)
- $T = 100$ days
- $\Delta t = 0.01$

In [6]:
pass

Plot $S(t)$, $I(t)$, and $R(t)$ vs. time. Display a legend with each line's label.

In [7]:
pass

# Better numerical integration and the SIR model
___

We stated above the Euler's method employs a first-order Taylor series approximation to solve differential equations. The SIR model is seemingly simple, but it is much more complicated than solving the differential equation for exponential growth. Let's now use `scipy` which has functions to integrate differential equations more accurately. Specifically let's use [scipy.integrate.odeint](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.odeint.html). (We don't concern ourselves with the implementation here. Though for those who are curious, this function uses [multistep methods](https://en.wikipedia.org/wiki/Linear_multistep_method#Families_of_multistep_methods) whereas Euler's method uses only one step. So this `scipy` function should be more accurate.)

Reading the documentation, we see an example that is illuminating and follow it. We need to define a function which outputs an array-like object which contains the derivatives.

Write a function which computes $dS(t)/dt$ and $dI(t)/dt$ taking as input $[S, I]$, the current values of $S$ and $I$; $T$, a sequence of time points; and $\beta$ and $\nu$. This function should return $[dS/dt, dI/dt]$.

In [8]:
def sir_ode(current_vals, T, beta, nu):
    pass

Run the scipy ode integrator using your function, obtaining trajectories for $S$ and $I$ using the following parameters:
- $I(0) = 10^{-6}$
- $S(0) = 1 - I(0)$
- $\nu = \frac{1}{7}$ (recovery after a week)
- $R_0 = 5$ (average number of infections per individual)
- $T$ = `np.linspace(0, 100, 101)` (specifies that S(t), I(t) is evaluated at $t \in [0, 100]$ (days) $ \subset \mathbb{Z}$)

In [9]:
pass

Plot the solutions from your implentation of the Euler method as a scatter, and plot the solutions from `scipy` as solid lines. When plotting the results from using the Euler method, plot every 100th point since we had $\Delta t = 0.01$. Set the size of the markers of the scatter to be 5: `s=5`. Label all plotted objects. Because we now have six plotted objects, the legend is large. We move the legend outside the axes: `plt.legend(loc='center right', bbox_to_anchor=(1.3, 0.5))`.

In [10]:
pass

We see that, at least for this choice of parameters, Euler's method produces an approximation which matches the results given by the `scipy` integrator (at least visually). Going forward, use the function from `scipy` to understand the trajectories of $S(t)$ and $I(t)$ since it's much quicker and more accurate in principle.