# Introduction and overview

In this course, we will build on the techniques learnt in numerical analysis in CMIII and make some of the notions there more rigorous. 

In addition to thinking about discretising in time (ODEs) we will also look at discretising differential equations in space (PDEs), and the interplay between them.

We'll start by recapitulating some of the terminology and results from CMIII, in particular stability and consistency of numerical methods for ODEs.

Some of this material is taken from [Jed Brown's](https://jedbrown.org) numerical PDEs course https://github.com/cucs-numpde/numpde, licensed under BSD 2-clause.

## Discretising an ODE

### Method of lines

When we eventually encounter spatial derivatives, along with time derivatives, we will first discretise in space, obtaining a system of ODEs and then in time. This is known as the _method of lines_. For now, we will just assume that we have already discretised in space to obtain the ODE
$$
 \dot u = f(t, u).
$$
This is a _first order_ ODE, since there is only one time derivative.

#### Second (and higher-order) ODEs

An ODE with higher time derivatives, for example the second order problem
$$
\ddot u = f(t, u, \dot u)
$$
can always be converted into a system of first order problems by introducing auxiliary variables
$$
\begin{align}
u_0 &= u\\
u_1 &= \dot u
\end{align}
$$
so that we obtain
$$
\begin{bmatrix} \dot u_0 \\ \dot u_1 \end{bmatrix} = \begin{bmatrix} u_1 \\ f(t, u_0, u_1) \end{bmatrix}.
$$

We'll therefore focus on first-order systems.

#### Uniqueness

We will call $u^*(t)$ a _solution_ to $\dot u = f(t, u)$, if, for all $t$

$$
\dot u^*(t) = f(t, u^*(t)).
$$

We note, and are not the first, that this solution usually contains a free parameter, so that for uniqueness we need to specify an _initial value_ $u(0) = u_0$. Such problems are therefore termed _initial value problems_ or IVPs.

### Discretising the $\dot u$ term

In CMIII, you mostly focussed on explicit (or forward) Euler to discretise the time derivative. This is a one-sided difference method, where we write:

$$
\dot u = \frac{u(t+\Delta t) - u(t)}{\Delta t} = f(t, u(t)),
$$
and so

$$
u(t+\Delta t) = u(t) + \Delta t f(t, u(t)).
$$

Let's look at how this behaves for a sample problem

$$
\dot u = -k(u - \cos{t})
$$

For an initial value $u(0) = u_0$, this problem has exact solution

$$
u(t) = \frac{k}{1+k^2}(\sin{t} + k \cos{t}) + \left(u_0 - \frac{k^2}{1+k^2}\right) \exp{(-kt)}.
$$

Note that most real problems will not have such easy to compute solutions (hint: use an [integrating factor](https://en.wikipedia.org/wiki/Integrating_factor) and integrate by parts twice to compute the resulting integral).

Let's go ahead and integrate this equation numerically

In [None]:
%matplotlib notebook
import numpy
from matplotlib import pyplot
pyplot.style.use("ggplot")

class f(object):
    def __init__(self, k):
        self.k = k
    
    def __call__(self, t, u):
        return -self.k * (u - numpy.cos(t))
    
    def __str__(self):
        return "f(k={})".format(self.k)

    def exact(self, t, u_0):
        k = self.k
        k2p1 = k/(1 + k**2)
        return (u_0 - k*k2p1)*numpy.exp(-k*t) + k2p1 * (numpy.sin(t) + k*numpy.cos(t))
    
def ode_euler(f, u_0, T, dt=0.1):
    u = numpy.array(u_0)
    t = 0
    thist = [t]
    uhist = [u_0]
    while t < T:
        dt = min(dt, T - t)
        u += dt * f(t, u)
        t += dt
        thist.append(t)
        uhist.append(u.copy())
    return numpy.asarray(thist), numpy.asarray(uhist)

u_0 = numpy.array(0.2)

pyplot.figure()

for k in [2, 10]:
    rhs = f(k)
    thist, uhist = ode_euler(rhs, u_0, dt=.1, T=12)
    pyplot.plot(thist, uhist, "o", linestyle="solid", label=str(rhs)+' Forward Euler')
    pyplot.plot(thist, rhs.exact(thist, u_0), label=str(rhs)+' exact')
pyplot.legend(bbox_to_anchor=(0.5, 1), loc='center', ncol=2);

This looks pretty good in the eyeball norm.

#### Questions

1. What happens when you increase $k$ further?
2. What about if you try and increase $\Delta t$?

### Implicit Euler

Rather than evaluating all the right hand side terms in our time discretisation _explicitly_, using already known values of $u$ at the beginning of the timestep, we can also evaluate them _implicitly_ using values at the end of the timestep. In that case, we end up with the discrete problem

$$
u(t + \Delta t) - \Delta t f(t + \Delta t, u(t + \Delta t)) = u(t).
$$

This is called _implicit_ (or backward) Euler. In the general case where $f$ is nonlinear, we must solve a nonlinear equation at this point. Let's skip over that for now and only consider the linear problem where we can write

$$
f(t, u) = Au
$$
for some matrix $A$.
Rearranging, we obtain
$$
(I - \Delta t A)u(t + \Delta t) = u(t)
$$
and so a single step now requires us to invert the matrix $(I - \Delta t A)$. This may be _significantly_ more expensive than just evaluating the right hand side (a matrix-vector product). So what does this buy us?

#### Exact solution of linear problems

The solution to the problem
$$
\dot u = A u
$$
can be written using the [matrix exponential](https://en.wikipedia.org/wiki/Matrix_exponential) as
$$
u(t) = \exp{(A t)}u(0)
$$
where the exponential is formally defined using the Taylor series
$$
\exp{A} = \sum_{n=0}^{\infty} \frac{A^{n}}{n!},
$$
and there are many (both good and bad) [ways to compute it](https://doi.org/10.1137/S00361445024180).

If we can efficiently evaluate the matrix exponential, then we can directly compute the solution to our linear ODE at any time $t$.

Let's now look at the behaviour of explict and implicit Euler on a test problem with purely oscillatory solutions. Choosing

$$
A = \begin{bmatrix} 0 & 1\\ -1 & 0 \end{bmatrix}
$$
and
$$
u_0 = \begin{bmatrix} 0.75 \\ 0 \end{bmatrix}
$$
we expect
$$
u(t) = \begin{bmatrix} 0.75 \cos t\\-0.75 \sin t\end{bmatrix}.
$$

Rather than laboriously working this out each time, we can compute an "exact" solution by explicitly computing the matrix exponential. [scipy](https://scipy.org) provides a builtin function to do this, `scipy.linalg.expm`.

##### Question

If scipy didn't provide this function, can you think of a fast way of computing matrix exponentials if the matrix $A$ is diagonalisable

$$
A = X \Lambda X^{-1}.
$$

Hint, substitute the expansion into the power series and note that $X X^{-1} = \mathbb{1}$.

In [None]:
from scipy.linalg import expm

class linear(object):
    def __init__(self, A):
        self.A = A.copy()
    
    def __call__(self, t, u):
        return self.A @ u
    
    def exact(self, t, u_0):
        t = numpy.array(t, ndmin=1)
        return [numpy.real_if_close(expm(self.A*s) @ u_0) for s in t]

test = linear(numpy.array([[0, 1],
                           [-1, 0]]))
u_0 = numpy.array([.75, 0])
thist, uhist = ode_euler(test, u_0, dt=.1, T=15)
pyplot.figure()
pyplot.plot(thist, uhist, '.', label='Euler')
pyplot.plot(thist, test.exact(thist, u_0), label='exact')
pyplot.legend(loc='upper right')
pyplot.title('Forward Euler');

Now let's look at backward Euler.

In [None]:
def ode_beuler(A, u_0, T, dt=0.1):
    u = numpy.array(u_0)
    t = 0
    thist = [t]
    uhist = [u_0]
    while t < T:
        dt = min(dt, T - t)
        # u <- (I - dt A)^{-1} u
        u = numpy.linalg.solve(numpy.eye(len(A)) - dt*A, u)
        t += dt
        thist.append(t)
        uhist.append(u.copy())
    return numpy.asarray(thist), numpy.asarray(uhist)

In [None]:
test = linear(numpy.array([[0, 1],
                           [-1, 0]]))
u_0 = numpy.array([.75, 0])
thist, uhist = ode_beuler(test.A, u_0, dt=.1, T=15)
pyplot.figure()
pyplot.plot(thist, uhist, '.', label='Euler')
pyplot.plot(thist, test.exact(thist, u_0), label='exact')
pyplot.legend(loc='upper right')
pyplot.title('Backward Euler');

### The (implicit) midpoint method

Already for this simple problem neither forward nor backward Euler produce particularly accurate results unless we choose tiny timesteps. These are not the only choices we can make. We could instead evaluate the right hand side $f$ at the midpoint of the timestep

$$
u(t + \Delta t) = u(t) + \Delta t f\left(t + \Delta t/2, \frac{u(t) + u(t + \Delta t)}{2}\right).
$$

Which for linear problems reduces to

$$
\left(I - \frac{\Delta t}{2}\right)u(t + \Delta t) = \left(I + \frac{\Delta t}{2}\right)u(t)
$$
again requiring a solve.

In [None]:
def ode_midpoint(A, u_0, T, dt=0.1):
    u = numpy.array(u_0)
    t = 0
    thist = [t]
    uhist = [u_0]
    I = numpy.eye(len(A))
    while t < T:
        dt = min(dt, T - t)
        # u <- (I - dt/2 A)^{-1} (I + dt/2 A) u
        u = numpy.linalg.solve(I - dt/2*A, (I + dt/2*A) @ u)
        t += dt
        thist.append(t)
        uhist.append(u.copy())
    return numpy.asarray(thist), numpy.asarray(uhist)

In [None]:
test = linear(numpy.array([[0, 1],
                           [-1, 0]]))
u_0 = numpy.array([.75, 0])
thist, uhist = ode_midpoint(test.A, u_0, dt=.1, T=15)
pyplot.figure()
pyplot.plot(thist, uhist, '.', label='Midpoint')
pyplot.plot(thist, test.exact(thist, u_0), label='exact')
pyplot.legend(loc='upper right')
pyplot.title('Implicit Midpoint');


This looks much more accurate!

## Linear stability

To assess the stability of time integration schemes, we consider their behaviour when applied to the linear test question (called the _Dahlquist test equation_)

$$
\dot u = \lambda u
$$
where $\lambda \in \mathbb{C}$ is some complex number. When taking a step of length $\Delta t$ This equation has exact solution
$$
u(\Delta t) = u_0 \exp{(\lambda \Delta t)} = u_0 \exp{(\operatorname{Re} \lambda \Delta t)} (\cos \operatorname{Im} \lambda \Delta t) + i\sin\operatorname{Im}\lambda \Delta t).
$$

That is, there is a oscillatory component (governed by the magnitude of the imaginary part of $\lambda$) bounded by an exponential envelope.

Now consider applying a particular time discretisation scheme to the same test equation.

### Stability regions

#### Explicit Euler

$$
u(\Delta t) = R(\Delta t \lambda) u_0,
$$
where
$$
R(z) = 1 + z.
$$

Repeated application of the timestepping scheme results in

$$
u_m := u(m\Delta t) = R(z)^m u_0.
$$

$u_m$ is therefore only bounded in the set

$$
S = \{z \in \mathbb{C} \colon |R(z)| \le 1\}.
$$

#### Implicit Euler

We can perform the same calculation here to obtain

$$
R(z) = \frac{1}{1 - z}
$$

#### Implicit midpoint

This time we obtain
$$
R(z) = \frac{1 + z/2}{1 - z/2}.
$$

#### Stability plots

**Definition**: $R(z)$ is called the *stability function*, and the set $S$ is the *stability domain*.

We can graphically display the stability region by plotting $|R(z)|$ in the complex plane.

In [None]:
def plot_stability(x, y, R, label):
    pyplot.figure()
    C = pyplot.contourf(x, y, numpy.abs(R), numpy.linspace(0, 1, 10), cmap=pyplot.cm.coolwarm)
    
    pyplot.colorbar(C, ticks=numpy.linspace(0, 1, 10))
    pyplot.contour(x, y, numpy.abs(Rz), numpy.linspace(0, 1,4), colors='k')
    pyplot.title(label)

In [None]:
x = numpy.linspace(-3, 3)
x, y = numpy.meshgrid(x, x)
z = x + 1j*y

Rs = [("Forward Euler", 1 + z),
      ("Backward Euler", 1/(1 - z)),
      ("Implicit Midpoint", (1 + z/2)/(1 - z/2))]

for label, Rz in Rs:
    plot_stability(x, y, Rz, label)

### Observations

While the physical equation is stable and remains bounded whenever $\operatorname{Re} \lambda \le 0$, the same is not true for all the methods. Explicit Euler is only stable in a disc of radius 1 centred around -1. Implicit Euler is stable _even_ in regions where the original equation is not, and tends to damp oscillations too agressively. Implicit midpoint has a stability region that exactly includes the left half plane.

### The $\theta$ method

All of these schemes are particular examples of the (implicit) $\theta$ method

$$
u(t + \Delta t) = u(t) + \Delta t f(t + \theta \Delta t, \theta u(t + \Delta t) + (1 - \theta)u(t)).
$$

With $\theta = 0$ we recover explicit Euler, $\theta = 1$ produces implicit Euler, and $\theta = 1/2$ produces implicit midpoint.

In [None]:
def ode_theta_linear(A, u0, rhsfunc, T=1, dt=0.1, theta=.5):
    u = u0.copy()
    t = 0
    hist = [(t,u0)]
    try:
        I = numpy.eye(len(A))
    except TypeError:
        I = numpy.eye(1)
    while t < T:
        dt = min(dt, T - t)
        rhs = (I + (1-theta)*dt*A) @ u + dt*rhsfunc(t+theta*dt)
        u = numpy.linalg.solve(I - theta*dt*A, rhs)
        t += dt
        hist.append((t, u.copy()))
    return hist

In [None]:
test = f(k=5000)

In [None]:
theta = 0.5
u0 = numpy.array([.2])
hist = ode_theta_linear(-test.k, u0,
                        lambda t: test.k*numpy.cos(t),
                        dt=.1, T=6, theta=theta)

In [None]:
pyplot.figure()
hist = numpy.array(hist)
pyplot.plot(hist[:,0], hist[:,1], 'o', linestyle="solid", label=r"$\theta = {}$".format(theta))
tt = numpy.linspace(0, 6, 200)
pyplot.plot(tt, test.exact(tt, u0), label="Exact")
pyplot.legend();

### Observations and questions

With $\theta = 1/2$ the method converges, but oscillates about the exact solution (eventually matching it if we run for long enough).

What happens for $\theta < 1/2$, what about $\theta > 1/2$?


### Definitions

#### A-stability

A method is _A-stable_ if the stability domain

$$
S = \{z \in \mathbb{C} \colon |R(z)| \le 1\}
$$

contains the _entire_ left half plane

$$
\operatorname{Re} z \le 0.
$$

This means that we can take arbitrarily large timesteps ($\Delta t \to \infty$) without the method becoming unstable (diverging) for any problem that is physically stable.

#### L-stability

A-stability might not be enough for accurate solutions to a problem, especially if it is oscillatory (as we see above). A stronger condition is therefore sometimes used.

A method is _L-stable_ if, in addition to being A-stable we also have

$$
\lim_{z \to \infty} R(z) = 0.
$$

#### Questions

1. For what values of $\theta$ is the $\theta$ method A-stable?
2. For what values is it L-stable?

## Accuracy of the methods

The formal accuracy of the method can be measured in terms of how fast the numerical solution approaches the exact solution in some norm.

We say a method is of order $p$ if

$$
\lim_{\Delta t \to 0} \| u - u^* \| \le C (\Delta t)^{p}
$$

for some constant $C$ independent of $\Delta t$.