# MATH 405/607 

# Numerical Methods for Differential Equations

[[Instructor: Christoph Ortner]](http://www.math.ubc.ca/~ortner/)  [[CANVAS]](https://canvas.ubc.ca/courses/55324)


## IVPs Part 2: Stability

* Stiff problems
* Dissipative behaviour 
* Implicit Euler and Crank-Nicholson
* Outlook: Implicit Runge-Kutta Methods

In [None]:
include("../math405.jl")

function euler(f, u0, h, T)
    t = 0.0:h:T 
    U = zeros(length(u0), length(t)); U[:, 1] = u0 
    for n = 2:length(t) 
        U[:,n] = U[:,n-1] + h * f(t[n-1], U[:,n-1])
    end 
    return U, t
end

function improved_euler(f, u0, h, T)
    t = 0.0:h:T 
    U = zeros(length(u0), length(t)); U[:, 1] = u0 
    for n = 2:length(t) 
        F = f(t[n-1], U[:,n-1])
        U[:,n] = U[:,n-1] + 0.5 * h * (F + f(t[n], U[:,n-1] + h * F))
    end 
    return U, t
end

### Literature

* D. Mayers and E. Suli, An Introduction to Numerical Analysis, CUP; §12.4, 12.7, 12.11, 12.12
* N. Trefethen, https://people.maths.ox.ac.uk/trefethen/1all.pdf, §1.8 
* I also quite liked [these lecture notes](https://webspace.science.uu.nl/~frank011/Classes/numwisk/ch10.pdf)


A completely artificial but illuminating example [[Tref, Ex.1.8.1]](https://people.maths.ox.ac.uk/trefethen/1all.pdf):
$$
    \dot{u}(t) = - 100(u(t) - \cos(t)) - \sin(t), \qquad u(0) = 1.
$$

The (unique) solution is obviously $u(t) = \cos(t)$, which is smooth (all derivatives are bounded by 1) and e.g. the truncation errors of the Euler and Improved Euler methods will be $|T_n| \leq h, |T_n| \leq h^2$. So we can probably afford to take quite moderate steps...

In [None]:
f = (t, u) -> [ - 100 * (u[1] - cos(t)) - sin(t) ]
h = 0.03; u0 = [1.0]; Tf = 1.0
Ue, te = euler(f, u0, h, Tf)
Ui, ti = improved_euler(f, u0, h, Tf)
plot(cos, 0, Tf, lw=3, label = "exact", size = (500, 300), yaxis = ([-3.0, 3.0],))
plot!(te, Ue[1,:], lw=3, label = "Euler")
plt_003 = plot!(ti, Ui[1,:], lw=3, label = "Improved Euler")

In [None]:
# change the time-step from 0.03 to 0.02 !!!!
h = 0.02;
Ue, te = euler(f, u0, h, Tf)
Ui, ti = improved_euler(f, u0, h, Tf)
plot(cos, 0, Tf, lw=3, label = "exact", size = (500, 300), yaxis = ([-3.0, 3.0],))
plot!(te, Ue[1,:], lw=3, label = "Euler")
plt_002 = plot!(ti, Ui[1,:], lw=3, label = "Improved Euler")
plot(plt_003, plt_002, size = (600, 300))

what is going on????? Is this not a contradiction to our error analysis? (QUESTION: what have I missed?)

To get an idea of what is going on we look not just at one trajectory but *all* trajectories of the ODE: We observe that trajectories "nearby" the solution $u(t) = \cos(t)$ are technically smooth but vary rapidly $(\dot{u} = O(100))$, and small perturbations from the solution (such as a numerical error!) will move our numerical solution onto a different trajectory.

In [None]:
h = 0.003  # to be on the safe side...
plt = plot(cos, 0, Tf, c=2, yaxis = ([0.5, 1.5],), xaxis = ([-0.01, 0.15],), lw=2, label = "different u0", size = (500, 300))
for u0 = ( [1.1], [1.5], [10.0], [100.0], [2000.0], 
            [0.9], [0.5], [-10.0], [-100.0], [-2000.0])
    Ui, ti = improved_euler(f, u0, h, Tf)
    plot!(ti, Ui[1,:], lw=2, c=2, label = "")
end
plot!(cos, 0, Tf, lw=3, c=1, label = "u0 = 1")
plt

The general take-away message is that in analysing numerical schemes it is often crucial to understand how the underlying problem changes under small perturbations. The case we are considering here is 

**Loose Definition:** A problem is called *stiff* if the exact solution is smooth / varies slowly, but there are nearby solutions that vary rapidly. In other words the family of solutions varies on different time-scales.

The **very load hint** in the example  above is  the factor 100: 
$$
    \dot{u} = - 100 ( u - \cos(t) ) - \sin(t)
$$

For those interested in dynamical systems: a common occurance is when there is a smooth attractor with nearby trajectories converging rapidly towards the attractor. This is precisely the example shown above. But there are other cases.

I highly recommend reading Nick Trefethen's discussion of stiff problems in [Sec. 1.7 of his lecture notes](https://people.maths.ox.ac.uk/trefethen/1all.pdf)

## Stability Regions

The simplest possible setting where we can think of is a homogeneous equation 
$$
    \dot{u} = f(u)
$$
with equilibrium $f(u_*) = 0$ and $f : \mathbb{R} \to \mathbb{R}$ (i.e. a scalar ODE). Then $u(t) = u_*$ is obviously a solution. 

We assume moreover this is a stable equilibrium i.e. all nearby trajectories converge to $u_*$. This can be characterised by $f'(u_*) < 0$. (The case $f'(u_*) = 0$ is a bit trickier and we will ignore this!) (I will record a lecture to review all this material!)

Formally, we can linearlise the ODE around $u_*$. Let $v(t) = u(t) - u_*$ then we obtain 
$$ 
    \dot{v}(t) = f'(u_*) v(t) + O(v^2)
$$
We now drop the $O(v^2)$ term, rename $u = v$ and obtain our model problem 
$$ 
    \dot{u}(t) = \lambda u(t)
$$
where $\lambda < 0$.

In fact it is customary to consider $\lambda \in \mathbb{C}, {\rm Re} \lambda < 0$;
$$
    \dot{u}(t) = \lambda u(t)
$$
Then $u(t) = e^{\lambda t} u(0) \to 0$ with rate $e^{{\rm Re}\lambda t}$.

**The challenge:** under which conditions does the numerical solution also converge to zero? 

Aside from giving us an intuition about how the method will behave for stiff problems this is also interesting in the context of qualitative long-term behaviour of solutions.

**Example:** Euler Method 

$$
U_{n+1} = U_n + h \lambda U_n = (1+h\lambda) U_n.
$$ 

Note how $h\lambda$ appears together; this is *generic* and we will therefore write $z = h \lambda$ which gives 

$$
U_{n+1} = (1+z) U_n.
$$

We have that $|U_n| \to 0$ iff. $|1+z| < 1$

In [None]:
Plots.pyplot() # (the following function doesn't work with the GR backend...)
stab_euler = MATH405.levelset([-3, 1], [-2,2], (x,y)->((x+1)^2 + y^2), 1.0, title = "Stability Euler Method")

Same analysis for the Improved Euler Method 
$$
    U_{n+1} = U_n + \frac{h}{2} \Big( \alpha U_n + \alpha (U_n + h \alpha U_n) \Big) = 
            \Big(1 + z + \frac{z^2}{2} \Big) U_n
$$

In [None]:
mult = z -> 1 + z + z^2/2
stab_ie = MATH405.levelset( [-3,1], [-2,2], (x, y) -> abs( mult(x + im * y) ), 1.0, c=2, title = "Stability Improved Euler")
plot( stab_euler, stab_ie, size = (600, 300) )

Anlogue of our toy problem: 
$$
\dot{u} = - 100 u 
$$ 
$|1+z| < 1$, $|-100 h + 1| < 1$, 
$$
    100^2 h^2 + 200 h + 1 < 1 
$$
$$
     h < 2 /100 = 0.02
$$

### Remark on Systems

If $f : \mathbb{R}^N \to \mathbb{R}^N$ with $f(u_*) = 0$ and we linearise around $u_*$ then we get a linear ODE system 
$$ 
    \dot{v}(t) = A v(t),
$$
where $A = \partial f(u_*)$. 

* The stability can then be understood in terms of the spectrum $\sigma(A)$ : if all eigenvalues have negative real part then $v(t) \to 0$ exponentially fast. If $A$ is diagonalisable, then we can transform the system to $\dot{w}_i(t) = \lambda_i w_i(t)$, $i = 1, \dots, N$ and each of these equations is one of the $\dot{u}(t) = \lambda u(t)$ equations we analyzed above. (This is why we assumed $\lambda \in \mathbb{C}$.) Specifically, we now require that *all* scaled eigenvalues $h\lambda_i$ belong to the stability region of the numerical method! I will go into this in more detail in a recorded lecture.
* The stiffness of $\dot{u} = f(u)$ near $u_*$ can also be understood similarly; if $\sigma(A)$ containes eigenvalues of significantly  different magnitude then this indicates different modes of the system evolving at different speeds, which is a strong indicator of stiffness. See also [[Tref, Sec. 1.8]](https://people.maths.ox.ac.uk/trefethen/1all.pdf) for a very nice discussion of this connection.

### Stability Regions for RK Methods 


* Every (explicit) Runge-Kutta method applied to $\dot{u} = \lambda u$ takes the form 

$$ 
    U_{n+1} = p(\lambda h) U_n
$$

where $p$ is a polynomial. 

* The *stability region* is therefore given by the sublevel set 

$$ 
    \{ z : |p(z)| < 1 \}.
$$

* Because $|p(z)| \to \infty$ as $|z| \to \infty$ it follows that RK methods all have compact stability regions. 

In [None]:
# E.g. for the RK4 method (exercise!)
mult_e   = z -> 1+z
mult_ie  = z -> 1 + z + z^2/2
mult_rk4 = z -> 1 + z + z^2/2 + z^3/6 + z^4/24 
plt = MATH405.levelset([-3.5,1.5], [-3.5, 3.5], [ (x, y) -> abs(mult(x + im * y)) 
                                                  for mult in [mult_rk4, mult_ie, mult_e] ],
                       1.0, xy=true, fill=true, label = ("RK4", "Improved Euler", "Euler"), 
                       legend = :outertopright, size = (380, 300))

Challenge: construct RK methods with 
* maximal stability regions?
* maximise range of stability along the negative real axis? 

This is a fun game; cf RKC, ROCK, DUMKA methods; see e.g. 
* [1] https://link.springer.com/article/10.1007%2Fs002110100292
* [2] https://epubs.siam.org/doi/abs/10.1137/S1064827500379549

This gives you methods with very interesting stability regions: 

<img href="figs/stabrock2.png" width="500">

$$
\dot{u} = \lambda u
$$
### A-Stability

* But wouldn't it be nice to have numerical integrators for which the entire left half-place $\{ z : {\rm Re} z < 0\}$ is in the stability region?
* We know from the previous slides that this cannot be an (explicit) Runge-Kutta method. What else can we try?


Remember that in the derivation of the IE method we had an intermediate result, the Crank-Nicholson scheme: 

$$ 
  U_{n+1} = U_n + \frac{h}{2} \big( f(t_n, U_n) + f(t_{n+1}, U_{n+1}) \big).
$$

Because each time-step required the solution of a nonlinear system, we call this an *implicit* integrator. The simplest example of its kind is the *Implicit Euler Method:*

$$
    U_{n+1} = U_n + h f(t_{n+1}, U_{n+1})
$$

Let us apply the implicit Euler method to our $\dot{u} = \lambda u$ model problem ... 

$$ \begin{aligned} 
    & U_{n+1} = U_n + h \lambda U_{n+1} = U_n + z U_{n+1} \\ 
    \Leftrightarrow \qquad & (1-z) U_{n+1} = U_n  \\ 
    \Leftrightarrow \qquad & U_{n+1} = (1 - z)^{-1} U_n
\end{aligned}$$
I.e. the region of stability = $\{ |1-z|^{-1} < 1 \}$, the exterior of the disk $\{ |1-z| \leq 1\}$. 

In [None]:
r_impe = z -> 1/(abs(1-z)+1e-15)
MATH405.levelset([-2.5,2.5], [-2.5,2.5], (x, y) -> abs( r_impe(x + im * y) ), 1.0)

**Definition:** We call a numerical integrator *A-stable* if its stability region includes the entire left half plane $\{ z : {\rm Im} z < 0 \}$. 

**Crank-Nicholson method:** exercise 

## Outlook: Implicit Runge-Kutta Methods

Recall the *explicit* RK methods: 
$$\begin{aligned}
    k_1 &= h f(t_n, U_n) \\ 
    k_2 &= h f(t_n + c_2 h, U_n + a_{21} k_1) \\ 
    k_3 &= h f(t_n + c_3 h, U_n + a_{31} k_1 + a_{32} k_2) \\ 
    &\vdots \\ 
    k_{s} &= h f(t_n + c_{s} h, U_n + a_{s,1} k_1 + \dots + a_{s,s-1} k_{s-1}) \\ 
    U_{n+1} &= U_n + b_1 k_1 + b_2 k_2 + \dots + b_s k_s
\end{aligned}$$

described by *lower-triangular* Butcher tables 
$$
\begin{array}{c|ccccc}
    0      &       & & & &  \\
    c_2    & a_{21}   &&& &   \\
    c_3    & a_{31} & a_{32} & & &   \\
    \vdots &   \vdots     &   & \ddots & &  \\
    c_{s}    & a_{s,1} & a_{s,2} & \cdots & a_{s,s-1} &   \\ \hline
           &  b_1 & b_2 & \cdots & b_{s-1} & b_s 
\end{array}
$$

By contrast, *implicit* RK formulas take the form 
$$\begin{aligned}
    k_j &= h f(t_n + c_j h, U_n + a_{j1} k_1 + \dots + a_{js} k_s) \\ 
    U_{n+1} &= U_n + b_1 k_1 + b_2 k_2 + \dots + b_s k_s
\end{aligned}$$

described by *full* Butcher tables 
$$
\begin{array}{c|ccccc}
    c_1    & a_{11}  & a_{12} & \cdots & a_{1,s}   \\
    c_2    & a_{21}  & a_{22} & \cdots & a_{2,s} \\
    \vdots & \vdots  &        &        & \vdots  \\
    c_{s}  & a_{s,1} & a_{s,2}& \cdots & a_{s,s} \\ \hline
           &  b_1 & b_2 & \cdots & b_s 
\end{array}
$$

The theory is much cleaner ... 

![trefirk](figs/trefethen_irk_thm.png)

so why aren't we using IRK methods all the time?

**Main Caveat:** One step of an IRK method requires the solution of a nonlinear system for $k_1, \dots, k_s$.
    
But many compromises and tricks have been devised to alleviate this additional cost, and we will encounter some of them later on in the course. 

For example, what if the ODE is of the form 
$$ 
    \dot{u} = Au + g(u) 
$$
where the term $Au$ is stiff but $g(u)$ is not, then we could discretise by 
$$ 
    U_{n+1} = U_n + h \big( A U_{n+1}  + g(u_n) \big)
$$

Schemes of this kind are called *semi-implicit*

Or maybe we could devise RK methods with lower-triangular tables, 
$$
\begin{array}{c|ccccc}
    c_1    & a_{11}  &  &  &    \\
    c_2    & a_{21}  & a_{22} &  &  \\
    \vdots & \vdots  &        &  \ddots      &   \\
    c_{s}  & a_{s,1} & a_{s,2}& \cdots & a_{s,s} \\ \hline
           &  b_1 & b_2 & \cdots & b_s 
\end{array}
$$
Then we would solve a sequence of smaller nonlinear problems instead of one big problem.

These are called *Diagonally Implicit Runge-Kutta (DIRK) formulae*

## Experiments

Implementing an implicit scheme requires some extra work, we will do this in the next workshop. Instead we will first revisit 
our opening example and then explore the Julia `OrdinaryDiffEq.jl` suite.

$$
    \dot{u}(t) = - 100(u(t) - \cos(t)) - \sin(t), \qquad u(0) = 1.
$$
What makes our life easy here is that $f$ is scalar and linear in $u$. 

In [None]:
function cn_example(u0, h, Tf)
    f = (t, u) -> - 100 * (u - cos(t)) - sin(t) 
    df = - 100
    t = 0.0:h:Tf 
    U = zeros(1, length(t)); U[1, 1] = u0[1]
    for n = 2:length(t) 
        f0 = f(t[n-1], U[1,n-1])
        f1 = f(t[n], U[1,n-1])
        dU = (h/2 * f0 + h/2 * f1) / (1 - h/2 * df)
        U[1, n] = U[1,n-1] + dU
    end 
    return U, t
end

In [None]:
h = 0.03; u0 = [1.0]; Tf = 1.0
Ui, ti = improved_euler(f, u0, h, Tf)
Ue, te = euler(f, u0, h, Tf)
U_cn, t_cn = cn_example(u0, h, Tf)
plot(cos, 0, Tf, lw=3, label = "exact", size = (500, 300), yaxis = ([-3.0, 3.0],))
plot!(te, Ue[1,:], lw=3, label = "Euler")
plot!(ti, Ui[1,:], lw=3, label = "Improved Euler")
plot!(t_cn, U_cn[1,:], lw=3, label = "Crank-Nicholson")

In [None]:
# perfect long-time behaviour even with large time-steps!
h = 0.1; u0 = [1.0]; Tf = 10.0
U_cn, t_cn = cn_example(u0, h, Tf)
plot(cos, 0, Tf, lw=3, label = "exact", size = (500, 300), yaxis = ([-3.0, 3.0],))
plot!(t_cn, U_cn[1,:], lw=3, label = "Crank-Nicholson")

Let's try some more advanced solvers from [DifferentialEquations.jl](https://diffeq.sciml.ai/dev/solvers/ode_solve/). Amongst many others, they recommend `Tsit5()` for non-stiff problems and `Rodas5()` for stiff problems. 

In [None]:
using OrdinaryDiffEq
f = (u, p, t) -> - 100 * (u - cos(t)) - sin(t) 
prob = ODEProblem(f, 1.0, (0.0, 10.0))

println("TSIT5 Integrator (non-stiff)")
@time sol_tsit5 = solve(prob, Tsit5(), abstol=1e-3)
@show length(sol_tsit5.t);

println("RODAS5 Integrator (stiff)")
@time sol_rodas5 = solve(prob, Rodas5(), abstol=1e-3)
@show length(sol_rodas5.t)

### Pushing the limits...  

The [ROBER test problem](https://www.unige.ch/~hairer/testset/testset.html); H.H. Robertson in 1966 [Rob66]; see also [this nice summary](https://www.radford.edu/~thompson/vodef90web/problems/demosnodislin/Single/DemoRobertson/demorobertson.pdf)

$$\begin{aligned}
    \dot{u}_1 &= - 0.04 u_1 + 10^4 u_2 u_3 \\ 
    \dot{u}_2 &= 0.04 u_1 - 3\times 10^7 u_2^2 - 10^4 u_2 u_3 \\ 
    \dot{u}_3 &= 3 \times 10^7 u_2^2
\end{aligned}$$

In [None]:
# this implementation rescales u[2] -> sqrt(3e7) u[2]; only to simplify plotting...
rober = let k₁ = 0.04, k₂ = 3e7, k₃ = 1e4
    sk2 = sqrt(k₂)
    (u, p, t) -> SA[      -k₁*u[1] + (k₃/sk2)*u[2]*u[3], 
                     sk2*( k₁*u[1] - (k₃/sk2)*u[2]*u[3] - u[2]^2 ), 
                                                          u[2]^2 ]
end 

prob = ODEProblem(rober, [1.0, 0.0, 0.0], (0.0, 1e5))
@time rober_tsit5 = solve(prob, Tsit5(); maxiters=10_000); @show length(rober_tsit5.t)
@time rober_rodas5 = solve(prob, Rodas5());  @show length(rober_rodas5.t)
plot(rober_rodas5, tspan=(1e-6,1e5), xscale = :log10, lw=2, yaxis = [-0.2, 1.2], size=(500, 250))

In [None]:
plot( plot(rober_tsit5, tspan=(1e-6, rober_tsit5.t[end]), xscale = :log10, lw=1, yaxis = [-0.2, 1.2]),
    plot(rober_tsit5, tspan=(1.0, rober_tsit5.t[end]), xscale = :log10, lw=1, yaxis = [0.1115, 0.1125],  xaxis = [5.27, 5.31]), 
      size=(600, 300))

### Summary

* stability regions give restrictions on time-steps, dependent on local **variations** in the driving force $f$
* For a more general and comprehensive theory we need to look at perturbation theory for ODEs and diagonalisation of the linearised equations but this leads essentially to the same characterisatons of stability that we have obtained in this lecture
* Modern ODE solvers (integrators) are finely tuned and sophisticated codes that can solve IVPs very efficiently; there are multiple classes of integrators, here we distinguished primarily explicit and implicit solvers which are more or less useful for non-stiff / stiff systems.
* You should not normally write your own ODE solvers but leverage existing software! But understanding the theory behind the different methods will help you choose the right integrators.
* A key missing ingredient in our theory is adaptivity - see the workshops!

#### Why does all this not contradict our theory from last week? 
i
Best to think that the "naive" theory from last week is only useful when you have moderate Lipschitz constants. In the stiff case, a much finer analysis is required to understand why ODE solvers are accurate. Today we explained this only *qualitatively* but a complete and rigorous analysis is a bit more subtle and goes beyond  the scope of this course.