<table>
 <tr align=left><td><img align=left src="https://i.creativecommons.org/l/by/4.0/88x31.png">
 <td>Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli</td>
</table>

In [3]:
%matplotlib inline
import numpy
import matplotlib.pyplot as plt

# Boundary Value Problems:  Discretization

## Model Problems

The simplest boundary value problem (BVP) we will run into is the one-dimensional version of Poisson's equation
$$
    u''(x) = f(x).
$$

Usually we solve this equation on a finite interval with either Dirichlet or Neumann boundary condtions.  Because there are two derivatives in the equation we need two boundary conditions to solve the PDE (really and ODE in this case) uniquely.  To start let us consider the following basic problem
$$\begin{aligned}
    u''(x) = f(x) ~~~ \Omega = [a, b] \\
    u(a) = \alpha ~~~ u(b) = \beta.
\end{aligned}$$

BVPs of this sort are often the result of looking at the steady-state form of a time dependent PDE.  For instance, if we were considering the steady-state solution to the heat equation
$$
    u_t(x,t) = \kappa u_{xx}(x,t) + \Psi(x,t) ~~~~ \Omega = [0, T] \times [a, b] \\
    u(x, 0) = u^0(x) ~~~ u(a, t) = \alpha(t) ~~~ u(b, t) = \beta(t)
$$
we would solve the equation where $u_t = 0$ and arrive at
$$
    u''(x) = - \Psi / \kappa,
$$
a version of Poisson's equation above.

In higher spatial dimensions the second derivative turns into a Laplacian.  Notation varies for this but all these are equivalent statements:
$$\begin{aligned}
    \nabla^2 u(\vec{x}) &= f(\vec{x}) \\
    \Delta u(\vec{x}) &= f(\vec{x}) \\
    \sum^N_{i=1} u_{x_i x_i} &= f(\vec{x}).
\end{aligned}$$

## One-Dimensional Discretization

As a first approach to solving the one-dimensional Poisson's equation let's break up the domain into `m` points, often called a *mesh* or *grid*.  Our goal is to approximate the unknown function $u(x)$ as the mesh points $x_i$.  First we can relate the number of mesh points `m` to the distance between with
$$
    \Delta x = \frac{1}{m + 1}.
$$
The mesh points $x_i$ can be written as
$$
    x_i = a + i \Delta x.
$$

We can let $\Delta x$ vary and many of the formulas above have only minor modifications but we will leave that for homework.  Notationally we will also adopt the notation
$$
    U_i \approx u(x_i)
$$
so that $U_i$ are the approximate solution at the grid points and retain the lower-case $u$ to denote the true solution.

To simplify our discussion let's consider the ODE
$$
    u''(x) = f(x) ~~~ \Omega = [0, 1] \\
    u(0) = \alpha ~~~ u(1) = \beta.
$$

Applying the 2nd order, centered difference approixmation for the 2d derivative we have the equation
$$
    D^2 U_i = \frac{1}{\Delta x^2} (U_{i+1} - 2 U_i + U_{i-1})
$$
so that we end up with the approximate algebraic expression at every grid point of
$$
    \frac{1}{\Delta x^2} (U_{i+1} - 2 U_i + U_{i-1}) = f(x_i)  ~~~ i = 1, 2, 3, \ldots, m.
$$

Note at this point that these algebraic equations are coupled as each $U_i$ depends on its neighbors.  This means we can write these as system of coupled equations
$$
    A U = F.
$$

#### Write the system of equations
$$
    \frac{1}{\Delta x^2} (U_{i+1} - 2 U_i + U_{i-1}) = f(x_i)  ~~~ i = 1, 2, 3, \ldots, m.
$$

Note the boundary conditions!

$$
    \frac{1}{\Delta x^2} \begin{bmatrix}
    -2 &  1 &    &    &    \\
     1 & -2 &  1 &    &    \\
       &  1 & -2 &  1 &    \\
       &    &  1 & -2 &  1 \\
       &    &    &  1 & -2 \\
    \end{bmatrix} \begin{bmatrix}
        U_1 \\ U_2 \\ U_3 \\ U_4 \\ U_5
    \end{bmatrix} = 
    \begin{bmatrix}
        f(x_1) - \frac{\alpha}{\Delta x^2} \\ f(x_2) \\ f(x_3) \\ f(x_4) \\ f(x_5) - \frac{\beta}{\Delta x^2} \\
    \end{bmatrix}.
$$

#### Example

Want to solve the BVP
$$
    u_{xx} = e^x, ~~~~ x \in [0, 1] ~~~~ \text{with} ~~~~ u(0) = 0.0, \text{ and } u(1) = 3
$$
via the construction of a linear system of equations.

$$\begin{aligned}
    u_{xx} &= e^x \\
    u_x &= A + e^x \\
    u &= Ax + B + e^x\\
    u(0) &= B + 1 = 0 \Rightarrow B = -1 \\
    u(1) &= A - 1 + e^{1} = 3 \Rightarrow A = 4 - e\\ 
    ~\\
    u(x) &= (4 - e) x - 1 + e^x
\end{aligned}$$

In [4]:
# Problem setup
a = 0.0
b = 1.0
u_a = 0.0
u_b = 3.0
f = lambda x: numpy.exp(x)
u_true = lambda x: (4.0 - numpy.exp(1.0)) * x - 1.0 + numpy.exp(x)

# Descretization
m = 10
x_bc = numpy.linspace(a, b, m + 2)
x = x_bc[1:-1]
delta_x = (b - a) / (m + 1)

# Construct matrix A
A = numpy.zeros((m, m))
diagonal = numpy.ones(m) / delta_x**2
A += numpy.diag(diagonal * -2.0, 0)
A += numpy.diag(diagonal[:-1], 1)
A += numpy.diag(diagonal[:-1], -1)

# Construct RHS
b = f(x)
b[0] -= u_a / delta_x**2
b[-1] -= u_b / delta_x**2

# Solve system
U = numpy.empty(m + 2)
U[0] = u_a
U[-1] = u_b
U[1:-1] = numpy.linalg.solve(A, b)

# Plot result
fig = plt.figure()
axes = fig.add_subplot(1, 1, 1)
axes.plot(x_bc, U, 'o', label="Computed")
axes.plot(x_bc, u_true(x_bc), 'k', label="True")
axes.set_title("Solution to $u_{xx} = e^x$")
axes.set_xlabel("x")
axes.set_ylabel("u(x)")
plt.show()

## Error Analysis

A natural question to ask given our approximation $U_i$ is how close this is to the true solution $u(x)$ at the grid points $x_i$.  To address this we will define the error $E$ as
$$
    E = U - \hat{U}
$$
where $U$ is the vector of the approximate solution and $\hat{U}$ is the vector composed of the $u(x_i)$.  

This leaves $E$ as a vector still so often we ask the question how does the norm of $E$ behave given a particular $\Delta x$.  For the $\infty$-norm we would have
$$
    ||E||_\infty = \max_{1 \leq i \leq m} |E_i| = \max_{1 \leq i \leq m} |U_i - u(x_i)|
$$

If we can show that $||E||_\infty$ goes to zero as $\Delta x \rightarrow 0$ we can then claim that the approximate solution $U_i$ at any of the grid points $E_i \rightarrow 0$.  If we would like to use other norms we often define slightly modified versions of the norms that also contain the grid width $\Delta x$ where
$$\begin{aligned}
    ||E||_1 &= \Delta x \sum^m_{i=1} |E_i| \\
    ||E||_2 &= \left( \Delta x \sum^m_{i=1} |E_i|^2 \right )^{1/2}
\end{aligned}$$
These are referred to as *grid function norms*.

The $E$ defined above is known as the *global error*.  Our goal now turns to using the local truncation error and some idea of stability to imply the global error goes to zero.

### Local Truncation Error

The *local truncation error* (LTE) can be defined by replacing the approximate solution $U_i$ by the approximate solution $u(x_i)$.  Since the algebraic equations are an approximation to the original BVP, we do not expect that the true solution will exactly satisfy these equations, this resulting difference is the LTE.

For our one-dimensional finite difference approximation from above we have
$$
    \frac{1}{\Delta x^2} (U_{i+1} - 2 U_i + U_{i-1}) = f(x_i).
$$

Replacing $U_i$ with $u(x_i)$ in this equation leads to
$$
    \tau_i = \frac{1}{\Delta x^2} (u(x_{i+1}) - 2 u(x_i) + u(x_{i-1})) - f(x_i).
$$

In this form the LTE is not as useful but if we assume $u(x)$ is smooth we can repalce the $u(x_i)$ with their Taylor series counterparts, similar to what we did for finite differences.  The relevant Taylor series are
$$
    u(x_{i \pm 1}) = u(x_i) \pm u'(x_i) \Delta x + \frac{1}{2} u''(x_i) \Delta x^2 \pm \frac{1}{6} u'''(x_i) \Delta x^3 + \frac{1}{24} u^{(4)}(x_i) \Delta x^4 + \mathcal{O}(\Delta x^5)
$$

This leads to an expression for $\tau_i$ of
$$\begin{aligned}
    \tau_i &= \frac{1}{\Delta x^2} \left [u''(x_i) \Delta x^2 + \frac{1}{12} u^{(4)}(x_i) \Delta x^4 + \mathcal{O}(\Delta x^5) \right ] - f(x_i) \\
    &= u''(x_i) + \frac{1}{12} u^{(4)}(x_i) \Delta x^2 + \mathcal{O}(\Delta x^4) - f(x_i) \\
    &= \frac{1}{12} u^{(4)}(x_i) \Delta x^2 + \mathcal{O}(\Delta x^4)
\end{aligned}$$
where we note that the true solution would satisfy $u''(x) = f(x)$.

As long as $ u^{(4)}(x_i) $ remains finite (smooth) we know that $\tau_i \rightarrow 0$ as $\Delta x \rightarrow 0$

We can also write the vector of LTEs as
$$
    \tau = A \hat{U} - F
$$
which implies
$$
    A\hat{U} = F + \tau.
$$

### Global Error

What we really want to bound is the global error $E$.  To relate the global error and LTE we can substitute $E = U - \hat{U}$ into our expression for the LTE to find
$$
    A E = -\tau.
$$

This means that the global error is the solution to the system of equations we difined for the approximation except with $\tau$ as the forcing function rather than $F$!  This also implies that the global error $E$ can be thought of as an approximation to similar BVP as we started with where
$$
    e''(x) = -\tau(x) ~~~ \Omega = [0, 1] \\
    e(0) = 0 ~~~ e(1) = 0.
$$

We can solve this ODE directly by integrating twice since to find to leading order
$$\begin{aligned}
    e(x) &\approx -\frac{1}{12} \Delta x^2 u''(x) + \frac{1}{12} \Delta x^2 (u''(0) + x (u''(1) - u''(0))) \\
    &= \mathcal{O}(\Delta x^2) \\
    &\rightarrow 0 ~~~ \text{as} ~~~ \Delta x \rightarrow 0.
\end{aligned}$$

### Stability

We should that the continuous analog to $E$ $e(x)$ does in fact go to zero as $\Delta x \rightarrow 0$ but what about $E$?  Instead of showing something based on $e(x)$ let's look back at the original system of equations for the global error
$$
    A^{\Delta x} E^{\Delta x} = - \tau^{\Delta x}
$$
where we now denote a particular realization of the system by the corresponding grid spacing $\Delta x$.  

If we could invert $A^{\Delta x}$ we could compute $E^{\Delta x}$ directly.  Assuming that we can and taking an appropriate norm we find
$$\begin{aligned}
    E^{\Delta x} &= (A^{\Delta x})^{-1} \tau^{\Delta x} \\
    ||E^{\Delta x}|| &= ||(A^{\Delta x})^{-1} \tau^{\Delta x}|| \\
    & \leq ||(A^{\Delta x})^{-1} ||~|| \tau^{\Delta x}||
\end{aligned}$$

We know that $\tau^{\Delta x} \rightarrow 0$ as $\Delta x \rightarrow 0$ already for our example so if we can bound the norm of the matrix $(A^{\Delta x})^{-1}$ by some constant $C$ for sufficiently small $\Delta x$ we can then write a bound on the global error of
$$
    ||E^{\Delta x}|| \leq C ||\tau^{\Delta x}||
$$
demonstrating that $E^{\Delta x} \rightarrow 0 $ at least as fast as $\tau^{\Delta x} \rightarrow 0$.

We can generalize this observation to all linear BVP problems by supposing that we have a finite difference approximation to a linear BVP of the form
$$
    A^{\Delta x} U^{\Delta x} = F^{\Delta x},
$$
where $\Delta x$ is the grid spacing.  

We say the approximation is *stable* if $(A^{\Delta x})^{-1}$ exists $\forall \Delta x < \Delta x_0$ and there is a constant $C$ such that
$$
    ||(A^{\Delta x})^{-1}|| \leq C ~~~~ \forall \Delta x < \Delta x_0.
$$

### Consistency

A related and important idea for the discretization of and PDE is that it be consistent with the equation we are approximating.  If
$$
    ||\tau^{\Delta x}|| \rightarrow 0 ~~\text{as}~~ \Delta x \rightarrow 0
$$
then we say an approximation is *consistent* with the differential equation.

### Convergence

We now have all the pieces to say something about the global error $E$.  A method is said to be *convergent* if
$$
    ||E^{\Delta x}|| \rightarrow 0 ~~~ \text{as} ~~~ \Delta x \rightarrow 0.
$$

If an approximation is both consistent ($||\tau^{\Delta x}|| \rightarrow 0 ~~\text{as}~~ \Delta x \rightarrow 0$) and stable ($||E^{\Delta x}|| \leq C ||\tau^{\Delta x}||$) then convergence is implied.

We have only derived this in the case of linear BVPs but in fact these criteria for convergence are often found to be true for any finite difference approximation (and beyond for that matter).  This statement of convergence can also often be strengthened to say
$$
    \mathcal{O}(\Delta x^p) ~\text{LTE}~ + ~\text{stability} ~ \Rightarrow \mathcal{O}(\Delta x^p) ~\text{global error}.
$$

It turns out the most difficult part of this process is usually the statement regarding stability.  In the next section we will see for our simple example how we can prove stability in the 2-norm.

### Stability in the 2-Norm

Recalling our definition of stability, we need to show that for our previously defined $A$ that
$$
    (A^{\Delta x})^{-1}
$$
exists and
$$
    ||(A^{\Delta x})^{-1}|| \leq C ~~~ \forall \Delta x < \Delta x_0
$$
for some $C$.  

We can show that $A$ is in fact invertible but can we bound the norm of the inverse?  Recall that the 2-norm of a symmetric matrix is equal to its spectral radius, i.e.
$$
    ||A||_2 = \rho(A) = \max_{1\leq p \leq m} |\lambda_p|.
$$

Since the inverse of $A$ is also symmetric the eigenvalues of $A^{-1}$ are the inverses of the eigenvalues of $A$ implying that
$$
    ||A^{-1}||_2 = \rho(A^{-1}) = \max_{1\leq p \leq m} \left| \frac{1}{\lambda_p} \right| = \frac{1}{\max_{1\leq p \leq m} \left| \lambda_p \right|}.
$$

If none of the $\lambda_p$ of $A$ are zero for sufficiently small $\Delta x$ and that the rest are finite as $\Delta x \rightarrow 0$ we have shown the stability of the method.

The eigenvalues of the matrix $A$ from above can be written as
$$
    \lambda_p = \frac{2}{\Delta x^2} (\cos(p \pi \Delta x) - 1)
$$
with the corresponding eigenvectors $v^p$ 
$$
    v^p_j = \sin(p \pi j \Delta x)
$$
as the $j$th component.

#### Check that these are in fact the eigenpairs of the matrix $A$
$$
    \lambda_p = \frac{2}{\Delta x^2} (\cos(p \pi \Delta x) - 1)
$$

$$
    v^p_j = \sin(p \pi j \Delta x)
$$

$$\begin{aligned}
    (A u^p_j) &= \frac{1}{\Delta x^2} (v^p_{j-1} - 2 v^p_j + v^p_{j+1} ) \\
    &= \frac{1}{\Delta x^2} (\sin(p \pi (j-1) \Delta x) - 2 \sin(p \pi j \Delta x) + \sin(p \pi (j+1) \Delta x) ) \\
    &= \frac{1}{\Delta x^2} (\sin(p \pi j \Delta x) \cos(p \pi \Delta x) - 2 \sin(p \pi j \Delta x) + \sin(p \pi j \Delta x) \cos(p \pi \Delta x) \\
    &= \lambda_p v^p_j.
\end{aligned}$$

#### Compute the smallest eigenvalue
$$
    \lambda_p = \frac{2}{\Delta x^2} (\cos(p \pi \Delta x) - 1)
$$
Use a Taylor series to get an idea of how this behaves with respect to $\Delta x$

From these expressions we know that smallest eigenvalue is
$$\begin{aligned}
    \lambda_1 &= \frac{2}{\Delta x^2} (\cos(\pi \Delta x) - 1) \\
    &= \frac{2}{\Delta x^2} \left (-\frac{1}{2} \pi^2 \Delta x^2 + \frac{1}{24} \pi^4 \Delta x^4 + \mathcal{O}(\Delta^6) \right ) \\
    &= -\pi^2 + \mathcal{O}(\Delta x^2).
\end{aligned}$$

Note that this also gives us an error bound as this eigenvalue also will also lead to the largest eigenvalue of the inverse matrix.  We can therefore say
$$
    ||E^{\Delta x}||_2 \leq ||(A^{\Delta x})^{-1}||_2 ||\tau^{\Delta x}||_2 \approx \frac{1}{\pi^2} ||\tau^{\Delta x}||_2.
$$

### Stability in the Max-Norm