# Introduction to Partial Differential Equations
---

## Chapter 2: Elliptic PDEs, Poisson’s Equation, and a Two-Point Boundary Value Problem 
---

## Want to use Colab? [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec3.ipynb)

---

## Prepping the environment for interactive plots in Colab
---

In [None]:
if 'google.colab' in str(get_ipython()):
    print('Running on CoLab - installing missing packages')
    !pip install ipympl
    from IPython.display import clear_output
    clear_output()
    exit()
else:
    print('Not running on CoLab - assuming environment has necessary packages')

In [None]:
%matplotlib widget
if 'google.colab' in str(get_ipython()):
    from google.colab import output
    output.enable_custom_widget_manager()

## Creative Commons License Information
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/80x15.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">Introduction to Partial Differential Equations: Theory and Computations</span> by <a xmlns:cc="http://creativecommons.org/ns#" href="https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations" property="cc:attributionName" rel="cc:attributionURL">Troy Butler</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.<br />Based on a work at <a xmlns:dct="http://purl.org/dc/terms/" href="https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations" rel="dct:source">https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations</a>.

## Section 2.3: Properties of Continuous and Discrete Solutions
---

For simplicity, we consider the domain $(0,1)$ here as opposed to [Section 2.1](Chp2Sec1.ipynb) that focused on the more general domain of $(a,b)$.

We have several goals in this notebook.

- Establish a unifying framework in which to define/analyze the continuous and discrete 2-point BVPs and their corresponding solutions. We utilize an operator notation to accomplish this in [Section 2.3.1](#Section2.3.1).

- Analyze the shared properties of the differential and difference operators used in the definitions of the continuous and discrete problems. We define the properties we are interested in generally for an arbitrary operator acting on an inner product space and establish why they hold for both of the operators considered in this notebook in [Section 2.3.2](#Section2.3.2).

- Analyze the shared properties of the continuous and discrete solutions. Important properties of the continuous solution are summarized in various theorems in [Section 2.1](Chp2Sec1.ipynb), and we discuss their discrete counterparts in [Section 2.3.3](#Section2.3.3) and [Section 2.3.4](#Section2.3.4).

- Finally, we prove that the discrete solution *converges* to the continuous solution in [Section 2.3.5](#Section2.3.5).


<mark>***This may require a few careful readings as there is a significant amount of theory discussed in this notebook. The material is unavoidably dense in places. It will likely take at least two classes to get through this all carefully.***</mark>

In [None]:
# Just getting this out of the way
import numpy as np
import matplotlib.pyplot as plt

from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

---
### <a id='Section2.3.1'>Section 2.3.1: Operator Notation - A Useful Formalism</a>
---

To accomplish our goal, we find it convenient to rewrite the continuous and discrete problems with a new operator-centric notation. This also helps set the right mindset for how we approach modern PDE theory to study more general elliptic, parabolic, and hyperbolic PDEs.

---
#### The continuous problem and its solution
---

We first rewrite the 2-point BVP

$$
    -u''(x) = f(x), \ x\in(0,1), \ u(0)=u(1)=0,
$$

as

$$
    (Lu)(x) = f(x), \ x\in(0,1), 
$$

where $L:\mathcal{C}^2_0((0,1))\to \mathcal{C}((0,1))$ is the differential operator defined by $L:=-\frac{d^2}{dx^2}$. 

- Here, $\mathcal{C}^2_0((0,1))$ denotes the space $\mathcal{C}^2((0,1))\cap \mathcal{C}([0,1])$ whose values at $x=0$ and $x=1$ are zero. 

  - In other words, $\mathcal{C}^2_0((0,1))$ denotes the twice continuously differentiable functions on the *open interval* $(0,1)$ that are also continuous on the *closed* interval $[0,1]$ with values of $0$ at the endpoints. 

  - $\mathcal{C}^2_0((0,1))$ is a vector subspace of the vector space $\mathcal{C}^2((0,1))$.
  
  
- It is worth emphasizing that just because a function has a value of zero at some point does not mean that its derivatives must be zero. Consider $u(x)=\sin(\pi x)$, which is zero at $x=0$ and $1$, but $u'(0)=\pi$ and $u'(1)=-\pi$.

**Remarks:**

- For $u\in\mathcal{C}^2_0((0,1))$, we have that $Lu\in\mathcal{C}((0,1))$. If we start with a $u\in\mathcal{C}^2_0((0,1))$ and set $f=Lu\in\mathcal{C}((0,1))$, then we have *manufactured* a solution to the 2-point BVP that has this particular $f$ as the data.

- <mark>The conceptual importance of the operator $L$ as a *mapping* from $u\in\mathcal{C}^2_0((0,1))$ to the *data* $f\in\mathcal{C}((0,1))$ cannot be overstated.</mark> It allows us to do things such as formally define the *solution operator* ``$L^{-1}:\mathcal{C}((0,1))\to\mathcal{C}^2_0((0,1))$'' that maps the *data* to a *solution* of the BVP.

Why are there quotes around ``$L^{-1}:\mathcal{C}((0,1))\to\mathcal{C}^2_0((0,1))$'' in the above remark? We consider a linear algebra analogy which is also connected with what we discussed at the end of [Section 2.2](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb) as well as in [Section 1.5](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp1/Chp1Sec5.ipynb) to set the proper mindset for how to interpret this notation.
  
We begin with the following problem: 

> Solve $Ax=b$ where $A\in\mathbb{R}^{m\times n}$, $x\in\mathbb{R}^n$, and $b\in\mathbb{R}^m$.

In this problem, $A:\mathbb{R}^n\to\mathbb{R}^m$. 

*Question:* Does $A^{-1}:\mathbb{R}^m\to\mathbb{R}^n$ always exist?

- We can always evaluate $Ax$ for *any* given $x\in\mathbb{R}^n$, but does this mean that for *any* given $b\in\mathbb{R}^m$ that there exists $x\in\mathbb{R}^n$ such that $Ax=b$?

  The answer is *of course **not***. Even if $m=n$, there are counterexamples. For example, consider the $2\times 2$ matrix $A$ where each row is given by $(1, 0)$. $Ax=b$ only has solutions if $b=(b_1, 0)^\top$ for some real number $b_1$. The vector $b=(0, 1)^\top$ produces a problem with *no* solution because it is not in the column space of $A$.

  Remember, *not every problem has a solution.*

- But, suppose we have a $b\in\mathbb{R}^m$ such that there does exist an $x\in\mathbb{R}^n$ with $Ax=b$, then formally we may write $x=A^{-1}b$ even though $A^{-1}$ may not exist (e.g., perhaps $m\neq n$ or if $m=n$ the rows of $A$ may be linearly dependent). 

  Moreover, even if $A^{-1}$ exists, the inverse of an invertible matrix is almost never constructed in practice! 

So, what are we really formally representing by writing $x=A^{-1}b$?

- The symbol $A^{-1}$ often represents the *process* by which we determine the $x\in\mathbb{R}^n$ (perhaps that process is Gaussian elimination). 

Similarly, we may write $u=L^{-1}f$ by which we mean that we apply a more nuanced/sophisticated perspective to such notation.

Of course, we have already established in the [Theorem of Existence,  Uniqueness, and Smoothness in Section 2.1](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec1.ipynb) that for all $f\in\mathcal{C}((0,1))$ there exists a unique $u\in\mathcal{C}^2_0((0,1))$ such that $Lu=f$. But, if we were to change the boundary conditions (so consider an input space to the differential operator that was different than $\mathcal{C}^2_0((0,1))$), this may not be the case.

Moreover, just because the theorem implies the actual existence of an operator $L^{-1}:\mathcal{C}((0,1))\to\mathcal{C}^2_0((0,1))$, it does *not* state what is fundamentally *meant* by writing $u=L^{-1}f$ because it is not stating *what* this operator $L^{-1}$ represents. 

What is it? In this case, we can think of it as the process by which we construct $u$ using the Green's function. In other words,

$$
    L^{-1}f = \int_0^1 G(x,y)f(y)\, dy.
$$

---
#### The discrete problem and its solution
---

First, recall from the [previous notebook](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb) that we use $n+2$ points to discretize $[0,1]$ into $n+1$ subintervals of equal length $h=1/(n+1)$ and we solve the matrix-vector problem


$$
\large Av=b, \ \text{ where } \ A = \begin{pmatrix}
                    2 & -1 & 0 & \cdots & 0 \\
                    -1 & 2 & -1 & \ddots & \vdots \\
                    0 & \ddots & \ddots & \ddots & 0 \\
                    \vdots & \ddots & -1 & 2 & -1 \\
                    0 & \cdots & 0 & -1 & 2
                \end{pmatrix},
            \
            \
             b = h^2\begin{pmatrix}
                        f(x_1) \\
                        f(x_2) \\
                        \vdots \\
                        f(x_n)
                    \end{pmatrix}, 
$$

and $v\in\mathbb{R}^n$ represents $v=(v_1, v_2, \ldots, v_n)^\top\approx (u(x_1), u(x_2), \ldots, u(x_n))^\top$.

However, the above matrix-vector problem does not tell the whole story. The discrete problem includes boundary conditions. 

The *actual* vector $v$ that solves the discrete problem is in $\mathbb{R}^{n+2}$ where $v_0=v_{n+1}=0$ and we solve for $v_1,v_2,\ldots, v_n$ using the matrix-vector problem above because it is precisely this $(n+2)$-dimensional $v$ that satisfies the discrete problem that *includes* boundary conditions. 

This is an unfortunate abuse of notation. When we write $v$, how are we supposed to know if it is referring to the $n$-dimensional solution to the matrix-vector problem or the $(n+2)$-dimensional solution to the discrete problem? 

Well, the context helps. We solve the matrix-vector problem $Av=b$ to get the *unknown* values of $v$, and when we refer to $v$ by itself, we refer to the solution of the discrete problem which has dimension equal to the number of points used to discretize $[0,1]$.

Note that if we change boundary conditions (e.g., making one or more unknown by the use of Neumann or Robin boundary conditions), then $A$ changes to reflect this (and belongs to either $\mathbb{R}^{(n+1)\times (n+1)}$ or $\mathbb{R}^{(n+2)\times (n+2)}$) as well as $b$ (and belongs to either $\mathbb{R}^{n+1}$ or $\mathbb{R}^{n+2}$) in order to solve for these additional unknown values in the $v\in\mathbb{R}^{n+2}$.

<mark>To help connect the discrete problem and its solution to the continuous problem and its solution, we need to take a step back to recall how $Av=b$ was even constructed and use a similar operator-based notation. However, we must first define the ***spaces*** that the operator will map between and their relationship to the continuous function spaces involved in the continuous problem.</mark>

---
#### A Discrete Function Space
---

<mark>First, define $D_h$ as the collection of **discrete functions** that map the $n+2$ grid points $x_j=jh$, $0\leq j\leq n+1$ into $\mathbb{R}$.</mark>

- In other words, $w\in D_h$ if $w(x_j)\in\mathbb{R}$ for each $0\leq j\leq n+1$. We will sometimes use $w_j$ as a shorthand notation for $w(x_j)$.

- <mark>We are going to use $D_h$ to setup the solution spaces for the discrete problem.</mark>

- *This is like a discrete version of $\mathcal{C}^k([0,1])$ where we only consider functions whose values are defined at all the grid points in $D_h$ as opposed to being defined at all points in $[0,1]$ for $\mathcal{C}^k([0,1])$.*  

  Note that $\mathcal{C}([0,1])\subset D_h$ since if $w\in\mathcal{C}([0,1])$ then $w(x)\in\mathbb{R}$ for all $x\in[0,1]$.
  
  However, $\mathcal{C}((0,1))$ is not a subset of $D_h$ because functions like $1/x$ belong to $\mathcal{C}((0,1))$ that are not defined as any real number for $x=0$.

- We use the $\sup$-norm metric to measure the distance between functions in $D_h$, i.e., we use the metric $d_{h,\infty}:D_h\times D_h\to[0,\infty)$ defined by
  
  $$
      d_{h,\infty}(v,w) := \| v - w \|_{h,\infty} = \sup_{0\leq j\leq n+1} | v_j - w_j |, \ \forall v, w\in D_h.
  $$
  
  Note that we used the $v_j, w_j$ shorthand notation above in-place of the function notation $v(x_j), w(x_j)$. Even though we use this shorthand notation, it is important to keep the perspective that $v$ and $w$ are in fact *functions* (discrete though they may be) defined on the grid points of $[0,1]$.
  
  When there is no chance for confusion (e.g., if we are only discussing the discrete functions), we may drop the $h$ in the subscript of the metric and norm.
  
- *Note that the metric space defined by $(D_h, d_{h,\infty})$ is [isometrically isomorphic](https://en.wikipedia.org/wiki/Isometry) to $(\mathbb{R}^{n+2}, d_\infty)$. In other words, we can identify $D_h$ as $\mathbb{R}^{n+2}$ equipped with the sup/infinity-norm induced metric.* 

  This means that even though we should always keep in mind that $D_h$ is a space of *functions*, we actually do not do anything mathematically incorrect if we happen to treat each function as the specific $(n+2)$-dimensional vector of real numbers defined by the function values at the grid points.
  
- *Of course, the function spaces we are dealing with are just vector spaces much like how the "space of all polynomials up to degree $m$" defines an $m$-dimensional vector space. The lesson here: linear algebra is a useful subject to study in depth.*

  We therefore have that $D_h$ is "obviously" an $(n+2)$-dimensional vector space where as $\mathcal{C}^k([0,1])$ is an infinite dimensional vector space.
 

 <mark>Second, define $D_{h,0}\subset D_h$ as the *subset* of $D_h$ given by all $w\in D_h$ such that $w_0=w_{n+1}=0$.</mark>

- *It is a good exercise for students to show that this is in fact an $n$-dimensional vector subspace of $D_h$.*

- $D_{h,0}$ will serve as the solution space to the discrete problem in an analogous way that $\mathcal{C}^2_0((0,1))$ served as the solution space to the continuous problem.

<mark>Third, let $D_h^n$ denote the spaces of discrete functions similar to the above except that the superscript $n$ implies that we only consider the mapping of the $n$ interior grid points $x_j=jh$ for $1\leq j\leq n$ into $\mathbb{R}$. </mark>

- The use of the $n$ superscript should be interpreted similarly as the change from closed intervals $\mathcal{C}([0,1])$ to open intervals in $\mathcal{C}((0,1))$.

  - Note that if $w\in \mathcal{C}([0,1])$, then $w\in\mathcal{C}((0,1))$. 
  
    However, the converse is not necessarily true. For example, consider $w\in\mathcal{C}((0,1))$ defined by either $w(x)=1/x$ or $w(x)=\sin(1/x)$. In either case, $\lim_{x\downarrow 0} w(x)$ does not exist implying there is no way to define $w(0)$ that makes $w$ continuous at $x=0$. 
    
  - Similarly, if $w\in D_h$, then $w\in D_h^n$.
  
    The converse may not be true. However, it is no longer due to a lack of limits but rather the lack of a function value for a given $w\in D_h^n$. For instance, $w(x)=1/x$ and $w(x)=\sin(1/x)$ are simply *not defined* at $x=0$, so these $w$ are in $D_h^n$ but not in $D_h$.
    
    Of course, if we were to define $w$ in a piecewise manner, then we can get around certain issues like the above. For example, if $w(x)=1/x$ for $x>0$ but $w(0)=\alpha$ (where $\alpha\in\mathbb{R}$ is fixed), then $w\in D_h$. 
    
Why don't we similarly define $D_{h,0}^n$? Well, $D_{h,0}^n$ is equivalent to $D_{h,0}$, so there is really no point in doing so. It is just a manner of semantics. Writing $\mathcal{C}_0((0,1))$ and $\mathcal{C}_0([0,1])$ also mean the same thing, but we never write the latter.

<mark>Fourth, define the *difference operator* $L_h: D_{h,0} \to D_{h}^n$ given by</mark>


$$
    (L_hw)(x_j) = - \frac{w(x_{j+1}) - 2w(x_j) + w(x_{j-1})}{h^2}, \ \text{ for } 1\leq j \leq n. 
$$

- Note that $L_hw \in D_h^n$, i.e., $L_h$ is an operator that maps a discrete function $w\in D_{h,0}^n$ and creates a new discrete function $L_hw\in D_h^n$. 

- We are not making any statement/claim as to whether or not $L_hw\in D_h$. 

- We can apply $L_h$ to any $w\in\mathcal{C}_0((0,1))$. In particular, if $w\in \mathcal{C}^2_0((0,1))$, then we identify the function $L_hw\in D_h^n$ as the function that approximates $w''$ at the interior grid points $x_j=jh$ for $1\leq j\leq n$, i.e., $L_hw(x_j)\approx w''(x_j)$ for each $1\leq j \leq n$.

<mark>Fifth, define the **discrete problem in operator form** as finding $v\in D_{h,0}$ such that</mark>

$$
    (L_hv)(x_j) = f(x_j), \ 1\leq j\leq n.
$$

We now have the discrete problem in a form analogous to the form given in the continuous problem: $(Lu)(x)=f(x)$ for all $x\in (0,1)$. 

- A key observation is that $L:\mathcal{C}^2_0((0,1))\to\mathcal{C}((0,1))$ is replaced by $L_h:D_{h,0}\to D_h^n$.

- Note that we are writing the discrete problem as a system of $n$ equations in this form, and the matrix-vector problem $Av=b$ where we define $v_0=v_{n+1}=0$ is an equivalent way to represent this problem as a single equation (a single equation involving a matrix and a vector).

<mark>Finally, make sense of what $v=L_h^{-1}f$ means for $f\in D_h^n$.</mark>

As discussed at the end of the previous notebook, $v=A^{-1}b$ because $A$ is a symmetric positive definite matrix, so it is invertible. 

Thus, we can determine $v_1, \ldots, v_n$ via the coordinates of the vector $A^{-1}b$ (and we may interpret $A^{-1}$ as the process of applying Gaussian elimination), and we can interpret $v = L_h^{-1}f$ as simply meaning:

$$
    L_h^{-1}f(x_j) = \begin{cases}
                        (A^{-1}b)_j, & \text{ if } 1\leq j\leq n, \\
                        0, & \text{ if } j\in{0,n+1}.
                     \end{cases}
$$

In [None]:
# We make sense of $L_h^{-1}f$ below for solving the discrete problem by
# revisiting code from the previous notebook

def make_A(n):
    A = np.zeros((n,n))
    np.fill_diagonal(A,2)
    A += np.diag(-np.ones(n-1),k=1)
    A += np.diag(-np.ones(n-1),k=-1)
    return A

In [None]:
def solve_Av_b(n, f):  # This is saying "Construct $L_h^{-1}f$""
    
    A = make_A(n)  # Construct $A$
    
    x = np.linspace(0, 1, n+2)  # Create the n+2 grid points
    h = x[1]-x[0]  # Determine h=1/(n+1), which is also just the difference between grid points
    b = h**2*f(x[1:-1])  # Construct $b$
    
    # Note that v is a n+2 dimensional vector that is storing what we mean by $L_h^{-1}f$
    v = np.zeros(n+2)  
    
    # Below, we compute "$A^{-1}b$" for $1\leq j\leq n$. The 0's at j=0 and j=n+1 are untouched 
    # We use quotes around $A^{-1}b$ because we use the solve method in numpy's linalg subpackage,
    # which is utilizing routines for solving matrix-vector problems without constructing the inverse
    # of the matrix A.
    v[1:-1] = np.linalg.solve(A, b)  
    
    return v, x, h

---
### <a id='Section2.3.2'>Section 2.3.2: Properties of $L$ and $L_h$</a>
---

First, we define inner products  $\langle \cdot, \cdot \rangle$ and  $\langle \cdot, \cdot \rangle_h$ on $\mathcal{C}([0,1])$ and $D_h$, respectively, as

$$
    \langle u, v \rangle := \int_0^1 u(x)v(x)\, dx, \ \forall \ u, v\in\mathcal{C}([0,1]), 
$$

and, by applying the [trapezoidal rule](https://en.wikipedia.org/wiki/Trapezoidal_rule) on each of the $n+1$ subintervals of length $h$ used to discrete $[0,1]$ to approximate this integral, we define

$$
    \langle u, v \rangle_h := h\left[\frac{u_0v_0+u_{n+1}v_{n+1}}{2} + \sum_{j=1}^n u_jv_j\right].
$$

**Remarks:**

- We call $\langle \cdot, \cdot \rangle$ the **continuous inner product** and $\langle \cdot, \cdot \rangle_h$ the **discrete inner product.**

- Inner products impart a *geometric structure* on a space (as well as a norm, metric, topology, etc.). 

  - Since the discrete inner product is developed as an approximation of the continuous inner product, this suggests that the structures/analysis on either space should be approximately mirrored by the other space.
  
  - We in fact witness this "mirrored" analysis throughout the rest of this chapter where steps in analyzing properties of $L_h$ and solutions to the discrete problem in $D_{h,0}$ "mirror" the steps in analyzing properties of $L$ and solutions to the continuous problem in $\mathcal{C}^2_0((0,1))$.

In [None]:
from scipy.integrate import quadrature as quad

def inner(u, v):
    z = quad(lambda x: u(x)*v(x), 0, 1)[0]
    return z

In [None]:
def inner_h(u, v, h):
    z = h * (u[0]*v[0] + u[-1]*v[-1])/2.0 + h*np.dot(u[1:-1],v[1:-1])
    return z

In [None]:
# Let's compare these inner products on some continuous functions on [0,1].
# They should give similar answers if enough grid points are used.
# What is enough? Well, it depends on the functions.

u = lambda x : np.sin(x)
v = lambda x : np.exp(x)

In [None]:
print('-'*70)
print('The continuous inner product of u and v is\n')
print(inner(u, v))
print()

n = 19
x = np.linspace(0, 1, n+2)
h = x[1]-x[0]
print('-'*70)
print('The discrete inner product of u and v using ' + str(n+2) + ' grid points is\n')
print(inner_h(u(x), v(x), h))

---
#### How good of an approximation is the discrete inner product to the continuous inner product?
---

Here, we explore the quality of the approximation of the continuous inner product by the discrete inner product. 

First, we prove that if $u,v\in C_0^2((0,1))$ then 
$$
\large    \left| \left< u, v \right> -  \left< u, v \right>_h \right| \leq \frac{h^2}{12} \left|\left| (uv)'' \right|\right|_\infty
$$

In the proof below, we use the well established [error bound](https://en.wikipedia.org/wiki/Trapezoidal_rule#Error_analysis) associated with the [Trapezoidal rule](https://en.wikipedia.org/wiki/Trapezoidal_rule).

***Proof:***

Let $u,v\in C_0^2((0,1))$. 
Then, $uv\in C_0^2((0,1))$.

Let $f=uv$. 
Then, $\int_0^1 f(x)\, dx = \left<u,v\right>$ and $\left<u,v\right>_h$ is identified as the trapezoidal rule applied to this integral.
The result follows immediately from the established error bound for the trapezoidal rule. $\Box$

***Some extra material***

If you've never seen the proof of the error bound for the trapezoidal rule, it follows from applying the [integration by parts](https://en.wikipedia.org/wiki/Integration_by_parts) formula with some creative choices of integration constants.
We sketch out the process as a two-step process below that you may find more useful than the Wiki reference linked to above.
This also helps set the stage for some of the techniques used to prove other results in this notebook.

**A useful Lemma:** For $f\in C^2((0,1))$, let $x_i\in[0,1)$ and $h>0$ sufficiently small so that $x_i+h\in[0,1]$

$$
    \left|\int_{x_i}^{x_i+h} f(x)\, dx - \frac{h}{2}(f(x_i+h)-f(x_i)) \right| \leq \frac{h^3}{12}\left|\left|  f''\right|\right|_\infty.
$$

**Sketch of Proof:** A simple change of variables (to simplify the limits of integration) followed by integrating by parts twice with clever choices of integration constants leads to

$$
    \int_{x_i}^{x_i+h} f(x)\, dx = \int_0^h f(t+x_i)\, dt = \underbrace{\frac{h}{2}\left[f(x_i) + f(x_i+h)\right]}_{\text{Trap. Rule}} + \underbrace{\int_{0}^{h} \left(\frac{(t-h/2)^2}{2}-\frac{h^2}{8} \right)f''(t+x_i)\, dt}_{\text{Error in Trap. Rule on $(x_i, x_i+h)$}}.
$$

So it follows that the error can be bounded by

$$
    \left| \int_{0}^{h} \left(\frac{(t-h/2)^2}{2}-\frac{h^2}{8} \right)f''(t+x_i)\, dt \right| \leq \left|\left|f''\right|\right|_\infty \int_0^h \left| \frac{(t-h/2)^2}{2}-\frac{h^2}{8} \right| \, dt. 
$$

The integrand is the absolute value of a parabola $\displaystyle \frac{(t-h/2)^2}{2}-\frac{h^2}{8}$ which opens upward and is zero whenever $t-h/2 = \pm h/2$ (i.e., at $t=0$ and at $t=h$), so for $t\in(0,h)$, we have that

$$
    \left|\frac{(t-h/2)^2}{2}-\frac{h^2}{8}\right| = \frac{h^2}{8} - \frac{(t-h/2)^2}{2}. 
$$

It follows from a direct integration of this term that the error in the trapezoidal rule on $(x_i,x_i+h)$ is bounded by

$$
    \left| \int_{0}^{h} \left(\frac{(t-h/2)^2}{2}-\frac{h^2}{8} \right)f''(t+x_i)\, dt \right| \leq \frac{h^3}{12}\left|\left|f''\right|\right|_\infty.
$$

Since we apply the trapezoidal rule on $n$ subintervals, we add up the error bound $n$ times and use the fact that $h=1/n$ to get the result used in the proof above. $\Box$

---
#### Symmetry of operators
---

<mark>For an operator $\mathcal{L}$ on a real-valued vector space equipped with an inner product (with inner product denoted by $\langle \cdot, \cdot \rangle$) to be symmetric, it means that $\langle \mathcal{L}u, v\rangle = \langle u, \mathcal{L}v\rangle$ for all $u$ and $v$ in the space.</mark>

Using [integration by parts](https://en.wikipedia.org/wiki/Integration_by_parts) twice, we can prove (details are left to the students or will be covered in class) that $L$ is **symmetric** meaning that

$$
    \langle Lu, v \rangle = \langle u, Lv\rangle, \ \forall \ u, v\in\mathcal{C}_0^2((0,1)).
$$

Similarly, using [summation by parts](https://en.wikipedia.org/wiki/Summation_by_parts) twice (and by defining $u_{-1}=v_{-1}=0$ to simplify notation), we can prove (details are left to the students or will be covered in class) that $L_h$ is **symmetric** meaning that 

$$
    \langle L_hu, v \rangle_h = \langle u, L_hv\rangle_h, \ \forall \ u, v\in D_{h,0}.
$$

We summarize these results in the following Lemma for ease of reference.

<br>

---
#### Lemma 2.3.1: Symmetry of Operators

The operators $L$ and $L_h$ are symmetric in the sense given above.

---

<br>


<mark>**Note that "in the sense given above" makes *explicit* mention of the spaces of functions for which this symmetry holds. If either function (or both) fail to be in these spaces, then symmetry does not necessarily hold. We explore this below.**</mark>

In [None]:
import sympy as sp
from sympy.utilities.lambdify import lambdify

In [None]:
def L(u):
    return -u.diff(x,2)

In [None]:
x = sp.symbols('x')

# The following u, v FAIL to be in the correct spaces for symmetry to necessarily hold.
u = sp.sin(x)  
v = sp.exp(x)

Lu = L(u)
Lv = L(v)

u_eval = lambdify(x, u)
v_eval = lambdify(x, v)

Lu_eval = lambdify(x, Lu)
Lv_eval = lambdify(x, Lv)

In [None]:
print('-'*70)
print('The continuous inner product of Lu and v is\n')
print(inner(Lu_eval, v_eval))
print()
print('The continuous inner product of u and Lv is\n')
print(inner(u_eval, Lv_eval))
print()

n = 19
x = np.linspace(0, 1, n+2)
h = x[1]-x[0]
print('-'*70)
print('The discrete inner product of Lu and v using ' + str(n+2) + ' grid points is\n')
print(inner_h(Lu_eval(x), v_eval(x), h))
print()
print('The discrete inner product of u and Lv using ' + str(n+2) + ' grid points is\n')
print(inner_h(u_eval(x), Lv_eval(x), h))

In [None]:
x = sp.symbols('x')

# The following u, v are designed to be in the correct spaces for symmetry to hold.
# By "designed", we mean that we subtract the straight line connecting the end-point
# values in order to ensure that u(0)=v(0)=0 and u(1)=v(1)=0.
u = sp.sin(x) - (sp.sin(0)*(1-x)+sp.sin(1)*x) 
v = sp.exp(x) - (sp.exp(0)*(1-x)+sp.exp(1)*x)

Lu = L(u)
Lv = L(v)

u_eval = lambdify(x, u)
v_eval = lambdify(x, v)

Lu_eval = lambdify(x, Lu)
Lv_eval = lambdify(x, Lv)

In [None]:
print('-'*70)
print('The continuous inner product of Lu and v is\n')
print(inner(Lu_eval, v_eval))
print()
print('The continuous inner product of u and Lv is\n')
print(inner(u_eval, Lv_eval))
print()

n = 19
x = np.linspace(0, 1, n+2)
h = x[1]-x[0]
print('-'*70)
print('The discrete inner product of Lu and v using ' + str(n+2) + ' grid points is\n')
print(inner_h(Lu_eval(x), v_eval(x), h))
print()
print('The discrete inner product of u and Lv using ' + str(n+2) + ' grid points is\n')
print(inner_h(u_eval(x), Lv_eval(x), h))

---
#### Positive definiteness of operators
---

<mark>For an operator $\mathcal{L}$ on a vector space equipped with an inner product (with inner product denoted by $\langle \cdot, \cdot \rangle$) to be positive definite, it means that $\langle \mathcal{L}u, u\rangle\geq 0$ for all $u$ in the space with equality only happening if $u$ is equivalent to the zero vector.</mark>

By applying integration by parts once, we have that for any $u\in\mathcal{C}_0^2((0,1))$

$$
    \langle Lu, u \rangle = \int_0^1 (u'(x))^2\, dx.
$$

Since the above integral involves a nonnegative function (because we are squaring $u'(x)$), the integral is nonnegative. This implies $\langle Lu, u \rangle\geq 0$ with equality only if $u(x)\equiv 0$.. 

Moreover, for continuous nonnegative functions, the integral is zero if and only the integrand is identically zero, i.e., $(u'(x))^2\equiv 0$ on $[0,1]$, which is equivalent to saying that $u'(x)\equiv 0$ on $[0,1]$. 

If a derivative is constant on an interval, then the function is constant. Thus, if $u'(x)\equiv 0$ on $[0,1]$, then $u(x)=c$ for some $c\in\mathbb{R}$ on $[0,1]$. But, since we started with $u\in\mathcal{C}_0^2((0,1))$, we know that $u(0)=u(1)=0$, which means that $c=0$. 

Therefore, $\langle Lu, u \rangle\geq 0$ with equality only if $u(x)\equiv 0$.

Similarly, for any $v\in D_{h,0}$, we can apply summation by parts once (again using the convenient definition of $v_{-1}=0$ to make this step more obvious) and group terms to get

$$
    \langle L_h v, v\rangle_h = h^{-1}\sum_{j=0}^n (v_{j+1}-v_j)^2.
$$

Since the above summation involves nonnegative terms (again, they are squared), we have that the sum is greater than or equal to zero. 

Moreover, if the sum is zero, then this can only happen if $v_{j+1}=v_j$ for all $0\leq j\leq n$. Since $v\in D_{h,0}$ implies $v_0=0$, this further implies $v_j=0$ for $0\leq j\leq n+1$. In other words, $v$ is equivalent to the discrete function that is identically zero.


We summarize these results in the following Lemma for ease of reference.

<br>

---
#### Lemma 2.3.2: Positive Defineteness of Operators

The operators $L$ and $L_h$ are positive definite in the sense given above.

---

<br>


In [None]:
print(inner(Lu_eval, u_eval))
print(inner(Lv_eval, v_eval))

In [None]:
n = 19
x = np.linspace(0, 1, n+2)
h = x[1]-x[0]

print(inner_h(Lu_eval(x), u_eval(x), h))
print(inner_h(Lv_eval(x), v_eval(x), h))

---
### <a id='Section2.3.3'>Section 2.3.3: Existence and Uniqueness of Continuous and Discrete Solutions</a>
---

<mark>Consider a *linear* positive definite operator $\mathcal{L}:W\subset V\to V$ where $V$ is a vector space equipped with an inner product (with inner product denoted by $\langle \cdot, \cdot \rangle$) and $W$ is some vector subspace of $V$. If the problem defined by $\mathcal{L}w=v$ has a solution, then the solution is unique.</mark>

To see that the above statement is true, suppose $w_1$ and $w_2$ are both solutions to the same problem $\mathcal{L}w=v$. Define $e=w_1-w_2$. Then, by linearity of $\mathcal{L}$, we have that

$$
    \mathcal{L}e = \mathcal{L}(w_1-w_2) = \mathcal{L}w_1-\mathcal{L}w_2 = v-v=0\in V.
$$

Since the inner product with the zero vector always produces zero, this implies that 

$$
    \langle \mathcal{L}e, e\rangle = \langle 0, e \rangle = 0.
$$

Recalling that $\mathcal{L}$ is positive definite, we have that this implies $e$ is the zero vector in $V$.

Since we have previously argued that solutions exist to the continuous and discrete problems (i.e., $u=L^{-1}f$ and $v=L_h^{-1}f$ both exist), and both $L$ and $L_h$ are positive definite, we have that the continuous and discrete solutions are unique. We summarize this as

<br>

---
#### Lemma 2.3.3: Existence and Uniqueness of Solutions

The functions $u=L^{-1}f\in\mathcal{C}_0^2((0,1))$ and $v=L_h^{-1}f\in D_{h,0}$ defined above in Section 2.3.1 are the unique solutions to the continuous and discrete problems, respectively.

---

<br>


---
### <a id='Section2.3.4'>Section 2.3.4: Maximum Principle and Monotonicity for Continuous and Discrete Solutions</a>
---

Recall that on both $\mathcal{C}^2_0((0,1))$ and $D_{h,0}$ that we use the sup-norm (i.e., infinity-norm) induced metric (not the norm or metric induced by the inner products discussed above).

Recall from [Section 2.1](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec1.ipynb) the following

<br>

---
**Theorem 2.1.3: Maximum principle**

Assume that $f\in\mathcal{C}((a,b))$ and let $u$ be the unique solution of the BVP given in Theorem 2.1.1, then

$$
    \| u \|_\infty \leq \frac{(b-a)^2}{8}\|f\|_\infty.
$$

---

<br>

In the simplified case of this notebook where $a=0$ and $b=1$, the result in this theorem simplifies to

$$
    \| u \|_\infty \leq \frac{1}{8}\|f\|_\infty
$$

We similarly have (notice the mirroring in the numbering of this theorem which occurred entirely by accident)

<br>

---
**Theorem 2.3.1: Maximum principle**

Assume that $f\in\mathcal{C}((0,1))$ and let $v\in D_{h,0}$ be the unique solution of the discrete version of the BVP considered in this notebook, then 

$$
    \| v \|_{h,\infty} \leq \frac{1}{8}\|f\|_{h,\infty}.
$$

---

<br>

**Remark:**

- Note that the linear interpolant of any function in $D_{h}$ defines a function in $\mathcal{C}((0,1))$, so the above result can be modified to hold for any $f\in D_{h}$ as well.

**Outline of proof for Theorem 2.3.1:**

1. For each $1\leq k\leq n$, verify that $G^k(\cdot) := G(\cdot, x_k) \in D_{h,0}$ (where $G$ is the Green's function associated with the continuous problem) solves the discrete problem with a forcing function $\frac{1}{h}e^k$ chosen so that 

$$
    e^k(x_j) := \begin{cases}
                              1, & k=j, \\
                              0, & \text{else}.
                          \end{cases}
$$

2. Use this to verify that a solution $w\in D_{h,0}$ for an arbitrary $f\in D_{h,0}$ can be written as
<br>
$$
    w(x_j) := h \sum_{k=1}^n f(x_k) G^k(x_j), 
$$

   by observing first that $f(\cdot) = \sum_{k=1}^n f(x_k)e^k(\cdot)$.


3. Recognize that if $f$ is defined as a constant function, i.e., $f(x)\equiv \alpha$, then 

   $$
       w(x_j) = \alpha h\sum_{k=1}^n G^k(x_j), 
   $$

   and show that this is equivalently written as
   
   $$
       w(x_j) = \alpha \frac{1}{2} x_j(1-x_j)
   $$

4. Use the fact that $\frac{1}{2}x(1-x)$ is bounded by $\frac{1}{8}$ for all $x\in[0,1]$ and that $\|f\|_{h,\infty}$ is "just some $\alpha\in[0,\infty)$" to arrive at the desired conclusion.

**Motivation for the first step in the proof:**

How would one ever conceive of this first step? There are a few ideas that can nudge us in this direction.

First, the Green's function was key to proving Theorem 2.1.3.

Second, it is always a good idea to see "plug" an exact solution to a continuous problem into the discrete problem to build intuition for discretization errors. 

The Green's function, $G(x,y)$, to the continuous problem is essentially the solution to the continuous problem when $f(x)=\delta(x-y)$. It is perhaps not too surprising then that for a fixed $y=x_k$, $G(\cdot, x_k)$ defines a function in $D_{h,0}$ that solves the discrete problem associated with a forcing function that is non-zero only at the grid point $x_k$. 

In fact, we refer to the $G^k(\cdot)=G(\cdot, x_k)$ as the **discrete Green's functions.**

<mark>Filling in all the details of the proof are left for either the students or in-class presentation.</mark>

Utilizing both Step 2 in the proof of Theorem 2.3.1 and the fact that $G(x,y)\geq 0$ for all $x,y\in[0,1]$ implies $G^k(x_j)\geq 0$ for all grid points $x_j$ allows us to immediately prove the following monotonicity result that mirrors Theorem 2.1.2 from  [Section 2.1](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec1.ipynb) that we write as Theorem 2.3.2 below.

<br>

---
**Theorem 2.3.2: Monotonicity**

Assume that $f\in\mathcal{C}((0, 1))$ is a nonnegative function, and let $v\in D_{h,0}$ be the unique solution of the discrete version of the BVP considered in this notebook, then $v(x_j)\geq 0$ for $1\leq j\leq n$.

---

<br>

**A slightly deeper dive into the discrete Green's function**

- One take home message is that the discrete analog of the Dirac delta function is given by $\frac{1}{h} e^k$ where $e^k\in D_{h,0}$ is defined by $e^k(x_k)=1$ and $e^k(x_j)=0$ if $j\neq k$.

- Observe that we required that $1\leq k\leq n$, i.e., we evaluate at an interior point since $e^k\in D_{h,0}$ automatically implies that $e^k(x_0)=0=e^k(x_{n+1})$.

- It is just worth emphasizing again that the discrete Green's function is given by $G^k(x_j) = G(x_j,x_k)$ where $G$ is the Green's function for the continuous problem and we have that $L_h G^k = \frac{1}{h}e^k$. 

If we were to construct the discrete Green's function, we would store it as a $\mathbb{R}^{(n+2)\times (n+2)}$ matrix as we show below.

In [None]:
def G(x,y): 
    if 0 <= y <= x:
        z = y*(1-x)
    else:
        z = x*(1-y)
    return z

def make_G(n, x):    
    G_k = np.zeros((n+2,n+2))
    for j in range(0,n+2):
        for k in range(0,n+2):
            G_k[j,k] = G(x[j],x[k])
    return G_k

In [None]:
n = 5

x = np.linspace(0,1,n+2)
h = x[1]-x[0]  # So 1/h = n+1

G_k = make_G(n,x)
        
A = make_A(n)

In [None]:
# Choose an integer between 0 and n+1
k = 2

test = np.zeros(n+2)
test[1:-1] = 1/h**2 * np.dot(A,G_k[1:-1,k])  # This should produce the e^k function

%matplotlib widget
plt.figure(0) 
plt.plot(x,test)  

With the inner product notation, we have for the continuous problem $u(x) = \left< G(x,y),f(y) \right>$ (where the integral is with respect to $y$ not $x$), and now for the discrete problem we also have that $v(x_j) = \left< G^k(x_j), f \right>_h$ (where the summation is with repect to $k$ not $j$).

This implies another way for construction solutions, which we explore numerically.

In [None]:
# Another way to construct solutions

# First, the old way of constructing solutions
b_old = h**2 * (3*x+x**2)*np.exp(x)  # So, f is (3x+x^2)e^x

v_old = np.zeros(n+2)
v_old[1:-1] = np.linalg.solve(A, b_old[1:-1])  # Numerical soln. using Gaussian elimination

# Now, the new one
b_new = (3*x+x**2)*np.exp(x)

v_new = np.zeros(n+2)
for j in range(1,n+1):
    v_new[j] += inner_h(G_k[j,:], b_new, h)

u = x*(1-x)*np.exp(x)  # Exact soln.

# Let's compare the approaches
%matplotlib widget
plt.figure(1)
plt.plot(x, v_new, 'g--', label='Num. Soln. $v_{new}$')
plt.plot(x, v_old, 'b.', markersize=15, label='Num. Soln. $v_{old}$')
plt.legend(loc='upper left', shadow=True)

In [None]:
# Checking that the maximum principle holds

ns = range(1,30)

for n in ns:
    x = np.linspace(0,1,n+2)
    h = x[1]-x[0]  # So 1/h = n+1

    G_k = make_G(n, x)
    
    temp = np.zeros(n+2)
    for i in range(0,n+2):
        temp[i] = inner_h(G_k[:,i], np.ones(n+2), h)  # Choosing f\equiv 1, so the max value should be $\leq$ 1/8
        
    print(np.max(temp))

---
#### A word of caution about discrete Green's functions.
---

If we know $G$ for the continuous problem, then it appears as if determining the discrete Green's function is rather trivial and we can use it to easily determine solutions through simple computations of inner products and sums of these inner products.

Formally, we think of $G$ as the inverse of the differential operator $L$ applied to $\delta(x-y)$, and we can think of the discrete Green's function in a similar way as the inverse of $L_h$ applied to $e^k$. This simply means that the discrete Green's funciton may be constructed by determining $\frac{1}{h^2}A^{-1}$ and then multipling this to the standard basis vectors (scaled by $1/h$) of $\mathbb{R}^n$. 

However, this is a ***stupid*** computation because it involves inverting a matrix, which is generally a computational expense we try to avoid whenever possible. 

Just like with regular Green's functions, which are really difficult to determine in general cases, the mere existence of one is usually enough to infer useful properties of the solution similar to how the existence of the inverse of a matrix is also useful even if we never construct it.

---
### <a id='Section2.3.5'>Section 2.3.5: Convergence of Discrete Solutions to Continuous Solutions</a>
---

Here, the convergence is considered as $h=\frac{1}{n+1}\to 0$, which is equivalent to stating convergence as $n\to\infty$. We will ultimately show that if $u\in \mathcal{C}^2_0((0,1))$ is the continuous solution and $v\in D_{h,0}$ is the discrete solution for the same $f\in\mathcal{C}^2([0,1])$, then $\| u - v \|_{h,\infty} = \mathcal{O}(h^2)$.

We have numerically observed the convergence in [Section 2.2](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb), but examples of numerical convergence are by no means a proof.

To prove convergence, we first require definitions of truncation error and consistency, which are the key ingredients to proving convergence for solutions obtained via finite difference schemes.

---
#### Definition 2.3.1: Truncation Error and Consistency

Let $f\in\mathcal{C}([0,1])$ and $u\in \mathcal{C}^2_0((0,1))$ be the solution of the continuous BVP. Then, $\tau_h\in D_h^n$ defined by

$$
    \tau_h(x_j) := (L_hu)(x_j) - f(x_j), \ 1\leq j\leq n
$$

is called the **truncation error**. We say that the finite difference scheme encoded in the operator $L_h$ is **consistent** if

$$
    \lim_{h\downarrow 0} \| \tau_h \|_{h,\infty} = 0.
$$

---
<br>

**Remarks:**

- Note that $\tau_h$ is a discrete function defined by how well the *exact, continuous* solution satisfies the *discrete* problem.

  There is no reason to believe that the continuous solution should exactly satisfy the discrete problem. However, we definitely have a problem if it does not *almost* solve this problem (meaning that $L_hu \approx L_hv$ where $v$ is the *exact, discrete* solution). 

- We often refer to the truncation error as a vector in $\mathbb{R}^n$ where the $j$th component of the vector is defined by the truncation error function evaluated at $x_j$. Given the isometries that exist between $D_h^n$ and $\mathbb{R}^n$ with the $\infty$-norm, this does not cause any problems.

- Clearly, for finite difference schemes constructed by manipulating Taylor series/polynomials, truncation errors are related to the remainder term describing the error in a Taylor polynomial approximation. We therefore generally expect the truncation error to converge to zero for sufficiently smooth data. This is in fact key to the next two results.

---
#### Lemma 2.3.4: Truncation Error

Suppose $f\in \mathcal{C}^2([0,1])$, then 

$$
    \| \tau_h \|_{h,\infty} \leq \frac{\| f \|_\infty}{12} h^2.
$$

---
<br>

Before we prove this lemma, note that $f\in\mathcal{C}^2([0,1])$ is assumed smoother than we have considered before. This is critical to the proof as we see below. Another note is that we are bounding a norm for functions in $D_{h,0}$ with a norm for functions in $\mathcal{C}([0,1])$.

<br>

---
***Proof of Lemma 2.3.4:***

Let $1\leq j\leq n$. 

By the definition of $\tau_h$, we have that for 

$$
\begin{align}
    \left| \tau_h(x_j) \right| &= \left| (L_hu)(x_j) - f(x_j) \right| \\ \\
                               &= \left| \frac{u(x_{j-1}) - 2u(x_j) + u(x_{j+1})}{h^2} + f(x_j) \right|
\end{align}
$$

By Theorem 2.1.1 (Existence, Uniqueness, and Smoothness), $f\in\mathcal{C}^2([0,1])$ implies $u\in \mathcal{C}^4_0((0,1))$. Thus, applying Taylor's theorem as was done for the analysis of the centered finite difference scheme in [Section 2.2](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb), we have that

$$
    \frac{u(x_{j-1}) - 2u(x_j) + u(x_{j+1})}{h^2} = u''(x_j) + \frac{h^2}{24}\left[u^{(4)}(\xi_1) + u^{(4)}(\xi_2) \right]
$$

for some $\xi_1\in[x_{j-1}, x_j]$ and $\xi_2\in[x_j, x_{j+1}]$. Moreover, since $u$ is the exact solution to the continuous problem, we have that $-u''(x_j) = f(x_j)$, which implies $u''(x_j)=-f(x_j)$. This implies that

$$
\begin{align}
    \left| \tau_h(x_j) \right| &=  \left| -f(x_j) + \frac{h^2}{24}\left[u^{(4)}(\xi_1) + u^{(4)}(\xi_2) \right] + f(x_j) \right| \\ \\
                               &= \frac{h^2}{24} \left|u^{(4)}(\xi_1) + u^{(4)}(\xi_2)\right| \\ \\
                               &\leq \frac{\|u^{(4)}\|_\infty}{12} h^2.
\end{align}
$$

Since $-u''=f$, we have that $u^{(4)}=-f''$, which we substitute into the above inequailty above to give

$$
    \left| \tau_h(x_j) \right| \leq  \frac{\|f'' \|_\infty}{12} h^2.
$$

Since the term on the right is independent of $x_j$, applying the supremum over all $1\leq j\leq n$ to both sides gives the result.
$\Box$
---
<br>

Note that the above is really just a different way of phrasing something we already knew from [Section 2.2](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb) where we showed that 

$$
    |E_h(x)| \leq \frac{M_g h^2}{12}, 
$$

where $E_h$ is the remainder/error in approximating the second derivative of $g\in\mathcal{C}^4((0,1))$ at some point $x\in(0,1)$. The difference here is mostly notation and the fact that we restrict the error analysis to solutions of the BVP at the grid points $x_j=jh$ for $1\leq j\leq n$.

We are now in position to state and prove convergence.

<br>

---
#### Theorem 2.3.3: Convergence

Suppose $f\in\mathcal{C}^2([0,1])$ and that $u\in\mathcal{C}^2_0((0,1))$ and $v\in D_{h,0}$ are the continuous and discrete solutions of the corresponding continuous and discrete problems, respectively, then 

$$
    \| u - v \|_{h,\infty} \leq \frac{\| f'' \|_\infty}{96}h^2.
$$

---

<br>

**Spoiler alert:** The $96$ is due to multiplying the $12$ appearing in Lemma 2.3.4 and the $8$ appearing in Theorem 2.3.2.

<br>

---
***Proof of Theorem 2.3.3:***

Let $e\in D_{h,0}$ be defined as $e(x_j):= u(x_j)-v(x_j)$ for $1\leq j\leq n$. 

Since $v$ is the *exact, discrete* solution, $L_hv(x_j) = f(x_j)$ for $1\leq j\leq n$. 

Let $1\leq j\leq n$, then the linearity of $L_h$ thus implies that

$$
    L_he(x_j) = L_hu(x_j)-L_hv(x_j) = L_hu(x_j)-f(x_j) = \tau_h(x_j).
$$

Thus, $L_he = \tau_h$, which means that $e$ is the *exact, discrete* solution to the BVP associated with the linear interpolant of $\tau_h$. This implies that Theorem 2.3.1 (The Maximum Principle) applies to give

$$
    \| e\|_{h,\infty}\leq \frac{1}{8}\| \tau_h \|_{h,\infty}.
$$

Applying Lemma 2.3.4 gives the result. $\Box$

---

<br>

**Remarks:**

- The above analysis is typical in that it utilizes an ***a priori error bound*** on $v\in D_{h,0}$ that depends on a constant (in this case $\frac{\| f'' \|_\infty}{96}$) that is *independent* of the mesh/grid multipled by a function of the grid size (the $h^2$). The constant only depends upon the known data $f$. 

- The convergence is then "clear" once this bound is established because $\lim_{h\to 0} Ch^2 = C\lim_{h\to 0} h^2 = 0$ follows from standard convergence results in elementary analysis.

- We say **a priori** here because the bound is known before the solutions. **A posteriori** error bounds (or even better, error estimates) are possible to construct with some more advanced techniques. We introduce some ideas related to that in [Section 2.4](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec4.ipynb), which is perhaps best considered as an appendix to this notebook.

---
#### Student Activity
---

Consider the differential equation

$$
    -u''(x)+u(x) = f(x), \ x\in(0,1), \ u(0)=u(1)=0.
$$

1. Define $L$ and use the centered finite difference approximation for $-u''$ to define $L_h$.

2. Define and compute the truncation error $\tau_h$.

3. Show that the scheme is consistent provided that the solution $u$ is sufficiently smooth.

4. Provide some code/numerical examples/plots with manufactured solutions to illustrate these results.

**Students are encouraged to try this on their own. However, solutions (at least partial solutions) are provided below in hidden cells that students can unhide if they get stuck while attempting this.**

**Student Activity Solution to Part 1.**

It is straightforward to define differential operators as standalone operators using derivative notation, so $L:\mathcal{C}_0^2((0,1))\to \mathcal{C}((0,1))$ is defined by

$$L:=-\frac{d}{dx^2}+I.$$

It is not as clean to define $L_h$ as a standalone operator. It is easier to define it in terms of its action $L_h: D_{h,0} \to D_{h}^n$, so for $w\in D_{h,0}$, we define

$$
    (L_hw)(x_j) := - \frac{w(x_{j+1}) - 2w(x_j) + w(x_{j-1})}{h^2} + w(x_j), \ \text{ for } 1\leq j \leq n. 
$$

If you wanted to define $L_h$ as a standalone operator, then use a notation like $\delta_j:D_h\to \mathbb{R}$ to denote the operator that maps a function in $D_h$ to its evaluation at $x_j$, i.e., $\delta_j w = w(x_j)$, and then you can write

$$
    L_h := -\frac{1}{h^2}\left(\delta_{j+1} - 2\delta_j + \delta_{j-1}\right) + \delta_j , \ \text{ for } 1\leq j\leq n.
$$

**Student Activity Solution to Part 2.**

The truncation error $\tau_h\in D_h^n$ is defined by

$$
    \tau_h(x_j) := (L_hu)(x_j) - f(x_j), \ 1\leq j\leq n,
$$

i.e., it is the residual of exact continuous solution plugged into the discrete problem. In this problem, this means

$$
    \tau_h(x_j) = - \frac{u(x_{j+1}) - 2u(x_j) + u(x_{j-1})}{h^2} + u(x_j) - f(x_j), \ 1\leq j\leq n.
$$

**Student Activity Solution to Part 3.**

$$
\begin{align}
    \left| \tau_h(x_j) \right| &= \left| (L_hu)(x_j) - f(x_j) \right| \\ \\
                               &= \left| \frac{u(x_{j-1}) - 2u(x_j) + u(x_{j+1})}{h^2} -u(x_j) + f(x_j) \right|
\end{align}
$$

Assuming $u$ is smooth enough (meaning it has at least four derivatives here), 

$$
    \frac{u(x_{j-1}) - 2u(x_j) + u(x_{j+1})}{h^2} = u''(x_j) + \frac{h^2}{24}\left[u^{(4)}(\xi_1) + u^{(4)}(\xi_2) \right]
$$

for some $\xi_1\in[x_{j-1}, x_j]$ and $\xi_2\in[x_j, x_{j+1}]$. Moreover, since $u$ is the exact solution to the continuous problem, we have that $-u''(x_j) + u(x_j) = f(x_j)$, which implies $u''(x_j)-u(x_j)=-f(x_j)$. This implies that

$$
\begin{align}
    \left| \tau_h(x_j) \right| &=  \left| -f(x_j) + \frac{h^2}{24}\left[u^{(4)}(\xi_1) + u^{(4)}(\xi_2) \right] + f(x_j) \right| \\ \\
                               &= \frac{h^2}{24} \left|u^{(4)}(\xi_1) + u^{(4)}(\xi_2)\right| \\ \\
                               &\leq \frac{\|u^{(4)}\|_\infty}{12} h^2.
\end{align}
$$

Since $-u'' + u=f$, we have that $u^{(4)}=-f''+u''=-f''-f+u$, which we substitute into the above inequailty and apply the triangle inequality two times to get

$$
    \left| \tau_h(x_j) \right| \leq  \frac{\|f'' \|_\infty+\|f \|_\infty+\|u \|_\infty}{12} h^2.
$$

This is kind of annoying because we would prefer to write the truncation error in terms of just the data $f$, so we make use of the fact that there exists a Green's function for this problem (we saw this in Section 2.1) that is also bounded, so 

$$
    u(x) = \int_0^1 G(x,y)f(y)\, dy,
$$

where $G(x,y)$ is the function seen in Section 2.1 and since $C=\sup_{x\in[0,1]}\int_0^1 |G(x,y)|\, dy<\infty$, this means that

$$
    \|u\|_\infty \leq C\|f\|_\infty,
$$

which means that

$$
    \left| \tau_h(x_j) \right| \leq  \frac{\|f'' \|_\infty+(C+1)\|f \|_\infty}{12} h^2\to 0 \text{ as } h\to 0.
$$

**Student Activity Solution to Part 4.**

In [None]:
# Activity 2.3.1 Part 4

def solve_Lh_b(n, f):  # This is saying "Construct $L_h^{-1}f$" for this activity.
    
    A = make_A(n)  # Construct $A$
    
    x = np.linspace(0, 1, n+2)  # Create the n+2 grid points
    h = x[1]-x[0]  # Determine h=1/(n+1), which is also just the difference between grid points
    b = h**2*f(x[1:-1])  # Construct $b$
    
    # Note that v is a n+2 dimensional vector that is storing what we mean by $L_h^{-1}f$
    v = np.zeros(n+2)  
    
    v[1:-1] = np.linalg.solve(A+np.eye(n), b)  
    
    return v, x, h

---
#### Student Activity 
---

By Theorem 2.3.3, we have that for $e=u-v$ for the original BVP considered in this notebook that 

$$
    \|e\|_{h, \infty} \leq \frac{\| f'' \|_\infty}{96}h^2
$$

Suppose we want the error in the numerical approximation $v\approx u$ to be less than $1e-k$ for some $k\in\mathbb{N}$ at every grid point, then we can "bound the bound" and substitute $h=\frac{1}{n+1}$ to determine the number of interior grid poitns $n$ that will *guarantee* the desired error tolerance is achieved.

- Define some functions $f$ based on manufactured solutions and use the suggested approach above with an error tolerance of $1e-5$ that lead to $n>1e3$ and $n>1e4$.

- Demonstrate the $n$ leads to the desired error tolernace in code cells below.

*Hint: Smooth functions that have large second derivatives are often the result of the repeated application of the chain rule.*

*Looking ahead: Suppose we wanted to control the error only at a few points, say $x=0.25$ and $x=0.66$. Can you create a finite difference scheme based on a non-uniform mesh to achieve the desired error tolerance at only these points with singificantly less grid points? Is this necessary? We will consider adjoint based a posteriori error estimation in the next notebook, which provides an alternative approach.*

---
## Navigation:

- [Previous](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec2.ipynb)

- [Next](https://github.com/CU-Denver-MathStats-OER/Intro-PDEs-Theory-and-Computations/blob/main/Chp2/Chp2Sec4.ipynb)
---