# 7) FD Solutions

Last time:

- Introduction to Numerical Analysis 
- Accuracy, Consistency, Stability, Convergence
- Lax equivalence theorem

Today:
1. Measuring errors  
2. Stability  
3. Consistency  
4. Second order derivatives  
5. Matrix representations and properties  
6. Second order derivative with Dirichelet boundary conditions  
7. Discrete Green’s functions  
8. Interpolation by Vandermonde matrices  

## 1.  Measuring errors

Recap: 
* Last time we looked at the error of the _forward difference_, the _backward difference_, and the _centered difference_



In [None]:
using Plots
using LaTeXStrings
default(linewidth=3)
default(legendfontsize=12)

In [None]:
# These are inline functions: take arrays x and u, spit out an array of differences u' over array x.
diff1l(x, u) = x[2:end],   (u[2:end] - u[1:end-1]) ./ (x[2:end] - x[1:end-1])
diff1r(x, u) = x[1:end-1], (u[2:end] - u[1:end-1]) ./ (x[2:end] - x[1:end-1])
diff1c(x, u) = x[2:end-1], (u[3:end] - u[1:end-2]) ./ (x[3:end] - x[1:end-2])
difflist = [diff1l, diff1r, diff1c]

n = 40 
h = 2 / (n - 1)
x = LinRange(-3, 3, n)
u = sin.(x)
fig = plot(cos, xlims=(-3, 3), label = L"f'(x)=\cos(x)")
for d in difflist
    xx, yy = d(x, u)
    plot!(fig, xx, yy, marker=:circle, label=d)
end

In [None]:
fig

It's time for a worked exercise.
* What does this code do?
* What property of the difference formulas (methods) are we measuring? (hint: error vs N is ?)

In [None]:
using LinearAlgebra

grids = 2 .^ (2:10) # Hint: dot operator vectorizes, so ".^" is just vectorized "^" over all entries of a collection/tuple/array
hs = 1 ./ grids 

function refinement_error(f, fprime, d) 
    error = []
    for n in grids
        x = LinRange(-3, 3, n)
        xx, yy = d(x, f.(x))
        push!(error, norm(yy - fprime.(xx), 2)/sqrt(n)) # uses a normalized 2-norm, as Root Mean Square (RMS) 
    end
    error
end

In [None]:
# We are measuring the CONVERGENCE of these methods

# How many gridpoints we use: 2^2, 2^3, ..., 2^10
grids = 2 .^ (2:10) 
# Grid spacing: 2^(-2), 2^(-3), ... 2^(-10)
hs = 1 ./ grids 

# We define a function to measure the convergence of these methods
# The func takes 3 arguments: `f` and `fprime` are callables (functions)
# and `d` is an array containing numerical differences (derivatives)
function refinement_error(f, fprime, d) 
    error = []
    # Loop over the grids as they get more refined
    for n in grids
        x = LinRange(-3, 3, n) # domain
        # Compute numerical derivative for this grid
        xx, yy = d(x, f.(x))
        # Compute error and add it to an array
        push!(error, norm(yy - fprime.(xx), 2)/sqrt(n)) # push! = add to array
    end
    # Return the array of errors
    error
end

In [None]:
fig = plot(xscale=:log10, yscale=:log10)
for d in difflist
    error = refinement_error(sin, cos, d)
    plot!(fig, hs, error, marker=:circle, label=d)
end
plot!(fig, hs, [hs hs .^ 2], label=["h" "\$h^2\$"])



:::{note}
- Different [norms](https://en.wikipedia.org/wiki/Norm_(mathematics)) can be defined. 
- The Euclidean norm  is also called the quadratic norm, $L^2$ norm,
$ℓ^2$ norm, $2$-norm, or square norm. This is a special case ($p=2$) of the $L^p$ norm for [$L^p$-spaces](https://en.wikipedia.org/wiki/Lp_space).
:::

- What happens if we use a 1-norm, 2-norm, or Inf-norm? 

In [None]:
function refinement_error(f, fprime, d) 
    error = []
    for n in grids
        x = LinRange(-3, 3, n)
        xx, yy = d(x, f.(x))
        push!(error, norm(yy - fprime.(xx), 1)) # uses a L-1 norm
    end
    error
end

In [None]:
fig = plot(xscale=:log10, yscale=:log10)
for d in difflist
    error = refinement_error(sin, cos, d)
    plot!(fig, hs, error, marker=:circle, label=d)
end
plot!(fig, hs, [hs hs .^ 2], label=["h" "\$h^2\$"])

- The Root Mean Square (RMS)-like error used in the previous example measures the average size of the error _per grid point_.
   * It is like asikng: "On average, how wrong is my numerical derivative at a typical grid point?" 
   * This question is useful, regardless of how coarse/fine the grid is

In general, the quantity 

$$
\|f\|_h = \left[ h \sum_{m=-\infty}^{\infty} \left| f_{m} \right|^2
\right]^{1/2}
$$

is the $L^2$ norm of the grid function $f$, and is a measure of thefsize (energy) of the solution. 
- The multiplication by $h \equiv \Delta x$ is needed
so that the norm is not sensitive to grid refinements (the number of
points increases as $h\rightarrow 0$).


## 2. Stability


:::{admonition} **Activity**
- Read [**What is Numerical Stability?**](https://nhigham.com/2020/08/04/what-is-numerical-stability/) and discuss in small groups
- Share insights in class discussion
:::

<img src="https://fncbook.com/build/backwarderror-55621e558c526e24b8fc1d61b00b65a3.svg" width="90%" />

Source: [FNC: backward error](https://fncbook.com/stability/#backward-error)

### (Forward) Stability
**"nearly the right answer to nearly the right question"**
$$ \frac{\lvert \tilde f(x) - f(\tilde x) \rvert}{| f(\tilde x) |} \in O(\epsilon_{\text{machine}}) $$
for some $\tilde x$ that is close to $x$. Recall that $\tilde f$ is the computed, approximate solution and $f$ is the analytical one.

### Backward Stability
**"exactly the right answer to nearly the right question"**
$$ \tilde f(x) = f(\tilde x) $$
for some $\tilde x$ that is close to $x$

Note:
* Every backward stable algorithm is ($\implies$) stable.
* Not every stable algorithm is backward stable.
* The difference is in the _focus_: forward analysis is concerned with what the method reaches, while backward analysis looks at the problem being solved (which is why we can speak of ill-conditioned methods and ill-conditioned problems). 
* In a backward stable algorithm the errors introduced during the algorithm have the same effect as a small perturbation in the data. 
* If the backward error is the same size as any uncertainty in the data then the algorithm produces as good a result as we can expect.


* An algorithm is backwards stable if for a small _input_ rel error you get a small _output_ rel error
* = "You get a nearly correct answer to the nearly correct question"

Question:
* Are there "rough" functions for which these formulas estimate $u'(x_i) = 0$?


In [None]:
x = LinRange(-1, 1, 9)
f_rough(x) = cos(.1 + 4π*x)
fp_rough(x) = -4π*sin(.1 + 4π*x)

plot(x, f_rough, marker=:circle) # Sparse sampling of the function
plot!(f_rough, label = L"f(x) =-4 \pi \sin(0.1 + 4 \pi x)")

In [None]:
fig = plot(fp_rough, xlims=(-1, 1), label = L"f'(x)=-16 \pi^2 \cos(0.1 + 4 \pi x )")
for d in difflist
    xx, yy = d(x, f_rough.(x))
    plot!(fig, xx, yy, label=d, marker=:circle)
end
fig

* If we have a solution $u(x)$, then $u(x) + f_{\text{rough}}(x)$ is indistinguishable to our FD method.
* Therefore, given a small input relative error, we can get a large output relative error.
* There do not exist "bad" functions that also satisfy the equation.
* The solution does not "blow up" for time-dependent problems.
* Definition here is intentionally vague, and there are more subtle requirements for problems like incompressible flow.

## 3. Consistency

* When we apply the differential operator to the exact solution, we get a small residual.
* The residual converges under grid refinement.
* Hopefully fast as $h \to 0$

Recall, one part of **Lax equivalence theorem**: Consistency + Stability $\implies$ Convergence

## 4. Second order derivatives

We can compute a second derivative by applying first derivatives twice, but which ones?

In [None]:
function diff2a(x, u)
    xx, yy = diff1c(x, u)
    diff1c(xx, yy)
end


### Exercise 7.1

In [None]:

function diff2b(x, u)
    ## Compute the second order derivative as a backward (left) difference of a forward (right) difference 

    xx, yy = diff1l(x, u)
    diff1r(xx, yy) 
end

In [None]:

diff2list = [diff2a, diff2b]
n = 20
x = LinRange(-3, 3, n)
u = - cos.(x); # f(x)

In [None]:
fig = plot(cos, xlims=(-3, 3), label = L"f''(x)=\cos(x)")
for d2 in diff2list
    xx, yy = d2(x, u)
    plot!(fig, xx, yy, marker=:circle, label=d2)
end
fig

### How fast do these approximations converge?

In [None]:
grids = 2 .^ (3:10)
hs = 1 ./ grids
function refinement_error2(f, f_xx, d2)
    error = []
    for n in grids
        x = LinRange(-3, 3, n)
        xx, yy = d2(x, f.(x))
        push!(error, norm(yy - f_xx.(xx), Inf
        )) # which norm?
    end
    error
end

In [None]:
fig = plot(xlabel="h", xscale=:log10, ylabel="Error", yscale=:log10)
for d2 in diff2list
    error = refinement_error2(x -> -cos(x), cos, d2)
    plot!(fig, hs, error, marker=:circle, label=d2)
end
plot!(fig, hs, hs .^ 2, label="\$h^2\$") 

* Both methods are second order accurate.
* The `diff2b` method is more accurate than `diff2a` (by a factor of 4)
* The `diff2a` method can't compute derivatives at points adjacent the boundary.
* We don't know yet whether either is stable

## 5. Matrix representations and properties  

* All our `diff*` functions thus far have been linear in `u`, therefore they can be represented as**differentiation matrices**.
$$\frac{u_{i+1} - u_i}{x_{i+1} - x_i} = \begin{bmatrix} -1/h & 1/h \end{bmatrix} \begin{bmatrix} u_i \\ u_{i+1} \end{bmatrix}$$

* More generally: 
$$\begin{bmatrix} u'(x_1) \\ u'(x_2) \\ \vdots \\ u'(x_n) \end{bmatrix} = \begin{bmatrix} D_{11} & D_{12} & \ldots & D_{1n} \\ D_{21} & D_{22} & \ldots & D_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ D_{n1} & D_{2n} & \ldots & D_{nn} \end{bmatrix} \begin{bmatrix} u(x_1) \\ u(x_2) \\ \vdots \\ u(x_n) \end{bmatrix}\quad \text{or} \quad \mathbf{u'} = \mathbf{D} \mathbf{u}$$

In [None]:
# Dfferentiation matrix for a first order derivative
function diff1_mat(x)
    n = length(x)
    D = zeros(n, n)
    h = x[2] - x[1]
    D[1, 1:2] = [-1/h  1/h] # Use a first-order forward difference at the left boundary
    for i in 2:n-1
        D[i, i-1:i+1] = [-1/2h  0  1/2h] # In the interior points, use a second-order centered difference
    end
    D[n, n-1:n] = [-1/h  1/h] # Use a first-order backward difference at the right boundary
    D
end
x = LinRange(-1, 1, 5)
diff1_mat(x)

In [None]:
n = 12
x = LinRange(-3, 3, n)
plot(x, diff1_mat(x) * sin.(x), marker=:circle, label = L"D * \sin(x)")
plot!(cos, label = L"f'(x)=\cos(x)")

### How accurate is this derivative matrix?

In [None]:
fig = plot(xscale=:log10, yscale=:log10, legend=:topleft)
ref_error = refinement_error(sin, cos, (x, u) -> (x, diff1_mat(x) * u))
plot!(fig, hs, ref_error, marker=:circle, label = "refinement error")
plot!(fig, hs, hs, label="\$h\$")
plot!(fig, hs, hs .^ 2, label="\$h^2\$")

### Can we study it as a matrix?

In [None]:
function my_spy(A)
    cmax = norm(vec(A), Inf)
    s = max(1, ceil(120 / size(A, 1)))
    spy(A, marker=(:square, s), c=:diverging_rainbow_bgymr_45_85_c67_n256, clims=(-cmax, cmax))
end

D = diff1_mat(x)
my_spy(D)

In [None]:
svdvals(D) # Singular values given by the SVD decomposition

In [None]:
cond(D) # condition number of the matrix

In [None]:
open("../../../img/svd_viz.svg") do f
   display("image/svg+xml", read(f, String))
end

## 6. Second order derivative with Dirichlet boundary conditions

Recall the Poisson equation. Recall the [Intro to PDEs lecture note](https://sdsu-math693b.github.io/spring26/lectures/module1-3_intro_to_pdes.html). 

\begin{gather} -\frac{d^2 u}{dx^2} = f(x) \quad x \in \Omega = (-1,1) \\
u(-1) = a \quad \frac{du}{dx}(1) = b .
\end{gather}


* Turn this into a linear system by replacing 
    * $x \to [x_1, x_2, \ldots, x_n] = \mathbf{x}$,
    * $u(x) \to [u_1, u_2, \ldots, u_n] = \mathbf{u}$,
    * $f(x) \to [f_1, f_2, \ldots, f_n] = \mathbf{f}$,
    * $\frac{d^2}{dx^2} \to \mathbf{D}^2$.
    
    $$ \mathbf{D}^2\mathbf{u} = \mathbf{f}$$
    
* How to encode left boundary condition? Discuss.

The left endpoint in our example boundary value problem has a Dirichlet boundary condition,
$$u(-1) = a . $$
With finite difference methods, we have an explicit degree of freedom $u_1 = u(x_1 = -1)$ at that endpoint.
When building a matrix system for the BVP, we can implement this boundary condition by modifying the first row of the matrix,
$$ \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ \\ & & A_{2:n,:} & & \\ \\ \end{bmatrix} \begin{bmatrix} u_1 \\ \\ u_{2:n} \\ \\ \end{bmatrix} = \begin{bmatrix} a \\ \\ f_{2:n} \\ \\ \end{bmatrix} . $$

* This matrix is now not symmetric even if the interior part $A_{2:n-1} ( = D^2)$ is.

In [None]:
function laplacian_dirichlet(x)
    n = length(x)
    D = zeros(n, n)
    h = x[2] - x[1]
    D[1, 1] = 1
    for i in 2:n-1
        D[i, i-1:i+1] = (1/h^2) * [-1, 2, -1]
    end
    D[n, n] = 1
    D
end

###  Laplacian operator as a matrix

In [None]:
L = laplacian_dirichlet(x)
my_spy(L)

In [None]:
svdvals(L)

In [None]:
cond(L)

### Solutions

* For now, let's say the right boundary condition is also Dirichlet, $u(1) = b$.

$$ \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ \\ & & A_{2:n-1,:} & & \\ \\ \\ 0&0 &0 &0 &0 & 1 \end{bmatrix} \begin{bmatrix} u_1 \\ \\ u_{2:n-1} \\ \\ \\ u_n \end{bmatrix} = \begin{bmatrix} a \\ \\ f_{2:n-1} \\ \\ \\ b \end{bmatrix} . $$

In [None]:

L = laplacian_dirichlet(x)
f = one.(x)
f[1] = 0 # left BC
f[end] = 0; # right BC
plot(x, f, label = L"f(x)")

In [None]:
u = L \ f # This is syntax for the direct solution of the system Lu = f (similar to MATLAB syntax)
plot(x, u, label = L"u = L^{-1} f ")

## 7. Discrete "Green's functions"

* Green's function: impulse response
* If we write the LHS of our equation as a linear differential operator $L$ acting on $u$,

$$ Lu(x) = f(x),$$

then the Green's function $G(x, s)$ ($s$ is for "source") is defined as

$$ LG(x, s) = \delta(x - s) $$

* Motivation: once we have $G(x, s)$, we can "build up" solutions from it for different $f$. Why?

Integrating the definition of the Green's function, we get

$$ \int LG(x,s)\,f(s)\,ds=\int \delta (x-s)\,f(s)\,ds=f(x)\, .$$

Due to $L$ only acting on $x$ and being linear, we can bring $L$ out:

$$ L\left(\int G(x,s)\,f(s)\,ds\right)=f(x)\, ,$$
meaning

$$ u(x)=\int G(x,s)\,f(s)\,ds. $$

* Position of the source, $s$, can only be one of the $x_i$ in the linear system
* In matrix form:

$$ LG = I $$

* So $G = L^{-1}$, and the columns of $L^{-1}$ are the "Green's functions" for varying $s$.

In [None]:
plot(x, inv(L)[:, 2], label = L"L^{-1}_{:,2}") # second column of L^{-1}

### Discrete eigenfunctions

In [None]:
x = LinRange(-1, 1, 10)
L = laplacian_dirichlet(x)
Lambda, V = eigen(L) # returns an eigen factorization (eigenvalues, matrix of eigenvectors)
plot(Lambda, marker=:circle, label = L"\lambda")

In [None]:
plot(x, V[:, 1:4]) # first 4 eigenvectors

### Outlook on our method

#### Pros
* Consistent
* Stable
* Second order accurate (we hope)

#### Cons
* Only second order accurate (at best)
* Worse than second order on non-uniform grids
* Worse than second order at Neumann boundaries
* Boundary conditions break symmetry

## 8. Interpolation by Vandermonde matrices

We can compute a polynomial

$$ p(x) = c_0 + c_1 x + c_2 x^2 + \dotsb $$

that assumes function values $p(x_i) = u_i$ by solving a linear system with the Vandermonde matrix.

Constraints:

$$ p(x_1) = c_0 + c_1 x_1 + c_2 x_1^2 + \dotsb = u(x_1) $$

$$ p(x_2) = c_0 + c_1 x_2 + c_2 x_2^2 + \dotsb = u(x_2) $$

$$ \vdots $$

Write as 

$$ \underbrace{\begin{bmatrix} 1 & x_1 & x_2^2 & \dotsb \\
    1 & x_2 & x_3^2 & \dotsb \\
    1 & x_3 & x_3^2 & \dotsb \\
    \vdots & & & \ddots \end{bmatrix}}_V \begin{bmatrix} c_0 \\ c_1 \\ c_2 \\ \vdots \end{bmatrix} = \begin{bmatrix} u_0 \\ u_1 \\ u_2 \\ \vdots \end{bmatrix} .$$

In [None]:
function vander(x, k=nothing)
    if k === nothing
        k = length(x)
    end
    V = ones(length(x), k)
    for j = 2:k
        V[:, j] = V[:, j-1] .* x
    end
    V
end

In [None]:
vander(LinRange(-1, 1, 5))

In [None]:
cond(vander(LinRange(-1, 1, 5))) # condition number 

* The condition number of the Vandermonde matrix can get as large as $10^{16}$ ($\approx 1/\varepsilon_M$ in general) before we start losing digits
* Why?

###  Fitting a polynomial


In [None]:
k = 4
x = LinRange(-2, 2, k)
u = sin.(x)
V = vander(x)
c = V \ u
scatter(x, u, label="\$u_i = sin (x_i)\$", legend=:topleft)
plot!(x -> (vander(x, k) * c)[1,1], label="\$p(x)\$")
plot!(sin, label=L"\sin(x)")

### Differentiating

We're given the coefficients $c = V^{-1} u$ of the polynomial
$$
p(x) = c_0 + c_1 x + c_2 x^2 + \dotsb.
$$

What is

\begin{align} p(0) &= c_0 \\
p'(0) &= c_1 \\ 
p''(0) &= c_2 \cdot 2\\
p^{(k)}(0) &= c_k \cdot k! .
\end{align}

In [None]:
function fdstencil1(source, target)
    "first derivative stencil from source to target"
    x = source .- target
    V = vander(x)
    inv(V)[2, :]' # as a row vector
end
plot([z -> fdstencil1(x, z) * u, cos], xlims=(-3,3))
scatter!(x, 0*x, label="grid points")