---
Some useful $\LaTeX$ commands are defined in this cell:
$$
\newcommand{\abs}[1]{\left\lvert#1\right\rvert}
\newcommand{\norm}[1]{\left\lVert#1\right\rVert}
\newcommand{\set}[1]{\left\{#1\right\}}
\newcommand{\paren}[1]{\left(#1\right)}
\newcommand{\brack}[1]{\left[#1\right]}
\newcommand{\ip}[2]{\left\langle#1,#2\right\rangle}
\DeclareMathOperator{\span}{span}
\abs{x}, \norm{x}, \set{x}, \paren{x}, \brack{x}, \ip{x}{y}, \span
$$

---

---
# 12.2 Orthogonal basis functions
---

It would be great if we could find a basis $\phi_0,\ldots,\phi_n$ for which solving $Bc = b$, where 

$$
B := 
\begin{bmatrix}
\ip{\phi_0}{\phi_0} & \cdots & \ip{\phi_0}{\phi_n} \\
\vdots & \ddots & \vdots\\
\ip{\phi_n}{\phi_0} & \cdots & \ip{\phi_n}{\phi_n} \\
\end{bmatrix},
\qquad
b := 
\begin{bmatrix}
\ip{f}{\phi_0}\\
\vdots\\
\ip{f}{\phi_n}\\
\end{bmatrix}.
$$

is trivial.

The best case is when $B = I$, but having $B$ **diagonal** would also be great.

## Orthogonal basis

We want a basis that satisfies:

$$\ip{\phi_i}{\phi_j} = 0, \quad i \neq j.$$

That is, we want $\phi_0,\ldots,\phi_n$ to be **pairwise orthognal**.

An **orthogonal basis** is a basis that is pairwise orthogonal.

If $\set{\phi_0,\ldots,\phi_n}$ is an orthogonal basis and

$$\ip{\phi_i}{\phi_i} = 1, \quad i = 0,\ldots,n,$$

we say that $\set{\phi_0,\ldots,\phi_n}$ is an **orthonormal basis**.

Note that

$$B_{ii} = \ip{\phi_i}{\phi_i} = \norm{\phi_i}_2^2.$$

Thus, to solve $Bc = b$ for an orthogonal basis $\set{\phi_0,\ldots,\phi_n}$ we simply set

$$ c_i = \frac{b_i}{\norm{\phi_i}_2^2}, \quad i=0,\ldots,n.$$

If the basis is orthonormal, then $B = I$, so we have $c = b$.

## Gram-Schmidt orthogonalization

Given a basis $\set{\psi_0,\ldots,\psi_n}$, we can create an **orthogonal basis** for 

$$\span\set{\psi_0,\ldots,\psi_n}$$

using the [Gram-Schmidt process](http://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process).

Suppose we have just two functions $\set{\psi_0,\psi_1}$. First we set 

$$\fbox{$\phi_0 := \psi_0$}$$

Now let $p$ be the orthogonal projection of $\psi_1$ onto $\span\set{\phi_0}$.

That is, $p = c_0 \phi_0$, where $c_0$ is found by solving $Bc = b$.

Since $B = \begin{bmatrix} \ip{\phi_0}{\phi_0} \end{bmatrix}$ and $b = \begin{bmatrix} \ip{\psi_1}{\phi_0} \end{bmatrix}$, we have $c_0 = \ip{\psi_1}{\phi_0} \big/ \ip{\phi_0}{\phi_0}$, so

$$p = \frac{\ip{\psi_1}{\phi_0}}{\ip{\phi_0}{\phi_0}} \phi_0.$$

Then we let $\phi_1 = \psi_1 - p$, which is the residual of the projection of $\psi_1$ onto $\span\set{\phi_0}$.

That is,

$$\phi_1 := \psi_1 - \frac{\ip{\psi_1}{\phi_0}}{\ip{\phi_0}{\phi_0}} \phi_0.$$

Recall that the residual of an orthogonal projection is orthogonal to every basis vector.

Thus, $\ip{\phi_1}{\phi_0} = 0$, so we have an orthogonal basis. 

Additionally, it can be shown that

$$\span\set{\phi_0,\phi_1} = \span\set{\psi_0,\psi_1}.$$

### Induction step

Suppose after $k$ steps we have computed an orthogonal basis $\set{\phi_0,\ldots,\phi_k}$ that satisfies

$$\span\set{\phi_0,\ldots,\phi_k} = \span\set{\psi_0,\ldots,\psi_k}.$$

As before, we let $p = \sum_{j=0}^k c_j \phi_j$ be the projection of $\psi_{k+1}$ onto $\span\set{\phi_0,\ldots,\phi_k}$.

To do this, we need to solve $Bc = b$:

$$
\begin{bmatrix}
\ip{\phi_0}{\phi_0}\\
&\ip{\phi_1}{\phi_1}\\
&&\ddots\\
&&&\ip{\phi_k}{\phi_k}\\
\end{bmatrix}
\begin{bmatrix}
c_0\\
c_1\\
\vdots\\
c_k\\
\end{bmatrix}  
= 
\begin{bmatrix}
\ip{\psi_{k+1}}{\phi_0}\\
\ip{\psi_{k+1}}{\phi_1}\\
\vdots\\
\ip{\psi_{k+1}}{\phi_k}\\
\end{bmatrix}
$$

Therefore, we have

$$p = \sum_{j=0}^k c_j\phi_j = \sum_{j=0}^k \frac{\ip{\psi_{k+1}}{\phi_j}}{\ip{\phi_j}{\phi_j}} \phi_j.$$

Letting $\phi_{k+1}$ be the residual of the projection of $\psi_{k+1}$ onto $\span\set{\phi_0,\ldots,\phi_k}$ (i.e., $\phi_{k+1} = \psi_{k+1} - p$), we obtain

$$\fbox{${\displaystyle \phi_{k+1} := \psi_{k+1} - \sum_{j=0}^k \frac{\ip{\psi_{k+1}}{\phi_j}}{\ip{\phi_j}{\phi_j}} \phi_j}$}$$

Since the residual of an orthogonal projection is orthogonal to every basis vector, we have that $\set{\phi_0,\ldots,\phi_{k+1}}$ is orthogonal.

Additionally, we can show that

$$\span\set{\phi_0,\ldots,\phi_{k+1}} = \span\set{\psi_0,\ldots,\psi_{k+1}}.$$

### Summary

Given a basis $\set{\psi_0,\ldots,\psi_n}$, we can create an orthogonal basis $\set{\phi_0,\ldots,\phi_n}$ that satisfies

$$\span\set{\phi_0,\ldots,\phi_n} = \span\set{\psi_0,\ldots,\psi_n}$$

using the **Gram-Schmidt process**:

$$\fbox{
${\displaystyle
\phi_{k} := \psi_{k} - \sum_{j=0}^{k-1} \frac{\ip{\psi_{k}}{\phi_j}}{\ip{\phi_j}{\phi_j}} \phi_j, \quad k=0,\ldots,n.
}$
}
$$


### An orthonormal basis

To obtain an orthonormal basis, we just need to **normalize** each vector by dividing it by its norm:

$$\phi_k \gets \frac{1}{\norm{\phi_k}_2}\phi_k, \quad k=0,\ldots,n.$$

## Legendre polynomial basis

The **[Legendre polynomials](http://en.wikipedia.org/wiki/Legendre_polynomials)**, named after [Adrien-Marie Legendre](http://en.wikipedia.org/wiki/Adrien-Marie_Legendre), form an othogonal basis for the space of polynomials having degree at most $n$.

These polynomials are defined on the interval $[-1,1]$ and are obtained by performing the **Gram-Schmidt process** on the monomial basis $\set{1,x,x^2,\ldots,x^n}$.

In [None]:
using SymPy, LinearAlgebra
using Plots, LaTeXStrings

In [None]:
n = 5

# Define ψ = [1, x, ..., x^n]
x = symbols("x")
ψ = [x^i for i=0:n]

In [None]:
# Perform Gram-Schmidt to obtain ϕ
ϕ = zeros(Sym, length(ψ))
for k=0:n
    ϕ[k+1] = ψ[k+1]
    for j=0:k-1
        ψϕ = integrate(ψ[k+1]*ϕ[j+1], (x, -1, 1))
        ϕϕ = integrate(ϕ[j+1]*ϕ[j+1], (x, -1, 1))
        ϕ[k+1] -= (ψϕ/ϕϕ)*ϕ[j+1]
    end
end
ϕ

In [None]:
# Plot all functions in ϕ
xx = range(-1, 1, length=1000)

plt = plot(legend=:bottomright, aspect_ratio=:equal, size=(600,600))
for k=0:n
    yy = [subs(ϕ[k+1], x, xi) for xi in xx]
    plot!(xx, yy, label=latexstring("\\phi_$k"))
end
hline!([0], c=:black, label=:none)

In [None]:
B = [integrate(ϕ[i]*ϕ[j], (x, -1, 1)) for i=1:n+1, j=1:n+1]

## A three-term recurrence relation for Legendre polynomials

The Legendre polynomials (normalized so that $\phi_k(1) = 1$, for all $k$) can be described using the following **three-term recurrence relation**:

$$
\begin{align}
\phi_0(x) &= 1,\\
\phi_1(x) &= x,\\
\phi_{k+1}(x) &= \frac{2k+1}{k+1} x \phi_k(x) - \frac{k}{k+1} \phi_{k-1}(x), \quad k = 1,\ldots,n-1.\\
\end{align}
$$

This remarkable fact means that we only need the previous two polynomials to determine the next polynomial.

---

In [None]:
n = 5

x = symbols("x")

ϕ = zeros(Sym, n+1)
ϕ[1] = 1
ϕ[2] = x
for k=1:n-1
    ϕ[k+2] = expand(((2k+1)*x*ϕ[k+1] - k*ϕ[k])/(k+1))
end
ϕ

In [None]:
B = [integrate(ϕ[i]*ϕ[j], (x, -1, 1)) for i=1:n+1, j=1:n+1]

Moreover,

$$\ip{\phi_k}{\phi_k} = \norm{\phi_k}_2^2 = \frac{2}{2k + 1}, \quad k=0,\ldots,n,$$

as observed in the above matrix.

In [None]:
# Plot all functions in ϕ
xx = range(-1, 1, length=1000)

plt = plot(legend=:bottomright, aspect_ratio=:equal, size=(600,600))
for k=0:n
    yy = [subs(ϕ[k+1], x, xi) for xi in xx]
    plot!(xx, yy, label=latexstring("\\phi_$k"))
end
hline!([0], c=:black, label=:none)

In [None]:
n = 5

xx = range(-1, 1, length=1000)

ϕ0, ϕ1 = ones(length(xx)), xx
plot(legend=:bottomright, aspect_ratio=:equal, xlims=(-1,1), size=(600,600))
plot!(xx, ϕ0, label=L"\phi_0")
plot!(xx, ϕ1, label=L"\phi_1")

# Using the three-term recurrence relation to compute Legendre polynomials
ϕolder, ϕold = ϕ0, ϕ1
for k=1:n-1
    ϕ = ((2k+1)*xx.*ϕold - k*ϕolder)/(k+1)
    ϕolder, ϕold = ϕold, ϕ
    plot!(xx, ϕ, label=latexstring("\\phi_$(k+1)"))
end
hline!([0], c=:black, label=:none)

---

## Best approximation on $[a,b]$

We go between $x \in [-1,1]$ and $t \in [a,b]$ using

$$t = \frac12 \brack{(b-a)x + (a+b)} \quad \text{and} \quad x = \frac{2t - a -b}{b-a}.$$

Define 

$$\hat\phi_i(t) = \phi_i\paren{\frac{2t - a -b}{b-a}}.$$

Then it can be shown that

$$\ip{\hat\phi_i}{\hat\phi_j} = 
\begin{cases}
0, & i \neq j,\\\\
\displaystyle\frac{b-a}{2i+1}, & i=j.
\end{cases}
$$

---

## Example

Let's use the **Legendre basis** to find an **exact representation** of the polynomial of degree at most **four** that best fits $f(t) = \cos(t)$ over the interval $[0, 2\pi]$.

In [None]:
a, b, n = 0, 2PI, 4

t = symbols("t")
x = (2t - a - b)/(b - a)

In [None]:
ϕ = zeros(Sym, n+1)
ϕ[1] = 1
ϕ[2] = x
for k=1:n-1
    ϕ[k+2] = expand(((2k+1)*x*ϕ[k+1] - k*ϕ[k])/(k+1))
end

ϕ

In [None]:
bsym = [integrate(ϕk*cos(t), (t, 0, 2PI)) for ϕk in ϕ ]

In [None]:
dd = [(b - a)/(2i + 1) for i=0:n]  # B = Diag(dd)

In [None]:
csym = expand.(bsym./dd)

In [None]:
psym = expand(dot(csym,ϕ))

In [None]:
# Get the coefficients of psym

coef = zeros(Sym, n+1)
for i = 1:length(csym)
    coef[i] = subs(psym, t, 0)
    psym = diff(psym, t)/i
end
coef

In [None]:
cf = float(coef)

In [None]:
tt = range(0, 2π, length=1000)

plot(legend=:bottomright, aspect_ratio=:equal, xlabel=L"t", ylabel=L"y", size=(600,400))
plot!(tt, cos.(tt), label=L"y = \cos(t)")

# Evaluate p using Horner's rule
p = zeros(length(tt))
for i=n+1:-1:1
    p = p.*tt .+ cf[i]
end

plot!(tt, p, label=L"y = p(t)")

---