# Orthogonal Polynomials

Fourier series proved very powerful for approximating periodic functions.
If periodicity is lost, however, uniform convergence is lost. In this chapter
we introduce alternative bases, _orthogonal polynomials (OPs)_ built on polynomials that are applicable in
the non-periodic setting. That is we consider expansions of the form
$$
f(x) = \sum_{k=0}^‚àû c_k p_k(x) ‚âà \sum_{k=0}^{n-1} c_k^n p_k(x)
$$
where $p_k(x)$ are special families of polynomials, $c_k$ are expansion coefficients and
$c_k^n$ are approximate coefficients. 

Why not use monomials as in Taylor series? Hidden in the previous lecture was that we could effectively
compute Taylor coefficients by evaluating on the unit circle in the complex plane, _only_ if the radius of convergence
was 1. Many functions are smooth on say $[-1,1]$ but have non-convergent Taylor series, e.g.:
$$
{1 \over 25x^2 + 1}
$$
While orthogonal polynomials span the same space as monomials, and therefore we can in theory write an
approximation in monomials, orthogonal polynomials are _much_ more stable.



In addition to numerics, OPs play a very important role in many mathematical areas
including functional analysis, integrable systems, singular integral equations,
complex analysis, and random matrix theory.

1. General properties of OPs: we define orthogonal polynomials, three-term recurrences and Jacobi operators
2. Classical OPs: we define Chebyshev, Legendre, Jacobi, Laguerre, and Hermite. 
3. Gaussian quadrature: we see that OPs can be used to construct effective numerical methods for singular integrals
4. Recurrence relationships and Sturm‚ÄìLiouville equations: we see that classical OPs have many simple recurrences that
are of importance in computation, which also show they are eigenfunctions of simple differential operators.


## 1. General properties of orthogonal polynomials 

**Definition (graded polynomial basis)** 
A set of polynomials $\{p_0(x), p_1(x), ‚Ä¶ \}$ is _graded_ if $p_n$ is
precisely degree $n$: i.e.,
$$
p_n(x) = k_n x^n + k_n^{(n-1)} x^{n-1} + ‚ãØ + k_n^{(1)} x + k_n^{(0)}
$$
for $k_n ‚â†¬†0$. 

Note that if $p_n$ are graded then $\{p_0(x), ‚Ä¶, p_n(x) \}$
are a basis of all polynomials of degree $n$.


**Definition (orthogonal polynomials)** 
Given an (integrable) _weight_ $w(x) > 0$ for $x ‚àà (a,b)$,
which defines a continuous inner product
$$
‚ü®f,g‚ü© = ‚à´_a^b  f(x) g(x) w(x) {\rm d} x
$$
a graded polynomial basis $\{p_0(x), p_1(x), ‚Ä¶ \}$
are _orthogonal polynomials (OPs)_ if
$$
‚ü®p_n,p_m‚ü© = 0
$$
whenever $n ‚â† m$.


Note in the above
$$
h_n := ‚ü®p_n,p_n‚ü© = \|p_n\|^2 = ‚à´_a^b  p_n(x)^2 w(x) {\rm d} x > 0.
$$

**Definition (orthonormal polynomials)**
A set of orthogonal polynomials $\{q_0(x), q_1(x), ‚Ä¶ \}$
are orthonormal if $\|q_n\| = 1$.

**Definition (monic orthogonal polynomials)**
A set of orthogonal polynomials $\{q_0(x), q_1(x), ‚Ä¶ \}$
are orthonormal if $k_n = 1$.


**Proposition (expansion)**
If $r(x) is a degree $n$ polynomial, $\{p_n\}$ are orthogonal
and $\{q_n \}$ are orthonormal then
$$
\begin{align*}
r(x) &= ‚àë_{k=0}^n {‚ü®p_k,r‚ü© \over \|p_k\|^2} p_k(x) \\
     &    = ‚àë_{k=0}^n ‚ü®q_k,r‚ü© q_k(x)
\end{align*}
$$

**Proof**
Because $\{p_0,‚Ä¶,p_n \}$ are a basis of polynomials we can
write
$$
r(x) = ‚àë_{k=0}^n r_k p_k(x)
$$
for constants $r_k ‚àà ‚Ñù$.
By linearity we have
$$
‚ü®p_m,r‚ü© = ‚àë_{k=0}^n r_k ‚ü®p_m,p_k‚ü©= r_m ‚ü®p_m,p_m‚ü©
$$
‚àé

**Corollary (zero inner product)**
If a degree $n$ polynomial $r$ satisfies
$$
0 = ‚ü®p_0,r‚ü© = ‚Ä¶ = ‚ü®p_n,r‚ü©
$$
then $r = 0$.


OPs are uniquely defined (up to a constant) by the
property that they are orthogonal to all lower degree polynomials.

**Proposition (orthogonal to lower degree)** 
A polynomial $p$ of precisely degree $n$ satisfies
$$
‚ü®p,r‚ü© = 0
$$
for all degree $m < n$ polynomials $r$ if and only if
$p = c q_n$. Therefore an orthogonal polynomial is uniquely
defined by $k_n$. 

**Proof**
As $\{p_0,‚Ä¶,p_n\}$ are a basis of all polynomials of degree $n$,
we can write
$$
r(x) = ‚àë_{k=0}^m a_k p_k(x)
$$
Thus by linearity of inner products we have
$$
‚ü®cp_n,‚àë_{k=0}^m a_k p_k‚ü© = ‚àë_{k=0}^m ca_k ‚ü®p_n, p_k‚ü© = 0.
$$

Now for
$$
p(x) = c x^n + O(x^{n-1})
$$
consider $p(x) - c p_n(x)$ which is of degree $n-1$. It satisfies
for $k ‚â§¬†n-1
$$
‚ü®p_k, p - c p_n‚ü© = ‚ü®p_k, p‚ü© - c ‚ü®p_k, p_n‚ü© = 0.
$$
Thus it is zero, i.e., $p(x) = c p_n(x)$.

‚àé

A consequence of this is that orthonormal polynomials are always a
constant multiple of orthogonal polynomials.

The most _fundamental_ property of orthogonal polynomials is their three-term
recurrence.

**Theorem (3-term recurrence, 2nd form)**
If $\{p_n\}$ are OPs then there exist real constants
$a_n, b_n ‚â†0,c_{n-1} ‚â†0$
such that
$$
\begin{align*}
x p_0(x) &= a_0 p_0(x) + b_0 p_1(x)  \\
x p_n(x) &= c_{n-1} p_{n-1}(x) + a_n p_n(x) + b_n p_{n+1}(x) 
\end{align*}
$$
**Proof**
The $n=0$ case is immediate since $\{p_0,p_1}$ are a basis of degree 1 polynomials.
The $n >0$ case follows from 
$$
‚ü®x p_n, p_k‚ü© = ‚ü® p_n, xp_k‚ü© = 0
$$
for $k < n-1$ as $x p_k$ is of degree $k+1 < n$.

Note that
$$
b_n = {‚ü®p_{n+1}, x p_n‚ü© \over \|p_{n+1} \|^2} ‚â†¬†0
$$
since $x p_n = k_n x^{n+1} + O(x^n)$ is precisely degree
$n$. Further,
$$
c_{n-1} = {‚ü®p_{n-1}, x p_n‚ü© \over \|p_{n-1}\|^2 } = 
{‚ü®p_n, x p_{n-1}‚ü©  \over \|p_{n-1}\|^2 } =  b_{n-1}{\|p_n\|^2  \over \|p_{n-1}\|^2 } ‚â†¬†0.
$$



‚àé

**Corollary (orthonormal 3-term recurrence)** If
$\{q_n\}$ are orthonormal then $c_n = b_n$.

**Proof**
$$
b_n = ‚ü®x q_n, q_{n+1}‚ü© = ‚ü®q_n, x q_{n+1}‚ü© = c_{n-1}.
$$
‚àé

**Corollary (monic 3-term recurrence)** If
$\{p_n\}$ are monic then $b_n = 1$.



**Corollary (3-term recurrence, 1st form)**
If $\{p_n\}$ are OPs then there exist real constants $A_n ‚â† 0$, $B_n, $and C_n$
such that
$$
\begin{align*}
p_1(x) &= (A_1 x + B_1) p_0(x) \\
p_{n+1}(x) &= (A_1 x + B_1) p_n(x) - C_n p_{n-1}(x)
\end{align*}
$$

**Proof**

Follows from
$$
\begin{align*}
p_1(x) &= ({x \over b_0}  - {a_0  \over b_0}) p_0 \\
p_{n+1}(x) &= ({x \over b_n}  - {a_n  \over b_n}) p_n(x) - {c_{n-1} \over b_n} p_{n-1}(x)
\end{align*}
$$

‚àé

The three-term recurrence can also be interpreted as a matrix known
as the Jacobi matrix:

**Corollary (Jacobi matrix)**
For
$$
P(x) := [p_0(x) | p_1(x) | ‚ãØ]
$$
then we have
$$
x P(x) = P(x) \underbrace{\begin{bmatrix} a_0 \\ c_0 \\
                                                        b_0 & a_1 & c_1 \\
                                                        & ‚ã± & ‚ã± & ‚ã±
                                                        \end{bmatrix}}_X
$$
More generally, for any polynomial $a(x)$ we have
$$
a(x) P(x) = P(x) a(X).
$$

**Remark** If you are worried about multiplication of infinite matrices/vectors
note it is well-defined by the standard definition because it is banded.
It can also be defined in terms of functional analysis.

**Remark** Typically the Jacobi matrix is the transpose $J = X^‚ä§$. 
If the basis are orthonormal then $X$ is symmetric and they are the same.

**Example** What are the first 4 monic $p_n(x)$ polynomials with respect to $w(x) = 1$ on $[0,1]$?
We can construct these using Gram‚ÄìSchmidt, but exploiting the 3-term recurrence to reduce the computational cost.
We have $p_0(x) = q_0(x) = 1$, which we see is orthogonal:
$$
\|p_0\|^2 = ‚ü®p_0,p_0‚ü© = ‚à´_0^1 {\rm d x} = 1.
$$
We know from the 3-term recurrence that
$$
x p_0(x) = a_0 p_0(x) +  p_1(x)
$$
where
$$
a_0 = ‚ü®x p_0,p_0‚ü© = ‚à´_0^1 x {\rm d} x = 1/2.
$$
Thus $x p_0(x) - a_0 p_0(x) =  p_1(x)$, i.e., a constant multiple of 




##¬†2. Classical orthogonal polynomials

Classical orthogonal polynomials are special cases with a number
of beautiful properties, for example
1. Their derivatives are also OPs
2. The are eigenfunctions of simple differential operators

As stated above orthogonal polynomials are uniquely defined by the weight
$w(x)$ and the constant $k_n$. We consider:

1. Chebyshev polynomials $T_n(x)$/$U_n(x)$: $w(x) = 1/\sqrt{1-x^2}$ or $\sqrt{1-x^2}$  on $[-1,1]$
2. Legendre polynomials $P_n(x)$: $w(x) = 1$ on $[-1,1]$.
3. Hermite polynomials $H_n(x): $w(x) = \exp(-x^2)$  on $[-‚àû,‚àû]$.

Other important families discussed are

1. Ultrapsherical polynomials
2. Jacobi polynomials
3. Laguerre polynomials


## Chebyshev

**Definition (Chebyshev polynomials, 1st kind)** $T_n(x)$ are orthogonal with respect to $1/sqrt{1-x^2}$
and satisfy:
$$
T_0(x) = 1, T_n(x) = 2^{n-1} x^n + O(x^{n-1})
$$


**Definition (Chebyshev polynomials, 2nd kind)** $T_n(x)$ are orthogonal with respect to $1/sqrt{1-x^2}$.
$$
U_n(x) = 2^n x^n + O(x^{n-1})
$$


**Theorem**
$$
T_n(x) = \cos n \acos x
$$
In other words
$$
T_n(cos(Œ∏)) = \cos n Œ∏.
$$


**Proof**

We need to show that $p_n(x) := \cos n \acos x$ are (1) graded polynomials,
(2) orthogonal w.r.t $1/\sqrt{1-x^2}$ on $[-1,1]$, and 
(3) have the right constant. Property (2) is immediate:
$$
\int_{-1}^1 {p_n(x) p_m(x) \over \sqrt{1-x^2}} {\rm d} x = 
\int_{-œÄ}^œÄ {cos(nŒ∏) cos(mŒ∏) \over \sqrt{1-cos^2 Œ∏}} \sin Œ∏ {\rm d} Œ∏ =
\int_{-œÄ}^œÄ cos(nŒ∏) cos(mŒ∏) {\rm d} x = 0
$$
if $n ‚â†¬†m$. 

To see that they are graded we use the fact that
$$
x p_n(x) = \cos Œ∏ \cos n Œ∏ = {\cos(n-1)Œ∏ + cos(n+1)Œ∏ \over 2} = {p_{n-1}(x) + p_{n+1}(x) \over 2}
$$
In other words $p_{n+1}(x) = 2x p_n(x) - p_{n-1}(x)$.
Since each time we multiply by $2x$ and $p_0(x) = 1$ we have
$$
p_n(x) = (2x)^n + O(x^{n-1})
$$
which completes the proof.

‚àé

Buried in the proof is the 3-term recurrence:

**Corollary**
$$
x ùêì(x) = ùêì(x) \begin{bmatrix} 0 & 1/2 \\ 1 & 0 & 1/2 \end{bmatrix} 
$$

## Legendre

Legendre: $P_n(x)$ are orthogonal polynomials with respect to $w(x) = 1$ on $[-1,1]$, with
$k_n = $

## 3. Gaussian quadrature

Consider integration
$$
\int_a^b f(x) w(x) {\rm d}x.
$$
For periodic integration we approximated (using the Trapezium rule) an integral by a sum.
We can think of it as a weighted sum:
$$
{1 \over 2œÄ} \int_0^{2œÄ} f(Œ∏) {\rm d} Œ∏ ‚âà  ‚àë_{j=0}^{n-1} w_j f(Œ∏_j)
$$
where $w_j = 1/n$. Replacing an integral by a weighted sum is a known as a _quadrature_ rule.
This quadrature rule had several important properties:
1. It was _exact_ for integrating trigonometric polynomials with 2n-1 coefficients
$$
p(Œ∏) = \sum_{k=1-n}^{n-1} pÃÇ_k \exp({\rm i}k Œ∏)
$$
as seen by the formula
$$
‚àë_{j=0}^{n-1} w_j f(Œ∏_j) = pÃÇ_0^n = ‚Ä¶ + pÃÇ_{n-1} + pÃÇ_0 + pÃÇ_n + ‚ãØ = pÃÇ_0 = {1 \over 2œÄ} \int_0^{2œÄ} p(Œ∏) {\rm d} Œ∏
$$
2. It exactly recovered the coefficients ($pÃÇ_k^n = pÃÇ_k$) for expansions of trigonometric polynomials with $n$ coeffiicents:
$$
p(Œ∏) = \sum_{k=-‚åà(n-1)/2‚åâ}^{‚åä(n-1)/2‚åã} pÃÇ_k \exp({\rm i}k Œ∏)
$$
3. It converged fast for smooth, periodic functions $f$.

In this section we consider other quadrature rules
$$
\int_a^b f(x) w(x) {\rm d}x ‚âà \sum_{j=1}^n w_j f(x_j)
$$
We want to choose $w_j$ and $x_j$ so that the following properties are satisfied:
1. It is _exact_ for integrating polynomials up to degree $2n-1$:
$$
p(Œ∏) = \sum_{k=0}^{2n-1} c_k q_k(x)
$$
2. It exactly recovers the coefficients for expansions:
$$
p(Œ∏) = \sum_{k=0}^{n-1} c_k q_k(x)
$$
3. It converges fast for smooth functions $f$.
We will focus on properties (1) and (2) as property (3) is more involved.

The key to property (1) is to use _roots (zeros) of $q_n(x)$_.

**Lemma** $q_n(x)$ has exactly $n$ distinct roots.

**Proof**

Suppose $x_1, ‚Ä¶,x_j$ are the roots where $q_n(x)$ changes sign, that is,
$$
q_n(x) = c_j (x-x_j) + O((x-x_j)^2)
$$
for $c_j ‚â† 0$. Then
$$
q_n(x) (x-x_1) ‚ãØ(x-x_j)
$$
does not change sign.
In other words:
$$
‚ü®q_n,(x-x_1) ‚ãØ(x-x_j) ‚ü© = \int_a^b q_n(x) (x-x_1) ‚ãØ(x-x_j) {\rm d} x ‚â† 0.
$$
This is only possible if $j = n$.

‚àé



**Lemma (zeros)** The zeros $x_1, ‚Ä¶,x_n$ of $q_n(x)$ are the eigenvalues of the truncated Jacobi matrix
$$
X_n := \begin{bmatrix} a_0 & b_0 \\ 
                         b_0 & ‚ã± & ‚ã± \\ 
                         & ‚ã± & a_{n-2} & b_{n-2} \\
                         && b_{n-2} & a_{n-1} \end{bmatrix} ‚àà ‚Ñù^{n √ó n}.
$$
More precisely,
$$
X_n Q_n = Q_n \begin{bmatrix} x_1 \\ & ‚ã± \\ && x_n \end{bmatrix}
$$
for the orthogonal matrix
$$
Q_n = \begin{bmatrix}
p_0(x_1) & ‚ãØ & p_0(x_n) \\
‚ãÆ  & ‚ãØ & ‚ãÆ  \\
p_{n-1}(x_1) & ‚ãØ & p_{n-1}(x_n) 
\end{bmatrix} 
$$

**Proof**

We construct the eigenvector (noting $b_{n-1} p_n(x_j) = 0$):
$$
X_n \begin{bmatrix} p_0(x_j) \\ ‚ãÆ \\ p_{n-1}(x_j) \end{bmatrix} =
\begin{bmatrix} a_0 p_0(x_j) + b_0 p_1(x_j) \\
 b_0 p_0(x_j) + a_1 p_1(x_j) + b_1 p_2(x_j) \\
‚ãÆ \\ 
b_{n-3} p_{n-3}(x_j) + a_{n-2} p_{n-2}(x_j) + b_{n-2} p_{n-1}(x_j) \\
b_{n-2} p_{n-2}(x_j) + a_{n-1} p_{n-1}(x_j) + b_{n-1} p_n(x_j)
\end{bmatrix} = x_j \begin{bmatrix} p_0(x_j) \\
 p_1(x_j) \\
‚ãÆ \\ 
p_n(x_j)
\end{bmatrix}
$$

‚àé