In [1]:
using DrWatson;
@quickactivate "MATH361Lectures"
using LinearAlgebra;
import MATH361Lectures;

# Cholesky Factorization

In many practical applications (*e.g.* numericla approximation of linear PDEs) the matrix $A$ in the linear system $Ax=b$ has a special structure. In this lecture, we are interested in the case when $A$ is what is known as **symmetric and positive definite** (SPD). 

A matrix $A$ is symmetric positive definite if

1) $A = A^{T}$ (that is, $A$ is symmetric), and

2) $x^{T}Ax > 0$ for all nonzero vectors $x$

Symmetric positive definite matrices satisfy some interesting and useful properties. We state the following theorem but leave the proofs of each part as either an exercise or as part of a linear algebra course. 

If $A$ is SPD, then

i) all the diagonal entries of $A$ are positive, 

ii) all the eigenvalues of $A$ are positive,

iii) the determinant of $A$ is positive, and

iv) every submatrix $B$ of $A$ obtained by deleting any set of rows and the corresponding set of columns from $A$ is SPD. 

Our major result is the following: 

> If $A$ is SPD, then there is a unique lower triangular matrix with positive diagonal entries that satisfies $A=LL^{T}$. 

The factorization $A=LL^{T}$ is called **Cholesky** factorization. 

We will give a proof of the existence of the Cholesky factorization. The proof we will by way of mathematical induction. 

**Question:** What is a proof by induction? 

### Proof of Existence of Cholesky Factorization

We begin with the base case, that is, when $A$ is a $1\times 1$ SPD matrix so that $A=\alpha$ where $\alpha > 0$. In this case, take $L=\sqrt{\alpha}$ so $L^{T}=\sqrt{\alpha}$ and obviously $LL^{T} = A$. 

Now we proceed with the induction stype. Out induction hypothesis is that for all $n \leq N-1$ if $A$ is an $n\times n$ SPD matrix then $A$ possesses a Cholesky factorization. We will show that is $A$ is an $N\times N$ SPD matrix, then there is a unique lower triangular matrix with positive diagonal entries that satisfies $A=LL^{T}$. 

Observe that we may write $A$ as

$$A = \left[\begin{array}{@{}c|c@{}} A_{N-1} & b \\ \hline \\
b^{T} & a_{NN}\end{array}\right].$$

Now, we use the fact stated earlier that if $A$ is SPD, then every submatrix $B$ of $A$ obtained by deleting any set of rows and the corresponding set of columns from $A$ is SPD. In particular, this tells us that $A_{N-1}$ is SPD and of size $n \leq N-1$. Therefore, we have a Cholesky factorization $A_{N-1}=L_{N-1}L_{N-1}^{T}$. 

Thus, we will look for a matrix $L$ of the form

$$L = \left[\begin{array}{@{}c|c@{}} L_{N-1} & {\bf 0} \\ \hline \\
c^{T} & \alpha\end{array}\right],$$

that satisfies $LL^{T} = A$, that is

$$\left[\begin{array}{@{}c|c@{}} L_{N-1} & {\bf 0} \\ \hline \\
c^{T} & \alpha\end{array}\right]\left[\begin{array}{@{}c|c@{}} L_{N-1}^{T} & c \\ \hline \\
{\bf 0} & \alpha\end{array}\right] = \left[\begin{array}{@{}c|c@{}} A_{N-1} & b \\ \hline \\
b^{T} & a_{NN}\end{array}\right].$$

Computing the blosk matrix multiplication corresonding to $LL^{T}$ gives

$$LL^{T} = \left[\begin{array}{@{}c|c@{}} L_{N-1}L_{N-1}^{T} & L_{N-1}c \\ \hline \\
c^{T}L_{N-1}^{T} & c^{T}c + \alpha^2\end{array}\right] = \left[\begin{array}{@{}c|c@{}} A_{N-1} & b \\ \hline \\
b^{T} &  a_{NN}\end{array}\right].$$

Thus, we have the existence of a Cholesky factorization for $A$ provided

i) $L_{N-1}c = b$ has a unique solution, and

ii) $c^{T}c + \alpha^{2} = a_{NN}$ has a positive solution $\alpha$. 

Now, since $L_{N-1}$ is a lower triangular matrix with positive diagonal entries (by the induction hypothesis) $L_{N-1}c = b$  has a unique solution (which can be computed by forward substitution). Furthermore, $c^{T}c + \alpha^{2} = a_{NN}$ will have a positive solution $\alpha = \sqrt{a_{NN} - c^{T}c}$ provided $\alpha^{2} > 0$. We show now that this is the case. 

If $A=LL^{T}$ with $L$ as just constructed, then $0 < \det(A) = \det(LL^{T}) = \det(L)\det(L^{T})$. Now by the structure of $L$, we have that $\det(L) = \det(L_{N-1})\alpha$, so $0 < \det(L_{N-1})^{2}\alpha^{2}$ and since $\det(L_{N-1})^2 > 0$ we must have $\alpha^{2} > 0$. 

This completes the proof of the existence of a Cholesky factorization for any SPD matrix $A$. 

The question now is, how do we actually compute the Cholesky factorization of an SPD matrix $A$. That is, what is an algorithm we can implement on a computer. Our next goal is to derive such an algorithm. 

Consider a factorization for a $3 \times 3$ SPD matrix that looks as follows:

$$A = \left[\begin{array}{ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{array}\right] = \left[\begin{array}{ccc} l_{11} & 0 & 0 \\ l_{21} & l_{22} & 0 \\ l_{31} & l_{32} & l_{33} \end{array}\right] \left[\begin{array}{ccc} l_{11} & l_{21} & l_{31} \\ 0 & l_{22} & l_{32} \\ 0 & 0 & l_{33} \end{array}\right] = \left[\begin{array}{ccc} l_{11}^{2} & l_{11}l_{21} & l_{11}l_{31} \\ l_{11}l_{21} & l_{21}^2+l_{22}^2 & l_{21}l_{31}+l_{22}l_{32} \\ l_{11}l_{31} & l_{21}l_{31} + l_{22}l_{32} & l_{31}^2 +l_{32^2}+ l_{33^3} \end{array}\right] $$

Comparing the far left matrix entries with the far right matrix entries leads to a system of equations: