# LU Factorization
---

GENERAL PROBLEM: solve the linear system of equations

\begin{align}
  a_{11}x_{1} + a_{12}x_{2} + \cdots + a_{1n}x_{n} &= b_{1} \\
  a_{21}x_{1} + a_{22}x_{2} + \cdots + a_{2n}x_{n} &= b_{2} \\
  &\,\,\,\vdots \\
  a_{n1}x_{1} + a_{n2}x_{2} + \cdots + a_{nn}x_{n} &= b_{n}
\end{align}

for the unknown variables $x_{1},\ldots,x_{n}$. Equivalently, solve the matrix equation

\begin{align}
  A\mathbf{x} = \mathbf{b}
\end{align}

for the unknown vector $\mathbf{x}$, where

\begin{align}
  A =
  \left[\begin{array}{cccc}
    a_{11} & a_{12} & \cdots & a_{1n} \\
    a_{21} & a_{22} & \cdots & a_{2n} \\
    \vdots & \vdots & \ddots & \vdots \\
    a_{n1} & a_{n2} & \cdots & a_{nn}  
  \end{array}\right]
  \quad,\quad
  \mathbf{x} =
  \left[\begin{array}{c}
    x_{1} \\
    x_{2} \\
    \vdots \\
    x_{n}
  \end{array}\right]
  \quad,\quad
  \mathbf{b} =
  \left[\begin{array}{c}
    b_{1} \\
    b_{2} \\
    \vdots \\
    b_{n}
  \end{array}\right]
\end{align}

IDEA: factorize $A$ into the composition of an upper-triangular matrix, $U$, with a lower-triangular matrix, $L$, so that the above matrix equation is replaced by two matrix equations

\begin{align}
  A = LU
  \quad\Rightarrow\quad
  A\mathbf{x} = LU\mathbf{x} = \mathbf{b}
  \quad\Rightarrow\quad
  \left\{\begin{array}{l}
    L\mathbf{y} = \mathbf{b} \\
    U\mathbf{x} = \mathbf{y}
  \end{array}\right.
\end{align}

Once this factorization is done, the solution is quickly obtained using forward substitution on the first equation, followed by backward substitution on the second equation.

PRE-REQUISITES:
- Gaussian elimination
- Partial pivoting
- Back substitution

REFERENCES:
- [1] DeVries and Hasbun, *A First Course in Computational Physics, 2nd edition*.
- [2] Burden and Faires, *Numerical Analysis, 7th edition*.
- [3] Press et al, *Numerical Recipes: The Art of Scientific Computing, 3rd edition*.

## 1. Overview

We are trying to solve a system of linear equations represented in matrix form as

\begin{align}
  A\mathbf{x} = \mathbf{b},
\end{align}

where 

\begin{align}
  A =
  \left[\begin{array}{cccc}
    a_{11} & a_{12} & \cdots & a_{1n} \\
    a_{21} & a_{22} & \cdots & a_{2n} \\
    \vdots & \vdots & \ddots & \vdots \\
    a_{n1} & a_{n2} & \cdots & a_{nn}  
  \end{array}\right]
  \quad,\quad
  \mathbf{x} =
  \left[\begin{array}{c}
    x_{1} \\
    x_{2} \\
    \vdots \\
    x_{n}
  \end{array}\right]
  \quad,\quad
  \mathbf{b} =
  \left[\begin{array}{c}
    b_{1} \\
    b_{2} \\
    \vdots \\
    b_{n}
  \end{array}\right]
\end{align}

To aid in finding this solution, we want to factorize $A$ into the product of a lower-triangular matrix, $L$, and an upper-triangular matrix, $U$, as follows

\begin{align}
  A = LU,
\end{align}

where 

\begin{align}
  L =
  \left[\begin{array}{ccccc}
    l_{11} & 0 & 0 & \cdots & 0 \\
    l_{21} & l_{22} & 0 & \cdots & 0 \\
    l_{31} & l_{32} & l_{33} & \ddots & \vdots \\
    \vdots & \vdots & \vdots & \ddots & 0 \\
    l_{n1} & l_{n2} & l_{n3} & \cdots & l_{nn}  
  \end{array}\right]
  \quad,\quad
  U =
  \left[\begin{array}{ccccc}
    u_{11} & u_{12} & u_{13} & \cdots & u_{1n} \\
    0 & u_{22} & u_{23} & \cdots & u_{2n} \\
    0 & 0 & u_{33} & \cdots & u_{3n} \\
    \vdots & \vdots & \ddots & \ddots & \vdots \\
    0 & 0 & \cdots & 0 & u_{nn}  
  \end{array}\right],
\end{align}

Once $L$ and $U$ have been found, we may write 

\begin{align}
  A = LU
  \quad\Rightarrow\quad
  A\mathbf{x} = LU\mathbf{x} = \mathbf{b}
  \quad\Rightarrow\quad
  \left\{\begin{array}{l}
    L\mathbf{y} = \mathbf{b} \\
    U\mathbf{x} = \mathbf{y}
  \end{array}\right.
\end{align}

The original matrix equation has now been replaced by two matrix equations

\begin{align}
  L\mathbf{y} = \mathbf{b}
  \quad\Rightarrow\quad
  \left[\begin{array}{ccccc}
    l_{11} & 0 & 0 & \cdots & 0 \\
    l_{21} & l_{22} & 0 & \cdots & 0 \\
    l_{31} & l_{32} & l_{33} & \ddots & \vdots \\
    \vdots & \vdots & \vdots & \ddots & 0 \\
    l_{n1} & l_{n2} & l_{n3} & \cdots & l_{nn}  
  \end{array}\right]
  \left[\begin{array}{c}
    y_{1} \\
    y_{2} \\
    y_{3} \\
    \vdots \\
    y_{n} \\
  \end{array}\right]
  =
  \left[\begin{array}{c}
    b_{1} \\
    b_{2} \\
    b_{3} \\
    \vdots \\
    b_{n} \\
  \end{array}\right]
\end{align}

and 

\begin{align}
  U\mathbf{x} = \mathbf{y}
  \quad\Rightarrow\quad
  \left[\begin{array}{ccccc}
    u_{11} & u_{12} & u_{13} & \cdots & u_{1n} \\
    0 & u_{22} & u_{23} & \cdots & u_{2n} \\
    0 & 0 & u_{33} & \cdots & u_{3n} \\
    \vdots & \vdots & \ddots & \ddots & \vdots \\
    0 & 0 & \cdots & 0 & u_{nn}  
  \end{array}\right]
  \left[\begin{array}{c}
    x_{1} \\
    x_{2} \\
    x_{3} \\
    \vdots \\
    x_{n} \\
  \end{array}\right]
  =
  \left[\begin{array}{c}
    y_{1} \\
    y_{2} \\
    y_{3} \\
    \vdots \\
    y_{n} \\
  \end{array}\right]
\end{align}

Forward substitution is applied to the first equation, $L\mathbf{y} = \mathbf{b}$, to solve for $\mathbf{y}$. Then back substitution is applied to the second equation, $U\mathbf{x}=\mathbf{y}$, to obtain the solution $\mathbf{x}$.

(NOTE: one advantage of LU factorization is that it does not depend at all on the vector $\mathbf{b}$. Given a matrix $A$, the matrices $L$ and $U$ are found once and for all. A solution to the equation $A\mathbf{x}=\mathbf{b}$ can be found without repeating the effort of finding $L$ and $U$ every time a new vector $\mathbf{b}$ is used. Given a new $\mathbf{b}$, one only has to redo the forward and back substitution steps, which are consideraly less taxing than determining $L$ and $U$.) 

## 2. Part 1: LU factorization (using Crout's method)

When factorizing $A$, there are different ways of handling the diagonal terms in $L$ and $U$. Here we follow the method of Crout and set the diagonal elements of $L$ to be 1 ($L$ is then sometimes called a *unit* lower-triangular matrix). The goal now is to determine the remainder of non-zero elements of $L$ and $U$. The first few columns and rows of $LU=A$ are

\begin{align}
  \left[\begin{array}{ccccc}
    1      & 0      & 0      & 0 & \cdots\\
    l_{21} & 1      & 0      & 0 & \cdots \\
    l_{31} & l_{32} & 1      & 0 & \cdots \\
    l_{41} & l_{42} & l_{43} & 1 & \cdots \\
    \vdots & \vdots & \vdots & \vdots & \ddots
  \end{array}\right]
  \left[\begin{array}{ccccc}
    u_{11} & u_{12} & u_{13} & u_{14} & \cdots \\
    0      & u_{22} & u_{23} & u_{24} & \cdots \\
    0      & 0      & u_{33} & u_{34} & \cdots \\
    0      & 0      & 0      & u_{44} & \cdots \\
    \vdots & \vdots & \vdots & \vdots & \ddots
  \end{array}\right]
  = 
  \left[\begin{array}{ccccc}
    a_{11} & a_{12} & a_{13} & a_{14} & \cdots \\
    a_{21} & a_{22} & a_{23} & a_{24} & \cdots \\
    a_{31} & a_{32} & a_{33} & a_{34} & \cdots \\
    a_{41} & a_{42} & a_{43} & a_{44} & \cdots \\
    \vdots & \vdots & \vdots & \vdots & \ddots
  \end{array}\right]
\end{align}

\begin{align}
  = 
  \left[\begin{array}{llllc}
    u_{11} & u_{12} & u_{13} & u_{14} & \cdots \\
    l_{21}u_{11} & l_{21}u_{12} + u_{22} & l_{21}u_{13} + u_{23} & l_{21}u_{14} + u_{24} & \cdots \\
    l_{31}u_{11} & l_{31}u_{12} + l_{32}u_{22} & l_{31}u_{13} + l_{32}u_{23} + u_{33} & l_{31}u_{14} + l_{32}u_{24} + u_{34} & \cdots \\
    l_{41}u_{11} & l_{41}u_{12} + l_{42}u_{22} & l_{41}u_{13} + l_{42}u_{23} + l_{43}u_{33} & l_{41}u_{14} + l_{42}u_{24} + l_{43}u_{34} + u_{44} & \cdots \\
    \vdots & \vdots & \vdots & \vdots & \ddots  
  \end{array}\right]
\end{align}

where all $l_{ij}$'s and $u_{ij}$'s are yet to be determined, but the $a_{ij}$'s are all known. After staring at this for a while, a pattern emerges. All elements in the first *row* of $U$ (that is $u_{11}, u_{12},\ldots,u_{1n}$) can be determined immediately from the first row of $A=LU$ (equating the first row above with the first row of $A$)

\begin{align}
  u_{1j} = a_{1j}
  \quad,\quad j = 1,\ldots,n
\end{align}

Then because $u_{11}$ is now known, all elements in the first *column* of $L$ can be determined from the first column of $A=LU$

\begin{align}
  l_{i1} = \frac{a_{i1}}{u_{11}}
  \quad,\quad i = 2,\ldots,n
\end{align}

Next the elements in the second row of $U$ can be determined from the second row of $LU=A$

\begin{align}
  u_{2j} = a_{2j} - l_{21}u_{1j}
  \quad,\quad j = 2,\ldots,n
\end{align}

Then the second column of $L$ can be determined from the second column of $LU=A$

\begin{align}
  l_{i2} = \frac{1}{u_{22}}(a_{i2} - l_{i1}u_{12})
  \quad,\quad i = 3,\ldots,n
\end{align}

And so on, alternating between columns below the diagonal of $A=LU$ and rows to the right of the diagonal, solving for the column and row elements of $L$ and $U$, respectively. The general expression for calculating elements in the $i$th row of $U$ is

\begin{align}
  u_{ij} = a_{ij} - \sum_{k=1}^{i-1}l_{ik}u_{kj}
  \quad,\quad j = i,\ldots,n
\end{align}

The general expression for calculating elements in the $j$th column of $L$ is

\begin{align}
  l_{ij} = \frac{1}{u_{ii}}(a_{ij} - \sum_{k=1}^{j-1}l_{ik}u_{kj})
  \quad,\quad i = j+1,\ldots,n
\end{align}

**Pivoting**

Notice that calculating the $l_{ij}$'s involves dividing by $u_{ii}$'s. That means in order to avoid potentially catastrophic roundoff errors, or division by zero, we need pivoting. Pivoting is slightly more subtle here than it is during Gaussian elimination. The above procedure assumes no interchanges of rows or columns, that is, no pivoting. In that case, $A=LU$ is strictly true. However to allow for pivoting, the correct identification should realy be $PA=LU$, where $P$ is a permutation matrix that performs all row interchanges (here we only consider partial pivoting by interchanging rows, not columns). If we knew ahead of time which row interchanges were needed for pivoting, we would have 

\begin{align}
  \tilde{A} = PA = P_{n}P_{n-1}\cdots P_{2}P_{1}A.
\end{align}

It is $\tilde{A}$, and not $A$, that is factorized into the composition of $L$ and $U$. Of course, we don't know ahead of time which row interchanges are needed for pivoting. So we proceed in a provisional way. We calculate elements using the above expressions until we get to a diagonal element, $u_{jj}$. Then we calculate $u_{jj}$ for all rows on and below the diagonal (i.e., all $u_{ij}$, where $i=j,\ldots,n$), choose the best pivot candidate among them, and interchange the necessary rows of $A$. Any elements of $L$ that have been determined up to that point are also subject to row interchanges, as are any scale factors associated with $A$. However, elements of $L$ and $U$ that have not yet been determined are unaffected by these row interchanges, under the excuse that they haven't been pinned down yet anyways. This is where the subtlety lies. We are not actually permuting the rows of $L$ or $U$, we are simply re-interpreting their meaning. Remember, we are finding $L$ and $U$ that satisfy $PA=LU$, and not $A=LU$. So when we encounter a pivoting step, we go back and reconsider what $L$ and $U$ *would have been* if the permutation had been made from the outset. This is a bit confusing, so hopefully the example below will illustrate this clearly.

Alternating between rows and columns is not the only possibility. One may instead sweep through entire columns at a time, starting with the leftmost column and proceeding to the right; or one may sweep through entire rows, starting with the uppermost row and proceeding down. The data used to calculate each element are determined by the time they are needed. In the algorithm below, we will sweep through columns below the diagonal, followed by rows above the diagonal, and then pivoting when we reach a diagonal element. We will use scaled partial pivoting as we did for Gaussian elimination.

(PROGRAMMER'S NOTE: each $a_{ij}$ appears exactly once in the above sequence of calculations of either $l_{ij}$ or $u_{ij}$. Therefore once used, it is no longer needed. As a result, there is no need to save the array representing the elements of $A$. Storage requirements can be minimized by replacing the elements of $A$ by the elements of $L$ and $U$, as they are calculated. Also the diagonal elements of $L$ do not need to be stored, since we know that they are simply 1.)

## 3. Simple example (part 1: LU factorization using Crout's method)

To illustrate the above procedure, consider the linear system of four equations

\begin{align}
  \begin{array}{lrcrcrcrcr}
    E_{1}: &  x_{1} &+&  x_{2} & &        &+& 3x_{4} &=&  4, \\
    E_{2}: & 2x_{1} &+&  x_{2} &-&  x_{3} &+&  x_{4} &=&  1, \\
    E_{3}: & 3x_{1} &-&  x_{2} &-&  x_{3} &+& 2x_{4} &=& -3, \\
    E_{4}: & -x_{1} &+& 2x_{2} &+& 3x_{3} &-&  x_{4} &=&  4,
  \end{array}
\end{align}

(NOTE: pivoting is not strictly needed for this example. Nevertheless, we will walk through the pivoting procedure for illustrative purposes.)

In matrix form we have

\begin{align}
  \left[\begin{array}{cccc}
    1      & 0      & 0      & 0 \\
    l_{21} & 1      & 0      & 0 \\
    l_{31} & l_{32} & 1      & 0 \\
    l_{41} & l_{42} & l_{43} & 1
  \end{array}\right]
  \left[\begin{array}{cccc}
    u_{11} & u_{12} & u_{13} & u_{14} \\
    0      & u_{22} & u_{23} & u_{24} \\
    0      & 0      & u_{33} & u_{34} \\
    0      & 0      & 0      & u_{44}
  \end{array}\right]
  = 
  \left[\begin{array}{rrrr}
     a^{(0)}_{11} & a^{(0)}_{12} & a^{(0)}_{13} & a^{(0)}_{14} \\
     a^{(0)}_{21} & a^{(0)}_{22} & a^{(0)}_{23} & a^{(0)}_{24} \\
     a^{(0)}_{31} & a^{(0)}_{32} & a^{(0)}_{33} & a^{(0)}_{34} \\
     a^{(0)}_{41} & a^{(0)}_{42} & a^{(0)}_{43} & a^{(0)}_{44}
  \end{array}\right],
\end{align}

where $a^{(\alpha)}_{ij}$ denotes the provisional matrix elements after $\alpha$ steps of pivoting. We start with 

\begin{align}
  A^{(0)} =
  \left[\begin{array}{rrrr}
     a^{(0)}_{11} & a^{(0)}_{12} & a^{(0)}_{13} & a^{(0)}_{14} \\
     a^{(0)}_{21} & a^{(0)}_{22} & a^{(0)}_{23} & a^{(0)}_{24} \\
     a^{(0)}_{31} & a^{(0)}_{32} & a^{(0)}_{33} & a^{(0)}_{34} \\
     a^{(0)}_{41} & a^{(0)}_{42} & a^{(0)}_{43} & a^{(0)}_{44}
  \end{array}\right]
  = 
  \left[\begin{array}{rrrr}
     1 &  1 &  0 &  3 \\
     2 &  1 & -1 &  1 \\
     3 & -1 & -1 &  2 \\
    -1 &  2 &  3 & -1 
  \end{array}\right].
\end{align}

with scale factors

\begin{align}
  \left.\begin{array}{l}
    s^{(0)}_{1} = \max_{1 \leq j \leq 4}|a^{(0)}_{1j}| =  3 \\
    s^{(0)}_{2} = \max_{1 \leq j \leq 4}|a^{(0)}_{2j}| =  2 \\
    s^{(0)}_{3} = \max_{1 \leq j \leq 4}|a^{(0)}_{3j}| =  3 \\
    s^{(0)}_{4} = \max_{1 \leq j \leq 4}|a^{(0)}_{4j}| =  3 
  \end{array}\right\}
  \quad\Rightarrow\quad
  s^{(0)} =
  \left[\begin{array}{l}
    3 \\
    2 \\
    3 \\
    3
  \end{array}\right].
\end{align}

The first element to calculate is $u_{11}$, which involves pivoting. Therefore calculate all pivot candidates in the first column, as if they had appeared in the pivot position (the $(1,1)$ position in this case)

\begin{align}
  & u^{(0)}_{11} = a^{(0)}_{11} =  1 \\
  & u^{(0)}_{21} = a^{(0)}_{21} =  2 \\
  & u^{(0)}_{31} = a^{(0)}_{31} =  3 \\
  & u^{(0)}_{41} = a^{(0)}_{41} = -1 .
\end{align}

The best pivot candidates is chosen by comparing their magnitudes relative to the scale factor associated with each row 

\begin{align}
  & \left|\frac{u^{(0)}_{11}}{s^{(0)}_{1}}\right| = \left|\frac{1}{3}\right| = 1/3 \\
  & \left|\frac{u^{(0)}_{21}}{s^{(0)}_{2}}\right| = \left|\frac{2}{2}\right| = 1 \\
  & \left|\frac{u^{(0)}_{31}}{s^{(0)}_{3}}\right| = \left|\frac{3}{3}\right| = 1 \\
  & \left|\frac{u^{(0)}_{41}}{s^{(0)}_{4}}\right| = \left|\frac{-1}{3}\right| = 1/3 .
\end{align}

Here there are two best pivot candidates (rows 2 and 3). By convention we select the first. So we interchange rows 1 and 2, yielding

\begin{align}
  \begin{array}{l}
    u^{(1)}_{11} = u^{(0)}_{21} = 2
  \end{array}
  \quad,\quad
  A^{(1)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     1 &  1 &  0 &  3 \\
     3 & -1 & -1 &  2 \\
    -1 &  2 &  3 & -1 
  \end{array}\right]
  \quad,\quad
  s^{(1)} =
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right].
\end{align}

Now we can quickly solve for the elements in the first column of $L$. Note that the expression for these $l_{i1}$'s is the same as the above $u^{(0)}_{i1}$'s, with the additional division by $u^{(1)}_{11}$. That is 

\begin{align}
  & l^{(1)}_{21} = \frac{u^{(0)}_{11}}{u^{(1)}_{11}} = \frac{1}{2} \\
  & l^{(1)}_{31} = \frac{u^{(0)}_{31}}{u^{(1)}_{11}} = \frac{3}{2} \\
  & l^{(1)}_{41} = \frac{u^{(0)}_{41}}{u^{(1)}_{11}} = -\frac{1}{2} .
\end{align}

Notice that the ordering and labeling of these may be affected by future pivoting. The elements in the rest of the first row of $U$ can also be calculated

\begin{align}
  & u^{(1)}_{12} = a^{(1)}_{12} =  1 \\
  & u^{(1)}_{13} = a^{(1)}_{13} = -1 \\
  & u^{(1)}_{14} = a^{(1)}_{14} = -1 .
\end{align}

The first stage is complete, yielding

\begin{align}
  A^{(1)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     1 &  1 &  0 &  3 \\
     3 & -1 & -1 &  2 \\
    -1 &  2 &  3 & -1 
  \end{array}\right]
  \quad,\quad
  s^{(1)} = 
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right]
  \quad,\quad
  L^{(1)} =
  \left[\begin{array}{cccc}
    1    & 0 & 0 & 0 \\
    1/2  & 1 & 0 & 0 \\
    3/2  & ? & 1 & 0 \\
    -1/2 & ? & ? & 1
  \end{array}\right]
  \quad,\quad
  U^{(1)} =
  \left[\begin{array}{cccc}
    2 & 1 & -1 & 1 \\
    0 & ? & ? & ? \\
    0 & 0 & ? & ? \\
    0 & 0 & 0 & ?
  \end{array}\right].
\end{align}

Next calculate $u_{22}$ (with pivoting). 

\begin{align}
  & u^{(1)}_{22} = a^{(1)}_{22} - l^{(1)}_{21}u^{(1)}_{12} 
  = 1 - (1/2)(1) = 1/2 
  \quad\Rightarrow\quad
  \left|\frac{u^{(1)}_{22}}{s^{(1)}_{2}}\right| = 1/6
  \\
  & u^{(1)}_{32} = a^{(1)}_{32} - l^{(1)}_{31}u^{(1)}_{12} 
  = -1 - (3/2)(1) = -5/2
  \quad\Rightarrow\quad
  \left|\frac{u^{(1)}_{32}}{s^{(1)}_{3}}\right| = 5/6
  \\
  & u^{(1)}_{42} = a^{(1)}_{42} - l^{(1)}_{41}u^{(1)}_{12} 
  = 2 - (-1/2)(1) = 5/2
  \quad\Rightarrow\quad
  \left|\frac{u^{(1)}_{42}}{s^{(1)}_{4}}\right| = 5/6 ,
\end{align}

which yields $u^{(1)}_{32}$ as the pivot element. After interchanging rows 2 and 3, we have 

\begin{align}
  \begin{array}{l}
    u^{(2)}_{22} = u^{(1)}_{32} = -5/2
  \end{array}
  \quad,\quad
  A^{(2)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     3 & -1 & -1 &  2 \\
     1 &  1 &  0 &  3 \\
    -1 &  2 &  3 & -1 
  \end{array}\right]
  \quad,\quad
  s^{(2)} = 
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right].
\end{align}

Elements in the second column of $L$ are

\begin{align}
  & l^{(2)}_{32} = \frac{u^{(1)}_{22}}{u^{(2)}_{22}} = -1/5 \\
  & l^{(2)}_{42} = \frac{u^{(1)}_{42}}{u^{(2)}_{22}} = 1 .
\end{align}

Elements in the second row of $U$ are

\begin{align}
  & u^{(2)}_{23} = a^{(2)}_{23} - l^{(2)}_{21}u^{(2)}_{13} 
  = -1 - (3/2)(-1) = 1/2
  \\
  & u^{(2)}_{24} = a^{(2)}_{24} - l^{(2)}_{21}u^{(2)}_{14} 
  = 2 - (3/2)(1) = 1/2 .
\end{align}

After the second stage

\begin{align}
  A^{(2)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     3 & -1 & -1 &  2 \\
     1 &  1 &  0 &  3 \\
    -1 &  2 &  3 & -1 
  \end{array}\right]
  \quad,\quad
  s^{(2)} = 
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right]
  \quad,\quad
  L^{(2)} =
  \left[\begin{array}{cccc}
    1    & 0    & 0 & 0 \\
    3/2  & 1    & 0 & 0 \\
    1/2  & -1/5 & 1 & 0 \\
    -1/2 & -1   & ? & 1
  \end{array}\right]
  \quad,\quad
  U^{(2)} =
  \left[\begin{array}{cccc}
    2 & 1 & -1 & 1 \\
    0 & -5/2 & 1/2 & 1/2 \\
    0 & 0    & ?   & ? \\
    0 & 0    & 0   & ?
  \end{array}\right].
\end{align}

Next calculate $u_{33}$ (with pivoting).

\begin{align}
  & u^{(2)}_{33} = a^{(2)}_{33} - l^{(2)}_{31}u^{(2)}_{13} - l^{(2)}_{32}u^{(2)}_{23} 
  = 0 - (1/2)(-1) - (-1/5)(1/2) = 3/5
  \quad\Rightarrow\quad
  \left|\frac{u^{(2)}_{33}}{s^{(2)}_{3}}\right| = 1/5
  \\
  & u^{(2)}_{43} = a^{(2)}_{43} - l^{(2)}_{41}u^{(2)}_{13} - l^{(2)}_{42}u^{(2)}_{23} 
  = 3 - (-1/2)(-1) - (-1)(1/2) = 3
  \quad\Rightarrow\quad
  \left|\frac{u^{(2)}_{43}}{s^{(2)}_{4}}\right| = 1 ,
\end{align}

which yields $u^{(2)}_{43}$ as the pivot element. After interchanging rows 3 and 4, we have 

\begin{align}
  \begin{array}{l}
    u^{(3)}_{33} = u^{(2)}_{43} = 3
  \end{array}
  \quad,\quad
  A^{(3)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     3 & -1 & -1 &  2 \\
    -1 &  2 &  3 & -1 \\
     1 &  1 &  0 &  3
  \end{array}\right]
  \quad,\quad
  s^{(3)} = 
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right].
\end{align}

Elements in the third column of $L$ are

\begin{align}
  & l^{(3)}_{43} = \frac{u^{(2)}_{33}}{u^{(3)}_{33}} =  1/5 .
\end{align}

Elements in the third row of $U$ are

\begin{align}
  u^{(3)}_{34} = a^{(3)}_{34} - l^{(3)}_{31}u^{(3)}_{14} - l^{(3)}_{32}u^{(3)}_{24} 
  = -1 - (-1/2)(1) - (-1)(1/2) = 0 .
\end{align}

After the third stage

\begin{align}
  A^{(3)} = 
  \left[\begin{array}{rrrr}
     2 &  1 & -1 &  1 \\
     3 & -1 & -1 &  2 \\
    -1 &  2 &  3 & -1 \\
     1 &  1 &  0 &  3
  \end{array}\right]
  \quad,\quad
  s^{(3)} = 
  \left[\begin{array}{l}
    2 \\
    3 \\
    3 \\
    3
  \end{array}\right]
  \quad,\quad
  L^{(3)} =
  \left[\begin{array}{cccc}
    1    & 0    & 0   & 0 \\
    3/2  & 1    & 0   & 0 \\
    -1/2 & -1   & 1   & 0 \\
    1/2  & -1/5 & 1/5 & 1
  \end{array}\right]
  \quad,\quad
  U^{(3)} =
  \left[\begin{array}{cccc}
    2 & 1 & -1 & 1 \\
    0 & -5/2 & 1/2 & 1/2 \\
    0 & 0    & 3   & 0 \\
    0 & 0    & 0   & ?
  \end{array}\right].
\end{align}

Finally, calculate the last element $u_{44}$.

\begin{align}
  l^{(3)}_{44} = a^{(3)}_{44} - l^{(3)}_{41}u^{(3)}_{14} - l^{(3)}_{42}u^{(3)}_{24} - l^{(3)}_{43}u^{(3)}_{34} 
  = 3 - (1/2)(1) - (-1/5)(1/2) - (1/5)(0)
  = 13/5.
\end{align}

This completes the factorization, yielding

\begin{align}
  L =
  \left[\begin{array}{cccc}
    1    & 0    & 0   & 0 \\
    3/2  & 1    & 0   & 0 \\
    -1/2 & -1   & 1   & 0 \\
    1/2  & -1/5 & 1/5 & 1
  \end{array}\right]
  \quad,\quad
  U =
  \left[\begin{array}{cccc}
    2 & 1 & -1 & 1 \\
    0 & -5/2 & 1/2 & 1/2 \\
    0 & 0    & 3   & 0 \\
    0 & 0    & 0   & 13/5
  \end{array}\right].
\end{align}

To check this, we first need to reconstruct the permutation matrix responsible for carrying out the above row interchanges. The row interchanges are induced by the permutation matrices

\begin{align}
  & P_{1} = P_{(1\leftrightarrow 2)} =
  \left[\begin{array}{cccc}
    0 & 1 & 0 & 0 \\
    1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
  \end{array}\right]
  \\
  & P_{2} = P_{(2\leftrightarrow 3)} =
  \left[\begin{array}{cccc}
    1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1
  \end{array}\right]
  \\
  & P_{3} = P_{(3\leftrightarrow 4)} =
  \left[\begin{array}{cccc}
    1 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 \\
    0 & 0 & 1 & 0
  \end{array}\right].
\end{align}
 
One can easily verify that these matrices perform the indicated row interchange by examining their action on a known column vector. The permutation matrices left-multiply $A$ in succession, yielding 
 
\begin{align}
  P = P_{3}P_{2}P_{1} =
  \left[\begin{array}{cccc}
    1 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1 \\
    0 & 0 & 1 & 0
  \end{array}\right]
  \left[\begin{array}{cccc}
    1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 0 & 1
  \end{array}\right]
  \left[\begin{array}{cccc}
    0 & 1 & 0 & 0 \\
    1 & 0 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
  \end{array}\right]
  =
  \left[\begin{array}{cccc}
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1 \\
    1 & 0 & 0 & 0
  \end{array}\right].  
\end{align}

It is now easily verified that $PA=LU$.

In [1]:
# record solutions for later comparisons
import numpy as np
Psoln = np.array([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1], [1, 0, 0, 0]], dtype=float)
Lsoln = np.array([[1, 0, 0, 0], [3/2, 1, 0, 0], [-1/2, -1, 1, 0], [1/2, -1/5, 1/5, 1]], dtype=float)
Usoln = np.array([[2, 1, -1, 1], [0, -5/2, 1/2, 1/2], [0, 0, 3, 0], [0, 0, 0, 13/5]], dtype=float)
print('Psoln =\n', Psoln)
print('Lsoln =\n', Lsoln)
print('Usoln =\n', Usoln)

Psoln =
 [[ 0.  1.  0.  0.]
 [ 0.  0.  1.  0.]
 [ 0.  0.  0.  1.]
 [ 1.  0.  0.  0.]]
Lsoln =
 [[ 1.   0.   0.   0. ]
 [ 1.5  1.   0.   0. ]
 [-0.5 -1.   1.   0. ]
 [ 0.5 -0.2  0.2  1. ]]
Usoln =
 [[ 2.   1.  -1.   1. ]
 [ 0.  -2.5  0.5  0.5]
 [ 0.   0.   3.   0. ]
 [ 0.   0.   0.   2.6]]


In [2]:
# check solution
A = np.array([[1, 1, 0, 3], [2, 1, -1, 1], [3, -1, -1, 2], [-1, 2, 3, -1]], dtype=float)
print('A =\n', A)
print('Psoln.A =\n', np.dot(Psoln, A))
print('Lsoln.Usoln =\n', np.dot(Lsoln, Usoln))
print(np.dot(Psoln, A) == np.dot(Lsoln, Usoln))

A =
 [[ 1.  1.  0.  3.]
 [ 2.  1. -1.  1.]
 [ 3. -1. -1.  2.]
 [-1.  2.  3. -1.]]
Psoln.A =
 [[ 2.  1. -1.  1.]
 [ 3. -1. -1.  2.]
 [-1.  2.  3. -1.]
 [ 1.  1.  0.  3.]]
Lsoln.Usoln =
 [[  2.00000000e+00   1.00000000e+00  -1.00000000e+00   1.00000000e+00]
 [  3.00000000e+00  -1.00000000e+00  -1.00000000e+00   2.00000000e+00]
 [ -1.00000000e+00   2.00000000e+00   3.00000000e+00  -1.00000000e+00]
 [  1.00000000e+00   1.00000000e+00   1.11022302e-16   3.00000000e+00]]
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True False  True]]


## 4. Pseudocode: LU factorization

**INPUT**
- $n$, number of equations
- $A$, an $n\times n$ array

**Validate inputs**
- if $A$ does not have the correct dimensions, shape, or size, STOP

**Calculate scale factors**
- create, fill scale factor array, $s$

**Initialize row pointer**
- create row pointer array, $\mathsf{nrow}$
- initialize $\mathsf{nrow}_{i} = i$

**LU factorization loop**
- loop over column/row super-index, $\alpha$

  **calculate diagonal element, using scaled partial pivoting**
  - calculate pivot candiates $u_{i\alpha}$, for $i=\alpha,\ldots,n$ using
    $a_{i,\alpha} = a_{i,\alpha} - \sum_{k=1}^{\alpha-1}a_{ik}a_{k,\alpha}$
  - identify the row number $k$ with best pivot candidate
  - if $k\neq j$, simulate row swap $(E_{k})\leftrightarrow (E_{j})$ 
    by swapping row pointer elements $\mathsf{nrow}_{k}\leftrightarrow \mathsf{nrow}_{j}$
  
  **calculate column elements of L below the diagonal**
  - for $i=\alpha+1,\ldots,n$ (loop over rows below the diagonal)
    - calculate $l_{ij}$'s along $\alpha$th column using 
    $a^\text{(new)}_{i,\alpha} = a^\text{(old)}_{i,\alpha}/a_{\alpha\alpha}$

  **calculate row elements of U to the right of the diagonal**
  - for $j=\alpha+1,\ldots,n$ (loop over columns to the right of the diagonal)
    - calculate $u_{ij}$'s along $\alpha$th row using 
    $a_{\alpha,j} = a_{\alpha,j} - \sum_{k=1}^{\alpha-1}a_{\alpha,k}a_{kj}$

**Construct output**
- create, fill permutation array $P$
- create, fill lower-triangular array $L$
- create, fill lower-triangular array $U$

**OUTPUT**
- arrays $P$, $L$, and $U$, or failure message

## 5. CODE: LU factorization

In [3]:
%%writefile lu_factor.py

import numpy as np

# LU factorization
def lu_factor(n, A):

    # check that input matrix has the correct dimensions, shape, and size
    if A.ndim != 2:
        print('ERROR: input array must have ndim=2. Stopping.')
        return
    if A.shape[0] != A.shape[1]:
        print('ERROR: input array must have shape=(n,n). Stopping.')
        return
    if A.size != n**2:
        print('ERROR: input array must have size=nxn. Stopping.')
        return
    
    # ensure that elements of A are floats
    A = A.astype(float)

    # calculate the scale factor for each row as the magnitude of
    # the largest element in that row
    s = np.zeros(n) #initialize all scale factors to zero
    for i in range (0, n): #loop over rows
        s[i] = np.amax(A[i,:]) #caclulate scale factor
        print('scale factor for row',i,'is',s[i])                      #debugging
        if s[i] == 0:
            print('ERROR: matrix is singular. Stopping.')
            return
    
    # initialize row pointer to keep track of row ordering
    # (pointer indices are swapped during pivoting to simulate row swapping)
    nrow = np.arange(0, n)
    
    # LU factorization loop
    for alpha in range (0, n): #loop over row/column super index
        
        ### final loop ###
        if alpha == n-1:
            
            # final diagonal element
            print('Calculating final element of U:')                      #debugging

            # calculate the sum
            lusum = 0. #reset sum
            for k in range (0, alpha):
                lusum = lusum + A[nrow[i],k]*A[nrow[k],alpha]
        
            # calculate diagonal element
            A[nrow[alpha],alpha] = A[nrow[alpha],alpha] - lusum
            print('  u[',alpha,',',alpha,'] =',A[nrow[alpha],alpha]) #debugging
            
            #then quit
            break
        
        ### calculate diagonal element, with pivoting ###

        # initialize pivoting
        print('Pivoting at column',alpha,':')                         #debugging
        pmax = 0            #max relative magnitude
        prow = alpha        #virtual pivot row number
        nprow = nrow[alpha] #actual pivot row number in stored array

        # calculate pivot candidates in row alpha
        print('  calculating pivot candidates in column',alpha,'...') #debugging
        for i in range (alpha, n): #loop over rows
        
            # calculate the sum using previously found solutions
            lusum = 0. #reset sum
            for k in range (0, alpha):
                lusum = lusum + A[nrow[i],k]*A[nrow[k],alpha]
        
            # calculate the pivot candidate in the current row
            A[nrow[i],alpha] = A[nrow[i],alpha] - lusum
            print('  row',i,': u[',i,',',alpha,'] =',A[nrow[i],alpha]) #debugging

            # check if pivot candidate is the best so far...
            pij = np.abs(A[nrow[i],alpha]/s[nrow[i]]) #calculate relative magnitude
            if pij > pmax:
                # if relative magnitude of the current pivot candidate is greater than 
                # the relative magnitude of the previous best pivot candidate...
                pmax = pij      #current max relative magnitude, so far
                prow = i        #current virtual pivot row number, so far
                nprow = nrow[i] #current actual pivot row number in stored array, so far   
                
        # pivot
        print('  all pivot candidates have been calculated...')      #debugging
        print('  pmax for column',alpha,'is',pmax)                   #debugging
        print('  pivot row for column',alpha,'is',prow)              #debugging
        if pmax == 0:
            print('ERROR: matrix is singular. Stopping.')
            return
    
        # simulate row swap by swapping row pointers
        if prow != nrow[alpha]:
            print('  row swap: row',prow,'<--> row',alpha)           #debugging
            ncopy = nrow[alpha]
            nrow[alpha] = nrow[prow]
            nrow[prow] = ncopy
            print('  new U element, after pivoting:')                #debugging
            print('  u[',alpha,',',alpha,'] =',A[nrow[alpha],alpha]) #debugging 
        else:
            print('  pivoting is not needed, skipping')              #debugging
            continue


        ### calculate column elements of L ###
        print('Calculating elements of L along column',alpha,':')    #debugging
        for i in range (alpha+1, n): #loop over rows
                
            # calculate the element of L in the current row
            A[nrow[i],alpha] = A[nrow[i],alpha]/A[nrow[alpha],alpha]
            print('  l[',i,',',alpha,'] =',A[nrow[i],alpha])         #debugging
            
            
        ### calculate row elements of U ###
        print('Calculating elements of U along row',alpha,':')       #debugging
        for j in range (alpha+1, n): #loop over columns
        
            # calculate the sum using previously found solutions
            lusum = 0. #reset sum
            for k in range (0, alpha):
                lusum = lusum + A[nrow[alpha],k]*A[nrow[k],j]
        
            # calculate the element of U in the current column
            A[nrow[alpha],j] = A[nrow[alpha],j] - lusum
            print('  u[',alpha,',',j,'] =',A[nrow[alpha],j])         #debugging

            
    # create output arrays
    P = np.zeros((n,n)) #initialize P as a zero array
    L = np.eye(n)       #initialize L as diag(1,...,1)
    U = np.zeros((n,n)) #initialize U as a zero array
    for i in range (0, n):
        P[i, nrow[i]] = 1             #fill permutation matrix 
        U[i, i:] = A[nrow[i], i:]     #fill upper-triangular elements of U
    for i in range (1, n):
        L[i, :i] = A[nrow[i], :i]     #fill lower-triangular elements of L

    return P, L, U

Overwriting lu_factor.py


In [4]:
%run lu_factor.py

In [5]:
# apply code to simple example above
n = 4
A = np.array([[1, 1, 0, 3], [2, 1, -1, 1], [3, -1, -1, 2], [-1, 2, 3, -1]])
P, L, U = lu_factor(n, A)
print('P =\n', P)
print('L =\n', L)
print('U =\n', U)

scale factor for row 0 is 3.0
scale factor for row 1 is 2.0
scale factor for row 2 is 3.0
scale factor for row 3 is 3.0
Pivoting at column 0 :
  calculating pivot candidates in column 0 ...
  row 0 : u[ 0 , 0 ] = 1.0
  row 1 : u[ 1 , 0 ] = 2.0
  row 2 : u[ 2 , 0 ] = 3.0
  row 3 : u[ 3 , 0 ] = -1.0
  all pivot candidates have been calculated...
  pmax for column 0 is 1.0
  pivot row for column 0 is 1
  row swap: row 1 <--> row 0
  new U element, after pivoting:
  u[ 0 , 0 ] = 2.0
Calculating elements of L along column 0 :
  l[ 1 , 0 ] = 0.5
  l[ 2 , 0 ] = 1.5
  l[ 3 , 0 ] = -0.5
Calculating elements of U along row 0 :
  u[ 0 , 1 ] = 1.0
  u[ 0 , 2 ] = -1.0
  u[ 0 , 3 ] = 1.0
Pivoting at column 1 :
  calculating pivot candidates in column 1 ...
  row 1 : u[ 1 , 1 ] = 0.5
  row 2 : u[ 2 , 1 ] = -2.5
  row 3 : u[ 3 , 1 ] = 2.5
  all pivot candidates have been calculated...
  pmax for column 1 is 0.833333333333
  pivot row for column 1 is 2
  row swap: row 2 <--> row 1
  new U element, afte

In [6]:
# check solution
print('A =\n', A)
print('P.A =\n', np.dot(P, A))
print('L.U =\n', np.dot(L, U))
print(np.dot(P, A) == np.dot(L, U))

A =
 [[ 1  1  0  3]
 [ 2  1 -1  1]
 [ 3 -1 -1  2]
 [-1  2  3 -1]]
P.A =
 [[ 2.  1. -1.  1.]
 [ 3. -1. -1.  2.]
 [-1.  2.  3. -1.]
 [ 1.  1.  0.  3.]]
L.U =
 [[ 2.  1. -1.  1.]
 [ 3. -1. -1.  2.]
 [-1.  2.  3. -1.]
 [ 1.  1.  0.  3.]]
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]


In [7]:
# compare to solution found above, by hand, for L
print('Lsoln =\n', Lsoln)
print('L =\n', L)
print(L == Lsoln)

Lsoln =
 [[ 1.   0.   0.   0. ]
 [ 1.5  1.   0.   0. ]
 [-0.5 -1.   1.   0. ]
 [ 0.5 -0.2  0.2  1. ]]
L =
 [[ 1.   0.   0.   0. ]
 [ 1.5  1.   0.   0. ]
 [-0.5 -1.   1.   0. ]
 [ 0.5 -0.2  0.2  1. ]]
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True False  True]]


In [8]:
# compare to solution found above, by hand, for U
print('Usoln =\n', Usoln)
print('U =\n', U)
print(U == Usoln)

Usoln =
 [[ 2.   1.  -1.   1. ]
 [ 0.  -2.5  0.5  0.5]
 [ 0.   0.   3.   0. ]
 [ 0.   0.   0.   2.6]]
U =
 [[ 2.   1.  -1.   1. ]
 [ 0.  -2.5  0.5  0.5]
 [ 0.   0.   3.   0. ]
 [ 0.   0.   0.   2.6]]
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]


In [9]:
# compare to solution found using numpy.linalg
import scipy.linalg as la
Ppy, Lpy , Upy = la.lu(A)
print(A)
print(Ppy)
print(Lpy)
print(Upy)
print(np.dot(Ppy.T, A)== np.dot(Lpy, Upy))

[[ 1  1  0  3]
 [ 2  1 -1  1]
 [ 3 -1 -1  2]
 [-1  2  3 -1]]
[[ 0.  0.  0.  1.]
 [ 0.  0.  1.  0.]
 [ 1.  0.  0.  0.]
 [ 0.  1.  0.  0.]]
[[ 1.          0.          0.          0.        ]
 [-0.33333333  1.          0.          0.        ]
 [ 0.66666667  1.          1.          0.        ]
 [ 0.33333333  0.8         0.6         1.        ]]
[[ 3.         -1.         -1.          2.        ]
 [ 0.          1.66666667  2.66666667 -0.33333333]
 [ 0.          0.         -3.          0.        ]
 [ 0.          0.          0.          2.6       ]]
[[ True  True  True  True]
 [ True  True  True  True]
 [ True False  True  True]
 [ True  True  True  True]]


## 6. Part 2: Forward substitution

The equation $A\mathbf{x}=\mathbf{b}$ is first re-arranged by the permutation matrix $P$ to account for the necessary pivoting steps to avoid large roundoff error (or division by zero) during the process of LU factorization. Letting $\tilde{A}=PA$ and $\tilde{\mathbf{b}}=P\mathbf{b}$, we have $\tilde{A}\mathbf{x}=LU\mathbf{x}=\tilde{\mathbf{b}}$. This matrix equation can now be writen as the two matrix equations

\begin{align}
  L\mathbf{y}=\tilde{\mathbf{b}} \\
  U\mathbf{x} = \mathbf{y}
\end{align}

The first of these is solved for $\mathbf{y}$ using forward substitution. This process is outlined in this section. The second equation is then solved for $\mathbf{x}$ using back substitution, which will be described in the next section.

Writing out the equation $L\mathbf{y}=\tilde{\mathbf{b}}$ gives 

\begin{align}
  \left[\begin{array}{ccccc}
    1      & 0      & 0      & \cdots & 0 \\
    l_{21} & 1      & 0      & \cdots & 0 \\
    l_{31} & l_{32} & 1      & \ddots & \vdots \\
    \vdots & \vdots & \vdots & \ddots & 0 \\
    l_{n1} & l_{n2} & l_{n3} & \cdots & 1  
  \end{array}\right]
  \left[\begin{array}{c}
    y_{1} \\
    y_{2} \\
    y_{3} \\
    \vdots \\
    y_{n} \\
  \end{array}\right]
  =
  \left[\begin{array}{c}
    \tilde{b}_{1} \\
    \tilde{b}_{2} \\
    \tilde{b}_{3} \\
    \vdots \\
    \tilde{b}_{n} \\
  \end{array}\right]
  =
  \left[\begin{array}{ccccc}
    p_{11} & p_{12} & p_{13} & \cdots & p_{1n} \\
    p_{21} & p_{22} & p_{23} & \cdots & p_{2n} \\
    p_{31} & p_{32} & p_{33} & \cdots & p_{3n} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    p_{n1} & p_{n2} & p_{n3} & \cdots & p_{nn}  
  \end{array}\right]
  \left[\begin{array}{c}
    b_{1} \\
    b_{2} \\
    b_{3} \\
    \vdots \\
    b_{n} \\
  \end{array}\right]
\end{align}

which yields the system of equations

\begin{align}
  \begin{array}{ccccc}
    y_{1}       &   &             &   &             & = & \tilde{b}_{1} \\
    l_{21}y_{1} & + & y_{2}       &   &             & = & \tilde{b}_{2} \\
    l_{31}y_{1} & + & l_{32}y_{2} & + & y_{3}       & = & \tilde{b}_{3} \\
    \cdots      &   & \cdots      &   & \cdots      &   & \cdots \\
    l_{n1}y_{1} & + & \cdots      & + & y_{n}       & = & \tilde{b}_{n}. \\    
  \end{array}
\end{align}

Now that $L$ and $P$ are fully determined (from part 1 above), this system of equations is solved using forward substitution. 

The first equation yields

\begin{align}
  y_{1} = \tilde{b}_{1}.
\end{align}

The second equation yields

\begin{align}
  y_{2} = \tilde{b}_{2} - l_{21}y_{1}.
\end{align}

And so on. The general expression for the $i$th solution is 

\begin{align}
  y_{i} = \tilde{b}_{i} - \sum_{j=1}^{i-1}l_{ij}y_{j}.
\end{align}

At the end of this process, the vector $\mathbf{y}$ is fully determined.

## 7. Simple example, revisited (part 2: forward substitution)

Recall the linear system above

\begin{align}
  \begin{array}{lrcrcrcrcr}
    E_{1}: &  x_{1} &+&  x_{2} & &        &+& 3x_{4} &=&  4, \\
    E_{2}: & 2x_{1} &+&  x_{2} &-&  x_{3} &+&  x_{4} &=&  1, \\
    E_{3}: & 3x_{1} &-&  x_{2} &-&  x_{3} &+& 2x_{4} &=& -3, \\
    E_{4}: & -x_{1} &+& 2x_{2} &+& 3x_{3} &-&  x_{4} &=&  4,
  \end{array}
\end{align}

Previously, we factorized $\tilde{A}=PA$ into a lower-triangular matrix, $L$, and an upper-triangular matrix, $U$. We found

\begin{align}
  P = 
  \left[\begin{array}{rrrr}
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1 \\
    1 & 0 & 0 & 0
  \end{array}\right]
  \quad,\quad
  L = 
  \left[\begin{array}{rrrr}
       1 &    0 &  0  & 0 \\
     3/2 &    1 &  0  & 0 \\
    -1/2 &   -4 &  1  & 0 \\
     1/2 & -1/5 & 1/5 & 1
  \end{array}\right]
  \quad,\quad
  U = 
  \left[\begin{array}{rrrr}
    2 &    1 &  -1 &    1 \\
    0 & -5/2 & 1/2 &  1/2 \\
    0 &    0 &   3 &    0 \\
    0 &    0 &   0 & 13/5
  \end{array}\right].
\end{align}

The equation $L\mathbf{y}=\tilde{\mathbf{b}}=P\tilde{\mathbf{b}}$ becomes

\begin{align} 
  \left[\begin{array}{rrrr}
       1 &    0 &  0  & 0 \\
     3/2 &    1 &  0  & 0 \\
    -1/2 &   -4 &  1  & 0 \\
     1/2 & -1/5 & 1/5 & 1
  \end{array}\right]
  \left[\begin{array}{c}
    y_{1} \\
    y_{2} \\
    y_{3} \\
    y_{4}
  \end{array}\right]
  = 
  \left[\begin{array}{rrrr}
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1 \\
    1 & 0 & 0 & 0
  \end{array}\right]
  \left[\begin{array}{r}
     4 \\
     1 \\
    -3 \\
     4
  \end{array}\right]
  = 
  \left[\begin{array}{r}
     1 \\
    -3 \\
     4 \\
     4
  \end{array}\right].
\end{align}

The vector $\mathbf{y}$ is now quickly determined using forward substitution.

\begin{align}
  & y_{1} = \tilde{b}_{1}
  = 1
  \\
  & y_{2} = \tilde{b}_{2} - l_{21}y_{1} 
  = -3 - (3/2)(1) = -9/2
  \\
  & y_{3} = \tilde{b}_{3} - l_{31}y_{1} - l_{32}y_{2}
  = 4 - (1/2)(1) - (1)(-9/2) = 0
  \\
  & y_{4} = \tilde{b}_{4} - l_{41}y_{1} - l_{42}y_{2} - l_{43}y_{3}
  = 4 - (1/2)(1) - (1/5)(-9/2) - (1/5)(0) = 26/10
\end{align}

And so

\begin{align}
  \mathbf{y} = 
  \left[\begin{array}{c}
  1 \\
  -9/2 \\
  0 \\
  26/10
  \end{array}\right]
\end{align}

In [10]:
# record solution for later comparison
ysoln = np.array([1, -9/2, 0, 26/10])
print('ysoln =',ysoln)

ysoln = [ 1.  -4.5  0.   2.6]


## 8. Pseudocode: Forward substitution


**INPUT**
- $n$, number of equations
- $L$, an $n\times n$ lower-triangular array
- $\tilde{\mathbf{b}}$, a 1d vector of length $n$

**Validate inputs**
- if $L$ and $\tilde{\mathbf{b}}$ do not have the correct dimensions, shape, or size, stop.
- if $L$ is not lower-triangular, stop.

**Initialize output array**
- create array $y$

**Back substitution loop**
- loop over all rows
  - calculate $y_{i}$

**OUTPUT**
- solution array $\mathbf{y}$, or failure message

## 9. CODE: Forward substitution

In [11]:
%%writefile forward_sub.py
import numpy as np

# forward substitution
def forward_sub(n, L, b):

    # check that input matrix has the correct dimensions, shape, and size
    if L.ndim != 2:
        print('ERROR: input array must have ndim=2. Stopping.')
        return
    if L.shape[0] != L.shape[1]:
        print('ERROR: input array must have shape=(n,n). Stopping.')
        return
    if L.size != n**2:
        print('ERROR: input array must have size=nxn. Stopping.')
        return

    # check that input matrix is lower-triangular

    
    # check that input vector has the correct dimensions, shape, and size
    if b.ndim != 1:
        print('ERROR: input vector must have ndim=1. Stopping.')
        return
    if b.shape[0] != n:
        print('ERROR: input vector must have shape=(n,). Stopping.')
        return
    if b.size != n:
        print('ERROR: input vector must have size=n. Stopping.')
        return

    # create output array
    y = np.zeros(n)
    
    # forward substitution loop
    for i in range (0, n): #loop over rows
        
        # calculate the sum using previously found solutions
        ysum = 0 #reset sum to zero
        for j in range (0, i): #loop over columns to the left of the diagonal
            ysum = ysum + L[i,j]*y[j]
        
        # calculate the unknown variable in the current row
        y[i] = (b[i] - ysum)/L[i,i]
        
    return y

Overwriting forward_sub.py


In [12]:
%run forward_sub.py

In [20]:
# apply code to simple example above
b = np.array([4, 1, -3, 4])
btilde = np.dot(P,b)
y = forward_sub(n, L, btilde)
print('y =',y)

y = [ 1.  -4.5  0.   2.6]


In [21]:
# compare to solution found above, by hand
print('ysoln =', ysoln)

ysoln = [ 1.  -4.5  0.   2.6]


In [22]:
# compare to solution found using numpy.linalg
import scipy.linalg as la
la.solve(L, btilde)

array([  1.00000000e+00,  -4.50000000e+00,  -8.88178420e-16,
         2.60000000e+00])

## 10. Part 3: Back substitution

The equation $U\mathbf{x}=\mathbf{y}$ is now solved using back substitution.  

\begin{align}
  \left[\begin{array}{ccccc}
    u_{11} & u_{12} & u_{13} & \cdots & u_{1n} \\
    0 & u_{22} & u_{23} & \cdots & u_{2n} \\
    0 & 0 & u_{33} & \cdots & u_{3n} \\
    \vdots & \vdots & \ddots & \ddots & \vdots \\
    0 & 0 & \cdots & 0 & u_{nn}  
  \end{array}\right]
  \left[\begin{array}{c}
    x_{1} \\
    x_{2} \\
    x_{3} \\
    \vdots \\
    x_{n} \\
  \end{array}\right]
  =
  \left[\begin{array}{c}
    y_{1} \\
    y_{2} \\
    y_{3} \\
    \vdots \\
    y_{n} \\
  \end{array}\right]
\end{align}

yields the system of equations

\begin{align}
  \begin{array}{lllllll}
    u_{nn}x_{n}        &   &                    &   &                & = & y_{n} \\
    u_{n-1,n-1}x_{n-1} & + & u_{n-1,n}x_{n}     &   &                & = & y_{n-1} \\
    u_{n-2,n-2}x_{n-2} & + & u_{n-2,n-1}x_{n-1} & + & u_{n-2,n}x_{n} & = & y_{n-2} \\
    \cdots             &   & \cdots             &   & \cdots         &   & \cdots \\
    u_{11}x_{11}       & + & \cdots             & + & u_{1n}x_{n}    & = & y_{1} \\
  \end{array}
\end{align}

Now that $U$ and $\mathbf{y}$ are both fully determined (from parts 1 and 2 above, respectively), this system of equations is solved using back substitution. 

The first equation (corresponding to the last row) yields

\begin{align}
  x_{n} = \frac{y_{n}}{u_{nn}}.
\end{align}

The second equation yields

\begin{align}
  x_{n-1} = \frac{y_{n-1} - u_{n-1,n}x_{n}}{u_{n-1,n-1}}.
\end{align}

And so on. The general expression for the $i$th solution is 

\begin{align}
  x_{i} = \frac{y_{i} - \sum_{j=i+1}^{n}u_{ij}x_{j}}{u_{ii}}.
\end{align}

At the end of this process, the vector $\mathbf{x}$ is fully determined, which is the solution of the original system of equations.

## 11. Simple example, revisited (part 3: back substitution)

Once again recall the linear system

\begin{align}
  \begin{array}{lrcrcrcrcr}
    E_{1}: &  x_{1} &+&  x_{2} & &        &+& 3x_{4} &=&  4, \\
    E_{2}: & 2x_{1} &+&  x_{2} &-&  x_{3} &+&  x_{4} &=&  1, \\
    E_{3}: & 3x_{1} &-&  x_{2} &-&  x_{3} &+& 2x_{4} &=& -3, \\
    E_{4}: & -x_{1} &+& 2x_{2} &+& 3x_{3} &-&  x_{4} &=&  4,
  \end{array}
\end{align}

In part 1, we factorized $\tilde{A}=PA$ into a lower-triangular matrix, $L$, and an upper-triangular matrix, $U$. We found

\begin{align}
  P = 
  \left[\begin{array}{rrrr}
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1 \\
    1 & 0 & 0 & 0
  \end{array}\right]
  \quad,\quad
  L = 
  \left[\begin{array}{rrrr}
       1 &    0 &  0  & 0 \\
     3/2 &    1 &  0  & 0 \\
    -1/2 &   -4 &  1  & 0 \\
     1/2 & -1/5 & 1/5 & 1
  \end{array}\right]
  \quad,\quad
  U = 
  \left[\begin{array}{rrrr}
    2 &    1 &  -1 &    1 \\
    0 & -5/2 & 1/2 &  1/2 \\
    0 &    0 &   3 &    0 \\
    0 &    0 &   0 & 13/5
  \end{array}\right].
\end{align}

In part 2, we solved $L\mathbf{y}=\tilde{\mathbf{b}}=P\mathbf{b}$ for $\mathbf{y}, which yielded

\begin{align}
  \mathbf{y} = 
  \left[\begin{array}{c}
  1 \\
  -9/2 \\
  0 \\
  26/10
  \end{array}\right].
\end{align}

The equation $U\mathbf{x}=\mathbf{y}$ is then

\begin{align}
  \left[\begin{array}{rrrr}
    2 &    1 &  -1 &    1 \\
    0 & -5/2 & 1/2 &  1/2 \\
    0 &    0 &   3 &    0 \\
    0 &    0 &   0 & 13/5
  \end{array}\right]
  \left[\begin{array}{c}
    x_{1} \\
    x_{2} \\
    x_{3} \\
    x_{4} \\
  \end{array}\right]  
  =
  \left[\begin{array}{c}
  1 \\
  -9/2 \\
  0 \\
  26/10
  \end{array}\right].
\end{align}

The vector $\mathbf{x}$ is now quickly determined using back substitution in 

\begin{align}
  & x_{4} = \frac{y_{4}}{u_{44}} 
  = \frac{26/10}{13/5} = 1
  \\
  & x_{3} = \frac{y_{3} - u_{34}x_{4}}{u_{33}} 
  = \frac{0 - (0)(1)}{3} = 0
  \\
  & x_{2} = \frac{y_{2} - u_{23}x_{3} u_{24}x_{4}}{u_{22}} 
  = \frac{(-9/2) - (1/2)(0) - (1/2)(1)}{-5/2} = 2
  \\
  & x_{1} = \frac{y_{1} - u_{12}x_{2} + u_{13}x_{3} u_{14}x_{4}}{u_{11}} 
  = \frac{1 - (1)(2) - (-1)(0) - (1)(1)}{2} = -1
\end{align}

And so

\begin{align}
  \mathbf{x} = 
  \left[\begin{array}{r}
  -1 \\
   2 \\
   0 \\
   1
  \end{array}\right]
\end{align}

In [25]:
# record solution for later comparison
xsoln = np.array([-1, 2, 0, 1])
print('xsoln =', xsoln)

xsoln = [-1  2  0  1]


## 12. Pseudocode: Back substitution

(See notebook on Gaussian elimination)

## 13. CODE: Back substitution

In [27]:
#%pycat back_sub.py

In [28]:
%run back_sub.py

In [29]:
# apply code to simple example above
x = back_sub(n, U, y)
print('x =', x)

x = [-1.  2.  0.  1.]


In [30]:
# compare to solution found above, by hand
print('xsoln =', xsoln)

xsoln = [-1  2  0  1]


In [31]:
# compare to solution found using numpy.linalg
import scipy.linalg as la
la.solve(U, y)

array([-1.,  2.,  0.,  1.])