# LU factorization

## Basic form

Our goal is to factor $A=LU$, where $L$ is unit lower triangular and $U$ is upper triangular. As the book explains, expressing Gaussian elimination using linear algebra leads to an algorithm. We will derive it differently, using the outer product form of $LU$, similarly to how we found modified Gram-Schmidt. 

Define 

$$A_j = \sum_{k=j}^m \ell_k u_k^*, \qquad j=1,\ldots,m.$$

Note that $A_1=A$. Let's step through a small example. 

In [1]:
A = randi(10,4,4)
A1 = A


A =

     9     7    10    10
    10     1    10     5
     2     3     2     9
    10     6    10     2


A1 =

     9     7    10    10
    10     1    10     5
     2     3     2     9
    10     6    10     2



Let's look at the first row of $A_1$. We can express this algebraically as 

$$e_1^* A_1 = \sum_{k=1}^m (e_1^* \ell_k) u_k^* = u_1^*,$$

thanks to the structure of $L$. From this identity we get the first row of $U$. Similarly, if we look at the first column of $A_1$, we find

$$A_1 e_1 = \sum_{k=1}^m \ell_k (u_k^*e_1) = U_{11}\ell_1,$$

which gives us the first column of $L$. 

In [2]:
U = zeros(4,4);
L = zeros(4,4);

U(1,:) = A1(1,:)
L(:,1) = A1(:,1)/U(1,1)


U =

     9     7    10    10
     0     0     0     0
     0     0     0     0
     0     0     0     0


L =

    1.0000         0         0         0
    1.1111         0         0         0
    0.2222         0         0         0
    1.1111         0         0         0



Now we can construct $A_2=A_1-\ell_1u_1^*$. Note that 

$$e_1^* A_2 = e_1^*A_1 - u_1^* = 0,$$ 

and 

$$A_2e_1 = A_1e_1 - \ell_1 U_{11} = 0.$$

In other words, the first rank-one term $\ell_1 u_1^*$ captures and cancels out the first row and column of the original $A$. 

In [3]:
A2 = A1 - L(:,1)*U(1,:)


A2 =

         0         0         0         0
         0   -6.7778   -1.1111   -6.1111
         0    1.4444   -0.2222    6.7778
         0   -1.7778   -1.1111   -9.1111



Now we move on to take out the second column and row. Observe that 

$$ e_2^*A_2 = \sum_{k=2}^m (e_2^* \ell_k) u_k^* = u_2^*,$$

and 

$$A_2 e_2 = \sum_{k=2}^m \ell_k (u_k^*e_2) = U_{22}\ell_2.$$

In [4]:
U(2,:) = A2(2,:)
L(:,2) = A2(:,2)/U(2,2)


U =

    9.0000    7.0000   10.0000   10.0000
         0   -6.7778   -1.1111   -6.1111
         0         0         0         0
         0         0         0         0


L =

    1.0000         0         0         0
    1.1111    1.0000         0         0
    0.2222   -0.2131         0         0
    1.1111    0.2623         0         0



You may have guessed that this will capture the second row and column, so that we can zero those out too.

In [5]:
A3 = A2 - L(:,2)*U(2,:)


A3 =

         0         0         0         0
         0         0         0         0
         0         0   -0.4590    5.4754
         0   -0.0000   -0.8197   -7.5082



And so on.

In [6]:
U(3,:) = A3(3,:)
L(:,3) = A3(:,3)/U(3,3)
A4 = A3 - L(:,3)*U(3,:)

U(4,:) = A4(4,:)
L(:,4) = A4(:,4)/U(4,4)


U =

    9.0000    7.0000   10.0000   10.0000
         0   -6.7778   -1.1111   -6.1111
         0         0   -0.4590    5.4754
         0         0         0         0


L =

    1.0000         0         0         0
    1.1111    1.0000         0         0
    0.2222   -0.2131    1.0000         0
    1.1111    0.2623    1.7857         0


A4 =

         0         0         0         0
         0         0         0         0
         0         0         0         0
         0   -0.0000         0  -17.2857


U =

    9.0000    7.0000   10.0000   10.0000
         0   -6.7778   -1.1111   -6.1111
         0         0   -0.4590    5.4754
         0   -0.0000         0  -17.2857


L =

    1.0000         0         0         0
    1.1111    1.0000         0         0
    0.2222   -0.2131    1.0000         0
    1.1111    0.2623    1.7857    1.0000



This is the whole factorization:

In [7]:
A-L*U


ans =

   1.0e-14 *

         0         0         0         0
         0         0         0         0
         0         0         0         0
         0         0   -0.1776         0



Before we can write this as an algorithm, though, we need to address an important possible failure point.

## Pivoting

Above, we iteratively divided by the entries $U_{11},U_{22},\ldots$ as we found them. What if one of these were zero? We can easily find an example where this happens almost immediately.

In [8]:
A(1,1) = 0


A =

     0     7    10    10
    10     1    10     5
     2     3     2     9
    10     6    10     2



In [9]:
U = zeros(4,4);
L = zeros(4,4);

A1 = A;
U(1,:) = A1(1,:)
L(:,1) = A1(:,1)/U(1,1)


U =

     0     7    10    10
     0     0     0     0
     0     0     0     0
     0     0     0     0


L =

   NaN     0     0     0
   Inf     0     0     0
   Inf     0     0     0
   Inf     0     0     0



You might be thinking that maybe $A$ is singular, so we're off the hook. But that is not the case.

In [10]:
svd(A)


ans =

   25.4535
   11.2648
    5.2987
    3.1857



However, there's a part of standard Gaussian elimination we have not yet used: swapping rows of the matrix. In a linear system of equations, this leaves the solution unchanged. By swapping rows to put a nonzero in the "pivot" location $(k,k)$ of $A_k$, the algorithm may continue. 

We'll look at this a little differently. Rather than trying to zero out the first row and first column with $\ell_1 u_1^*$, we will chose a different row, which we denote by $i_1$. We will also change the old structural requirements 

$$L_{11} = 1, \quad L_{12}=L_{13}=\cdots=L_{1m}=0,$$ 

to hold for row $i_1$ rather than row 1. So now 

$$e_{i_1}^* A_1 = \sum_{k=1}^m (e_{i_1}^* \ell_k) u_k^* = u_1^*,$$

which as before gives a way to extract $u_1^*$. But now $U_{11}$ is the $(i_1,1)$ element of $A$. If we can't find an $i_1$ such that this is nonzero, then the entire first column of $A$ is zero, and this *would* imply that $A$ is singular. Otherwise, we have $A_1 e_1=U_{11}\ell_1$ exactly as before, and we know that we can compute $\ell_1$. 

This is a lot less daunting than the formalism makes it sound. First, we use the maximum element in column 1 to select $i_1$ (more on this later). 

In [11]:
i = zeros(4,1);
[~,i(1)] = max(abs(A1(:,1)))


i =

     2
     0
     0
     0



So we are targeting row 2 and column 1 to zero out. 

In [12]:
U(1,:) = A1(i(1),:)
L(:,1) = A1(:,1)/U(1,1)


U =

    10     1    10     5
     0     0     0     0
     0     0     0     0
     0     0     0     0


L =

         0         0         0         0
    1.0000         0         0         0
    0.2000         0         0         0
    1.0000         0         0         0



In [13]:
A2 = A1 - L(:,1)*U(1,:)


A2 =

         0    7.0000   10.0000   10.0000
         0         0         0         0
         0    2.8000         0    8.0000
         0    5.0000         0   -3.0000



Now we select a new row $i_2$ with a nonzero pivot in column 2.

In [14]:
[~,i(2)] = max(abs(A2(:,2)))


i =

     2
     1
     0
     0



Now we want $e_{i_2}^*A_2=u_2^*$. This happens if we require

$$L_{i_2,2} = 1, \quad L_{i_2,3}=\cdots=L_{i_2,m}=0.$$ 

(Note that $L_{i_2,1}$ was previously determined.) 

In [15]:
U(2,:) = A2(i(2),:)
L(:,2) = A2(:,2)/U(2,2)


U =

    10     1    10     5
     0     7    10    10
     0     0     0     0
     0     0     0     0


L =

         0    1.0000         0         0
    1.0000         0         0         0
    0.2000    0.4000         0         0
    1.0000    0.7143         0         0



In [16]:
A3 = A2 - L(:,2)*U(2,:)


A3 =

         0         0         0         0
         0         0         0         0
         0         0   -4.0000    4.0000
         0         0   -7.1429  -10.1429



By now the pattern is clear. 

In [17]:
[~,i(3)] = max(abs(A3(:,3)))
U(3,:) = A3(i(3),:)
L(:,3) = A3(:,3)/U(3,3)
A4 = A3 - L(:,3)*U(3,:)


i =

     2
     1
     4
     0


U =

   10.0000    1.0000   10.0000    5.0000
         0    7.0000   10.0000   10.0000
         0         0   -7.1429  -10.1429
         0         0         0         0


L =

         0    1.0000         0         0
    1.0000         0         0         0
    0.2000    0.4000    0.5600         0
    1.0000    0.7143    1.0000         0


A4 =

         0         0         0         0
         0         0         0         0
         0         0    0.0000    9.6800
         0         0         0         0



In [18]:
[~,i(4)] = max(abs(A4(:,4)))
U(4,:) = A4(i(4),:)
L(:,4) = A4(:,4)/U(4,4)


i =

     2
     1
     4
     3


U =

   10.0000    1.0000   10.0000    5.0000
         0    7.0000   10.0000   10.0000
         0         0   -7.1429  -10.1429
         0         0    0.0000    9.6800


L =

         0    1.0000         0         0
    1.0000         0         0         0
    0.2000    0.4000    0.5600    1.0000
    1.0000    0.7143    1.0000         0



Indeed, we did again get a factorization of $A$.

In [19]:
norm(A-L*U)


ans =

   8.8818e-16



But what sort of factorization is it? 

Just as before, $U$ is upper triangular. But $L$ is not triangular. However, think about the structural conditions imposed during the algorithm:

$$L_{i_1,1} = 1, \quad L_{1_1,2}=\cdots=L_{i_1,m}=0,$$ 

$$L_{i_2,2} = 1, \quad L_{1_2,3}=\cdots=L_{i_2,m}=0,$$ 

down to $L_{i_m,i_m}=1$ in the last step. What this means is that if we take the rows of $L$ in the order $i_1,i_2,i_3,i_4$, then the result is again unit lower triangular! 

In [20]:
L(i,:)


ans =

    1.0000         0         0         0
         0    1.0000         0         0
    1.0000    0.7143    1.0000         0
    0.2000    0.4000    0.5600    1.0000



We can express this result using a permutation matrix as $PL$. Conventionally though, this truly triangular matrix is the one we call $L$, and the one produced directly by the algorithm is $P^{-1}L=P^TL$. Since $A=P^TLU$, this implies that $PA=LU$. This is the **row-pivoted LU factorization** (or partially pivoted factorization, or $P^TLU$ factorization). 

## Linear systems

The system $Ax=b$ is equivalent to $PAx=Pb$, or $L(Ux)=Pb$. We do a forward substitution using a permnuted form of $b$, then a backward substitution using that result. (In practice we wouldn't move data around in memory, but just index the vector indirectly in the correct order.) 

In [21]:
xact = ones(4,1);  b = A*xact;

Pb = b(i); 
x = U\(L(i,:)\Pb)


x =

    1.0000
    1.0000
    1.0000
    1.0000



The built-in `lu` returns a factorization two different ways. With two outputs, we get the "psychologically lower triangular" matrix $P^TL$, plus $U$. 

In [22]:
[L,U] = lu(A)


L =

         0    1.0000         0         0
    1.0000         0         0         0
    0.2000    0.4000    0.5600    1.0000
    1.0000    0.7143    1.0000         0


U =

   10.0000    1.0000   10.0000    5.0000
         0    7.0000   10.0000   10.0000
         0         0   -7.1429  -10.1429
         0         0         0    9.6800



With a third output, we get the purely triangular $L$, and the permutation matrix $P$. 

In [23]:
[L,U,p] = lu(A)


L =

    1.0000         0         0         0
         0    1.0000         0         0
    1.0000    0.7143    1.0000         0
    0.2000    0.4000    0.5600    1.0000


U =

   10.0000    1.0000   10.0000    5.0000
         0    7.0000   10.0000   10.0000
         0         0   -7.1429  -10.1429
         0         0         0    9.6800


p =

     0     1     0     0
     1     0     0     0
     0     0     0     1
     0     0     1     0

