In [1]:
using DrWatson;
@quickactivate "MATH361Lectures"
using LinearAlgebra;
import MATH361Lectures;

# LU Factorization 

Consider a general square linear system, that is,  $Ax=b$ with $A$ an $n\times n$ matrix. In these notes we will finally show that it is often possible to factorize $A$ as $A=LU$ with $L$ lower triangular and $U$ upper triangular. As we previously pointed out, this provides a method for solving $Ax=b$ since we can use the factorization $A=LU$, that is, the so-called **LU factorization** of $A$ to solve the linear system by splitting $Ax=LUx = b$ into two subsystems as follows:

1. $Ly=b$ which has solution $y=L^{-1}b$ (in practice obtained via forward substitution), and
  
2. $Ux=y$ which has solution $x=U^{-1}y=U^{-1}L^{-1}b$ (in practice obtained via backward substitution).

Furthermore, we will see that $LU$ factorization arises from, and is in fact equivalent to, Gaussian elimination. It is suggested that you watch the lecture video on [LU factorization](https://www.youtube.com/watch?v=aFbjNVZNYYk&list=PLvUvOH0OYx3BcZivtXMIwP6hKoYv0YvGn&index=8) to supplement this lecture.

## The Algebra of Gaussian Elimination

By now you should be well aware of the fact that Gaussian elimination involves using the three elementary row operations to row reduce a matrix to [row echelon form](https://en.wikipedia.org/wiki/Row_echelon_form). Applying row reduction to an augmented matrix is one way to solve a system of linear equations. What might be less familiar is the fact that the elementary row operations may be carried out via matrix multiplication. 

Specifically, 

1. Multiplying a matrix $A$ on the left by a matrix $M$ (*i.e*, forming $MA$) where $M$ is the identity matrix but with $M_{ii}=\alpha$ has the result of multiplying row $i$ of $A$ by $\alpha$.

2. Multiplying a matrix $A$ on the left by a matrix $P$ (*i.e*, forming $PA$) where $P$ is obtained by interchanging rows $i$ and $j$ of the identity matrix has the result of swapping rows $i$ and $j$ of $A$. (We note that such a matrix $P$ is called a permutation matrix. A permutation matrix has exactly one nonzero entry equal to 1 in each row and column. The inverse of a permutation matrix is its transpose.)

3. Multiplying $A$ on the left by a matrix $I+\alpha e_{j}e_{i}^{T}$ (*i.e.*, forming $(I+\alpha e_{j}e_{i}^{T})A$), where $e_{k}$ denotes the $k$-th column of the identity matrix, has the result of adding $\alpha$ times row $i$ of $A$ to row $j$ of $A$. We will show that the inverse of $I+\alpha e_{j}e_{i}^{T}$ is $I-\alpha e_{j}e_{i}^{T}$. Observe that each of the matrices $I+\alpha e_{j}e_{i}^{T}$ will have each entry of the main diagonal equal to 1. 

We will derive the results stated in 3. while you will be asked ot verify the results stated in 1. and 2. in the homework exercises. 

Let's illustrate statements 1. and 2. using Julia functions stored in the file [`MATH361Lectures.jl`](https://github.com/jmgraham30/MATH361Lectures/blob/master/src/MATH361Lectures.jl). 

In [2]:
A = [-1.0 2.0 -5.0;3.0 -2.0 2.0;-1.0 4.0 1.0]

3×3 Matrix{Float64}:
 -1.0   2.0  -5.0
  3.0  -2.0   2.0
 -1.0   4.0   1.0

Suppose that we want to multiply row 2 of $A$ by $-2$. Then we do as follows:

In [3]:
M = MATH361Lectures.rowmultmat(2,-2.0,3) # construct an appropriate M

3×3 Matrix{Float64}:
 1.0   0.0  0.0
 0.0  -2.0  0.0
 0.0   0.0  1.0

In [4]:
M*A # compute M times A

3×3 Matrix{Float64}:
 -1.0  2.0  -5.0
 -6.0  4.0  -4.0
 -1.0  4.0   1.0

Now suppose that we want to swap the first and third rows of $A$. 

In [5]:
P = MATH361Lectures.rowswapmat(3,1,3) # construct an appropriate P

3×3 Matrix{Float64}:
 0.0  0.0  1.0
 0.0  1.0  0.0
 1.0  0.0  0.0

In [6]:
P*A # compute P times A

3×3 Matrix{Float64}:
 -1.0   4.0   1.0
  3.0  -2.0   2.0
 -1.0   2.0  -5.0

The matrix $P$ is an example of a so-called **permutation matrix**. We will discuss such matrices in greater detail later. For now, note that an important characteristic of permutatoin matrices is that the transpose of a permutation matrix is its inverse. Let's illustrate the inverse property of our permutation matrix $P$. 

In [7]:
P*P'

3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

In [8]:
P'*P

3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

Observe that in order to row reduce a matrix $A$ to an upper triangular matrix, it is only necessary to use operation 3. because the only thing we need to do is zero out all of the entries below the main diagonal. Thus, it is useful to have a Julia function that does this. We now present one:

In [9]:
"""
    rowopmat(j,i,α,n)

Constructs an \$n \\times n\$ matrix that upon left multiplication replaces row 
j of a matrix \$A\$ with α times row i plus row j.

# Example
```julia-repl
julia> L12 = rowopmat(2,1,2.0,4)
```

"""
function rowopmat(j,i,α,n)
    In = Matrix{Float64}(I,n,n); # construct the n by n identity matrix
    M = In + α*In[:,j]*In[:,i]'
    return M
end

rowopmat

We note that the text between the pair of triple quotes preceding the function is called a docstring and it is the way that functions in Julia should be documented. Examine what happens if we call the help utility on the function `rowopmat`:

In [10]:
?rowopmat

search: [0m[1mr[22m[0m[1mo[22m[0m[1mw[22m[0m[1mo[22m[0m[1mp[22m[0m[1mm[22m[0m[1ma[22m[0m[1mt[22m



```
rowopmat(j,i,α,n)
```

Constructs an $n \times n$ matrix that upon left multiplication replaces row  j of a matrix $A$ with α times row i plus row j.

# Example

```julia-repl
julia> L12 = rowopmat(2,1,2.0,4)
```


### Example

In [11]:
A = [-1.0 2.0 1.0;3.0 -2.0 2.0;-1.0 0.0 1.0]

3×3 Matrix{Float64}:
 -1.0   2.0  1.0
  3.0  -2.0  2.0
 -1.0   0.0  1.0

Suppose that we want to replace row 2 with 3 times row 1 plus row 2 in order to zero out the $(2,1)$ entry of $A$. Then we construct the matrix $I+3e_{2}e_{1}^{T}$ which is done using our Julia function as follows:

In [12]:
L12 = rowopmat(2,1,3.0,3) # note that each diagonal entry is equal to 1

3×3 Matrix{Float64}:
 1.0  0.0  0.0
 3.0  1.0  0.0
 0.0  0.0  1.0

In [13]:
L12*A # lef multiply A by L12

3×3 Matrix{Float64}:
 -1.0  2.0  1.0
  0.0  4.0  5.0
 -1.0  0.0  1.0

Let's carry out the next step in row reduction:

In [14]:
L13 = rowopmat(3,1,-1.0,3)

3×3 Matrix{Float64}:
  1.0  0.0  0.0
  0.0  1.0  0.0
 -1.0  0.0  1.0

Observe what happens if we consecutively multiply by a row operation matrices:

In [15]:
L13*L12*A

3×3 Matrix{Float64}:
 -1.0   2.0  1.0
  0.0   4.0  5.0
  0.0  -2.0  0.0

Let's keep going we our row reduction:

In [16]:
L23 = rowopmat(3,2,0.5,3)

3×3 Matrix{Float64}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.5  1.0

In [17]:
L23*L13*L12*A

3×3 Matrix{Float64}:
 -1.0  2.0  1.0
  0.0  4.0  5.0
  0.0  0.0  2.5

We finally arrive at an upper triangular matrix $U$. What we have illustrated is that there exists a sequence of lower triangular matrices $L_{12}$, $L_{13}$, $L_{23}$ such that $L_{23}L_{13}L_{12}A = U$ where $U$ is uppper triangular. This turns out to be a general fact. Furthermore, we know how to invert each of these lower triangular matrices and their inverses are also lower triangular. So, if in our example we set $L=L_{12}^{-1}L_{13}^{-1}L_{23}^{-1}$ (note that the product of lower triangular matrices is a lower triangular matrix), then we have $LU = A$. This discussion illustrates the following important point:
> **Gaussian elimination finds a unit lower triangular matrix $L$ and an upper triangular matrix $U$ such that $A=LU$.**

Observe that a consequence of our development of the LU factorization of a matrix $A$ is that
$$
\text{det}(A) = \text{det}(LU) = \text{det}(L)\text{det}(U) = \prod_{i=1}^{n}u_{ii},
$$
where we have used the facts that the determinant of a triangular matrix is the product of the diagonal entries and that the diagonal entries of $L$ are all equal to 1. You will verify these facts in the homework.

Let's take a moment to derive by hand on the board the following facts:

1. Let $e_{k}$ denote the $k$-th column of the $n\times n$ identity matrix. Then, for $i\neq j$, we have that if $A$ is an $n\times n$ matrix, $e_{j}e_{i}^{T}A$ results in an $n\times n$ matrix where the $j$-th row is the $i$-th row of $A$ but every other entry is zero. From this, we can conclude that $\alpha e_{j}e_{i}^{T}A$ results in an $n\times n$ matrix where the $j$-th row is the $i$-th row of $A$ multiplied by $\alpha$ but every other entry is zero. 

2. Each of the matrices $I+\alpha e_{j}e_{i}^{T}$ is lower diagonal whenever $j > i$ and will have each entry of the main diagonal equal to 1. 

3. Multiplying $A$ on the left by a matrix $I+\alpha e_{j}e_{i}^{T}$ (*i.e.*, forming $(I+\alpha e_{j}e_{i}^{T})A$), where $e_{k}$ denotes the $k$-th column of the identity matrix, has the result of adding $\alpha$ times row $i$ of $A$ to row $j$ of $A$.

4. The inverse of $I+\alpha e_{j}e_{i}^{T}$ is $I-\alpha e_{j}e_{i}^{T}$. It is then clear that when $j > i$ the inverse of $I+\alpha e_{j}e_{i}^{T}$ given by $I-\alpha e_{j}e_{i}^{T}$ is lower triangular. 

With the previous facts established, let's do a computer example. We will obtain the LU factorization of 
$$
A = \left[\begin{array}{ccc} -1.0 & 2.0 & 1.0 \\ 3.0 & -2.0 & 2.0\\ -1.0 & 0.0 & 1.0  \end{array}\right]
$$

In [18]:
L12inv = rowopmat(2,1,-3.0,3); # change the sign from 3 to -3
L13inv = rowopmat(3,1,1.0,3);  # change the sign from -1 to 1
L23inv = rowopmat(3,2,-0.5,3); # change the sign from 0.5 to -0.5
L = L12inv*L13inv*L23inv;
U = L23*L13*L12*A;

In [19]:
L

3×3 Matrix{Float64}:
  1.0   0.0  0.0
 -3.0   1.0  0.0
  1.0  -0.5  1.0

In [20]:
U

3×3 Matrix{Float64}:
 -1.0  2.0  1.0
  0.0  4.0  5.0
  0.0  0.0  2.5

In [21]:
L*U

3×3 Matrix{Float64}:
 -1.0   2.0  1.0
  3.0  -2.0  2.0
 -1.0   0.0  1.0

In [22]:
A - L*U

3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

Let's demonstrate the determinant property of LU factorization that we pointed out previously:

In [23]:
println(det(A))
println(det(L)*det(U))
println(det(U))
println(U[1,1]*U[2,2]*U[3,3])

-9.999999999999998
-10.0
-10.0
-10.0


What we will do now is to use the ideas we have established so far to prove the existence of an LU factorization for a general nonsingular $n\times n$ matrix and present an algorithm for LU factorization that can be implemented on a computer. 

## The LU Algorithm

To reiterate:

> **Gaussian elimination finds a unit lower triangular matrix $L$ and an upper triangular matrix $U$ such that $A=LU$.**

However, we want to be able to find $L$ and $U$ directly without having to construct and multiply matrices. We show how to do this now

The main idea is that, as we perform row operations on $A$, we are obtaining the entries for the upper triangular matrix $U$; and the multipliers used to zero out entries below the main diagonal form the entries below the main diagonal in the lower triangular matrix $L$. This allows us to derive an $LU$ algorithm which proceeds one column at a time.

The $LU$ algorith:

Step 1: Fix apprpriate size $n$, copy $A$ and initialize $L$ as an $n\times n$ identity matrix.

Step 2: Set outer loop over columns and inner loop over rows that do the following:

operate on column $j$ (for $j=1,\ldots,n-1$) and row $i$ (for $i=j+1,\ldots n$) to

  i) place multiplier $\frac{A_{ij}}{A_{jj}}$ in $L_{ij}$ entry
  
  ii) subtract $L_{ij}$ times row $j$ of $A$ from row $i$ of $A$ and update $A$, that is
  
  $$A_{i,j:n} = A_{i,j:n} - L_{ij}A_{j,j:n}$$
  
 Step 3: Extract upper triangular part of updated $A$ to get $U$ 
 
 Step 4: Return $L$ and $U$

The following Julia function implements the LU factorization algorithm that we just derived. 

In [24]:
"""
    lufact(A)

Constructs the LU factorization of a matrix \$A\$.

# Example
```julia-repl
julia> A = [-1.0 2.0 1.0;3.0 -2.0 2.0;-1.0 0.0 1.0]
julia> L,U = lufact(A)
```
 
"""

function lufact(A)
   n = size(A)[1];
   Ac = copy(A);
   L = Matrix{Float64}(I,n,n);
   for j=1:n-1
        for i=j+1:n
            L[i,j] = Ac[i,j] / Ac[j,j];
            Ac[i,j:n] = Ac[i,j:n] - L[i,j]*Ac[j,j:n];
        end
    end
    U = triu(Ac);
    return L, U
end

lufact (generic function with 1 method)

### Example

In [25]:
L,U = lufact(A);

In [26]:
L

3×3 Matrix{Float64}:
  1.0   0.0  0.0
 -3.0   1.0  0.0
  1.0  -0.5  1.0

In [27]:
U

3×3 Matrix{Float64}:
 -1.0  2.0  1.0
  0.0  4.0  5.0
  0.0  0.0  2.5

In [28]:
L*U

3×3 Matrix{Float64}:
 -1.0   2.0  1.0
  3.0  -2.0  2.0
 -1.0   0.0  1.0

In [29]:
A - L*U

3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

### Using LU to Solve a Linear System

In [30]:
b = [-1.0,1.0,2.0]

3-element Vector{Float64}:
 -1.0
  1.0
  2.0

In [31]:
y = MATH361Lectures.forwardsub(L,b);

In [32]:
x = MATH361Lectures.backsub(U,y);

In [33]:
x

3-element Vector{Float64}:
 -1.2000000000000002
 -1.5
  0.8

In [34]:
A*x

3-element Vector{Float64}:
 -0.9999999999999998
  0.9999999999999996
  2.0

In [35]:
b - A*x

3-element Vector{Float64}:
 -2.220446049250313e-16
  4.440892098500626e-16
  0.0

We proved the existence of an $LU$ factorization whenever $A$ is a nonsingular square matrix. Another fact that we state without proof and will use later is that if $A$ is a nonsingular matrix, then the $LU$ factorization we derived is also unique. 

# Assessing LU Factorization 

In the next lecture, we will consider the efficiency and stability of LU factorization. In preparation for this, it is recommended that you watch the lecture videos on [operation counts](https://www.youtube.com/watch?v=FGfDHLpfkZo&list=PLvUvOH0OYx3BcZivtXMIwP6hKoYv0YvGn&index=9) and [pivoting](https://www.youtube.com/watch?v=mmoliBMaaQs&list=PLvUvOH0OYx3BcZivtXMIwP6hKoYv0YvGn&index=10). 