#### Notation/Settings/Acronyms

Common settings

| symbol | setting |
|---|---|
| $n$ | a positive integer |
| $A$ | nonsingular $n$-by-$n$ matrix |
| $b$ | (column) vector of length $n$ |
| $x$ | (column) vector of length $n$ |

Acronyms

|Abbreviation| meaning|
|---|---|
| SPD | Symmetric positive definite |

Common convention

| expression | meaning |
|---|---|
| $a_{ij}$, $a_{i,j}$, $A_{ij}$, $A_{i,j}$ | $(i,j)$-component of a matrix $A$ ($i$-th row, $j$-th column) |



**Problem of interest**

Given $A$ and $b$, find $x$ such that

$$ Ax = b. $$




#### Methods

- Methods for general matrices
   1. Direct methods
      - plain Gaussian elimination
      - Gaussian elimination using $PA = LU$ decomposition.
        - Preliminary: $A=LU$ decomposition
   2. Iterative methods
      - Jacobi iteration
      - Gauss-Seidel iteration
- Methods for SPD matrices
   1. Direct methods
      - Cholesky factorization
   2. Iterative methods
      - Conjugate gradient method
- Framework for improvements
   1. Preconditioning


#### Gaussian elimination


##### Augmented matrix

Compact rearrangement of a system of linear equations in matrix form

$$
\begin{aligned}
x+2 y-z & =3 \\
2 x+y-2 z & =3 \\
-3 x+y+z & =-6 
\end{aligned}
\leftrightarrow
\left[\begin{array}{rrr:r}
1 & 2 & -1 & 3 \\
2 & 1 & -2 & 3 \\
-3 & 1 & 1 & -6
\end{array}\right]
$$

##### Elementary row operations

1. Swap one equation (or a row) for another (row): $R_i \leftrightarrow R_j$.
2. Add or subtract a multiple of one equation (or a row) from another (row): $R_i \gets R_i + c R_j$.
3. Multiply an equation (or a row) by a nonzero constant: $R_i \gets c R_i$.


**Example**

Solve the following system of linear equations using the augmented matrix and elementary row operations: 

$$
\begin{aligned}
x+2 y-z & =3 \\
2 x+y-2 z & =3 \\
-3 x+y+z & =-6 .
\end{aligned}
$$

[Example of Gaussian eliminations 1](../images/ex_GaussianElimination1_lp1000.png)

[Example of Gaussian eliminations 2](../images/ex_GaussianElimination2_lp1000.png)

**Remark** (back substitution)

- While there can be many creative ways to find the solution, we will following one single way: we are *developing a systematic method*.
- Let us call, in this class, the first step *elimination*.
- The second step (finding unknowns one by one) is called *back substitution* or *back solving*.

##### Complexity of Gaussian elimination

| Step | Complexity (precise) | Complexity (order) | 
|------|------------|-----|
| eliminations | $$ \frac 2 3 n^3 + \frac 1 2 n^2 - \frac 7 6 n $$ | $$=\mathcal{O}(n^3) $$ | 
| back substitutions | $$ n^2 $$ |  $$= \mathcal{O}(n^2) $$ |

[Derivation of complexity of eliminations 1](../images/der_ComplexityGaussianEliminations1_lp2000.png)

[Derivation of complexity of eliminations 2](../images/der_ComplexityGaussianEliminations2_lp2000.png)

[Derivation of complexity of back substitions](../images/der_ComplexityBackSubstitutions_lp2000.png)

___

Take-aways

- Properties of unit lower triangular matrices that underpin LU decomposition.

### A = LU decomposition

Intuition: Gaussian eliminations can be encapsulated in matrix form. (provided there are no issues)

- We will see that $L^{-1}$ encodes the elimination while $U$ encodes the result of the elimination.

#### Method

Algorithm is borrowed from Kincaid and Cheney (2002) p. 155.

**Data**

- $A=(a_{ij})$: matrix
- $n$: size of matrix

**Computation**

- **for** $k=1$ to $n$ **do**
  - $\ell_{kk} \gets 1$
  - **for** $j=k$ to $n$ **do**
    - $u_{k j} \gets a_{k j}-\sum_{s=1}^{k-1} \ell_{k s} u_{s j}$
  - **for** $i=k+1$ to $n$ **do**
    - $\ell_{i k} \leftarrow\left(a_{i k}-\sum_{s=1}^{k-1} \ell_{i s} u_{s k}\right) / u_{k k}$

**Output**

- $L=(\ell_{ij})$
- $U=(u_{ij})$

**Remark**

- This algorithm works only if there is no zero pivot encountered.
    

#### Triangular matrices

**Definition**

1. A $n$-by-$n$ matrix $L$ is called:*lower triangular* if $\ell_{ij}=0$ for $i < j$. In addition, if $\ell_{ij}=1$ for $i = j$, it is called *unit* lower triangular.

$$
\left[\begin{array}{ccccc}
\ell_{1,1} & & & & 0 \\
\ell_{2,1} & \ell_{2,2} & & & \\
\ell_{3,1} & \ell_{3,2} & \ddots & & \\
\vdots & \vdots & \ddots & \ddots & \\
\ell_{n, 1} & \ell_{n, 2} & \ldots & \ell_{n, n-1} & \ell_{n, n}
\end{array}\right]
$$

2. A $n$-by-$n$ matrix $U$ is called *upper triangular* if $u_{ij}=0$ for $i > j$. In addition, if $u_{ij}=1$ for $i = j$, it is called *unit* upper triangular.

$$
U=\left[\begin{array}{ccccc}
u_{1,1} & u_{1,2} & u_{1,3} & \ldots & u_{1, n} \\
& u_{2,2} & u_{2,3} & \ldots & u_{2, n} \\
& & \ddots & \ddots & \vdots \\
& & & \ddots & u_{n-1, n} \\
0 & & & & u_{n, n}
\end{array}\right]
$$

**Properties of triangular matrices**

**Fact** (From linear algebra)

- If an $n$-by-$n$ matrix $A$ is invertible, then the eigenvalues of $A^{-1}$ are precisely the inverse of eigenvalues of $A$. 
- Determinant of a triangular matrix is the product of its diagonal entries.
- The eigenvalues of a lower triangular matrix are precisely its diagonal entries.

**Main properties**

Lower triangular shape is preserved under addition, scalar multiplication, matrix multiplication, and inversion. More specifically, 

1. If $L_1$ and $L_2$ are lower triangular matrices of size $n$-by-$n$, then $L_1 + L_2$ also lower triangular. 
2. If $L_1$ is a lower triangular matrix of size $n$-by-$n$ and $\alpha$ is a scalar, then $\alpha L_1$ is also lower triangular. 
3. If $L_1$ and $L_2$ are lower triangular matrices of size $n$-by-$n$, then $L_1 L_2$  also lower triangular. Furthermore, $[L_1 L_2]_{ii}=[L_1]_{ii}[L_2]_{ii}$ for $i=1,2,\cdots,n$
   - If $L_1$ and $L_2$ are unit lower triangular matrices of size $n$-by-$n$, then $L_1 L_2$  also unit lower triangular. 
4. If $L_1$ is a lower triangular matrix of size $n$-by-$n$ and it is invertible, then $L_1^{-1}$ is also lower triangular. Furthermore, $[L_1^{-1}]_{ii}=[L_1]_{ii}^{-1}$.
   - If $L_1$ is a unit lower triangular matrix of size $n$-by-$n$ and it is invertible, then $L_1^{-1}$ is also unit lower triangular. 

The same is true for upper triangular matrices.

[Proof of properties of triangular matrices 1](../images/pf_PropTriangularMatrices1_lp2000.png)

[Proof of properties of triangular matrices 2](../images/pf_PropTriangularMatrices2_lp2000.png)


**Lemma 1 for LU** (matrix of row subtraction)

The elementary row operation $R_{i} \gets R_{i}+(-c)R_{j}$ can be represented by a matrix multiplication by $L_{ij}(-c)$ on the left, where

$$
[L_{ij}(-c)]_{k \ell} = \begin{cases}
1 & (k = \ell) \\
-c & (k = i, \ \ell = j) \\
0 & (\text{otherwise}),
\end{cases}
$$

or, 

$$
L_{i j}(-c)=\left[\begin{array}{ccccccc}
1 & & & & & & \\
& \ddots & & & & & \\
& & 1 & & & & \\
& & & \ddots & & & \\
& & -c & & 1 & & \\
& & & & & \ddots & \\
& & & & & & 1
\end{array}\right]
$$

To instructor: Handwritten notes only row operation is desired.

**Lemma 2 for LU** (Product of row subtraction)

Let $L_{ij}(c_{ij})$ be defined as above. If $j$ is fixed, then, we have 

$$
\left[\prod_{i=j+1}^n L_{ij}(c_{ij})\right]_{k \ell} = \begin{cases}
1 & (k = \ell) \\
c_{ij} & (k = i, \ \ell = j) \\
0 & (\text{otherwise}),
\end{cases}
$$

or, 

$$
\prod_{i=j+1}^n L_{ij}(c_{ij})
=\left[\begin{array}{ccccccc}
1 & & & & & & \\
& \ddots & & & & & \\
& & 1 & & & & \\
& & c_{j+1,j} & \ddots & & & \\
& & c_{j+2,j} & & 1 & & \\
& & \vdots & & & \ddots & \\
& & c_{n,j} & & & & 1
\end{array}\right]
$$

For example in $4$-by-$4$ case with $j=1$, 

$$
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
c_{21} & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}\right)
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
c_{31} & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}\right)
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
c_{41} & 0 & 0 & 1 \\
\end{array}\right)
=
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
c_{21} & 1 & 0 & 0 \\
c_{31} & 0 & 1 & 0 \\
c_{41} & 0 & 0 & 1 \\
\end{array}\right)
$$

Proof: Explicit calculation.

**Lemma 3 for LU** (Inverse of row elimination)

Let $L$ be an $n$-by-$n$ lower triangular matrix whose diagonal elements are all 1, and only one column has nonzero elements below diagonal. Then, $A^{-1}$ is of the same form as $A$ except the signs of elements below diagonal being flipped.

In [2]:
"""This script verifies the inversion of a triangular matrix.

1. If only one column has nonzero element below the diagonal,
    then the matrix inversion is mechanical.
2. If more columns has nonzero elements below the diagonal,
    then inversion is not that simple. 
"""

import numpy as np

A = np.eye(3)
A[1, 0] = 2
A[2, 0] = -5
A[2, 1] = 3 # uncomment this line to see case 2

B = A.copy()
low_diag_ind = np.tril_indices_from(B, -1)
B[low_diag_ind] = - A[low_diag_ind]

print("A: \n", A)
print("\nB: \n", B)
print("\nA*B: \n", A@B)
print("\nA^-1:\n", np.linalg.inv(A))

A: 
 [[ 1.  0.  0.]
 [ 2.  1.  0.]
 [-5.  3.  1.]]

B: 
 [[ 1.  0.  0.]
 [-2.  1.  0.]
 [ 5. -3.  1.]]

A*B: 
 [[ 1.  0.  0.]
 [ 0.  1.  0.]
 [-6.  0.  1.]]

A^-1:
 [[ 1.00000000e+00 -1.77635684e-16 -0.00000000e+00]
 [-2.00000000e+00  1.00000000e+00  0.00000000e+00]
 [ 1.10000000e+01 -3.00000000e+00  1.00000000e+00]]


**Lemma 4 for LU** (Product of elementary matrix)

The following patterns generalizes to any size $n$-by-$n$ as long as

1. each matrix is unit lower triangular,
2. each matrix has at most one column that is filled with nonzero entries below diagonal, and
3. the order is kept, namely, the matrix with a full column of smaller index is multiplied more to the left.

$$
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
c_{21} & 1 & 0 & 0 \\
c_{31} & 0 & 1 & 0 \\
c_{41} & 0 & 0 & 1 \\
\end{array}\right)
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & c_{32} & 1 & 0 \\
0 & c_{42} & 0 & 1 \\
\end{array}\right)
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & c_{43} & 1 \\
\end{array}\right)
=
\left(\begin{array}{cccc}
1 & 0 & 0 & 0 \\
c_{21} & 1 & 0 & 0 \\
c_{31} & c_{32} & 1 & 0 \\
c_{41} & c_{42} & c_{43} & 1 \\
\end{array}\right)
$$

In [None]:
"""This script verifies product of elementary triangular matrices.

1. If the matrix with earlier column filled is multiplied more 
    to the left, the product is as easy as writing out.
2. If not, this property is lost.
"""

import numpy as np

A = np.eye(3)
A[1, 0] = 2
A[2, 0] = -5
# A[2, 1] = 3 # uncomment this line to see case 2

B = np.eye(3)
B[2, 1] = 3


print("A: \n", A)
print("\nB: \n", B)
print("\nA*B: \n", A@B)
print("\nB*A: \n", B@A)


**Example** (Finding LU decomposition)

1. Represent the Gaussian elimination of the following system of linear equations using elementary lower triangular matrices, and
2. find $A=LU$ decomposition, where $L$ is lower unit triangular and $U$ is upper triangular.

$$
\begin{aligned}
x+2 y-z & =3 \\
2 x+y-2 z & =3 \\
-3 x+y+z & =-6 .
\end{aligned}
$$

[Example of LU decomposition 1](../images/ex_LUdecomposition1_lp2000.png)

[Example of LU decomposition 2](../images/ex_LUdecomposition2_lp2000.png)

#### Solving linear system using LU decomposition

Once $A=LU$ is obtained, we solve $Ax=b$ via two steps.

1. Solve $Lc = b$ for $c$, then
2. solve $Ux = c$ for $x$.

Reason

From $Ax = LUx = b$, we have $Ux = L^{-1}b =: c$, hence $x=U^{-1}c$.

**Remark**

- Both steps can be computed efficiently because $L$ and $U$ are both triangular, hence their inversion is nothing but a (back or forward) substibution.

**Example** (Solving a linear system given LU)

Given 

$$
A 
= \left[\begin{array}{rrr}
1 & 2 & -1 \\
2 & 1 & -2 \\
-3 & 1 & 1
\end{array}\right]
=\left[\begin{array}{rrr}
1 & 0 & 0 \\
2 & 1 & 0 \\
-3 & -\frac{7}{3} & 1
\end{array}\right]\left[\begin{array}{rrr}
1 & 2 & -1 \\
0 & -3 & 0 \\
0 & 0 & -2
\end{array}\right]
=L U
$$

and 

$$
b = [3, 3, -6]^T,
$$

solve $Ax=b$.

[Example of solving linear system given LU](../images/ex_SolvingLinSystemGivenLU_lp2000.png)

#### Complexity comparisons: Gaussian elimination and LU

Though the Gaussian elimination and LU decomposition use the same idea, they are quite different in practical manner. 

| | Gaussian elimination | LU factorization |
|---|---|---|
| $b$ | included in the augmented matrix | not included |
| approximate complexity <br> for $Ax=b$ | $$ \frac 2 3 n^3 $$ | $$ \frac 2 3 n^3$$ |
| approximate complexity <br> for multiple problems: <br> $Ax_i=b_i$ <br> ($i=1,2,\cdots,k$)| $$ \frac 2 3 k n^3 $$ | $$ \frac 2 3 n^3 + 2 k n^2$$ |



**Remark**

- The above table summarizes approximate complexity rather than very precise ones. The reason we have $2kn^2$ additional term for multiple problems using LU decomposition (compared to a single problem) is that we need to compute two more substitions ($Lc=b$ and $Ux=c$) for each problem. On the other hand, for Gaussian elimination, we have to recompute all over again for each problem, hence complexity $(2/3) kn^3$.
- The multiple problems scenario ($Ax_i=b_i$ for $i=1,2,\cdots,k$) is common in applications because it is often the case $A$ comes from discretizing the integral or differential operator that governs the application and $b$ comes from different data. For example, accoring to Sauer (2017), in structural engineering, $b$ is called *loading vector* and the solution $x$ gives *stress*. We may want to see how stress looks like for many different loading vectors.

### PA = LU decomposition

Motivation

Not all matrices have $LU$ factorization.

**Example** ($LU$ may not exist)

The following matrix does not have $LU$ factorization.

$$
A 
= \left[\begin{array}{rr}
0 & 1 \\
1 & 0 \\
\end{array}\right]
$$

[Example of impossibility of LU factorization](../images/ex_LUimpossible_lp2000.png)

___

##### Application of LU decomposition

- Solving systems of linear equations
- Determinant
- Inverting matrices

Reference: [Wikipedia](https://en.wikipedia.org/wiki/LU_decomposition#Applications)