# MATH 405/607 

# Numerical Methods for Differential Equations

[[Instructor: Christoph Ortner]](http://www.math.ubc.ca/~ortner/)   [[course page]](https://github.com/cortner/math405_2022)




## L04 - Linear Systems

* arrays
* vectors and matrices
* linear systems in matrix form
* backsubstitution
* Gaussian elimination 
* LU factorisation
* sparse matrices
* banded matrices 
* Eigendecomposition

In [1]:
include("math405.jl")

┌ Info: You are not running in the `math405` Jupyter Hub environment. 
│ I'm therefore activating the local environment.
│ Make sure you know what you are doing! If this is unintentional 
│ then get in touch with your instructor to get help.
└ @ Main /Users/ortner/gits/math405_2022/notes/math405.jl:7
[32m[1m  Activating[22m[39m project at `~/gits/math405_2022/notes`


* https://fncbook.github.io/fnc/linsys/overview.html
* https://github.com/ettersi/ComputationalMathematics/blob/master/03_lu_factorisation.pdf
* E. Süli and D. Mayer, An Introduction to Numerical Analysis, Ch. 2
* G. H. Golub and C. F. Van Loan. Matrix Computations. 1996
* L. N. Trefethen and D. Bau. Numerical Linear Algebra. 1997
* N. J. Higham. Accuracy and Stability of Numerical Algorithms. 2002 

## Background 

I will assume that you are familiar with symbolic linear algebra, in particular: 
* The vector spaces $\mathbb{R}^N, \mathbb{C}^N$
* If we don't want to specify which field we are considering we will write $\mathbb{F}$
* Vectors $x \in \mathbb{F}^N$, Matrices $A \in \mathbb{F}^{M \times N}$
* Standard matrix algebra, such as $A * B, A+\lambda B, A*x$, etc.
* Matrices are representations of linear operators
* Writing linear systems in matrix form 
* Gaussian elimination for solving linear systems

If you need to review this background, then please find some suitable lecture notes.

In `Julia` we can create and manipulate matrices and vectors as follows:
* https://docs.julialang.org/en/v1/manual/arrays/
* https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/

In [2]:
x = rand(3)       # vector with random entries

3-element Vector{Float64}:
 0.635661141033263
 0.8267902654198881
 0.2462251722850558

In [3]:
A = rand(3, 3)   # 3 x 3 matrix with random entries

3×3 Matrix{Float64}:
 0.0489218  0.388194  0.0679635
 0.807611   0.571594  0.651348
 0.522048   0.783339  0.491211

In [4]:
y = A * x          # mat-vec multiplication

3-element Vector{Float64}:
 0.3687866769402452
 1.1463337524624697
 1.100451202301447

In [5]:
y + x              # addition 

3-element Vector{Float64}:
 1.0044478179735081
 1.9731240178823577
 1.3466763745865027

In [6]:
y .* x             # elementwise multiplication

3-element Vector{Float64}:
 0.2344233598617016
 0.9477775874582216
 0.2709587868779706

In [7]:
y * x              # Warning to Matlab users : y * x is not defined 

LoadError: MethodError: no method matching *(::Vector{Float64}, ::Vector{Float64})
[0mClosest candidates are:
[0m  *(::Any, ::Any, [91m::Any[39m, [91m::Any...[39m) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/operators.jl:655
[0m  *([91m::StridedMatrix{T}[39m, ::StridedVector{S}) where {T<:Union{Float32, Float64, ComplexF32, ComplexF64}, S<:Real} at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:44
[0m  *(::StridedVecOrMat, [91m::Adjoint{<:Any, <:LinearAlgebra.LQPackedQ}[39m) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/lq.jl:266
[0m  ...

In [8]:
A * rand(3, 2)   # matrix-matrix multiplication

3×2 Matrix{Float64}:
 0.418753  0.0595034
 1.28384   0.136476
 1.24169   0.148485

In [9]:
# construct vectors and matrices "manually"
x = [ 1.0, 2.0, 3.0 ]

3-element Vector{Float64}:
 1.0
 2.0
 3.0

In [10]:
A = [ 1.0 2.0 
      3.0 4.0 ]

2×2 Matrix{Float64}:
 1.0  2.0
 3.0  4.0

In [11]:
B = [ 1.0 2.0; 3.0 4.0 ]

2×2 Matrix{Float64}:
 1.0  2.0
 3.0  4.0

In [12]:
C = [ 1 2; 3 4 ]

2×2 Matrix{Int64}:
 1  2
 3  4

In [13]:
x = 3
typeof(x)

Int64

Further Julia functions and types to work with arrays: 
* `Array` is the type, but also the constructor, e.g., `Array{Float64, 2}(undef, 10, 10)` allocates a 10 x 10 real matrix with undefined entries.
* `zeros`, `ones`, `randn`
* `A'` for adjoint; `transpose` for transpose
* `ComplexF64` for complex matrices, e.g., `rand(ComplexF64, (10,10))`
* `ldiv!, rdiv!, mul!` for efficient in-place operations
* broadcasting, e.g., `f.(A)` applies `f` to each element
* Warning: `exp(A)` is the matrix exponential, `exp.(A)` the elementwise exponential

In [14]:
A = rand(3,3)
@show exp(A)
@show exp.(A);

exp(A) = [1.5576366012312592 0.6444903104494882 1.160382040175431; 0.5181012724463682 1.7273042774704035 0.7237137913433056; 0.41771383066509954 0.9252238607046032 1.3200378315935215]
exp.(A) = [1.355915152985946 1.1839740819558326 2.377445532139627; 1.3475239871920657 1.4582869197026855 1.5137966370366145; 1.2479370384002952 1.9636162663139103 1.0021927091768212]


In [15]:
?exp

search: [0m[1me[22m[0m[1mx[22m[0m[1mp[22m [0m[1me[22m[0m[1mx[22m[0m[1mp[22m2 [0m[1mE[22m[0m[1mx[22m[0m[1mp[22mr [0m[1me[22m[0m[1mx[22m[0m[1mp[22mm1 [0m[1me[22m[0m[1mx[22m[0m[1mp[22m10 [0m[1me[22m[0m[1mx[22m[0m[1mp[22mort [0m[1me[22m[0m[1mx[22m[0m[1mp[22mint [0m[1me[22m[0m[1mx[22m[0m[1mp[22mintx [0m[1me[22m[0m[1mx[22m[0m[1mp[22minti [0m[1me[22m[0m[1mx[22m[0m[1mp[22monent



```
exp(x)
```

Compute the natural base exponential of `x`, in other words $ℯ^x$.

See also [`exp2`](@ref), [`exp10`](@ref) and [`cis`](@ref).

# Examples

```jldoctest
julia> exp(1.0)
2.718281828459045

julia> exp(im * pi) == cis(pi)
true
```

---

```
exp(A::AbstractMatrix)
```

Compute the matrix exponential of `A`, defined by

$$
e^A = \sum_{n=0}^{\infty} \frac{A^n}{n!}.
$$

For symmetric or Hermitian `A`, an eigendecomposition ([`eigen`](@ref)) is used, otherwise the scaling and squaring algorithm (see [^H05]) is chosen.

[^H05]: Nicholas J. Higham, "The squaring and scaling method for the matrix exponential revisited", SIAM Journal on Matrix Analysis and Applications, 26(4), 2005, 1179-1193. [doi:10.1137/090768539](https://doi.org/10.1137/090768539)

# Examples

```jldoctest
julia> A = Matrix(1.0I, 2, 2)
2×2 Matrix{Float64}:
 1.0  0.0
 0.0  1.0

julia> exp(A)
2×2 Matrix{Float64}:
 2.71828  0.0
 0.0      2.71828
```


Remark on transpose vs adjoint: $A^H = A^* = \bar{A}^T$ (i.e. transpose and complex conjugate)

Most Julia functions are well documented, you can look at the help text using `?`

In [16]:
?randn

search: [0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22m[0m[1mn[22m [0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22m[0m[1mn[22m! sp[0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22m[0m[1mn[22m [0m[1mr[22m[0m[1ma[22m[0m[1mn[22m[0m[1md[22mstri[0m[1mn[22mg low[0m[1mr[22m[0m[1ma[22m[0m[1mn[22mk[0m[1md[22mow[0m[1mn[22mdate low[0m[1mr[22m[0m[1ma[22m[0m[1mn[22mk[0m[1md[22mow[0m[1mn[22mdate!



```
randn([rng=GLOBAL_RNG], [T=Float64], [dims...])
```

Generate a normally-distributed random number of type `T` with mean 0 and standard deviation 1. Optionally generate an array of normally-distributed random numbers. The `Base` module currently provides an implementation for the types [`Float16`](@ref), [`Float32`](@ref), and [`Float64`](@ref) (the default), and their [`Complex`](@ref) counterparts. When the type argument is complex, the values are drawn from the circularly symmetric complex normal distribution of variance 1 (corresponding to real and imaginary part having independent normal distribution with mean zero and variance `1/2`).

# Examples

```jldoctest
julia> using Random

julia> rng = MersenneTwister(1234);

julia> randn(rng, ComplexF64)
0.6133070881429037 - 0.6376291670853887im

julia> randn(rng, ComplexF32, (2, 3))
2×3 Matrix{ComplexF32}:
 -0.349649-0.638457im  0.376756-0.192146im  -0.396334-0.0136413im
  0.611224+1.56403im   0.355204-0.365563im  0.0905552+1.31012im
```


### Goal of this lecture: 

Direct solution of linear systems: If $A = (a_{ij})_{i,j=1}^N \in \mathbb{F}^{N \times N}, b = (b_i)_{i=1}^N \in \mathbb{F}^N$, find $x = (x_i)_{i=1}^N \in \mathbb{F}^N$ s.t.

$$\begin{aligned}
   a_{11} x_1 + a_{12} x_2 + \cdots + a_{1N} x_N &= b_1 \\ 
   a_{21} x_1 + a_{22} x_2 + \cdots + a_{2N} x_N &= b_2  \\ 
         & \qquad  \vdots \\
   a_{N1} x_1 + a_{N2} x_2 + \cdots + a_{NN} x_N &= b_N  \\ 
\end{aligned}$$


$$\begin{aligned}
   a_{11} x_1 + a_{12} x_2 + \cdots + a_{1N} x_N &= b_1 \\ 
   a_{21} x_1 + a_{22} x_2 + \cdots + a_{2N} x_N &= b_2  \\ 
         & \qquad  \vdots \\
   a_{N1} x_1 + a_{N2} x_2 + \cdots + a_{NN} x_N &= b_N  \\ 
\end{aligned}$$

$$
        \Leftrightarrow  \qquad\qquad \qquad 
    \begin{pmatrix}
        a_{11} & a_{12} &  \cdots & a_{1N}  \\ 
        a_{21} & a_{22} &  \cdots & a_{2N}  \\ 
          \vdots & \vdots &        & \vdots \\ 
        a_{N1} & a_{N2} &  \cdots & a_{NN}  \\ 
    \end{pmatrix}
    \cdot
    \begin{pmatrix}
        x_1 \\ x_2 \\ \vdots \\ x_N
    \end{pmatrix}
    = 
    \begin{pmatrix}
        b_1 \\ b_2 \\ \vdots \\ b_N
    \end{pmatrix}
$$

$$
   \Leftrightarrow \qquad \qquad  \qquad A x = b
$$

We've done this already in the intro lecture when we solved the 2-point BVP. In Julia this can be done simply via the `\` operator.

In [17]:
A, b = rand(10,10), rand(10)    # a random linear system 
x = A \ b     # assign solution to `x`
display(x')
@show norm(A * x - b);

1×10 adjoint(::Vector{Float64}) with eltype Float64:
 1072.55  1262.73  -1116.78  -1796.21  …  960.879  -11.4051  -1500.9  669.969

norm(A * x - b) = 1.3573885636900565e-12


Behind the `\` operator is the so-called LU factorisation, which is the main goal of this lecture.

### Triangular Systems & Backward substitution

#### Example

$$\begin{aligned}
    4 x_1 + x_2 - 2 x_3 &=  3 \\ 
          2 x_2 - x_3 &= 4 \\ 
                   3 x_3 &= 6      
\end{aligned}$$

$$\begin{aligned}
   x_3 &= 2 \\ 
   x_2 &= (4 + x_3)/2 = 3 \\ 
   x_1 &= (3 - x_2 + 2 x_3) / 4 = 1
\end{aligned}$$

### General Case

$$\begin{aligned}
    a_{11} x_1 + \cdots + a_{1,N-1} x_{N-1} + a_{1N} x_N &= b_1 \\ 
    & \vdots \\ 
    a_{N-1,N-1} x_{N-1} + a_{N-1,N} x_N &= b_{N-1} \\ 
    a_{NN} x_N &= b_N 
\end{aligned}$$

corresponds to $A x = b$ where $A$ is *UPPER TRIANGULAR*: 

$$
  A = \begin{pmatrix}
      * & * &  \cdots & * &  * \\ 
        & * &  \cdots & * &  * \\ 
        &   &  \ddots   & & \vdots \\ 
        &   &         & *  &  *  \\ 
        &   &          &    & * 
  \end{pmatrix}
$$

$$\begin{aligned}
    a_{11} x_1 + \cdots + a_{1,N-1} x_{N-1} + a_{1N} x_N &= b_1 \\ 
    & \vdots \\ 
    a_{N-1,N-1} x_{N-1} + a_{N-1,N} x_N &= b_{N-1} \\ 
    a_{NN} x_N &= b_N 
\end{aligned}$$

```
FOR columns n = N, N-1, ..., 1 DO
   x[n] = ( b[n] - sum( A[n, m] * x[m] for m = n+1:N ) ) / A[n,n]
END
```

In [18]:
function backsubstitution(A::UpperTriangular, b::AbstractVector)
    N = length(b)
    x = zeros(N)
    x[N] = b[N] / A[N,N]
    for n = N-1:-1:1 
        x[n] = (b[n] - sum(A[n,m]*x[m] for m=n+1:N)) / A[n,n]
    end
    return x
end 

backsubstitution (generic function with 1 method)

In [19]:
A = UpperTriangular(rand(10,10))
b = rand(10)
x = backsubstitution(A, b)
println("Residual: ||A*x-b|| = $(norm(A*x-b))")

Residual: ||A*x-b|| = 9.305364597889227e-16


### Performance Analysis

```
FOR columns n = N, N-1, ..., 1 DO
   x[n] = ( b[n] - sum( A[n, m] * x[m] for m = n+1:N ) ) / A[n,n]
END
```

$$
  \#{\rm FLOPS} \approx 1 + 2 + 3 + \dots + N \approx N^2
$$

We will say that "backsubstituation requires $O(N^2)$ operations. This is the same cost as a matrix-vector multiplication, but much cheaper than a matrix-matrix multiplication $O(N^3)$.

**KEY POINT:** If we are given $A$ upper triangular, then $A * x$ and $A^{-1} * x$ have essentially the same quasi-optimal cost. (touch each entry of the matrix once). I.o.w. if we have $A$ then we already have $A^{-1}$.

### Remark on Julia performance

Julia is an incredibly "democratic" language. Similar to C, C++, Fortran, 
etc but very different from Python or Matlab, well-written user code in Julia 
can be just as fast as core library code. Our code here is not very well-written
but it still gets into an ok performance ball-park.

In [20]:
using BenchmarkTools
N = 10; A = UpperTriangular(rand(N,N)); b = rand(N)
print("  Our toy code:" ); @btime backsubstitution($A, $b)
print("Julia built-in:" ); @btime ($A\$b)
;

  Our toy code:  123.194 ns (1 allocation: 144 bytes)
Julia built-in:  231.900 ns (1 allocation: 144 bytes)


### Gaussian Elimination

We now return to *full* systems, $A x = b$ where $A \in \mathbb{F}^{N \times N}, b \in \mathbb{F}^N$. Our goal is to reduce their solution to backsubstitution. This is achieved by performing Gaussian elimination (just like the pen+paper version) but remembering the operations:

#### Example

$$
\begin{pmatrix}
   3 &  -1 & 2 \\ 
   1 & 2 & 3 \\ 
   2 & -2 & -1 
\end{pmatrix}
\cdot 
\begin{pmatrix}
  x_1 \\ x_2 \\ x_3 
\end{pmatrix}
= 
\begin{pmatrix}
 12 \\ 11 \\ 2 
\end{pmatrix} 
$$

$$
\begin{pmatrix}
   3 &  -1 & 2 \\ 
   1 & 2 & 3 \\ 
   2 & -2 & -1 
\end{pmatrix}
\cdot 
\begin{pmatrix}
  x_1 \\ x_2 \\ x_3 
\end{pmatrix}
= 
\begin{pmatrix}
 12 \\ 11 \\ 2 
\end{pmatrix}
$$
* row[2] $\leftarrow$ row[2] - $\frac{1}{3}$ row[1]
* row[3] $\leftarrow$ row[3] - $\frac{2}{3}$ row[1]

$$
\begin{pmatrix}
   3 &  -1 & 2 \\ 
   0 & 7/3 & 7/3 \\ 
   0 & -4/3 & -7/3 
\end{pmatrix}
\cdot 
\begin{pmatrix}
  x_1 \\ x_2 \\ x_3 
\end{pmatrix}
= 
\begin{pmatrix}
 12 \\ 7 \\ -6
\end{pmatrix} 
$$

$$
\begin{pmatrix}
   3 &  -1 & 2 \\ 
   0 & 7/3 & 7/3 \\ 
   0 & -4/3 & -7/3 
\end{pmatrix}
\cdot 
\begin{pmatrix}
  x_1 \\ x_2 \\ x_3 
\end{pmatrix}
= 
\begin{pmatrix}
 12 \\ 7 \\ -6
\end{pmatrix} 
$$
* row[3] $\leftarrow$ row[3] + $\frac{4}{7}$ row[2]

$$
\begin{pmatrix}
   3 &  -1 & 2 \\ 
   0 & 7/3 & 7/3 \\ 
   0 & 0 & -1 
\end{pmatrix}
\cdot 
\begin{pmatrix}
  x_1 \\ x_2 \\ x_3 
\end{pmatrix}
= 
\begin{pmatrix}
 12 \\ 7 \\ -2
\end{pmatrix} 
$$
We have obtained an upper triangular system which we can now solve via backsubstitution!

The Gaussian elimination steps are *linear operations on individual columns* and can therefore be represented as a matrix multiplication:

In [21]:
A = [ 3 -1  2 
      1  2  3 
      2 -2 -1 ]

3×3 Matrix{Int64}:
 3  -1   2
 1   2   3
 2  -2  -1

In [22]:
L1i = [ 1     0 0
        -1/3  1 0 
        -2/3  0 1 ]

L1i * A

3×3 Matrix{Float64}:
 3.0  -1.0       2.0
 0.0   2.33333   2.33333
 0.0  -1.33333  -2.33333

In [23]:
L1i * A

3×3 Matrix{Float64}:
 3.0  -1.0       2.0
 0.0   2.33333   2.33333
 0.0  -1.33333  -2.33333

In [24]:
L2i = [ 1  0  0 
        0  1  0 
        0 4/7 1]
U = L2i * L1i *  A 

3×3 Matrix{Float64}:
 3.0          -1.0       2.0
 0.0           2.33333   2.33333
 2.22045e-16   0.0      -1.0

We are now very close to an LU factorisation: 
$$
   U := L_2^{-1} L_1^{-1} A
$$
is upper triangular. 

Now observe that 
$$
\begin{pmatrix}
    1 &  &  \\ 
    -l_{21} & 1 &  \\ 
    -l_{31}  & 0 &  1 
\end{pmatrix}^{-1} 
= 
\begin{pmatrix}
    1 &  &  \\ 
    l_{21} & 1 &  \\ 
    l_{31}  & 0 &  1 
\end{pmatrix}
$$
and analogously for all $L_n^{-1}$-matrices. 

Take $L = L_1 L_2 = (L_2^{-1} L_1^{-1})^{-1}$ then we have 
$$ 
  L = 
\begin{pmatrix}
    1 &  &  \\ 
    l_{21} & 1 &  \\ 
    l_{31}  & 0 &  1 
\end{pmatrix}
\cdot 
\begin{pmatrix}
    1 &  &  \\ 
    0 & 1 &  \\ 
    0  & l_{32} &  1 
\end{pmatrix}
= 
\begin{pmatrix}
    1 &  &  \\ 
    l_{21} & 1 &  \\ 
    l_{31}  & l_{32} &  1 
\end{pmatrix}
$$

In [25]:
Li =  L2i * L1i 
L1 = - L1i + 2*I; L2 = -L2i + 2*I 
L = L1 * L2 

# check that we haven't made a mess! -> now, we have an LU factorisation
L * U - A

3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

## LU Factorisation 

**Theorem:** If $A \in \mathbb{F}^{N \times N}$ is invertible then there exist
* a permutation matrix $P$ 
* a lower triangular matrix $L$
* and an upper triangular matrix $U$
such that 
$$
    P A = LU
$$

*Proof:* see e.g. Süli & Mayers, §4.2. 

### Why the Permutation Matrix?

To prevent division by zero 
$$ 
   A = 
   \begin{pmatrix}
       0 & b \\ 
       a & c
   \end{pmatrix}
$$
Cannot eliminate $a_{21}$. Instead: 
$$ 
   P A = 
   \begin{pmatrix}
       0 & 1 \\ 
       1 & 0
   \end{pmatrix}
    \cdot 
   \begin{pmatrix}
       0 & b \\ 
       a & c
   \end{pmatrix}
    = 
   \begin{pmatrix}
       a & c \\ 
       0 & b
   \end{pmatrix}   
$$
This is called (partial) *pivoting*! The matrix $P$ is never stored, this is just a convenient way to write permutations in the language of linear algebra.  (See also *complete pivoting* which is almost never used!)

... but also for numerical stability. (cf. L05 on floating point arithmetic) E.g. if 
$$
   A = 
   \begin{pmatrix}
       \epsilon & b \\ 
       a & c
   \end{pmatrix}
$$
Then 
$$
    L = 
   \begin{pmatrix}
       1 & 0 \\ 
       a/\epsilon & 1
   \end{pmatrix},
    \qquad \qquad
    U = 
   \begin{pmatrix}
       \epsilon & b \\ 
        0 & c - v/\epsilon
   \end{pmatrix}    
$$

We will say more precisely in L05 why $v/\epsilon$ is problematic. For now, imagine you want $\epsilon \to 0$; something surely most go wrong in that limit? 

But with pivoting: (assuming $a \gg \epsilon$)
$$ 
    P = 
   \begin{pmatrix}
       0 & 1 \\ 
       1 & 0
    \end{pmatrix},
    \qquad \qquad 
    %
    L = 
   \begin{pmatrix}
       1 & 0 \\ 
       \epsilon/a & 1
    \end{pmatrix},    
    %
    \qquad \qquad 
    U = 
   \begin{pmatrix}
       a & c \\ 
       0 & b - c/a
    \end{pmatrix}.
$$

Now we can take the limit $\epsilon \to 0$ and everything remains robust in that limit. 

### Sparse matrices

**Loose Definition:** A matrix $A \in \mathbb{F}^{N \times M}$ is called *sparse* if the number of non-zero entries is much smaller than $NM$. We will encounter sparse matrices when we cover PDEs, but see also [Sparse Arrays Documentation](https://docs.julialang.org/en/v1/stdlib/SparseArrays/)

For example, in our opening lecture we enountered the tri-diagonal matrices
$$
    A = 
    \begin{pmatrix} 
        a_{11} & a_{12} &   &  & & \\ 
        a_{21} & a_{22} & a_{23} & &  & \\ 
               & a_{32} & a_{33} & a_{34}  &   &  \\ 
               &        &  \ddots & \ddots & \ddots & 
    \end{pmatrix}
$$
In this case the $A = LU$ factorisation can be performed in $O(N)$ operations as long as no pivoting is required! (see exercises/workshops)

This is a special case of [*banded matrices*](https://en.wikipedia.org/wiki/Band_matrix); see also [BandedMatrices.jl](https://github.com/JuliaMatrices/BandedMatrices.jl).

### Conditioning 

We consider again a linear system 
$$ 
 A x = b 
$$
But let us now assume that the right-hand side $b$ has an error, i.e. the "real" system which we don't know is 
$$
 A \tilde{x} = \tilde{b}
$$
How close are $x$ and $\tilde{x}$?

First things first: What do we mean by "close"? There is no single correct definition, it is problem-dependent. But in the absence of a specific problem, in our general setting let us just assume that there is some norm $\|\cdot\|$ defined on $\mathbb{F}^N$. Then we can ask how large is the error $\| x - \tilde{x}\|$? 

$$\begin{aligned}
    A (x - \tilde x) &= b - \tilde b \\ 
    \| x -  \tilde x \| &= \| A^{-1} (b -  \tilde b) \| 
\end{aligned}$$

But this is too explicit; we want something more generic of the form 

$$
   \| x -  \tilde x \| \leq C \| b -  \tilde b \|  
$$

The matrix-norm: 
$$ 
    \| A \| := \sup_{x \neq 0} \frac{ \| A x \|}{\|x\|} 
         = \sup_{\|x \| =  1} \|Ax\|
$$

Then we obtain 
$$
    \| x - \tilde x\| \leq \| A^{-1} \| \| b - \tilde b\|
$$ 

Now the relative error:  $A \tilde x = \tilde b$, 
$$ 
    \| \tilde b \| \leq \|A \| \, \|\tilde x \| 
$$

$$
    \frac{\| x - \tilde x\|}{\|\tilde x \|} \leq \frac{\| A^{-1} \| \| b - \tilde b\|}{ \| \tilde x \| }
    \leq \| A \| \, \| A^{-1} \| \frac{\|b - \tilde b \|}{\| \tilde b\|}.
$$ 

**Definition:** $\kappa(A) := \|A\| \| A^{-1} \|$. Note that this definition is norm-dependent. If a norm is not explicitly specified then we normally understand $\|\cdot\| = \|\cdot\|_2$

**Proposition:** 
$$ 
        \frac{\| x - \tilde x\|}{\|\tilde x \|} \leq 
        \kappa(A) 
        \frac{\|b - \tilde b \|}{\| \tilde b\|}
$$

In Julia we can use the function `cond` to compute the condition number. 

## Other Useful Factorisations

* LU Factorisation: $P A = L U$
* Cholesky Factorisations: $A = L L^*$ for hermitian positive definite matrices  (Julia: `cholesky`)
* LDL factorisation: $A = L D L^*$ for hermitian matrices (Julia: `ldlt`)
* QR factorisation: $A = Q R$ with $Q$ orthogonal, $R$ upper triangular, for solving least-squares systems (Julia: `qr`)
* singular value decomposition: $A = U \Sigma V^*$  (Julia: `svd`) 
* eigen decomposition: $A = V \Lambda V^{-1}$  (Julia: `eigen`)

[... and many others ...](https://en.wikipedia.org/wiki/Matrix_decomposition)

### Quick Review of the Eigendecomposition

Let $A \in \mathbb{C}^{N \times N}$ then with say $(\lambda, v) \in \mathbb{C} \times \mathbb{C}^N$, $v \neq 0$, is an eigen-pair (eigenvalue, eigenvector) of $A$ if 

$$
    A v = \lambda v
$$

The set of eigenvalues is called the spectrum of $A$ and denoted $\sigma(A)$.

* If $(\lambda_i, v_i)$ are eigenpairs and $\lambda_i$ distinct, then the $v_i$ are linearly independent.
* In particular if $A$ has $N$ distinct eigenvalues, $\lambda_1, \dots, \lambda_N$ then 
$$
    A = V \Lambda V^{-1}, \qquad \Lambda = {\rm diag}(\lambda_i), \quad V = [v_1 \dots v_N].
$$
This is called the eigendecomposition. 

In general, if there exists an invertible $V \in \mathbb{C}^{N \times N}$ and a diagonal $\Lambda$ such that $A = V \Lambda V^{-1}$ then we say that $A$ is diagonalisable.

In [26]:
A = rand(100, 100)
F = eigen(A)
# extract the first eigenpair
λ1 = F.values[1]
v1 = F.vectors[:,1]
norm(A * v1 - λ1 * v1)

1.2893684472100066e-14

In [27]:
# Or we can compute A * V - V * Λ
norm(A * F.vectors - F.vectors * Diagonal(F.values))

1.7397341002330037e-13

We will often encounter real symmetric matrices, which have a very nice property: 

**Proposition:** 
* If $A \in \mathbb{R}^{N \times N}$ is symmetric, or $A \in \mathbb{C}^{N \times N}$ is hermitian, then $A$ has real eigenvalues and orthogonal eigenvectors. 
* Equivalently, $A = Q \Lambda Q^*$ with $\Lambda = {\rm diag}(\lambda_i)$, $\lambda_i \in \mathbb{R}$ and $Q^* Q = Q Q^* = I$.
* Equivalently, there exists an orthonormal basis $\{q_n\}$ of $\mathbb{R}^N$ and $\lambda_n \in \mathbb{R}$ such that 
 $$
     A = \sum_{n = 1}^N \lambda_n q_n q_n^*
 $$
 
 Proof: see separate notes
 
 Note: $Q^*$ = adjoint = transpose+conjugate

**Definition:** More generally we call a matrix $A \in \mathbb{C}^{N \times N}$ *normal* if $A^* A = A A^*$. 

This definition is particularly important for differential operators, many of which are normal. 

**Proposition:** If $A$ is normal then $A = Q \Lambda Q^*$ with $\Lambda$ complex diagonal and $Q^* Q = Q Q^* = I$; i.e., $A$ has a (complex) orthonormal eigenbasis (the columns of $Q$).

In [28]:
A = rand(10, 10)
A = A + A' 
F = eigen(A)
println("σ = ", round.(F.values, digits=2))
println("||QQ' - I|| = ", norm(F.vectors * F.vectors' - I))
println("||Q * Λ * Q' - A|| = ", norm(F.vectors * Diagonal(F.values) * F.vectors' - A))

σ = [-1.84, -0.87, -0.64, -0.46, 0.09, 0.36, 1.46, 1.54, 2.11, 10.61]
||QQ' - I|| = 7.879919857778448e-15
||Q * Λ * Q' - A|| = 1.2346458762832218e-14


**Applications:**

Let $A \in \mathbb{C}^{N \times N}$ normal, $A = Q \Lambda Q^*$, then 
* $A^{-1} = Q \Lambda^{-1} Q^*$ 
* $\| A \| = \max_n |\lambda_n|$  (cf exercise)
* $\| A^{-1} \| = \max_n |\lambda_n^{-1}| = (\min_n |\lambda_n|)^{-1}$ 
* So for the condition number we get 
$$
    \kappa(A) = \frac{\max_n |\lambda_n|}{\min_n |\lambda_n|}
$$
(In practice a better method to compute $\kappa$ is the SVD! Indeed, for a normal matrix, $|\lambda_n|$ are the singular values!)

Recall that $\kappa = \|A\| \|A^{-1}\|$ and $\|A\| = \max_{\|x\|_2 = 1} \| A x \|_2$