# Numerical Methods 1
### [Gerard Gorman](http://www.imperial.ac.uk/people/g.gorman), [Matthew Piggott](http://www.imperial.ac.uk/people/m.d.piggott), [Christian Jacobs](http://www.christianjacobs.uk)

# Lecture ?: Numerical Linear Algebra II

## Learning objectives:

* More on direct methods: LU decomposition
* Direct vs iterative/indirect methods
* Ill-conditioned matrices
* Example iterative method

## Partial pivoting

At the end of last week we commented that a problem could occur where the $A_{kk}$ we divide through by in the Gaussian elimination and/or back substitution algorithms might be (near) zero.

Using Gaussian elinination as an example, let's again consider the algorithm mid-way working on an arbitrary matrix system, i.e. assume that the first $k$ rows have already been transformed into upper-triangular form, while the equations/rows below are not yet in this form:

$$
\left[
  \begin{array}{rrrrrrrr|r}
    A_{11} & A_{12} & A_{13} & \cdots & A_{1k}  & \cdots & A_{1n} & b_1 \\
    0      & A_{22} & A_{23} & \cdots & A_{2k} & \cdots & A_{2n} & b_2 \\
    0      & 0      & A_{33} & \cdots & A_{3k}  & \cdots & A_{3n} & b_3 \\
    \vdots & \vdots & \vdots & \ddots & \vdots  & \ddots & \vdots & \vdots \\
\hdashline    
    0      & 0      & 0      & \cdots & A_{kk}  & \cdots & A_{kn} & b_k \\    
    \vdots & \vdots & \vdots & \ddots & \vdots  & \ddots & \vdots & \vdots \\
    0      & 0      & 0      & \cdots & A_{nk}  & \cdots & A_{nn} & b_n \\
\end{array}
\right]
$$

Note we have drawn the horizontal dashed line one row higher, as we are not going to blindly asssume that it is wise to take the current row $k$ as the pivot row, and $A_{kk}$ as the so-called pivot element.

*Partial pivoting* selects the best pivot (row or element) as the one where the $A_{ik}$ (for $i\ge k$) value is largest (relative to the other values of components in its own row $i$), and then swaps this row with the current $k$ row.

To generalise our codes above we would simply need to search for this row, and perform the row swap operation - we'll leave this as a howework exercise for those sufficiently keen!

## Direct vs iterative methods

Two types/families of methods exist to solve matrix systems.  These are termed *direct* methods and *iterative* (or *indirect*) methods.

Direct methods perform operations on the linear equations (the matrix system), e.g. the substitution of one eqution into another which we performed last week for your example $2\times 2$ system considered in MM1. This (and the subsequent Gaussian elimination algorithm) transformed the equations making up the linear system into equivalent ones with the aim of eliminating unknowns from some of the equations and hence allowing for easy solution through back (or forward) substitution.

In MM1 you learnt Cramer's rule which gives an explicit formula for the inverse of a matrix, or for the solution of a linear matrix system.  It was pointed out that the computational cost (in terms of arithmetic operations required; also termed complexity) scaled like $(n+1)!$, whereas the Gaussian elimination (which is basically the susbtitution method done above) scaled like $n^3$.  For large $n$ Gaussian elimination will clearly be more efficient - you considered the case where $n=100$ in MM1 for example. $n$ here refers to the number of unknowns or equations, or sometimes termed the *degrees of freedom* of the problem.

However, as pointed out above $n$ could be billions for hard-core applications such as in weather forecasting. In this case the $n^3$ operations required of a direct algorithm such as Gaussian elimination is also prohibitive. 

In order to reduce this cost, ideally to a level that is (close to) linear in $n$, *iterative* algorithms were devised. 

These start with a guess at the solution ($\pmb{x}_0$), they calculate the residual vector ($A\pmb{x}_0 - \pmb{b}$), and its norm (a scalar measure of a vector's size - e.g. the vector *2-norm* is just the square root of the sum of the squares of the components) which will obviously not be zero unless you were very lucky with your initial guess, and then *iteratively* seek to improve on this solution to drive down this residual norm.  This iteration will stop at some small (non-zero) residual norm tolerance level, yielding an approximation to the solution, but not the exact solution we would obtain with direct methods.  The residual norm tolerance stopping criteria therefore needs to be thought about carefully, e.g. depending on how accurate a solution $\pmb{x}$ we require.

Last week we considered Gaussian elimination (and back substitution) as examples of direct solution methods. We'll look briefly as iterative methods this week.

## Ill-conditioning


## LU decomposition - theory

We will consider one more direct solution method: LU decomposition or factorisation.

Last week we implemented Gaussian elimination to solve the matrix system with one RHS vector $\pmb{b}$.  

We often have multiple RHS vectors for the same matric $A$ - we could call the same code multiple times to solve all of these corresponding linear systems, but note that as the elimination algorithm is operating only on (the same) $A$ each time we would be wasting time repeating exaclty the same operations.

We could easily generalise the algorithm to include multiple RHS column vectors in the augmented system, perform the same row operations (now only once) to transform the matrix to upper-triangular form, and then perform back substitution on each of the transformed RHS vectors from the augmented system.

However, it is often the case that each RHS vector depends on the solutions to the matrix systems obtained from some or all of the earlier RHS vectors, and so this generalisation would not work in this case. Note that an example of this you will see in NM2 is where you are time-stepping the solution to a differential equation, and the RHS vector depends on the solution at the previous time level.

To deal with this situation efficiently we *decompose* or *factorise* the matrix $A$ in such a way that it is cheap to compute a new solution vector for any given RHS.  This decompisition involved a lower- and an upper-triangular matrix, hence the name LU decomposition. These matrices essentially encode the steps conducted in Gaussian elimination, so we don't have to explicilty conduct all of the operations again and again.

Mathematically, we assume that we have constructed a lower-triangular matrix ($L$ - all entries above the diagonal are zero)
and an upper-triangular matrix ($U$ - all entries above the diagonal are zero) such that we can write

$$ A = LU $$

In this case our matrix system becomes

$$ A\pmb{x} = \pmb{b} \iff (LU)\pmb{x} = L(U\pmb{x}) = \pmb{b} $$

Notice that the matrix-vector product $U\pmb{x}$ is itself a vector, let's call it $\pmb{c}$ for the time being (i.e. 
$\pmb{c}=U\pmb{x}$).

The above system then reads 

$$ L\pmb{c} = \pmb{b} $$

where $L$ is a matrix and $\pmb{c}$ is an unknown.  As $L$ is in lower-triangular form we can use forward substitution (generalising the back subsitution algorithm/code we developed last week) to very easily find $\pmb{c}$ in relatively few operations (we don't need to call the entire Gaussian elimination algorithm).

Once we know $\pmb{c}$ we then solve the second linear system 

$$ U\pmb{x} = \pmb{c} $$

where now we can use the fact that $U$ is upper-triangular to use our back substitution algorithm again very efficiently to give the solution $\pmb{x}$ we require.

So for a given $\pmb{b}$ we can find the corresponding $\pmb{x}$ very efficiently, we can therefore do this repeatedly as each new $\pmb{b}$ is given to us.

Our challenge is therefore to perform the decomposition $A=LU$.