# LU decomposition - pivoting

#### References

* Turing, A.M. (1948). "Rounding-Off Errors in Matrix Processes". The Quarterly Journal of Mechanics and Applied Mathematics. 1: 287–308. doi: [10.1093/qjmam/1.1.287](https://doi.org/10.1093/qjmam/1.1.287)

* A. Schwarzenberg-Czerny (1995). "On matrix factorization and efficient least squares solution". Astronomy and Astrophysics Supplement. 110: 405-410. Bibliographic Code ADS: [1995A&AS..110..405S](http://adsabs.harvard.edu/abs/1995A%26AS..110..405S)

* Golub, G. H. and C. F. Van Loan, (2013), Matrix computations, 4th edition, Johns Hopkins University Press, ISBN 978-1-4214-0794-4.

The previous notebook (`lu_decomp_intro`) presents an algorithm for computing the [LU decomposition](https://en.wikipedia.org/wiki/LU_decomposition). The algotithm, however, does not apply any pivoting scheme for preventing spurious errors due to numerical instabilities. The algorithm presented here, on the other hand, shows how the partial pivoting changes our previous algorithm for computing the LU decomposition.

Let's consider the solution of a linear system by using the Gaussian elimination with partial pivoting:

<a id='eq1'></a>
$$
\begin{align}
\mathbf{A}^{(0)} = \mathbf{A} & & & \mathbf{y}^{(0)} = \mathbf{y} \tag{1a} \\
\mathbf{A}^{(1)} = \mathbf{Q}^{(1)} \, \mathbf{P}^{(1)} \, \mathbf{A}^{(0)} & & &
\mathbf{y}^{(1)} = \mathbf{Q}^{(1)} \, \mathbf{P}^{(1)} \, \mathbf{y}^{(0)} \tag{1b} \\
\mathbf{A}^{(2)} = \mathbf{Q}^{(2)} \, \mathbf{P}^{(2)} \, \mathbf{A}^{(1)} & & &
\mathbf{y}^{(2)} = \mathbf{Q}^{(2)} \, \mathbf{P}^{(2)} \, \mathbf{y}^{(1)} \tag{1c} \\
\mathbf{A}^{(3)} = \mathbf{Q}^{(3)} \, \mathbf{P}^{(3)} \, \mathbf{A}^{(2)} & & &
\mathbf{y}^{(3)} = \mathbf{Q}^{(3)} \, \mathbf{P}^{(3)} \, \mathbf{y}^{(2)} \tag{1d}
\end{align}
$$

where $\mathbf{P}^{(k)}$, $k = 1, \dots, N-1$, is the permutation matrix used to interchange rows and perform the partial pivoting and $\mathbf{Q}^{(k)}$ is the $k$th Gauss transformation (see the notebooks `gauss-elim-outer` and `lu_decomp_intro`) given by:

<a id='eq2'></a>
$$
\mathbf{Q}^{(k)} = \left( \mathbf{I} - \mathbf{M}^{(k)} \right) \quad . \tag{2}
$$

At the end of this algorithm, the original matrix $\mathbf{A}$ ([equation 1a](#eq1)) is transformed into an upper triangular matrix $\mathbf{A}^{(N-1)}$ ([equation 1c](#eq1)), where $N = 4$.

Note that, according to the algorithm, the matrix $\mathbf{A}^{(3)}$ ([equation 1d](#eq1)) can be written as follows:

<a id='eq3'></a>
$$
\begin{split}
\mathbf{A}^{(3)} 
&= 
\mathbf{Q}^{(3)} \, \mathbf{P}^{(3)} \,
\mathbf{Q}^{(2)} \, \mathbf{P}^{(2)} \,
\mathbf{Q}^{(1)} \, \mathbf{P}^{(1)} \,
\mathbf{A} \\
&=
\underbrace{\left( \mathbf{I} - \mathbf{M}^{(3)} \right)}_{\tilde{\mathbf{Q}}^{(3)}}
\underbrace{\mathbf{P}^{(3)} \left( \mathbf{I} - \mathbf{M}^{(2)} \right) \mathbf{P}^{(-3)}}_{\tilde{\mathbf{Q}}^{(2)}} \,
\underbrace{\mathbf{P}^{(3)} \, \mathbf{P}^{(2)} \left( \mathbf{I} - \mathbf{M}^{(1)} \right) \mathbf{P}^{(-2)} \mathbf{P}^{(-3)}}_{\tilde{\mathbf{Q}}^{(1)}} \,
\underbrace{\mathbf{P}^{(3)} \, \mathbf{P}^{(2)} \, \mathbf{P}^{(1)}}_{\mathbf{P}} \,
\mathbf{A} \\
&= \tilde{\mathbf{Q}} \, \mathbf{P} \, \mathbf{A}
\end{split} \: , \tag{3}
$$

where 

<a id='eq4'></a>
$$
\mathbf{P}^{(-k)} \equiv \left( \mathbf{P}^{(k)} \right)^{-1} \quad ,\tag{4}
$$

<a id='eq5'></a>
$$
\mathbf{P} = \mathbf{P}^{(N-1)}\mathbf{P}^{(N-2)} \cdots  \mathbf{P}^{(1)} \quad , \tag{5}
$$

<a id='eq6'></a>
$$
\tilde{\mathbf{Q}} = \tilde{\mathbf{Q}}^{(N-1)} \tilde{\mathbf{Q}}^{(N-2)} \cdots \tilde{\mathbf{Q}}^{(1)} \quad , \tag{6}
$$

<a id='eq7'></a>
$$
\tilde{\mathbf{Q}}^{(k)} = 
\begin{cases}
\tilde{\mathbf{P}}^{(k)} \, \mathbf{Q}^{(k)} \, \tilde{\mathbf{P}}^{(-k)} \: &, \quad k < N - 1 \\
\mathbf{Q}^{(k)} \: &, \quad k = N - 1
\end{cases} \tag{7}
$$

and

<a id='eq8'></a>
$$
\tilde{\mathbf{P}}^{(k)} = \mathbf{P}^{(N-1)} \cdots \mathbf{P}^{(k+2)} \, \mathbf{P}^{(k+1)} \: . \tag{8}
$$

It is worth noting the diference between the matrices $\tilde{\mathbf{Q}}$ ([equation 6](#eq6)) and the matrix $\mathbf{Q}$ ([equation 2](#eq2)), as well as between the permutation matrices $\mathbf{P}^{(k)}$ and $\tilde{\mathbf{P}}^{(k)}$ ([equation 8](#eq8)).

The permuted matrix $\mathbf{P}\mathbf{A}$ ([equation 3](#eq3)) can be written as follows:

<a id='eq9'></a>
$$
\mathbf{P} \, \mathbf{A} = \tilde{\mathbf{Q}}^{-1} \, \mathbf{A}^{(N-1)} \: , \tag{9}
$$

where

<a id='eq10'></a>
$$
\tilde{\mathbf{Q}}^{-1} =  \tilde{\mathbf{Q}}^{(-1)} \, \tilde{\mathbf{Q}}^{(-2)} \cdots \tilde{\mathbf{Q}}^{(-N+1)} \tag{10}
$$

is the inverse of matrix $\tilde{\mathbf{Q}}$ ([equation 6](#eq6)),

<a id='eq11'></a>
$$
\tilde{\mathbf{Q}}^{(-k)} \equiv \left( \tilde{\mathbf{Q}}^{(k)} \right)^{-1} = 
\begin{cases}
\tilde{\mathbf{P}}^{(k)} \, \mathbf{Q}^{(-k)} \, \tilde{\mathbf{P}}^{(-k)} \: &, \quad k < N - 1 \\
\mathbf{Q}^{(-k)} \: &, \quad k = N - 1
\end{cases} \quad , \tag{11}
$$

is the inverse of matrix $\tilde{\mathbf{Q}}^{(k)}$ ([equation 7](#eq7)) .

<a id='eq12'></a>
$$
\mathbf{Q}^{(-k)} = \left( \mathbf{I} + \mathbf{M}^{(k)} \right) \tag{12}
$$

is the inverse of the $k$th Gauss transformation $\mathbf{Q}^{(k)}$ ([equation 2](#eq2)) and

<a id='eq13'></a>
$$
\tilde{\mathbf{P}}^{(-k)} \equiv \left( \tilde{\mathbf{P}}^{(k)} \right)^{-1} \tag{13}
$$

is the inverse of the permutation matrix $\tilde{\mathbf{P}}^{(k)}$ ([equation 8](#eq8)).

Permutation matrices are orthogonal and, consequently, equations [4](#eq4), [5](#eq5) and [13](#eq13) can be rewritten as follows:

<a id='eq14'></a>
$$
\begin{align}
\mathbf{P}^{(-k)} &= \left( \mathbf{P}^{(k)} \right)^{\top} \quad , \tag{14a} \\
\mathbf{P}^{-1} &= \mathbf{P}^{\top} \tag{14b} \\
\tilde{\mathbf{P}}^{(-k)} &= \left( \tilde{\mathbf{P}}^{(k)} \right)^{\top} \quad . \tag{14c}
\end{align}
$$ 

Now, let's analyze the matrix $\tilde{\mathbf{Q}}^{(-k)}$ ([equation 11](#eq11)). 

<a id='eq15'></a>
$$
\begin{split}
\tilde{\mathbf{Q}}^{(-k)} 
&= \tilde{\mathbf{P}}^{(k)} \, \mathbf{Q}^{(-k)} \, \tilde{\mathbf{P}}^{(-k)} \\
&= \tilde{\mathbf{P}}^{(k)} \,
\left( \mathbf{I} + \mathbf{M}^{(k)} \right) \,
\tilde{\mathbf{P}}^{(-k)} \\
&= \mathbf{I} + \tilde{\mathbf{P}}^{(k)} \,
\, \mathbf{M}^{(k)} \,
\tilde{\mathbf{P}}^{(-k)} \\
&= \mathbf{I} + \tilde{\mathbf{P}}^{(k)} \,
\mathbf{t}^{(k)} \cdot \left( \mathbf{u}^{(k-1)} \right)^{\top} \,
\tilde{\mathbf{P}}^{(-k)}
\end{split} \quad . \tag{15}
$$

To understand [equation 15](#eq15), let's analyze the permutation matrices $\tilde{\mathbf{P}}^{(k)}$ ([equation 8](#eq8)) and $\tilde{\mathbf{P}}^{(-k)}$ (equations [13](#eq13) and [14c](#eq14)). When matrix $\tilde{\mathbf{P}}^{(k)}$ multiplies an arbitrary vector $\mathbf{v}$, it swaps only the elements $\left[ \, k \, : \, \right]$ of $\mathbf{v}$. This can be verified by remembering that $\mathbf{P}^{(k+1)}$ swaps only the elements $\left[ \, k \, : \, \right]$, $\mathbf{P}^{(k+2)}$ swaps only the elements $\left[ \, k+1 \, : \, \right]$ and so on. The same reasoning can be used to conclude that matrix $\tilde{\mathbf{P}}^{(-k)}$ also swaps only the elements $\left[ \, k \, : \, \right]$ of $\mathbf{v}$.

As a consequence, 

<a id='eq16a'></a>
$$
\tilde{\mathbf{t}}^{(k)} = \tilde{\mathbf{P}}^{(k)} \, \mathbf{t}^{(k)} \tag{16a}
$$ 

is a new vector obtained by interchanging all Gauss multipliers.

By using [equation 14c](#eq14), we can see that 
$\left( \mathbf{u}^{(k-1)} \right)^{\top} \, \tilde{\mathbf{P}}^{(-k)} = \tilde{\mathbf{P}}^{(k)} \, \mathbf{u}^{(k-1)}$ and

<a id='eq16b'></a>
$$
\tilde{\mathbf{P}}^{(k)} \, \mathbf{u}^{(k-1)} = \mathbf{u}^{(k-1)} \tag{16b}
$$ 

because, in this case, $\tilde{\mathbf{P}}^{(k)}$ swaps only null elements of $\mathbf{u}^{(k-1)}$ .

Then, by using equations [16a](#eq16a) and [16b](#eq16b), matrix $\tilde{\mathbf{Q}}^{(-k)}$ ([equation 15](#eq15)) can be rewritten as follows:

<a id='eq17'></a>
$$
\tilde{\mathbf{Q}}^{(-k)} = \mathbf{I} + \tilde{\mathbf{t}}^{(k)} \cdot \left( \mathbf{u}^{(k-1)} \right)^{\top} \: . \tag{17}
$$

By using [equation 17](#eq17), we can verify that 

<a id='eq18'></a>
$$
\begin{split}
\tilde{\mathbf{Q}}^{(-k)} \cdot \tilde{\mathbf{Q}}^{(-k-1)}
&= \left[ \mathbf{I} + \tilde{\mathbf{t}}^{(k)} \cdot \left( \mathbf{u}^{(k-1)} \right)^{\top} \right]
   \left[ \mathbf{I} + \tilde{\mathbf{t}}^{(k+1)} \cdot \left( \mathbf{u}^{(k)} \right)^{\top} \right] \\
&= \mathbf{I} + \tilde{\mathbf{t}}^{(k)} \cdot \left( \mathbf{u}^{(k-1)} \right)^{\top} +
   \tilde{\mathbf{t}}^{(k+1)} \cdot \left( \mathbf{u}^{(k)} \right)^{\top} + 
   \tilde{\mathbf{t}}^{(k)} \cdot 
   \underbrace{ \left( \mathbf{u}^{(k-1)} \right)^{\top} \tilde{\mathbf{t}}^{(k+1)}}_{= \, 0}
   \cdot \left( \mathbf{u}^{(k)} \right)^{\top} \\
&= \mathbf{I} + \tilde{\mathbf{t}}^{(k)} \cdot \left( \mathbf{u}^{(k-1)} \right)^{\top} +
   \tilde{\mathbf{t}}^{(k+1)} \cdot \left( \mathbf{u}^{(k)} \right)^{\top} \quad .
\end{split} \tag{18}
$$

Finally, we use [equation 18](#eq18) to show that matrix $\tilde{\mathbf{Q}}^{-1}$ ([equation 10](#eq10)) is given by:

<a id='eq19'></a>
$$
\begin{split}
\tilde{\mathbf{Q}}^{-1}
&= \left[ \mathbf{I} + \tilde{\mathbf{t}}^{(1)} \cdot \left( \mathbf{u}^{(0)} \right)^{\top} \right]
   \left[ \mathbf{I} + \tilde{\mathbf{t}}^{(2)} \cdot \left( \mathbf{u}^{(1)} \right)^{\top} \right] \dots 
   \left[ \mathbf{I} + \tilde{\mathbf{t}}^{(N-1)} \cdot \left( \mathbf{u}^{(N-2)} \right)^{\top} \right] \\
&= \mathbf{I} + \tilde{\mathbf{t}}^{(1)} \cdot \left( \mathbf{u}^{(0)} \right)^{\top} +
   \tilde{\mathbf{t}}^{(2)} \cdot \left( \mathbf{u}^{(1)} \right)^{\top} + \cdots +
   \tilde{\mathbf{t}}^{(N-1)} \cdot \left( \mathbf{u}^{(N-2)} \right)^{\top}
\end{split} \quad . \tag{19}
$$

For our example with $N = 4$, 

$$
\tilde{\mathbf{Q}}^{-1} = 
\left[ \begin{array}{cccc}
1 & 0 & 0 & 0 \\
\tilde{t}_{0}^{(1)} & 1 & 0 & 0 \\
\tilde{t}_{1}^{(1)} & \tilde{t}_{1}^{(2)} & 1 & 0 \\
\tilde{t}_{2}^{(1)} & \tilde{t}_{2}^{(2)} & \tilde{t}_{2}^{(3)} & 1
\end{array} \right] \quad .
$$

Then, matrix $\tilde{\mathbf{Q}}^{-1}$ is a unit **Lower** triangular matrix $\tilde{\mathbf{L}}$ containing the permuted Gauss multipliers. Similarly to the previous class, we define the **Upper** triangular matrix $\mathbf{A}^{(N-1)}$ as $\mathbf{U}$, so that the original matrix $\mathbf{A}$ ([equation 9](#eq9)) is factored as follows:

<a id='eq20'></a>
$$
\mathbf{P} \, \mathbf{A} = \tilde{\mathbf{L}} \, \mathbf{U} \: , \tag{20}
$$

where $\mathbf{P}$ ([equation 5](#eq5)) is the product of all permutation matrices.

## Solving a linear system by using the LU decomposition with partial pivoting

Once the permutation matrix $\mathbf{P}$ and the triangular matrices $\tilde{\mathbf{L}}$ and $\mathbf{U}$ are calculated, we may use them to solve a linear system $\mathbf{A} \mathbf{x} = \mathbf{y}$. Let's first substitute the LU decomposition into the linear system:

<a id='eq21'></a>
$$
\begin{align}
\mathbf{A} \mathbf{x} &= \mathbf{y} \tag{21a} \\
\mathbf{P} \mathbf{A} \mathbf{x} &= \mathbf{P} \mathbf{y} \tag{21b} \\
\tilde{\mathbf{L}} \mathbf{U} \mathbf{x} &= \mathbf{P} \mathbf{y} \tag{21c}
\end{align}
$$

This equation shows that the original linear system can be represented by two triangular systems:

<a id='eq22'></a>
$$
\begin{align}
\tilde{\mathbf{L}}\mathbf{w} &= \mathbf{P} \mathbf{y} \tag{22a} \\
\mathbf{U}\mathbf{x} &= \mathbf{w} \tag{22b}
\end{align}
$$

where $\mathbf{w}$ is a dummy variable. Therefore, the linear system can be solved in two steps:

1. Solve the lower triangular system for $\mathbf{w}$ ([equation 22a](#eq22)) and 
2. Use it to solve the upper triangular system for $\mathbf{x}$ ([equation 22b](#eq22)) .

## Computing inverses by using the LU decomposition

As we have learned in the notebook `gauss-elim-pivoting`, each column of the inverse of a matrix can be computed by solving a linear system. It means that computing the inverse of an $N \times N$ matrix requires the solution of $N$ linear systems. Note that, by using the Gaussian elimination (with or without partial pivoting), we need to compute the triangular matrix just once, but a different 'data vector' for each one of the $N$ columns of the inverse matrix. If we instead use the LU decomposition (with or without partial pivoting), we need to compute the matrices $\mathbf{L}$ and $\mathbf{U}$ just once.

### Exercises

Create a function `lu_decomp_pivoting` according to the following template:

```python
def lu_decomp_pivoting(A, check_input=True):
    '''
    Compute the LU decomposition for a matrix A by applying partial pivoting.
    
    Parameters
    ----------
    A : numpy narray 2d
        Full square matrix of the linear system.
    check_input : boolean
        If True, verify if the input is valid. Default is True.
    Returns
    -------
    P : list of integers
        List containing all permuations.
    C : numpy array 2d
        Full square matrix containing the element of L below the main diagonal and 
        the elements of U in the upper triangle (including the elements on the main diagonal).
    '''
    N = A.shape[0]
    if check_input is True:
        assert A.ndim == 2, 'A must be a matrix'
        assert A.shape[1] == N, 'A must be square'
    # create matrix C as a copy of A
    C = 
    # initial list
    P = list containing elements varying from 0 to N-1
    
    for k = 1:N-1
        # permutation step
        p, C = permut(C, k-1)
        # update P
        P = P[p]
        # assert the pivot is nonzero
        assert C[k-1,k-1] != 0., 'null pivot!'
        # calculate the Gauss multipliers and store them 
        # in the lower part of C
        C[k:,k-1] = 
        # zeroing of the elements in the (k-1)th column
        C[k:,k:] = 
    # return matrix C
    return P, C
```

The permutation function `permut` is the same defined in the notebook `gauss-elim-pivoting`.

Note that the function `lu_decomp_pivoting` receives a square matrix $\mathbf{A}$ and returns the permutation matrix $\mathbf{P}$ (actually a list!) and a matrix $\mathbf{C}$ containing the triangular matrices $\tilde{\mathbf{L}}$ and $\mathbf{U}$. The elements of $\tilde{\mathbf{L}}$, except the unitary elements of its main diagonal, are stored below the main diagonal of $\mathbf{C}$. The elements of $\mathbf{U}$ are stored in the upper part of $\mathbf{C}$, including its main diagonal.

Additionally, create a function called `lu_solve_pivoting` that receives the output of function `lu_decomp` and a vector `y` and return the solution vector `x` of the linear system $\mathbf{A \, x} = \mathbf{y}$. Use the template below:

```python
def lu_solve_pivoting(P, C, y, check_input=True):
    '''
    Solve the linear system Ax = y for x by using the LU decomposition 
    of matrix A with partial pivoting.
    
    Parameters
    ----------
    P : list of integers
        List containing all permutations defined to compute the LU decomposition 
        with partial pivoting (output of function 'lu_decomp_pivoting').
    C : numpy narray 2d
        Full square matrix containing the elements of L below the main diagonal and 
        the elements of U in the upper triangle (including the elements on the main diagonal).
        (Output of the function 'lu_decomp_pivoting').
    y : numpy array 1d
        Independent vector of the linear system.
    check_input : boolean
        If True, verify if the input is valid. Default is True.
    Returns
    -------
    x : numpy array 1d
        Solution of the linear system Ax=y.
    '''
    N = C.shape[0]
    if check_input is True:
        assert C.ndim == 2, 'C must be a matrix'
        assert C.shape[1] == N, 'C must be square'
        assert type(P) == list, 'P must be a list'
        assert len(P) == N, 'P must have N elements'
        assert y.ndim == 1, 'y must be a vector'
        assert y.size == N, 'C columns must be equal to y size'
    
    # create your code here
    
    return x
```

The function `lu_solve_pivoting` must use one of your function to solve triangular systems.

Finally, create at least **three tests**:

* Define a square matrix `A`, compute a matrix `C` by using the function `lu_decomp_pivoting`, use `C` to create the triangular matrices `L_tilde` and `U` and verify the if condition `A[P] = LU` is satisfied (see the example below).

* Compare the matrices `I[P]`, `L`, `U` obtainted by using your function `lu_decomp_pivoting` (where `I` is the identity) with the matrices `P`, `L`, `U` obtainted by using the function [`scipy.linalg.lu`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lu.html). Be careful! The matrix `P` calculated by the routine `scipy.linalg.lu` is equal to the transpose of the matrix `I[P]` calculated by using `lu_decomp_pivoting`. 

* Define a matrix `A0` and a vector `x0` and use them to compute a vector `A0x0 = y0`. Then, use the functions `lu_decomp_pivoting` and `lu_solve_pivoting` to compute a vector `x1` by solving the linear system. Finally, compare the computed vector `x1` and the expected vector `x0`.

#### Testing the function `lu_decomp_pivoting`