# Gaussian elimination - pivoting

#### References

* Golub, G. H. and C. F. Van Loan, (2013), Matrix computations, 4th edition, Johns Hopkins University Press, ISBN 978-1-4214-0794-4.

The [Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) iteratively transforms an unstructured linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ into an equivalent triangular system $\mathbf{B}\mathbf{x} = \mathbf{z}$ having the same solution $\mathbf{x}$. The file `Content/Gaussian_elimination.pdf` shows an example of Gaussian elimination for $N = 4$. As pointed out in the previous class, the **pivots** needed for computing the **Gauss multipliers** must be nonzero. The presence of small pivots produces arbitrarily poor results, even for well-conditioned problems, showing that Gauss elimination may be an unstable method, depending on the elements of $\mathbf{A}$ (Golub and Van Loan, 2013, p. 125).

To avoid the division by small pivots and guarantee that the pivot has the largest absolute value, we may always swap the lines. The file `Content/Gaussian_elimination_pivoting.pdf` shows an example of this proccess for $N = 4$.

Let's recall the linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ presented in the previous class:

In [1]:
import numpy as np

In [2]:
N = 3

In [3]:
A = np.array([[ 2., 1.,-1.],
              [-3.,-1., 2.],
              [-2., 1., 2.]])

In [4]:
y = np.array([8., -11., -3.])

The solution of this system is given by:

In [5]:
x = np.linalg.solve(A,y)

In [6]:
print(x)

[ 2.  3. -1.]


This system can be solved by Gaussian elimination as follows:

In [7]:
I = np.identity(N)

**Iteration k = 1:**

Notice that, in this case, the pivot of the next Gauss transform is `2`. The pivot is not a small number. However, let's apply the partial pivoting for illustrating the procedure and showing that it does not change the final result.

In this case, we interchange the first and second rows/elements of `A0`/`y0`. This is equivalent to premultiply `A0` and `y0` by the following matrix:

In [8]:
P1 = np.identity(N)[[1,0,2]]

print(P1)

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


In [9]:
print(A)

[[ 2.  1. -1.]
 [-3. -1.  2.]
 [-2.  1.  2.]]


In [10]:
print(np.dot(P1, A))

[[-3. -1.  2.]
 [ 2.  1. -1.]
 [-2.  1.  2.]]


In [11]:
print(y)

[  8. -11.  -3.]


In [12]:
print(np.dot(P1, y))

[-11.   8.  -3.]


Notice that this row permutation changed the pivot from `2` to `-3`.

In [13]:
u0 = np.array([1., 0., 0.])

In [14]:
t1 = np.array([0., 
               np.dot(P1, A)[1][0]/np.dot(P1, A)[0][0], 
               np.dot(P1, A)[2][0]/np.dot(P1, A)[0][0]])

In [15]:
print(t1)

[ 0.         -0.66666667  0.66666667]


In [16]:
A1 = (I - np.outer(t1, u0)).dot(np.dot(P1,A))

In [17]:
print(A)

[[ 2.  1. -1.]
 [-3. -1.  2.]
 [-2.  1.  2.]]


In [18]:
print(A1)

[[-3.         -1.          2.        ]
 [ 0.          0.33333333  0.33333333]
 [ 0.          1.66666667  0.66666667]]


In [19]:
y1 = (I - np.outer(t1, u0)).dot(np.dot(P1,y))

In [20]:
print(y1)

[-11.           0.66666667   4.33333333]


**Iteration k = 2:**

Now, we interchange the second and third rows/elements of `A1`/`y1`. This is equivalent to premultiply `A1` and `y1` by the following matrix:

In [21]:
print(A1)

[[-3.         -1.          2.        ]
 [ 0.          0.33333333  0.33333333]
 [ 0.          1.66666667  0.66666667]]


In [22]:
P2 = np.identity(N)[[0,2,1]]

print(P2)

[[1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]


In [23]:
print(np.dot(P2, A1))

[[-3.         -1.          2.        ]
 [ 0.          1.66666667  0.66666667]
 [ 0.          0.33333333  0.33333333]]


In [24]:
print(np.dot(P2, y1))

[-11.           4.33333333   0.66666667]


In [25]:
u1 = np.array([0., 1., 0.])

In [26]:
t2 = np.array([0., 0., np.dot(P2, A1)[2][1]/np.dot(P2, A1)[1][1]])

In [27]:
print(t2)

[0.  0.  0.2]


In [28]:
B = (I - np.outer(t2, u1)).dot(np.dot(P2, A1))

In [29]:
print(B)

[[-3.         -1.          2.        ]
 [ 0.          1.66666667  0.66666667]
 [ 0.          0.          0.2       ]]


In [30]:
z = (I - np.outer(t2, u1)).dot(np.dot(P2, y1))

In [31]:
print(z)

[-11.           4.33333333  -0.2       ]


Solution of this equivalent triangular system:

In [32]:
print(np.linalg.solve(B,z))

[ 2.  3. -1.]


Solution of the original system:

In [33]:
print(np.linalg.solve(A,y))

[ 2.  3. -1.]


## Algorithm implementation

Our equivalent triangular system can be iteratively calculated according to the following algorithm:

<a id='eq1'></a>
$$
\begin{align}
\mathbf{A}^{(0)} = \mathbf{A} & & \mathbf{y}^{(0)} = \mathbf{y} \tag{1a} \\
\mathbf{A}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{A}^{(0)} & &
\mathbf{y}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{y}^{(0)} \tag{1b} \\
\mathbf{A}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{A}^{(1)} & &
\mathbf{y}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{y}^{(1)} \tag{1c}
\end{align}
$$

where $\mathbf{P}^{(k)}$, $k = 1, \dots, N-1$, is the permutation matrix used to interchange the rows and perform the partial pivoting.

Notice that a matrix $\mathbf{P}^{(k)}$, $k = 1, \dots, N-1$, may interchange the set of rows $\left[ \, k - 1 \, : \, \right]$. For example, while the matrix $\mathbf{P}^{(1)}$ can interchange all the rows forming the matrix $\mathbf{A}^{(0)}$ and vector $\mathbf{y}^{(0)}$, the matrix $\mathbf{P}^{(2)}$ can interchange only the set of rows $\left[ \, 1 \, : \, \right]$ (from the second on) forming the matrix $\mathbf{A}^{(1)}$ and vector $\mathbf{y}^{(1)}$.

In [34]:
print(np.identity(N))

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [35]:
print(P1)

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


In [36]:
print(P2)

[[1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]


The [equations 1a-c](#eq1) can be conveniently rewritten as follows:

<a id='eq2'></a>
$$
\begin{align}
\mathbf{C}^{(0)} &= \left[ \: \mathbf{A} \: \vert \: \mathbf{y} \: \right] \tag{2a} \\
\mathbf{C}^{(1)} &= \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{C}^{(0)} \tag{2b} \\
\mathbf{C}^{(2)} &= \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{C}^{(1)} \tag{2c}
\end{align}
$$

where $\mathbf{B} = \mathbf{C}^{(2)}[ \, : \, , \, :N-1]$ (first $N$ columns of $\mathbf{C}^{(2)}$) and $\mathbf{z} = \mathbf{C}^{(2)}[ \, : \, , \, N]$ (last column of $\mathbf{C}^{(2)}$).

For convenience, let's define 

<a id='eq3'></a>
$$
\tilde{\mathbf{C}}^{(k-1)} = \mathbf{P}^{(k)} \mathbf{C}^{(k-1)} \: . \tag{3}
$$

In [37]:
C0 = np.vstack([A.T, y]).T

In [38]:
C0_tilde = np.dot(P1, C0)

In [39]:
C1 = np.vstack([A1.T, y1]).T

In [40]:
C1_tilde = np.dot(P2, C1)

In [41]:
C2 = np.vstack([B.T, z]).T

In [42]:
print(C0)

[[  2.   1.  -1.   8.]
 [ -3.  -1.   2. -11.]
 [ -2.   1.   2.  -3.]]


In [43]:
print(C0_tilde)

[[ -3.  -1.   2. -11.]
 [  2.   1.  -1.   8.]
 [ -2.   1.   2.  -3.]]


In [44]:
print(C1)

[[ -3.          -1.           2.         -11.        ]
 [  0.           0.33333333   0.33333333   0.66666667]
 [  0.           1.66666667   0.66666667   4.33333333]]


In [45]:
print(C1_tilde)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.33333333   0.33333333   0.66666667]]


In [46]:
print(C2)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.           0.2         -0.2       ]]


By using [equation 3](#eq3), matrices $\mathbf{C}^{(k)}$ ([equations 2b-c](#eq2)) can be rewritten as follows:

<a id='eq4'></a>
$$
\begin{split}
\mathbf{C}^{(k)} &= \left( \mathbf{I} - \mathbf{M}^{(k)} \right) \tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{M}^{(k)}\tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{t}^{(k)} \cdot \left(\mathbf{u}^{(k-1)}\right)^{\top}\tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{t}^{(k)} \cdot \tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, : \, \right]
\end{split} \quad . \tag{4}
$$

The term $- \, \mathbf{t}^{(k)} \cdot \tilde{\mathbf{C}}^{(k-1)}\left[ \, k - 1 \, , \, : \, \right]$, in which $\mathbf{t}^{(k)} \left[ \, : k - 1 \right] = 0$ and

<a id='eq5'></a>
$$
\mathbf{t}^{(k)}[k:] = \frac{\tilde{\mathbf{C}}^{(k-1)}[k: \, , \, k-1]}{\tilde{\mathbf{C}}^{(k-1)}[k-1, k-1]} \quad , \tag{5}
$$

represents an outer product that affects only the terms $\left[ \, k : \, ,  \, k - 1 : \, \right]$ of matrix $\mathbf{C}^{(k)}$. See the file `Content/Gaussian_elimination_pivoting.pdf`.

In [47]:
# k = 1

print(C0_tilde)

print(C0_tilde - np.outer(t1, C0_tilde[0, :]))

[[ -3.  -1.   2. -11.]
 [  2.   1.  -1.   8.]
 [ -2.   1.   2.  -3.]]
[[ -3.          -1.           2.         -11.        ]
 [  0.           0.33333333   0.33333333   0.66666667]
 [  0.           1.66666667   0.66666667   4.33333333]]


In [48]:
# k = 2

print(C1_tilde)

print(C1_tilde - np.outer(t2, C1_tilde[1, :]))

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.33333333   0.33333333   0.66666667]]
[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.           0.2         -0.2       ]]


Note that $\mathbf{C}^{(k)}\left[ \, k \, : \, , \, k - 1 \, \right] = 0$.

In [49]:
# k = 1
print(C1)

[[ -3.          -1.           2.         -11.        ]
 [  0.           0.33333333   0.33333333   0.66666667]
 [  0.           1.66666667   0.66666667   4.33333333]]


In [50]:
# k = 2
print(C2)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.           0.2         -0.2       ]]


Then, we may simplify [equation 4](#eq4) as follows:

<a id='eq6'></a>
$$
\mathbf{C}^{(k)} \left[ \, k : \, ,  \, k : \, \right] = 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k : \, \right] - 
\mathbf{t}^{(k)} \left[ \, k : \, \right] \cdot 
\tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, k : \, \right] \quad . \tag{6}
$$

Finally, we can store the Gauss vector $\mathbf{t}^{(k)} \left[ \, k : \, \right]$ below the main diagonal of the matrix $\tilde{\mathbf{C}}^{(k-1)}$, at the column $k-1$:

<a id='eq7'></a>
$$
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k - 1 \, \right] = 
\mathbf{t}^{(k)} \left[ \, k : \, \right] \tag{7}
$$

and, consequently (see the file `Content/Gaussian_elimination_pivoting.pdf`),

<a id='eq8'></a>
$$
\mathbf{C}^{(k)} \left[ \, k : \, ,  \, k : \, \right] = 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k : \, \right] - 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k - 1 \, \right] \cdot 
\tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, k : \, \right] \quad . \tag{8}
$$

## Inverse matrices

Computing the inverse of a matrix is a very important task. The inverse of an $N \times N$ matrix $\mathbf{A}$ is commonly represented by $\mathbf{A}^{-1}$. The inverse satisfies:

<a id='eq9'></a>
$$
\begin{align}
\mathbf{A}^{-1} \mathbf{A} &= \mathbf{I} \tag{9a} \\
\mathbf{A} \mathbf{A}^{-1} &= \mathbf{I} \tag{9b}
\end{align}
$$

where $\mathbf{I}$ represents the identity matrix.

[Equation 9b](#eq9) can be conveniently rewritten by using a column partition given by:

<a id='eq10'></a>
$$
\mathbf{A} 
\left[ \mathbf{A}^{-1}\left[ \, : \, , \, 0  \right] \cdots \mathbf{A}^{-1} \left[ \, : \, , \, N-1  \right]  \right] = 
\left[ \mathbf{u}_{0} \cdots \mathbf{u}_{N-1} \right] \: , \tag{10}
$$

where $\mathbf{u}_{i}$, $i = 0, \dots, N-1$, is a $N \times 1$ vector with all elements equal to zero, except the $i$th element, which is equal to $1$. The vectors $\mathbf{A}^{-1}\left[ \, : \, , \, i  \right]$ and $\mathbf{u}_{i}$ represent the $i$th column of $\mathbf{A}^{-1}$ and $\mathbf{I}$, respectively. [Equation 10](#eq10) can then be separated into $N$ linear systems:

<a id='eq11'></a>
$$
\begin{split}
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, 0 \right] &= \mathbf{u}_{0} \\
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, 1 \right] &= \mathbf{u}_{1} \\
\vdots \\
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, N-1 \right] &= \mathbf{u}_{N-1}
\end{split} \: . \tag{11}
$$

[Equation 11](#eq11) shows that each column of the inverse matrix $\mathbf{A}^{-1}$ can be calculated by solving an independent linear system. The same strategy used in equations [2a-c](#eq2) can be used here. The difference is that, in the present case, matrix $\mathbf{C}^{(0)}$ ([equation 2a](#eq2)) is given by:

<a id='eq12'></a>
$$
\mathbf{C}^{(0)} = \left[ \: \mathbf{A} \: \vert \: \mathbf{I} \: \right] \quad . \tag{12}
$$

The following steps are exactly the same as those shown by equations [2b-c](#eq2).

### Exercise 1

Create a function `Gauss_elim` that implement the Gaussian elimination according to the following template:

```python
def Gauss_elim(A, y, check_input=True):
    '''
    Compute the equivalent triangular system for a system Ax = y.
    
    Parameters
    ----------
    A : numpy narray 2d
        Full square matrix of the linear system.
    y : numpy array 1d
        Independent vector of the linear system.
    check_input : boolean
        If True, verify if the input is valid. Default is True.
    Returns
    -------
    C[:, :N] : numpy array 2d
        Upper triangular matrix of the equivalent system.
    C[:,N] : numpy array 1d
        Independent vector of the equivalent system.
    '''
    N = A.shape[0]
    if check_input is True:
        assert A.ndim == 2, 'A must be a matrix'
        assert y.ndim == 1, 'y must be a vector'
        assert A.shape[1] == N, 'A must be square'
        assert y.size == N, 'A columns must be equal to y size'
    # create matrix C by stacking A and y
    C = 
    for k = 1:N-1
        # permutation step (computation of C tilde - eq. 3)
        p, C = permut(C, k-1)
        # assert the pivot is nonzero
        assert C[k-1,k-1] != 0., 'null pivot!'
        # calculate the Gauss multipliers and store them 
        # in the lower part of C (equations 5 and 7)
        C[k:,k-1] = 
        # zeroing of the elements in the (k-1)th column (equation 8)
        C[k:,k:] -= 
    # return the equivalent triangular system and Gauss multipliers
    return C[:,:N], C[:,N]
```

The permutation function can be defined as follows:

In [51]:
def permut (C, i):
    p = [j for j in range(C.shape[0])]
    imax = i + np.argmax(np.abs(C[i:,i]))
    if imax != i:
        p[i], p[imax] = p[imax], p[i]
    return p, C[p,:]

Additionally, create **at least three tests**:

* Create a linear system and the associated equivalent triangular system (do not use the function `Gauss_elim`!). Then use the function `Gauss_elim` to compute an equivalent triangular system. Finally, compare the true and the computed triangular system. They must be equal to each other.

* Create a matrix `A0` and a vector `x0` and use them to compute a vector `A0x0 = y0`. Then, use the function `Gauss_elim` to compute the equivalent triangular system. Use one of your functions to compute a vector `x1` by solving the equivalent triangular system. Finally, compare the computed vector `x1` and the expected vector `x0`.

* Create a reference input and a reference output for the `permut` function. Then, compare the result produced by the function `permut` and the reference output.

#### Testing the function `Gauss_elim`

### Exercise 2

1) In your functions file, create a function called `Gauss_elim_expanded` to compute the expanded equivalent system. The code must receive a matrix `A` and a matrix `Y` and return the matrix `C` (the same for function `Gauss_elim`) and the matriz `Z` containing all vector of the equivalent system.

2) In your test file, create two tests. In the first test, create a matrix `A`, compute its inverse `Ainv` with function `Gauss_elim_expanded` and verify if the products `A Ainv` and `Ainv A` are equal to the identity matrix. The second test must compare the computed inverse matrix and that computed by using the routine [`numpy.linalg.inv`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.inv.html). 

Create a function `Gauss_elim_expanded` to implement the "expanded" Gaussian elimination according to the template below:

```python
def Gauss_elim_expanded(A, Y, check_input=True):
    '''
    Compute the equivalent triangular system for an "expended" system Ax = Y.
    
    Parameters
    ----------
    A : numpy narray 2d
        Full square matrix of the linear system.
    Y : numpy array 2d
        Independent vectors of the linear system.
    check_input : boolean
        If True, verify if the input is valid. Default is True.
    Returns
    -------
    C[:, :N] : numpy array 2d
        Upper triangular matrix of the equivalent system.
    C[:,N:] : numpy array 2d
        Independent vectors of the equivalent system.
    '''
    N = A.shape[0]
    if check_input is True:
        assert A.ndim == Y.ndim == 2, 'A and Y must be matrices'
        assert A.shape[1] == N, 'A must be square'
        assert Y.shape[0] == N, 'A columns must have the same size as Y rows'
    # create matrix C by stacking A and y
    C = 
    for k = 1:N-1
        # permutation step (computation of C tilde - eq. 3)
        p, C = permut(C, k-1)
        # assert the pivot is nonzero
        assert C[k-1,k-1] != 0., 'null pivot!'
        # calculate the Gauss multipliers and store them 
        # in the lower part of C (equations 5 and 7)
        C[k:,k-1] = 
        # zeroing of the elements in the (k-1)th column (equation 8)
        C[k:,k:] -= 
    # return the triangular matrix with Gauss multipliers and 
    # all equivalent vectors
    return C[:,:N], C[:,N:]
```

#### Testing the function `Gauss_elim_expanded`