# Gaussian elimination - pivoting

The [Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) iteratively transforms an unstructured linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ into an equivalent triangular system $\mathbf{B}\mathbf{x} = \mathbf{z}$ having the same solution $\mathbf{x}$. As pointed out in the previous class, the **pivots** needed for computing the **Gauss multipliers** must be nonzero. The presence of small pivots produces arbitrarily poor results, even for well-conditioned problems, showing that Gauss elimination may be an unstable method, depending on the elements of $\mathbf{A}$ (Golub and Van Loan, 2013).

To avoid the division by small pivots and guarantee that the pivot has the largest absolute value, we may always swap the lines. 

Let's recall the linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ presented in the previous class:

In [1]:
import numpy as np

In [2]:
N = 3

In [3]:
A = np.array([[2.,1.,-1.],
              [-3.,-1.,2.],
              [-2.,1.,2.]])

In [4]:
y = np.array([8., -11., -3.])

The solution of this system is given by:

In [5]:
x = np.linalg.solve(A,y)

In [6]:
print(x)

[ 2.  3. -1.]


This system can be solved by Gaussian elimination as follows:

In [7]:
I = np.identity(N)

**Iteration k = 1:**

Notice that, in this case, the pivot of the next Gauss transform is `2`. The pivot is not a small number. However, let's apply the partial pivoting for illustrating the procedure and showing that it does not change the final result.

In this case, we interchange the first and second rows/elements of `A0`/`y0`. This is equivalent to premultiply `A0` and `y0` by the following matrix:

In [8]:
P1 = np.identity(N)[[1,0,2]]

print(P1)

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


In [9]:
print(np.dot(P1, A))

[[-3. -1.  2.]
 [ 2.  1. -1.]
 [-2.  1.  2.]]


In [10]:
print(np.dot(P1, y))

[-11.   8.  -3.]


Notice that this row permutation changed the pivot from `2` to `-3`.

In [11]:
u0 = np.array([1., 0., 0.])

In [12]:
t1 = np.array([0., 
               np.dot(P1, A)[1][0]/np.dot(P1, A)[0][0], 
               np.dot(P1, A)[2][0]/np.dot(P1, A)[0][0]])

In [13]:
print(t1)

[ 0.         -0.66666667  0.66666667]


In [14]:
A1 = (I - np.outer(t1, u0)).dot(np.dot(P1,A))

In [15]:
print(A1)

[[-3.         -1.          2.        ]
 [ 0.          0.33333333  0.33333333]
 [ 0.          1.66666667  0.66666667]]


In [16]:
y1 = (I - np.outer(t1, u0)).dot(np.dot(P1,y))

In [17]:
print(y1)

[-11.           0.66666667   4.33333333]


**Iteration k = 2:**

Now, we interchange the second and third rows/elements of `A1`/`y1`. This is equivalent to premultiply `A1` and `y1` by the following matrix:

In [18]:
P2 = np.identity(N)[[0,2,1]]

print(P2)

[[1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]


In [19]:
print(np.dot(P2, A1))

[[-3.         -1.          2.        ]
 [ 0.          1.66666667  0.66666667]
 [ 0.          0.33333333  0.33333333]]


In [20]:
print(np.dot(P2, y1))

[-11.           4.33333333   0.66666667]


In [21]:
u1 = np.array([0., 1., 0.])

In [22]:
t2 = np.array([0., 0., np.dot(P2, A1)[2][1]/np.dot(P2, A1)[1][1]])

In [23]:
print(t2)

[0.  0.  0.2]


In [24]:
B = (I - np.outer(t2, u1)).dot(np.dot(P2, A1))

In [25]:
print(B)

[[-3.         -1.          2.        ]
 [ 0.          1.66666667  0.66666667]
 [ 0.          0.          0.2       ]]


In [26]:
z = (I - np.outer(t2, u1)).dot(np.dot(P2, y1))

In [27]:
print(z)

[-11.           4.33333333  -0.2       ]


Solution of this equivalent triangular system:

In [28]:
print(np.linalg.solve(B,z))

[ 2.  3. -1.]


Solution of the original system:

In [29]:
print(np.linalg.solve(A,y))

[ 2.  3. -1.]


## Algorithm implementation

Now, our equivalent triangular system was iteratively calculated according to the following algorithm:

<a id='eq1'></a>
$$
\begin{align}
\mathbf{A}^{(0)} = \mathbf{A} & & \mathbf{y}^{(0)} = \mathbf{y} \tag{1a} \\\\
\mathbf{A}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{A}^{(0)} & &
\mathbf{y}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{y}^{(0)} \tag{1b} \\\\
\mathbf{A}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{A}^{(1)} & &
\mathbf{y}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{y}^{(1)} \tag{1c}
\end{align}
$$

where $\mathbf{P}^{(k)}$, $k = 1, \dots, N-1$, is the permutation matrix used to interchange the rows and perform the partial pivoting.

Notice that a matrix $\mathbf{P}^{(k)}$, $k = 1, \dots, N-1$, may interchange the set of rows $\left[ \, k - 1 \, : \, \right]$. For example, while the matrix $\mathbf{P}^{(1)}$ can interchange all the rows forming the matrix $\mathbf{A}^{(0)}$ and vector $\mathbf{y}^{(0)}$, the matrix $\mathbf{P}^{(2)}$ can interchange only the set of rows $\left[ \, 1 \, : \, \right]$ (from the second on) forming the matrix $\mathbf{A}^{(1)}$ and vector $\mathbf{y}^{(1)}$.

In [30]:
print(np.identity(N))

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [31]:
print(P1)

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]]


In [32]:
print(P2)

[[1. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]


The [equations 1a-c](#eq1) can be conveniently rewritten as follows:

<a id='eq2'></a>
$$
\begin{align}
\mathbf{C}^{(0)} &= \left[ \: \mathbf{A} \: \vert \: \mathbf{y} \: \right] \tag{2a} \\\\
\mathbf{C}^{(1)} &= \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{C}^{(0)} \tag{2b} \\\\
\mathbf{C}^{(2)} &= \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{C}^{(1)} \tag{2c}
\end{align}
$$

where $\mathbf{B} = \mathbf{C}^{(2)}[ \, : \, , \, :N-1]$ (first $N$ columns of $\mathbf{C}^{(2)}$) and $\mathbf{z} = \mathbf{C}^{(2)}[ \, : \, , \, N]$ (last column of $\mathbf{C}^{(2)}$).

For convenience, let's define 

<a id='eq3'></a>
$$
\tilde{\mathbf{C}}^{(k-1)} = \mathbf{P}^{(k)} \mathbf{C}^{(k-1)} \: . \tag{3}
$$

In [33]:
C0 = np.vstack([A.T, y]).T

In [34]:
C0_tilde = np.dot(P1, C0)

In [35]:
C1 = np.vstack([A1.T, y1]).T

In [36]:
C1_tilde = np.dot(P2, C1)

In [37]:
C2 = np.vstack([B.T, z]).T

In [38]:
print(C0)

[[  2.   1.  -1.   8.]
 [ -3.  -1.   2. -11.]
 [ -2.   1.   2.  -3.]]


In [39]:
print(C0_tilde)

[[ -3.  -1.   2. -11.]
 [  2.   1.  -1.   8.]
 [ -2.   1.   2.  -3.]]


In [40]:
print(C1)

[[ -3.          -1.           2.         -11.        ]
 [  0.           0.33333333   0.33333333   0.66666667]
 [  0.           1.66666667   0.66666667   4.33333333]]


In [41]:
print(C1_tilde)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.33333333   0.33333333   0.66666667]]


In [42]:
print(C2)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.           0.2         -0.2       ]]


By using [equation 3](#eq3), matrices $\mathbf{C}^{(k)}$ ([equations 2b-c](#eq2)) can be rewritten as follows:

<a id='eq4'></a>
$$
\begin{split}
\mathbf{C}^{(k)} &= \left( \mathbf{I} - \mathbf{M}^{(k)} \right) \tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{M}^{(k)}\tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{t}^{(k)} \cdot \left(\mathbf{u}^{(k-1)}\right)^{\top}\tilde{\mathbf{C}}^{(k-1)} \\\\
&= \tilde{\mathbf{C}}^{(k-1)} - \mathbf{t}^{(k)} \cdot \tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, : \, \right]
\end{split} \quad . \tag{4}
$$

The term $- \, \mathbf{t}^{(k)} \cdot \tilde{\mathbf{C}}^{(k-1)}\left[ \, k - 1 \, , \, : \, \right]$, in which $\mathbf{t}^{(k)} \left[ \, : k - 1 \right] = 0$ and

<a id='eq5'></a>
$$
\mathbf{t}^{(k)}[k:] = \frac{\tilde{\mathbf{C}}^{(k-1)}[k: \, , \, k-1]}{\tilde{\mathbf{C}}[k-1, k-1]} \quad , \tag{5}
$$

represents an outer product that affects only the terms $\left[ \, k : \, ,  \, k - 1 : \, \right]$ of matrix $\mathbf{C}^{(k)}$.

In [43]:
# k = 1
print(np.outer(-t1, C0_tilde[0, :]))

[[ 0.          0.         -0.          0.        ]
 [-2.         -0.66666667  1.33333333 -7.33333333]
 [ 2.          0.66666667 -1.33333333  7.33333333]]


In [44]:
# k = 2
print(np.outer(-t2, C1_tilde[1, :]))

[[-0.         -0.         -0.         -0.        ]
 [-0.         -0.         -0.         -0.        ]
 [-0.         -0.33333333 -0.13333333 -0.86666667]]


Remember that $\mathbf{C}^{(k)}\left[ \, k \, : \, , \, k - 1 \, \right] = 0$.

In [45]:
# k = 1
print(C1)

[[ -3.          -1.           2.         -11.        ]
 [  0.           0.33333333   0.33333333   0.66666667]
 [  0.           1.66666667   0.66666667   4.33333333]]


In [46]:
# k = 2
print(C2)

[[ -3.          -1.           2.         -11.        ]
 [  0.           1.66666667   0.66666667   4.33333333]
 [  0.           0.           0.2         -0.2       ]]


Then, we may simplify [equation 4](#eq4) as follows:

<a id='eq6'></a>
$$
\mathbf{C}^{(k)} \left[ \, k : \, ,  \, k : \, \right] = 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k : \, \right] - 
\mathbf{t}^{(k)} \left[ \, k : \, \right] \cdot 
\tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, k : \, \right] \quad . \tag{6}
$$

Finally, we can store the Gauss vector $\mathbf{t}^{(k)} \left[ \, k : \, \right]$ below the main diagonal of the matrix $\tilde{\mathbf{C}}^{(k-1)}$, at the column $k-1$:

<a id='eq7'></a>
$$
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k - 1 \, \right] = 
\mathbf{t}^{(k)} \left[ \, k : \, \right] \tag{7}
$$

and, consequently,

<a id='eq8'></a>
$$
\mathbf{C}^{(k)} \left[ \, k : \, ,  \, k : \, \right] = 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k : \, \right] - 
\tilde{\mathbf{C}}^{(k-1)} \left[ \, k : \, ,  \, k - 1 \, \right] \cdot 
\tilde{\mathbf{C}}^{(k-1)}
\left[ \, k - 1 \, , \, k : \, \right] \quad . \tag{8}
$$

The Gaussian elimination with partial pivoting can be implemented as follows:

    N = y.size
    assert A.shape[1] == N, 'A columns must be equal to y size'
    
    # create matrix C by stacking A and y
    C = 
    
    for k = 1:N-1
        
        # permutation step (computation of C tilde)
        p, C = permut(C, k-1)
        
        # assert the pivot is nonzero
        assert C[k-1,k-1] != 0., 'null pivot!'
        
        # calculate the Gauss multipliers and store them 
        # in the lower part of C (equations 5 and 7)
        C[k:,k-1] = 
        
        # zeroing of the elements in the ith column (equation 8)
        C[k:,k:] = 

    # return the equivalent triangular system
    return np.triu(C[:,:N]), C[:,N]

The permutation function can be defined as follows:

In [47]:
def permut (C, i):
    p = [j for j in range(C.shape[0])]
    imax = i + np.argmax(np.abs(C[i:,i]))
    if imax != i:
        p[i], p[imax] = p[imax], p[i]
    return p, C[p,:]

In [48]:
N = 5

In [49]:
C = 10*np.round(np.random.rand(N,N+1), decimals=3)

In [50]:
print(C)

[[9.23 2.25 5.94 4.05 5.39 6.05]
 [8.48 6.17 5.23 7.69 5.05 4.46]
 [5.67 8.84 8.   3.25 6.27 5.59]
 [1.95 3.37 5.89 2.95 2.56 7.03]
 [5.23 4.92 4.95 7.6  2.68 7.04]]


In [51]:
p, E = permut(C, 0)

In [52]:
print(E)

[[9.23 2.25 5.94 4.05 5.39 6.05]
 [8.48 6.17 5.23 7.69 5.05 4.46]
 [5.67 8.84 8.   3.25 6.27 5.59]
 [1.95 3.37 5.89 2.95 2.56 7.03]
 [5.23 4.92 4.95 7.6  2.68 7.04]]


In [53]:
print(p)

[0, 1, 2, 3, 4]


In [54]:
print(C[p])

[[9.23 2.25 5.94 4.05 5.39 6.05]
 [8.48 6.17 5.23 7.69 5.05 4.46]
 [5.67 8.84 8.   3.25 6.27 5.59]
 [1.95 3.37 5.89 2.95 2.56 7.03]
 [5.23 4.92 4.95 7.6  2.68 7.04]]


In [55]:
A = 10*np.random.rand(N,N)

In [56]:
x = 3*np.random.rand(N)

In [58]:
y = np.dot(A, x)

In [None]:
B, z = Gauss_elim(A, y)

In [None]:
xeq = np.linalg.solve(B, z)

In [None]:
np.allclose(x, xeq)

### Exercise

1) In your `my_functions.py` file, create a function called `gauss_elim` to compute the equivalent triangular system. Note that the algorithm shown above uses the function `permut`, receives the matrix `A` and the vector `y` and returns two numpy arrays containing the equivalent triangular system, as well as the Gauss multipliers. 

2) Create a first test in your `test_my_functions.py` file. For this test, create a linear system and the associated equivalent triangular system (do not use the function `gauss_elim`!). Then use the function `gauss_elim` to compute an equivalent triangular system. Finally, compare the true and the computed triangular system. They must be equal to each other.

3) Create a second test in your `test_my_functions.py` file. In this test, create a matrix `A0` and a vector `x0` and use them to compute a vector `A0x0 = y0`. Then, use the function `gauss_elim` to compute the equivalent triangular system. Use one of your functions to compute a vector `x1` by solving the equivalent triangular system. Finally, compare the computed vector `x1` and the expected vector `x0`.

4) In your `test_my_functions.py` file, create a test for the function `permut` presented above. For this test, create a reference input and a reference output. Then, compare the result produced by the function `permut` and the reference output.

## Inverse matrices

Sometimes, we need to calculate the inverse of a matrix. The inverse of an $N \times N$ matrix $\mathbf{A}$ is commonly represented by $\mathbf{A}^{-1}$. The inverse satisfies:

$$
\begin{split}
\mathbf{A}^{-1} \mathbf{A} &= \mathbf{I} \\
\mathbf{A} \mathbf{A}^{-1} &= \mathbf{I} 
\end{split} \: ,
$$

where $\mathbf{I}$ represents the identity matrix.

The second equation above can be conveniently rewritten by using a column partition given by:

$$
\mathbf{A} 
\left[ \mathbf{A}^{-1}\left[ \, : \, , \, 0  \right] \cdots \mathbf{A}^{-1} \left[ \, : \, , \, N-1  \right]  \right] = 
\left[ \mathbf{u}_{0} \cdots \mathbf{u}_{N-1} \right] \: ,
$$

where $\mathbf{u}_{i}$, $i = 1, \dots, N$, is a $N \times 1$ vector with all elements equal to zero, except the $i$th element, which is equal to $1$. The vectors $\mathbf{A}^{-1}\left[ \, : \, , \, i  \right]$ and $\mathbf{u}_{i}$ represent the $i$th column of $\mathbf{A}^{-1}$ and $\mathbf{I}$, respectively. This equation can then be separated into $N$ linear systems:

$$
\begin{split}
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, 0 \right] &= \mathbf{u}_{0} \\
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, 1 \right] &= \mathbf{u}_{1} \\
\vdots \\
\mathbf{A} \, \mathbf{A}^{-1} \left[ \, : \, , \, N-1 \right] &= \mathbf{u}_{N-1}
\end{split} \: .
$$

This equation shows that each column of the inverse matrix $\mathbf{A}^{-1}$ can be calculated by solving an independent linear system.

### Exercise

1) In your `my_functions.py` file, create a function called `mat_inverse` to compute the inverse of a real matrix. The code must receive a matrix `A` and calculate its inverse `Ainv`, column by column, according to the scheme presented above.

2) Create two tests in your `test_my_functions.py` file. In the first test, create a matrix `A`, compute its inverse `Ainv` by using the function `mat_inverse` and verify if the products `A Ainv` and `Ainv A` are equal to the identity matrix. The second test must compare the inverse matrix computed by using `mat_inverse` and the inverse matrix computed by using the routine [`numpy.linalg.inv`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.inv.html). 