# Gaussian elimination - pivoting

The [Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) iteratively transforms an unstructured linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ into an equivalent triangular system $\mathbf{B}\mathbf{x} = \mathbf{z}$ having the same solution $\mathbf{x}$. As pointed out in the previous class, the **pivots** needed for computating the **Gauss multipliers** must be nonzero. The presence of small pivots produces arbitrarily poor results, even for well-conditioned problems, showing that Gauss elimination is an unstable method (Golub and Van Loan, 2013).

To avoid the division by small pivots and guarantee that the pivot has the largest absolute value, we may always swap the lines. 

Let's recall the linear system $\mathbf{A}\mathbf{x} = \mathbf{y}$ presented in the previous class:

In [1]:
import numpy as np
from scipy.linalg import lu

In [2]:
A = np.array([[2.,1.,-1.],
              [-3.,-1.,2.],
              [-2.,1.,2.]])

In [3]:
y = np.array([[8.],
              [-11.],
              [-3.]])

The solution of this system is given by:

In [4]:
x = np.linalg.solve(A,y)

In [5]:
print x

[[ 2.]
 [ 3.]
 [-1.]]


This system can be solved by Gaussian elimination as follows:

In [6]:
I = np.identity(3)

**Iteration k = 1:**

Notice that, in this case, the pivot of the next Gauss transform is `2`. The pivot is not a small number. However, let's apply the partial pivoting for illustrating the procedure and showing that it does not change the final result.

In this case, we interchange the first and second rows/elements of `A0`/`y0`. This is equivalent to premultiply `A0` and `y0` by the following matrix:

In [7]:
P1 = np.identity(3)[[1,0,2]]

print P1

[[ 0.  1.  0.]
 [ 1.  0.  0.]
 [ 0.  0.  1.]]


In [8]:
print np.dot(P1, A)

[[-3. -1.  2.]
 [ 2.  1. -1.]
 [-2.  1.  2.]]


In [9]:
print np.dot(P1, y)

[[-11.]
 [  8.]
 [ -3.]]


Notice that this row permutation changed the pivot from `2` to `-3`.

In [10]:
u = np.array([[1.],
              [0.],
              [0.]])

In [11]:
t = np.array([[0.],
              [np.dot(P1, A)[1][0]/np.dot(P1, A)[0][0]],
              [np.dot(P1, A)[2][0]/np.dot(P1, A)[0][0]]])

In [12]:
A1 = (I - t*u.T).dot(np.dot(P1,A))

In [13]:
A1

array([[-3.        , -1.        ,  2.        ],
       [ 0.        ,  0.33333333,  0.33333333],
       [ 0.        ,  1.66666667,  0.66666667]])

In [14]:
y1 = (I - t*u.T).dot(np.dot(P1,y))

In [15]:
y1

array([[-11.        ],
       [  0.66666667],
       [  4.33333333]])

**Iteration 2:**

Now, we interchange the second and third rows/elements of `A1`/`y1`. This is equivalent to premultiply `A1` and `y1` by the following matrix:

In [16]:
P2 = np.identity(3)[[0,2,1]]

print P2

[[ 1.  0.  0.]
 [ 0.  0.  1.]
 [ 0.  1.  0.]]


In [17]:
print np.dot(P2, A1)

[[-3.         -1.          2.        ]
 [ 0.          1.66666667  0.66666667]
 [ 0.          0.33333333  0.33333333]]


In [18]:
print np.dot(P2, y1)

[[-11.        ]
 [  4.33333333]
 [  0.66666667]]


In [19]:
u = np.array([[0.],
              [1.],
              [0.]])

In [20]:
t = np.array([[0.],
              [0.],
              [np.dot(P2, A1)[2][1]/np.dot(P2, A1)[1][1]]])

In [21]:
B = (I - t*u.T).dot(np.dot(P2, A1))

In [22]:
B

array([[-3.        , -1.        ,  2.        ],
       [ 0.        ,  1.66666667,  0.66666667],
       [ 0.        ,  0.        ,  0.2       ]])

In [23]:
z = (I - t*u.T).dot(np.dot(P2, y1))

In [24]:
z

array([[-11.        ],
       [  4.33333333],
       [ -0.2       ]])

Solution of this equivalent triangular system:

In [25]:
print np.linalg.solve(B,z)

[[ 2.]
 [ 3.]
 [-1.]]


Solution of the original system:

In [26]:
print np.linalg.solve(A,y)

[[ 2.]
 [ 3.]
 [-1.]]


Now, our equivalent triangular system was iteratively calculated according to the following algorithm:

$$
\begin{array}{ccccc}
\mathbf{A}^{(0)} = \mathbf{A} & & & \mathbf{y}^{(0)} = \mathbf{y} \\\\
\mathbf{A}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{A}^{(0)} & & &
\mathbf{y}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{y}^{(0)} \\\\
\mathbf{A}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{A}^{(1)} & & &
\mathbf{y}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{y}^{(1)}
\end{array} \: ,$$

where $\mathbf{P}^{(k)}$ is the permutation matrix used to interchange the rows and perform the partial pivoting. Or, alternatively,

$$
\mathbf{C}^{(0)} = \left[ \: \mathbf{A} \: \vert \: \mathbf{y} \: \right] \: ,
$$

$$
\begin{array}{c}
\mathbf{C}^{(1)} = \left(\mathbf{I} - \mathbf{M}^{(1)}\right) \mathbf{P}^{(1)}\mathbf{C}^{(0)} \\\\
\mathbf{C}^{(2)} = \left(\mathbf{I} - \mathbf{M}^{(2)}\right) \mathbf{P}^{(2)}\mathbf{C}^{(1)} 
\end{array} \: ,$$

where $\mathbf{B} = \mathbf{C}^{(2)}[ \, : \, , \, :N]$ and $\mathbf{z} = \mathbf{C}^{(2)}[ \, : \, , \, N+1]$.

The Gaussian elimination with partial pivoting can be implemented as follows:

    N = y.size
    C = np.hstack(np.copy(A), np.copy(y))
    D = np.identity(N)
    
    for i = 1:N-1
        
        # permutation step
        p, C = np.permut(C, i)
        D = D[p]
        
        # assert the pivot is nonzero
        assert C[i,i] != 0., 'null pivot!'
        
        # calculate the Gauss multipliers and store them 
        # in the lower part of C
        C[i+1:,i] = C[i+1:,i]/C[i,i]
        
        # zeroing of the elements in the ith column
        C[i+1:,i+1:] = C[i+1:,i+1:] - outer(C[i+1:N,i], C[i,i+1:])
        
        return C[:,:N], C[:,N]

The permutation function can be defined as follows:

    permut (C, i):
        p = [j for j in range(C.shape[0])]
        imax = i + np.argmax(np.abs(C[i:,i]))
        if imax != i:
            p[i], p[imax] = p[imax], p[i]
        return p, C[p,:]

In [27]:
def permut(C,i):
    P = [j for j in range(C.shape[0])]
    imax = i + np.argmax(np.abs(C[i:,i]))
    P[i], P[imax] = P[imax], P[i]
    return P, C[P,:]

In [28]:
def outer(x,y):
    C = np.zeros((x.ravel().size,y.ravel().size))
    for i, c in enumerate(C):
        c += x[i]*y.ravel()
    return C

In [29]:
N = y.size
C = np.hstack((A, y))
D = np.identity(N)

In [30]:
for i in range(N-1):
    
    p, C = permut(C,i)
    D = D[p]
        
    C[i+1:,i] = C[i+1:,i]/C[i,i]
    C[i+1:,i+1:] = C[i+1:,i+1:] - outer(C[i+1:,i], C[i,i+1:])

In [36]:
D.T

array([[ 0.,  0.,  1.],
       [ 1.,  0.,  0.],
       [ 0.,  1.,  0.]])

In [37]:
C

array([[ -3.        ,  -1.        ,   2.        , -11.        ],
       [  0.66666667,   1.66666667,   0.66666667,   4.33333333],
       [ -0.66666667,   0.2       ,   0.2       ,  -0.2       ]])

In [31]:
P, L, U = lu(A)

In [32]:
P

array([[ 0.,  0.,  1.],
       [ 1.,  0.,  0.],
       [ 0.,  1.,  0.]])

In [33]:
L

array([[ 1.        ,  0.        ,  0.        ],
       [ 0.66666667,  1.        ,  0.        ],
       [-0.66666667,  0.2       ,  1.        ]])

In [34]:
U

array([[-3.        , -1.        ,  2.        ],
       [ 0.        ,  1.66666667,  0.66666667],
       [ 0.        ,  0.        ,  0.2       ]])