# Solving linear equations

# I. $LU$ factorization of a square matrix

Let us consider a decomposition of a square $n \times n$ matrix A as follows:
$$A = L \cdot U, \; \mbox{where} \; A = \begin{pmatrix} 
                                            a_{11} & a_{12} & a_{13} & \ldots & a_{1n} \\
                                            a_{21} & a_{22} & a_{23} & \ldots & a_{2n} \\
                                            a_{31} & a_{32} & a_{33} & \ldots & a_{3n} \\
                                            \vdots & \vdots & \vdots & \ddots & \vdots \\
                                            a_{n1} & a_{n2} & a_{n3} & \ldots & a_{nn} \\
                                        \end{pmatrix}
                               , \; L = \begin{pmatrix} 
                                            1 & 0 & 0 & \ldots & 0 \\
                                            * & 1 & 0 & \ldots & 0 \\
                                            * & * & 1 & \ldots & 0 \\
                                            \vdots & \vdots & \vdots & \ddots & \vdots \\
                                            * & * & * & \ldots & 1 \\
                                        \end{pmatrix}
                               , \; U = \begin{pmatrix} 
                                            a_{11} & * & * & \ldots & * \\
                                            0 & a_{22} & * & \ldots & * \\
                                            0 & 0 & a_{33} & \ldots & * \\
                                            \vdots & \vdots & \vdots & \ddots & \vdots \\
                                            0 & 0 & 0 & \ldots & a_{nn} \\
                                        \end{pmatrix} .$$

Let's start with Gaussian elimination. When we are working with the first column, we combine the first row and the second row multiplied by coefficient $$\gamma_{21} = \cfrac{a_{21}}{a{11}};$$ then the first row and the third row multiplied by coefficient $$\gamma_{31} = \cfrac{a_{31}}{a{11}},$$ and so on.

Hereby, to eliminate all elements below $a_{11}$ we need to multiply matrix A by matrix $$\Lambda_1 = \begin{pmatrix} 
                                                        1 & 0 & 0 & \ldots & 0 \\
                                                        -\gamma_{21} & 1 & 0 & \ldots & 0 \\
                                                        -\gamma_{31} & 0 & 1 & \ldots & 0 \\
                                                        \vdots & \vdots & \vdots & \ddots & \vdots \\
                                                        -\gamma_{n1} & 0 & 0 & \ldots & 1 \\
                                                    \end{pmatrix}$$
(prove it by trying to act with $\Lambda_1$ on the first column of matrix A).

Likewise, we can construct matrix $\Lambda_2$ as $$\Lambda_2 = \begin{pmatrix} 
                                                                1 & 0 & 0 & \ldots & 0 \\
                                                                0 & 1 & 0 & \ldots & 0 \\
                                                                0 & -\gamma_{32} & 1 & \ldots & 0 \\
                                                                \vdots & \vdots & \vdots & \ddots & \vdots \\
                                                                0 & -\gamma_{n2} & 0 & \ldots & 1 \\
                                                            \end{pmatrix}.$$

Finally, we will get the upper triangular matrix $$U = \Lambda_n \cdot \Lambda_{n-1} \cdot \ldots \cdot \Lambda_2 \cdot \Lambda_1 A. $$
Hence the lower triangular matrix $L = \Lambda_1^{-1} \cdot \Lambda_2^{-1} \cdot \ldots \cdot \Lambda_{n-1}^{-1} \cdot \Lambda_n^{-1}.$

One can show that, for example, $$\Lambda_1^{-1} = \begin{pmatrix} 
                                                        1 & 0 & 0 & \ldots & 0 \\
                                                        \gamma_{21} & 1 & 0 & \ldots & 0 \\
                                                        \gamma_{31} & 0 & 1 & \ldots & 0 \\
                                                        \vdots & \vdots & \vdots & \ddots & \vdots \\
                                                        \gamma_{n1} & 0 & 0 & \ldots & 1 \\
                                                    \end{pmatrix}$$

Note that we're using the `numpy` arrays to represent matrices [do **not** use `np.matrix`].

In [42]:
import numpy as np

def diy_lu_ext(a):
    """
    Construct the LU decomposition of the input matrix.
    
    Naive LU decomposition: work column by column, accumulate elementary triangular matrices.
    No pivoting.
    """
    N = a.shape[0]
    
    #Initializing the factors
    u = a.copy()
    L = np.eye(N)
    
    for j in range(N-1):
        lam = np.eye(N)
        
        #Creating the vector of gammas
        gamma = np.zeros(N-j-1)
        for i in range(N-j-1):
            gamma[i] = u[j+1+i, j]/u[j, j]
        
        #Creating matrix \Lambda_i
        for i in range(N-j-1):
            lam[j+1+i, j] = -gamma[i]
        
        #Acting with \Lambda_i on A to get U
        u_new = np.zeros((N, N))
        for ind_i in range(N):
            for ind_j in range(N):
                for ind_k in range(N):
                    u_new[ind_i, ind_j] += lam[ind_i, ind_k] * u[ind_k, ind_j]
        u = u_new.copy()
        
        #Creating matrix \Lambda_i^{-1}
        for i in range(N-j-1):
            lam[j+1+i, j] = gamma[i]
            
        #Multiplying L and \Lambda_i^{-1} o get new L
        L_new = np.zeros((N, N))
        for ind_i in range(N):
            for ind_j in range(N):
                for ind_k in range(N):
                    L_new[ind_i, ind_j] += L[ind_i, ind_k] * lam[ind_k, ind_j]
        L = L_new.copy()
        
    return L, u

In [43]:
# Now, generate a full rank matrix and test the naive implementation

import numpy as np

N = 6
a = np.zeros((N, N), dtype=float)
for i in range(N):
    for j in range(N):
        a[i, j] = 3. / (0.6*i*j + 1)

np.linalg.matrix_rank(a)

6

In [44]:
L, u = diy_lu_ext(a)
print(L@u - a)

[[ 0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00]
 [ 0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00]
 [ 0.000e+00  0.000e+00 -1.110e-16  1.110e-16  1.110e-16 -5.551e-17]
 [ 0.000e+00  0.000e+00  3.331e-16 -2.220e-16 -5.551e-17  0.000e+00]
 [ 0.000e+00  0.000e+00  0.000e+00 -1.110e-16 -1.665e-16  0.000e+00]
 [ 0.000e+00  0.000e+00 -1.110e-16 -2.776e-16  1.110e-16  1.110e-16]]


LU can be programmed in a more simple way by using the perks of `numpy`.

In [45]:
import numpy as np

def diy_lu(a):
    """
    Construct the LU decomposition of the input matrix.
    
    Naive LU decomposition: work column by column, accumulate elementary triangular matrices.
    No pivoting.
    """
    N = a.shape[0]
    
    u = a.copy()
    L = np.eye(N)
    for j in range(N-1):
        lam = np.eye(N)
        
        #Creating the vector of gammas
        gamma = u[j+1:, j] / u[j, j]
        
        #Creating matrix \Lambda_i
        lam[j+1:, j] = -gamma
        
        #Acting with \Lambda_i on A to get U
        u = lam @ u
        
        #Creating matrix \Lambda_i^{-1}
        lam[j+1:, j] = gamma
            
        #Multiplying L and \Lambda_i^{-1} o get new L
        L = L @ lam
    return L, u

In [46]:
# Tweak the printing of floating-point numbers, for clarity
np.set_printoptions(precision=3)

In [47]:
L, u = diy_lu(a)

print(L, "\n")
print(u, "\n")

# Quick sanity check: L times U must equal the original matrix, up to floating-point errors.
print(L@u - a)

[[1.    0.    0.    0.    0.    0.   ]
 [1.    1.    0.    0.    0.    0.   ]
 [1.    1.455 1.    0.    0.    0.   ]
 [1.    1.714 1.742 1.    0.    0.   ]
 [1.    1.882 2.276 2.039 1.    0.   ]
 [1.    2.    2.671 2.944 2.354 1.   ]] 

[[ 3.000e+00  3.000e+00  3.000e+00  3.000e+00  3.000e+00  3.000e+00]
 [ 0.000e+00 -1.125e+00 -1.636e+00 -1.929e+00 -2.118e+00 -2.250e+00]
 [ 0.000e+00  0.000e+00  2.625e-01  4.574e-01  5.975e-01  7.013e-01]
 [ 0.000e+00  2.220e-16  0.000e+00 -2.197e-02 -4.480e-02 -6.469e-02]
 [ 0.000e+00 -4.528e-16  0.000e+00  6.939e-18  8.080e-04  1.902e-03]
 [ 0.000e+00  4.123e-16  0.000e+00 -1.634e-17  0.000e+00 -1.585e-05]] 

[[ 0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00]
 [ 0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00  0.000e+00]
 [ 0.000e+00  0.000e+00 -1.110e-16  1.110e-16  1.110e-16 -5.551e-17]
 [ 0.000e+00  0.000e+00  3.331e-16 -2.220e-16 -5.551e-17  0.000e+00]
 [ 0.000e+00  0.000e+00  0.000e+00 -1.110e-16 -1.665e-16  0.000e+00]
 

# II. The need for pivoting

Let's tweak the matrix a little bit, we only change a single element:

In [48]:
a1 = a.copy()
a1[1, 1] = 3

Resulting matrix still has full rank, but the naive LU routine breaks down.

In [49]:
np.linalg.matrix_rank(a1)

6

In [50]:
l, u = diy_lu(a1)

print(l, u)

[[nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]] [[nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]
 [nan nan nan nan nan nan]]




### Test II.1

For a naive LU decomposition to work, all leading minors of a matrix should be non-zero. Check if this requirement is satisfied for the two matrices `a` and `a1`.

In [51]:
def check_principle_minors(matrix):
    return all([np.linalg.det(matrix[:i, :i]) for i in range(1, matrix.shape[0])])


print(f"Non-zero of A-matrix principle minors: {check_principle_minors(a)}")
print(f"Non-zero of A1-matrix principle minors: {check_principle_minors(a1)}")


Non-zero of A-matrix principle minors: True
Non-zero of A1-matrix principle minors: False


### Test II.2

Modify the `diy_lu` routine to implement column pivoting. Keep track of pivots, you can either construct a permutation matrix, or a swap array (your choice).

Implement a function to reconstruct the original matrix from a decompositon. Test your routines on the matrices `a` and `a1`.

In [52]:
def diy_lu_pivot(a):
    
    N = a.shape[0]
    
    P = np.arange(N)
    
    u = a.copy()
    L = np.eye(N)
    for j in range(N - 1):
        lam = np.eye(N)
        max_row_index = np.argmax(abs(u[j:, j])) + j
        P[[j, max_row_index]] = P[[max_row_index, j]]
        u[[j, max_row_index]] = u[[max_row_index, j]]
        gamma = u[j+1:, j] / u[j, j]
        lam[j+1:, j] = -gamma
        u = lam @ u
        lam[j+1:, j] = gamma
        L = L @ lam
    
    return L, u, P

L1, u1, P1 = diy_lu_pivot(a1)
L, u, P = diy_lu_pivot(a)

In [53]:
def original_matrix_reconstruction(l, u):
    return l @ u

print(original_matrix_reconstruction(L1, u1), "\n")
print(original_matrix_reconstruction(L, u))

[[ 3.     3.     3.     3.     3.     3.   ]
 [ 3.     0.75   0.429  0.3    0.231  0.188]
 [ 3.     1.364 -0.506 -0.892 -1.132 -1.295]
 [ 3.     1.071  0.652  0.424  0.292  0.206]
 [ 3.     0.882  0.517  0.366  0.284  0.232]
 [ 3.     3.     2.752  2.661  2.605  2.567]] 

[[3.    3.    3.    3.    3.    3.   ]
 [3.    0.75  0.429 0.3   0.231 0.188]
 [3.    1.364 0.779 0.458 0.253 0.111]
 [3.    1.071 0.652 0.473 0.375 0.313]
 [3.    0.882 0.517 0.366 0.283 0.23 ]
 [3.    1.875 1.467 1.262 1.138 1.055]]


Sum all elements in matrix `L1` and `u1` separately (for Google Form).

In [54]:
print(np.sum(L1))
print(np.sum(u1))

15.166249240140564
-3.456659008328081
