# MS 141 Lecture 7 

# Linear systems

### Read: Chapter 6 Newman. Optional but recommended: Chapter 2 of Heath's book.

In [1]:
%matplotlib inline

Linear algebra calculations are ubiquitous in scientific computing. They are found in nearly all science and physics problems and in subfields as diverse as<br> data fitting and machine learning, computer graphics and image processing, ordinary and partial differential equations, network theory, etc.

An essential task is solving a [system of linear equations](https://en.wikipedia.org/wiki/System_of_linear_equations) with the form

\begin{equation}
  A {\bf x} = {\bf b},
\end{equation}

where $A$ is a square matrix with size $N \times N$ and ${\bf b}$ is a vector with $N$ components. In a typical scenario, $A$ and $\mathbf{b}$ are known, and we wish to solve for the unknown solution vector ${\bf x}$. This task is equivalent to solving $N$ simultaneous linear equations for the $N$ unknown components of $\mathbf{x}$. 

Recall that not all linear systems can be solved. If the matrix $A$ is nonsingular, which is equivalent to requiring that $\det(A) \ne 0$, then the matrix can be inverted, and there is a unique solution. We will work under this assumption. 

A seemingly obvious approach could be to compute the inverse of $A$. However, inverting matrices (especially large ones) is very computationally expensive, with cost of order $N^3$, where $N$ is the size of the matrix. **Matrix inversion is never used in practice** to solve linear systems, and is almost never used in practice in *any* computational method. 

Another bad idea would be to apply Cramer's rule, in which each component of the solution is computed as a ratio of determinants. This approach would be astronomically expensive even for relatively small matrices. This and other textbook approaches are mostly useful only as theoretical tools.

There are two main families of approaches to solve linear systems of equations:
- **Direct** approaches that obtain an exact solution (within numerical accuracy) with a finite number of steps. 
- **Iterative** approaches in which an approximate solution is found through successive approximations that converge to the exact solution.

## 1. Direct methods 
Most direct methods use the same core approach: convert the linear system to an equivalent problem that is easy to solve. 

Let us consider which linear systems are easy to solve. Obviously if the matrix is diagonal finding the solution is trivial.<br> 
More generally, it is straightforward to solve a linear system if the matrix is *triangular*. As an example, consider

\begin{equation}
  \begin{pmatrix} 1 & 2 & 3 \\ 0 & 4 & 5 \\ 0 & 0 & 2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 6 \\ 9 \\ 2 \end{pmatrix}.
\end{equation}

We can perform **back substitution** by solving from the bottom to the top. We first read off and solve the equation defined by the bottom row of the matrix:

\begin{equation}
  2\, x_3 = 2 \qquad \implies \qquad x_3 = 1.
\end{equation}

We then move up to solve the equation defined by the second row from the bottom, using the solution we found for $x_3$:

\begin{equation}
  4 x_2 + 5 x_3 = 9 \qquad \implies \qquad 4 x_2 = 4 \qquad \implies \qquad x_2 = 1.
\end{equation}

Finally, we solve the equation defined by the first row of the matrix, using the solution we found for $x_2$ and $x_3$:

\begin{equation}
  x_1 + 2 x_2 + 3 x_3 = 6 \qquad \implies \qquad x_1 = 1.
\end{equation}

Back substitution will work for any *upper* triangular matrix, provided that no diagonal element is zero, as is guaranteed by our assumption that the matrix is non-singular. If the matrix is lower triangular we would start at the top and work our way down, using **forward substitution**.

### 1.1 Gaussian elimination

We would like to convert our linear system to an equivalent system with triangular form. By equivalent, we mean that the new system has the same solution as the original. We will aim to construct an upper triangular matrix. 

Think of each matrix row as corresponding to a single linear equation, and each column corresponding to a single unknown variable.<br> 
The **three basic row operations** we can be perform without changing the solutions are:

1. Multiply a row by a non-zero scalar (equivalent to rescaling an equation).
2. Add to one row a scalar multiple of another.
3. Swap two rows (equivalent to changing the order of the equations).

The operations need to be performed consistently on both the entries of the matrix $A$ *and* the right-hand side vector ${\bf b}$.

**[Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination)** is the basic algorithm to reduce the matrix $A$ with arbitrary elements to an upper triangular matrix.<br> 
This result is achieved by performing a sequence of row operations. Once the system is in upper triangular form, it can be solved by back substitution.

The algorithm considers each column in turn, annihilating the elements below the diagonal in each column to obtain an upper triangular matrix:

1. Consider column $j$, where $j = 1 \ldots N$. Its diagonal element $a_{jj}$ is called the *pivot*. We will assume for now that all pivots are non-zero.
2. To annihilate all entries in rows $i = j+1 \ldots N\,$ below the diagonal $a_{jj}$, we multiply row $j$ by the multiplier $m_i = - a_{ij}/a_{jj}$,<br> and then add the result, respectively, to row $i = j+1 \ldots N\,$. 

Explicitly, this means replacing the elements $a_{ik}$ in row $i > j$ as
\begin{equation}
  \boxed{a_{ik} \rightarrow a_{ik} - \frac{a_{ij}}{a_{jj}} a_{jk}.}
\end{equation}

We see that when $k = j$ the coefficients $a_{ij}$ below the pivot will be set to zero, as required.

If we repeat this algorithm for all columns $j$, each time looping over rows below column $j$, we will obtain an upper triangular matrix.<br> 
Let us write this algorithm using a modified version of the code provided in Newman's book.

In [3]:
# Gaussian elimination with back substitution
import numpy as np

def GaussianEliminationBack(A,b):
    # Store size of system, consistency check
    N = len(b)
    assert(np.all(A.shape == (N, N)))
    
    for j in range(N):
        
        #check that pivot is non-zero
        assert(A[j,j] != 0)
        pivot = A[j,j]
        
        for i in range(j+1,N):
            mult = -A[i,j]/pivot
            
            A[i,:] += mult*A[j,:] #update row i>j, boxed formula
            b[i] += mult*b[j]  #update RHS vector
    
    # debug:
    #print (A)
    #print (b)
    
    # back substitution
    x = np.zeros(N)
    for i in range(N-1,-1,-1): #from the last row up
        x[i] = b[i]/A[i,i]
        
        for j in range(i+1,N):
            x[i] -= A[i,j]*x[j]/A[i,i]
    
    print('Solution: ', x)
    return(x)

In [4]:
A = np.array([[2., 1, 4, 1], \
              [3, 4, -1, -1], \
              [1, -4, 1, 5], \
             [2, -2, 1, 3]])

b = np.array([-4., 3, 9, 7])

# debug:
# print (A)
# print (b)

x = GaussianEliminationBack(A,b)

Solution:  [ 2. -1. -2.  1.]


In [5]:
# check the result with SciPy
import scipy.linalg as la
x = la.solve(A, b)

print ('SciPy\'s result:')
print (x)

SciPy's result:
[ 2. -1. -2.  1.]


### 1.2 Gaussian elimination with partial pivoting

There are two severe limitations of plain Gaussian elimination:<br> 
1) The algorithm fails if the matrix $A$ acquires a zero pivot during the elimination process (or if $a_{11}=0$).<br> 
2) If too many pivots are smaller than 1, dividing by the pivots during Gaussian elimination will lead to progressively larger matrix elements, and thus lower accuracy due to build-up of rounding errors. 
This was one of the main concerns of Von Neumann and colleagues in the early day of scientific computing.

The solution to these problems is as simple as swapping rows and choosing at each step the largest pivot available in a given column below the diagonal.<br> 
The approach of swapping rows to choose the largest pivot is called *partial pivoting* and is always used in practical implementations of Gaussian elimination.<br>

The algorithm below implements Gaussian elimination with partial pivoting.

In [6]:
# Gaussian elimination with partial pivoting and backsubstitution (3 loops)
import numpy as np

def GaussianEliminationPivotingBack(A,b):
    # Store size of system, consistency check
    N = len(b)
    assert(np.all(A.shape == (N, N)))
    
    for j in range(N): # Loop 1
        
        #find row with pivot of largest magnitude
        max_row = np.argmax(abs(A[j:,j]))
        #print (max_row)
        
        # swap rows if needed
        if (max_row != 0):
            print ('swapping rows ', j, ' and', j+max_row)
            
            tmp_A = np.copy(A[j, :])
            A[j, :] = np.copy(A[j+max_row, :])
            A[j+max_row, :] = np.copy(tmp_A)
            
            tmp_b = np.copy(b[j])
            b[j] = np.copy(b[j+max_row])
            b[j+max_row] = np.copy(tmp_b)
            
            #debug:
            #print (A,b)
            
        pivot = A[j,j]
        #print (pivot)
        
        for i in range(j+1,N):  # Loop 2
            mult = -A[i,j]/pivot
            
            A[i,:] += mult*A[j,:] # Loop 3: update row i>j
            b[i] += mult*b[j]  #update RHS vector
    
            #debug:
            #print (A,b)
            
    # optional: print the upper triangular matrix
    #print (A)
    #print (b)
    
    # Backsubstitution (two loops)
    x = np.zeros(N)
    for i in range(N-1,-1,-1): #from the last row up
        x[i] = b[i]/A[i,i]
        
        for j in range(i+1,N):
            x[i] -= A[i,j]*x[j]/A[i,i]
    
    print('Solution: ', x)
    return(x)

In [7]:
A = np.array([[0., 1, 4, 1], \
              [3, 4, -1, -1], \
              [1, -4, 1, 5], \
             [2, -2, 1, 3]])

b = np.array([-4., 3, 9, 7])

x = GaussianEliminationPivotingBack(A,b)
# check the result
print (np.dot(A,x)-b) # Ax-b=0

swapping rows  0  and 1
swapping rows  1  and 2
Solution:  [ 1.61904762 -0.42857143 -1.23809524  1.38095238]
[ 0.00000000e+00  0.00000000e+00 -4.44089210e-16 -2.22044605e-16]


Partial pivoting reduces numerical rounding errors as one is less likely to subtract very large or very small numbers. <br>
In the code above, we kept track of loops (each performing order $N$ operations) in both Gaussian elimination and backsubstitution.<br> 
This allows us to estimate the computational cost:
- Gaussian elimination has 3 nested loops, one of which is hidden in the code above by a vectorized NumPy operation. We thus expect a computational cost of order $N^3$. More precisely, one can shown that the computational cost of Gaussian elimination is $N^3/3$ for a linear system of size $N\times N$.
- Backsubstitution has only 2 nested loops, so its computational cost is $N^2$ and is not dominant for large systems.

### 1.3 LU Decomposition

In *decomposition methods*, we rewrite the matrix $A$ as a product of two or more matrices, chosen to obtain a set of easy-to-solve problems. We will show below that Gaussian elimination is equivalent to decomposing the original matrix as $A = LU$, namely as the product of a lower-triangular matrix $L$ and an upper triangular matrix $U$. Storing and using the matrices $L$ and $U$ makes it possible to perform efficiently a wide range of linear algebra calculations.<br> 

In Gaussian elimination, we performed row operations to transform the linear system $ A {\bf x} = {\bf b} $
to an equivalent system that was easy to solve.<br> 
The matrix $A$ was transformed to an upper triangular matrix $U$, and we solved the corresponding system by back substitution. 

The row operations performed on $A$ during Gaussian elimination can equivalently be written as a series of matrix multiplications. Annihilating the entries in column $j$ below pivot $a_{jj}$ can be achieved by multiplying $A$ through a lower-triangular annihilation matrix $M_j$: 

$$
M_j = \begin{bmatrix} 
    1      & \ldots & 0 &  0 & \ldots & 0\\
    \vdots & \ddots & \vdots &  \vdots & \ddots & \vdots  \\
    0 &  \ldots &  1  & 0 & \ldots & 0 \\
    0 & \ldots & m_{j+1} & 1 & \ldots & 0 \\
    \vdots & \ddots & \vdots &  \vdots & \ddots & \vdots  \\
    0 & \ldots & m_N & 0 & \ldots & 1
    \end{bmatrix} $$


with elements $m_i$ equal to the multipliers defined above, $m_i = - a_{ij}\,/\,a_{jj}$, for $i = j+1,\ldots,N$.<br> 
    
The row operations performed in Gaussian elimination without pivoting are equivalent to applying $N-1$ lower triangular matrices $M_j$ to matrix $A$, which reduces it to an upper triangular matrix $U$: 

$$ (M_{N-1} \ldots M_1)\, A = U $$

The inverse of each lower triangular matrix $M_j$ is still lower triangular, and we write it as $L_j = M_j^{-1}$. 
The product of all matrices $L_j$ is the lower triangular matrix $L = L_1 \,L_2 \ldots L_N$ with diagonal elements equal to 1 and non-zero elements below the diagonal. Therefore, we can write:

$$ A = (M_{N-1} \ldots M_1)^{-1}\,U = (L_1 \,L_2 \ldots L_N)\,U = LU$$

In practice, one can decompose $A$ in its $LU$ form without multiplying all the matrices $M_j$, but simply by keeping track of the lower-diagonal entries (multipliers) used to reduce $A$ to upper triangular form during Gaussian elimination. Typically, the matrices $L$ and $U$ are stored together by overwriting $A$ (the diagonal elements of $L$ are equal to 1 and need not be stored). 

Once we obtain the $A=LU$ decomposition, the linear system can be solved for an arbitrary number of righthand side (RHS) vectors $\mathbf{b}$ as the equivalent two systems:

\begin{equation}
  A {\bf x} = {\bf b} \qquad \Leftrightarrow \qquad \left\{ \begin{aligned} L {\bf y} & = {\bf b}, \\ U {\bf x} & = {\bf y}. \end{aligned} \right. 
\end{equation}

As both $L$ and $U$ are triangular, the two systems they define can easily be solved using forwards and backward substitution respectively.<br> By storing the matrices $L$ and $U$, we can solve the system $A \mathbf{x} = \mathbf{b}$ for any number of RHS vectors without performing Gaussian elimination each time. 

The great advantage is that while Gaussian elimination and $LU$ decomposition each cost $\mathcal{O}(N^3)$,  performing back or forward substitution costs $\mathcal{O}(N^2)$.<br> 
The [`LAPACK` routine `dgesv`](http://www.netlib.org/lapack/explore-html/d7/d3b/group__double_g_esolve_ga5ee879032a8365897c3ba91e3dc8d512.html) performs $LU$ decomposition and returns the matrix $A$ in its $LU$ decomposed form and the solutions to $A \mathbf{x} = \mathbf{b}$ for any number of RHS vectors.

Lastly, one may wonder when $LU$ decomposition can be performed. One can show that if a matrix is nonsingular, $LU$ decomposition exists provided one performs partial pivoting so that all elements $u_{jj} \ne 0$. 
In practice, a condition that is much easier to check is that a matrix is *strictly diagonally dominant*, namely that the diagonal is greater in magnitude than the sum of all other elements in the row:
$$| a_{i i} | > \sum_{\substack{j \ne i}}^n | a_{i j} |,
        \quad (1 \le i \le n).$$

One can show that every strictly diagonally dominant matrix is nonsingular and has an $LU$ decomposition. 
Diagonal dominance is a very simple condition to check. Most matrices involved in numerical methods for solving PDEs are diagonally dominant. 

The $LU$ decomposition is in general not unique as $L$ and $U$ together have $N^2+N$ free coefficients while $A$ only has $N^2$. We can freely choose $N$ coefficients. The Doolittle approach discussed below removes this ambiguity by setting the diagonal elements of $L$ to 1. 

The code below performs $LU$ decomposition using the [Doolittle algorithm](https://www.geeksforgeeks.org/doolittle-algorithm-lu-decomposition/). By multiplying the elements of $L$ and $U$ and equating them to $A$, one can obtain $L$ and $U$ by looping over rows and columns using three nested loops, with cost of order $N^3$. This example with a $3\times3$ matrix illustrates the approach.

$$
\left(\begin{array}{ccc}
    1  & 0 & 0 \\
    l_{21} & 1 & 0 \\
    l_{31} & l_{32} & 1 
\end{array}\right)
\left(\begin{array}{ccc}
    u_{11}  & u_{12} & u_{13} \\
    0  & u_{22} & u_{23} \\
    0  & 0 & u_{33}
\end{array}\right)
=
\left(\begin{array}{ccc}
    a_{11}  & a_{12} & a_{13} \\
    a_{21}  & a_{22} & a_{23} \\
    a_{31}  & a_{32} & a_{33}
\end{array}\right)
$$

The matrix elements of $L$ and $U$ working from the first row / column of $A$ to the last:<br>

$\text{1st row:}\,\,\,\,\, u_{11} = a_{11}\,, \,\,\,\,\, u_{12} = a_{12}\,, \,\,\,\,\, u_{13} = a_{13} $<br>

$\text{1st column:}\,\,\,\,\, l_{21}u_{11}  = a_{21}\,, \,\,\,\,\, l_{31}u_{11}  = a_{31} $<br>

$\text{2nd row:}\,\,\,\,\, l_{21}u_{12}  + u_{22} = a_{22}\,, \,\,\,\,\, l_{21}u_{13}  + u_{23} = a_{23} $<br>

$\text{2nd column:}\,\,\,\,\, l_{31}u_{13}  + l_{32} u_{23} = a_{32} $<br>

$\text{3rd row:}\,\,\,\,\,  l_{31}u_{13} + l_{32} u_{23}  + u_{33} = a_{33} $<br>


More general formulas can be found at the Doolittle algorithm link given above.<br>
This algorithm fails if the pivot $u_{ii}=0$. As with Gaussian elimination, pivoting is necessary for general matrices.

In [8]:
# LU decomposition using the Doolittle factorization
def LU_decomposition(A):
    
    L = np.zeros_like(A)
    U = np.zeros_like(A)
    N = np.size(A, 0)
    
    for k in range(N):
        #diagonal elements of L and U
        L[k, k] = 1
        U[k, k] = (A[k, k] - np.dot(L[k, :k], U[:k, k])) / L[k, k]
        
        # elements above (U) or below (L) the diagonal
        for j in range(k+1, N):
            U[k, j] = (A[k, j] - np.dot(L[k, :k], U[:k, j])) / L[k, k]
        for i in range(k+1, N):
            L[i, k] = (A[i, k] - np.dot(L[i, :k], U[:k, k])) / U[k, k]
    
    return L, U

We validate this code for the example matrix used above. While this simple code does not include pivoting (row permutation),<br> 
accurate $LU$ decomposition code should always implement pivoting. 

In [9]:
A = np.array([[2., 1, 4, 1], \
              [3, 4, -1, -1], \
              [1, -4, 1, 5], \
             [2, -2, 1, 3]])

L, U = LU_decomposition(A)
print ('L: \n',L,'\n')
print ('U: \n',U,'\n')

print ('check:')
print (np.dot(L,U) - A,'\n')

L: 
 [[ 1.          0.          0.          0.        ]
 [ 1.5         1.          0.          0.        ]
 [ 0.5        -1.8         1.          0.        ]
 [ 1.         -1.2         0.83823529  1.        ]] 

U: 
 [[  2.    1.    4.    1. ]
 [  0.    2.5  -7.   -2.5]
 [  0.    0.  -13.6   0. ]
 [  0.    0.    0.   -1. ]] 

check:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]] 



**Pivoting in LU decomposition.** When pivoting is performed in $LU$ decomposition, swapping two rows can be achieved by multiplying through a permutation matrix $P$. For example, the matrix that interchanges the first two rows of a $3 \times 3$ matrix can be written as:

$$ P = \begin{bmatrix} 
    0 & 1 & 0 \\
    1 & 0 & 0 \\
    0 & 0 & 1
    \end{bmatrix}$$
    
Generalizing the treatment discussed above, each step in the LU decomposition can be seen as a multiplication by a lower-diagonal matrix $M$ multiplied, as needed, by a permutation matrix $P$. Therefore, we can write

$$ (M_{N-1}\,P_{N-1} \ldots M_1\,P_1)\, A = U, $$

where $P_k$ is the identity matrix if no permutation is performed at step $k$. Since the product $P = P_{N-1}P_{N-2}\ldots P_1$ is a permutation matrix performing all necessary row reorderings on $A$, we can rewrite the resulting decomposition as:

$$ P A = L U $$

In the presence of partial pivoting, the matrix $P$ (or equivalently, a list of row exchanges) is also returned together with the matrices $L$ and $U$.

### 1.4 Determinant and matrix inversion

The determinant and the inverse of a matrix can also be computed from its $LU$ decomposition: 

The determinant of a triangular matrix is simply the product of its diagonal entries. We can obtain $\det(A)$ from the product of the diagonal entries in $U$: 

$$\det(A) = \det(LU) = \det(L) \det(U) = \det(U) = u_{11} \times u_{22}\, \ldots \times u_{NN},$$ 

where we used the fact that $\det(L)=1$ because its diagonal elements are equal to 1. Using $LU$ decomposition, finding the determinant is an ${\cal O}(N^3)$ calculation while expansion in minors is an ${\cal O}(N!)$
task.<br>

We set up the problem of finding the inverse as finding the matrix $X$ that satisfies $ A X = I $. Solving for the column $j$ of the unknown inverse $X$, defined here as $\mathbf{x}_j$, is equivalent to solving the linear system $A\mathbf{x}_j = \mathbf{I}_j$, where $\mathbf{I}_j$ is the $j^{\rm th}$ column vector of the identity matrix, with components all equal to zero except component $j$, which is equal to 1. 
In principle, one could solve $n$ linear systems to determine the inverse using Gaussian elimination, but that would be very costly. Rather, once we have computed the LU decomposition of matrix $A$, computing the inverse is straightforward. In the lab assignment, you will develop a routine to compute the inverse of a matrix $A$ using LU decomposition.

### 1.5 Special matrices
So far we have assumed that the matrix $A$ of the linear system is a general matrix and is *dense*, meaning that nearly all of the matrix entries are non-zero.<br> 
If the matrix has some special properties, then workload and storage can often be saved in solving the linear system. Examples of properties that one can leverage to design an algorithm with lower computational cost and storage include:

- Symmetric: $A=A^{T}$, and thus $a_{ij} = a_{ji}$ for all $i,j$. 
- Hermitian: $A=A^{\dagger}$, where $A^\dagger$ is the Hermitian conjugate. This implies $a_{ij}=(a_{ji})^*$ for all $i,j$.
- Positive definite: $\mathbf{x^T}A\,\mathbf{x} > 0$ for all $\mathbf{x} \ne 0$; equivalently, all the eigenvalues are positive.
- Banded: $a_{ij}=0$ for all $|i-j|<\beta$, where $\beta$ is the bandwidth of the matrix. An important special case is tridiagonal matrices, for which $\beta = 1$.
- Sparse: most entries of the matrix $A$ are zero.

Techniques handling symmetric and banded systems are relatively straightforward variations of Gaussian elimination and $LU$ decomposition for dense systems. An example is Cholesky factorization, the equivalent of $LU$ decomposition for a symmetric (or Hermitian) matrix that is also positive definite.<br>
This method is analyzed in the example below and in the Lab assignment. 

Sparse linear systems, on the other hand, require more sophisticated algorithms and data structures that avoid storing or operating on the zeros in the matrix. Sparse systems are often best solved using *iterative* methods, which are discussed in the next lecture.

### Example: Cholesky factorization

If the matrix we want to decompose is symmetric and positive definite, the $LU$ factorization can be arranged so that $U=L^T$, that is, we decompose as $A=LL^T$ using only a lower triangular matrix (and its transpose) with positive entries that are in general different from 1.<br> 
This approach is known as [Cholesky decomposition](https://www.geeksforgeeks.org/cholesky-decomposition-matrix-decomposition/) and is illustrated below with an example for a $2\times2$ matrix.

  \begin{equation}
    L =
    \begin{pmatrix}
      l_{11} & 0 \\
      l_{21} & l_{22}
    \end{pmatrix}
  \end{equation}
  
  Since $A = L L^T$, we equate the matrix elements explicitly and obtain:
  
  \begin{align}
                && A & = L L^T \\
    \Rightarrow &&
    \begin{pmatrix}
      a_{11} & a_{12} \\
      a_{21} & a_{22}
    \end{pmatrix}
    & =
    \begin{pmatrix}
      \ell_{1 1}^2 & \ell_{1 1} \ell_{2 1} \\
      \ell_{1 1} \ell_{2 1} & \ell_{2 1}^2 +  \ell_{2 2}^2
    \end{pmatrix}
  \end{align}

This implies that we can find the elements of $L$ working from the first row / column to the last:

$$l_{11} = \sqrt{a_{11}},\,\,\,\,\, l_{21} = a_{21}\,/\,l_{11},\,\,\,\,\, l_{22} = \sqrt{a_{22} - l_{21}^2}$$

The great advantage is that Cholesky factorization can be accomplished with only about $N^3/6$ multiplications and additions, and thus about half the computational cost of $LU$ factorization. Only the lower triangular portion of $A$ is needed, and thus the upper triangular portion need not be stored.<br> 
In the Lab assignment, you will write code to perform the Cholesky factorization of a symmetric matrix. 

## Linear algebra libraries and resources

- The [LAPACK library](http://www.netlib.org/lapack/) provides a variety of [modules](http://www.netlib.org/lapack/explore-html/modules.html) to solve linear systems and perform a wide range of other linear algebra tasks. LAPACK relies on the lower-level BLAS library that includes matrix-vector and matrix-matrix operations for better utilization of hierarchical CPU memory and optimal data reuse. Generic versions of BLAS and LAPACK are available from Netlib, and many computer vendors (e.g., Intel and Cray) provide custom versions that are optimized for higher performance on their particular system.  
- A standard reference for numerical linear algebra is Trefethen, Numerical Linear Algebra, SIAM.