# Iterative and Conjugate Gradient Methods

When solving large linear systems, iterative methods are typically able to produce solutions that converge faster than do direct methods. This notebook provides an overview of the variety of iterative methods that have been developed:

1. Classical methods
    1. Jacobi
    2. Gauss-Seidel
    3. Successive over-relaxation
    4. Chebyshev
2. Krylov subspace methods
3. Conjugate gradient methods

This notebook is an important precursor to the 'Preconditioner Series' which assumes a working knowledge of iterative methods.

Many of the formulas and notations used are from [Matrix Computations by Gene H. Golub and Charles F. Van Loan (4th Ed.)](https://www.amazon.com/Computations-Hopkins-Studies-Mathematical-Sciences/dp/1421407949)

## Imports

In [29]:
import numpy as np
from scipy.linalg import solve_triangular
import matplotlib.pyplot as plt

## Classical methods

### Jacobi method

For a 3-by-3 linear system $Ax=b$,

$$ \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3
\end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} $$

the equation can be rearranged to express $x_1$, $x_2$, and $x_3$ in terms of the other variables:

$$ \begin{align} x_1 &= (b_1-a_{12}x_2-a_{13}x_3)/a_{11} \\ x_2 &= (b_2-a_{21}x_1-a_{23}x_3)/a_{22} \\ x_3 &= (b_3-a_{31}x_1-a_{32}x_2)/a_{33} \end{align} $$

All three equations have $x_i$ on both sides of the equal sign, and this can be interpreted as saying that the current approximation of $x_i$, $x_i^{(k-1)}$, is being used to calculate a new approximation of $x_i$, $x_i^{(k)}$. Rewritten the equations from above become:

$$ \begin{align} x_1^{(k)} &= (b_1-a_{12}x_2^{(k-1)}-a_{13}x_3^{(k-1)})/a_{11} \\ x_2^{(k)} &= (b_2-a_{21}x_1^{(k-1)}-a_{23}x_3^{(k-1)})/a_{22} \\ x_3^{(k)} &= (b_3-a_{31}x_1^{(k-1)}-a_{32}x_2^{(k-1)})/a_{33} \end{align} $$

These equations can be written in matrix form using three matrices $D_A$ (main diagonal), $L_A$ (lower triangular), and $U_A$ (upper triangular) such that $A=D_A+L_A+U_A$. Note that $L_A$ and $U_A$ do not include the main diagonal. Below is the linear system represented in matrix form:

$$ M_Jx^{(k)}=N_Jx^{(k-1)}+b $$

where $M_J=D_A$ and $N_J=-(L_A+U_A)$. Rearranging the equation to calculate $x^{(k)}$ yields:

$$ x^{(k)}=M_J^{-1}(N_Jx^{(k-1)}+b) $$

In [40]:
def jacobi(A, x0, b, rtol=1e-5, atol=1e-8, max_iter=10000):
    
    # find inverse of diagonal matrix and off-diagonal matrix
    D_inv, R = np.diag(np.reciprocal(np.diagonal(A).astype(float))), A - np.diag(np.diagonal(A))
    
    i = 0
    x = x0
    
    # loop for max_iter iterations or until converged
    while (i < max_iter):
        x_new = D_inv @ (b - R @ x)
        i += 1
        
        # test convergence
        if (np.allclose(x, x_new, rtol, atol)):
            print('Iterations:' + str(i))
            return x_new
        
        # prep for next iteration
        x = x_new
        
    # solution did not converge
    print('Maximum number of iterations reached. Did not converge.')

Here, a linear system if defined for which the solution is known: $x=\begin{bmatrix} 1 & 2 & -1 & 1\end{bmatrix}$

In [41]:
A = np.array([[10., -1., 2., 0.],
              [-1., 11., -1., 3.],
              [2., -1., 10., -1.],
              [0.0, 3., -1., 8.]])

b = np.array([6., 25., -11., 15.])
x0 = np.ones_like(b)

x = jacobi(A, x0, b); x

Iterations:15


array([ 0.9999985 ,  2.00000237, -1.00000187,  1.0000028 ])

### Gauss-Seidel method

The Gauss-Seidel method makes a small change to the Jacobi method: it uses the iterate previously calculated to calculate the next iterate. In other words, $x_1^{(k)}$ is plugged into the formula in place of $x_1^{(k-1)}$ to calculate $x_2^{(k)}$. Then, both $x_1^{(k)}$ and $x_2^{(k)}$ are used to calculate $x_3^{(k)}$.

In matrix form, the equations become:

$$ M_{GS}x^{(k)}=N_{GS}x^{(k-1)}+b $$

where $M_{GS}=D_A+L_A$ and $N_{GS}=-U_A$.

Because $M_{GS}$ is a lower triangular matrix, $x^{(k)}$ can be calculated  by forward substitution.

In [43]:
def gauss_seidel(A, x0, b, rtol=1e-8, atol=1e-8, max_iter=10000):
    M, N = np.tril(A), -np.triu(A, 1)
    
    i = 0
    x = x0
    
    # loop for max_iter iterations or until converged
    while (i < max_iter):
        x_new = solve_triangular(M, N @ x + b, lower=True)
        i += 1
            
        # test convergence
        if (np.allclose(x, x_new, rtol, atol)):
            print('Iterations:' + str(i))
            return x_new
        
        # prep for next iteration
        x = x_new
        
    # solution did not converge
    print('Maximum number of iterations reached. Did not converge.')

The same linear system of equations is solved.

In [44]:
x = gauss_seidel(A, x0, b); x

Iterations:10


array([ 1.,  2., -1.,  1.])

### Successive over-relaxation

In [None]:
def sor():

In [None]:
x = sor(A, x0, b); x

### Chebyshev method

In [None]:
def chebyshev():

In [None]:
x = chebyshev(A, x0, b); x

## Krylov subspace methods

## Conjugate gradient methods

## Resources

[Matrix Computations by Gene H. Golub and Charles F. Van Loan (4th Ed.)](https://www.amazon.com/Computations-Hopkins-Studies-Mathematical-Sciences/dp/1421407949)