## 3.7 Iterative Methods for Solving Linear Systems

In [2]:
import numpy as np

**The simple Richardson iteration**

In [1]:
def richardson(A, b, x, it=1000, omega=1, tol=1e-5):
    x = x.copy()
    for i in range(it):
        w = b - np.dot(A, x)
        if np.linalg.norm(w) < tol:
            break
        x += omega * w
    return x, i

We test the simple Richardson iterration with the linear system $Ax=b$ using
$$A=\begin{pmatrix}3 & 1.8 & 1\\ 1.4 & 2.3 & -0.7\\ 0.8 & 0.3 & 1.5 \end{pmatrix}\qquad
b = \begin{pmatrix} 1.2\\-2.1\\0.6\end{pmatrix}.$$
Using `numpy`, we get the following 'exact' solution

In [None]:
A = np.array([[3.0, 1.8, 1],
              [1.4, 2.3, -0.7],
              [0.8, 0.3, 1.5]])
b = np.array([1.2, -2.1, 0.6])

x_np = np.linalg.solve(A, b)
print(x_np)

We apply the Richardson iteration to this system with the relaxation parameter $\omega=1$ and the initial guess $x_0 = (1, -1, 0)$, which is reasonably close to the exact solution.

In [None]:
x0 = np.array([1.0, -1.0, 0.0])
x, n = richardson(A, b, x0, it=100, omega=1)
print(f'x = {x} after {n} steps')
print(f'||x - x_ex||_2 = {np.linalg.norm(x - x_np)}')

The method clearly diverges. However, if we take a smaller relaxation parameter $\omega$, we can achieve convergence:

In [None]:
x, n = richardson(A, b, x0, it=50, omega=0.4)
print(f'x = {x} after {n} steps')
print(f'||x - x_ex||_2 = {np.linalg.norm(x - x_np)}')

**Jacobi method**

This implementation using `numpy` is **NOT OPTIMAL**, as we use Python loops to compute inner products, rather than using functionality provided by `numpy`. We opted for this approach to highlight the differences between the Jacobi and Gauss-Seidel method.

In [9]:
def jacobi(A, b, x, it=1000, tol=1e-5):
    n, m = A.shape
    x, x_neu = x.copy(), x.copy()
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        for i in range(n):
            s = 0
            for j in range(i):
                s += A[i, j] * x[j]
            for j in range(i + 1, n):
                s += A[i, j] * x[j]
            x_neu[i] = (b[i] - s) / A[i, i]
        x[:] = x_neu
    return x, k

Applying this to our previous example yields

In [None]:
x, n = jacobi(A, b, x0, it=50)
print(f'x = {x} after {n} steps')
print(f'||x - x_ex||_2 = {np.linalg.norm(x - x_np)}')

**Gauss-Seidel method**

In [12]:
def gauss_seidel(A, b, x, it=100, tol=1e-5):
    n, m = A.shape
    x, x_neu = x.copy(), x.copy()
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        for i in range(n):
            s = 0
            for j in range(i):
                s += A[i, j] * x_neu[j]
            for j in range(i + 1, n):
                s += A[i, j] * x[j]
            x_neu[i] = (b[i] - s) / A[i, i]
        x[:] = x_neu
    return x, k

Applying this to our previous example yields

In [None]:
x, n = gauss_seidel(A, b, x0, it=50)
print(f'x = {x} after {n} steps')
print(f'||x - x_ex||_2 = {np.linalg.norm(x - x_np)}')

We see that the Gauss-Seidel method requires only about half the number of steps to achive the same accuracy.

The Gauss-Seidel method requires about half of the iteration steps than the Jacobi method.

### Convergence criteria for the Jacobi and Gauss-Seidel methods

#### Example 3.47 (Jacobi and Gauss-Seidel methods for the model matrix)

We consider the linear system $Ax=b$ with the matrix $A\in\mathbb{R}^{n\times n}$ given by
$$ A = \begin{pmatrix}2 & -1 \\ -1 & 2 & -1 \\ & \ddots & \ddots & \ddots \\ && -1 & 2 & -1\\ &&& -1 & 2 \end{pmatrix},$$
and the right-hand side vector $b\in\mathbb{R}^n$ mit $b=(1,\dots,1)^T$. The matrix is irreducible, diagonally dominant while the first andf last rows are stricly diagnally dominant. Therefore, both the Jacobi and Gauss-Seidel methods converge.

In [15]:
import time

In [None]:
for i in range(4):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = jacobi(A, b, x0, it=int(2e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Jacobi: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

In [None]:
for i in range(4):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = gauss_seidel(A, b, x0, it=int(2e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Gauss-Seidel: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

This shows that the Gauss-Seidel methods needs almost exactly half as many steps as the Jacobi method. Consequently, it is twice as fast if the methods are implemented equally efficiently.

Using `numpy`s functionality to compute inner products, we can implement the methods significantly more efficiently.

In [17]:
def jacobi_np(A, b, x, it=1000, tol=1e-5):
    n, m = A.shape
    d = np.diag(A)
    x = x.copy()
    for k in range(it):
        res = b - A.dot(x)
        if np.linalg.norm(res) < tol:
            break
        x += res / d
    return x, k

In [20]:
def gauss_seidel_np(A, b, x, it=100, tol=1e-5):
    n, m = A.shape
    x = x.copy()
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        x_alt = x.copy()
        for i in range(n):
            s1 = np.dot(A[i, :i], x[:i])
            s2 = np.dot(A[i, i + 1:], x_alt[i + 1:])
            x[i] = (b[i] - s1 - s2) / A[i, i]
    return x, k

In [None]:
for i in range(6):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = jacobi_np(A, b, x0, it=int(2e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Jacobi: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

In [None]:
for i in range(5):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = gauss_seidel_np(A, b, x0, it=int(1e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Gauss-Seidel: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

Here, we see that accessing individual entries of our vector and matrix from python is significantly more expensive than the computation of inner products by `numpy`. This results in the Jacobi method beating the Gauss-Seidel method, even though the latter needs half as many iterations. However, by using the matrix form of the GauÃŸ-Seidel method, and using an optimized `scipy` routine for the forward solve step, we get a similar level of efficiency.

In [22]:
import scipy as sp

def gauss_seidel_sp(A, b, x, it=100, tol=1e-5):
    n, m = A.shape
    x = x.copy()
    R = np.triu(A, 1)
    LD = np.tril(A, 0)
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        x = sp.linalg.solve_triangular(LD, b - np.dot(R, x), lower=True)
    return x, k

In [None]:
for i in range(5):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = gauss_seidel_sp(A, b, x0, it=int(1e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Gauss-Seidel: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

This highlights that an efficient implementation of an algorithm is key to its performance in practice.

### 3.7.2 Relaxation methods: The SOR method

We implement the SOR method in the same manner as we implemented the Jacobi and Gauss-Seidel methods, i.e., with loops in python and direct access to individual entries. Consequently, we can compare the methods' efficiency against these implementations of the latter methods.

In [24]:
def sor(A, b, x, omega, it=100, tol=1e-5):
    assert (omega > 0 and omega < 2), 'omega not contained in (0, 2)'
    n, m = A.shape
    x, x_neu = x.copy(), x.copy()
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        for i in range(n):
            s = 0
            for j in range(i):
                s += A[i, j] * x_neu[j]
            for j in range(i + 1, n):
                s += A[i, j] * x[j]
            x_neu[i] = omega * (b[i] - s) / A[i, i] + (1 - omega) * x[i]
        x[:] = x_neu
    return x, k

We first consider again the system $Ax=b$ with $$A=\begin{pmatrix}3 & 1.8 & 1\\ 1.4 & 2.3 & -0.7\\ 0.8 & 0.3 & 1.5 \end{pmatrix}\qquad b = \begin{pmatrix} 1.2\\-2.1\\0.6\end{pmatrix}.$$

In [None]:
A = np.array([[3.0, 1.8, 1],
              [1.4, 2.3, -0.7],
              [0.8, 0.3, 1.5]])
b = np.array([1.2, -2.1, 0.6])

x_np = np.linalg.solve(A, b)

x0 = np.array([1.0, -1.0, 0.0])
x, n = sor(A, b, x0, it=100, omega=1.2)
print(f'x = {x} after {n} steps')
print(f'||x - x_ex||_2 = {np.linalg.norm(x - x_np)}')

The method converges faster than the previous two methods. However, this strongly depends on the correct choice of $\omega$. Try different values for $\omega$.

#### Example 3.49 (Model matrix system with the SOR method.)

We return to the model matrix

In [None]:
for i in range(5):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    lam = 1 - np.pi**2 / (2 * (n+1)**2)
    omega = 2 * (1 - np.sqrt(1 - lam**2)) / lam**2
    
    t = time.perf_counter()
    x, m = sor(A, b, x0, omega=omega, it=int(2e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'SOR: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

For $n=80$, the SOR method is about 40-times faster than the Gauss-Seidel method. 

Using `numpy`, we can again improve the performance a little bit

In [29]:
def sor_np(A, b, x, omega, it=100, tol=1e-5):
    n, m = A.shape
    x, x_alt = x.copy(), x.copy()
    for k in range(it):
        if np.linalg.norm(b - np.dot(A, x)) < tol:
            break
        x_alt[:] = x
        for i in range(n):
            s1 = np.dot(A[i, :i], x[:i])
            s2 = np.dot(A[i, i + 1:], x_alt[i + 1:])
            x[i] = omega * (b[i] - s1 - s2) / A[i, i] + (1 - omega) * x_alt[i]
    return x, k

In [None]:
for i in range(6):
    n = 10 * 2**i
    A = np.diag(2 * np.ones(n), k=0) + np.diag(-1 * np.ones(n - 1), k=1) + np.diag(-1 * np.ones(n - 1), k=-1)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    lam = 1 - np.pi**2 / (2 * (n+1)**2)
    omega = 2 * (1 - np.sqrt(1 - lam**2)) / lam**2
    
    t = time.perf_counter()
    x, m = sor_np(A, b, x0, omega=omega, it=int(2e6), tol=1e-4)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'SOR: n = {n:03d}, steps = {m:07d} time = {t:07.3f}sec, res = {res:4.2e}')

Due to the significant reduction in the number of necessary iteration steps, the SOR method is faster than the Jacobi method using only `numpy` inner products, even though we have to access parts of the matrix and vectors for the SOR method.