## 6.2 The Conjugate Gradient Method

In [None]:
import numpy as np
import time

**Implementiion 6.6: Conjugate Gradient Method**

We implement the CG method using `numpy` functions for matrix-vector and scalar products. We also make sure that we compute each product only once and reuse the result.

In [None]:
def cg(A, b, x, k=100, tol=1e-5):
    x = x.copy()
    d, Ad, r = [np.zeros_like(r) for _ in range(3)]
    r[:] = b - A.dot(x)
    nrm_r2 = r.dot(r)
    tol = tol**2

    for i in range(k):
        if nrm_r2 < tol:
            break
        Ad[:] = A.dot(d)
        dAd = d.dot(Ad)
        alpha = nrm_r2 / dAd
        x[:] += alpha * d
        r[:] -= alpha * Ad
        nrm_r2_new = r.dot(r)

        beta = nrm_r2_new / nrm_r2
        d[:] = r + beta * d
        nrm_r2 = nrm_r2_new
    else:
        print(f'CG method did not converge after {k} iterations.')
    return x, i

**Example 6.9**

To compare the CG method with the previous iterative methods, we consider the model matrix again.

In [None]:
for i in range(1, 11):
    m = i * 10
    n = m**2
    N = np.diag(np.ones(m - 1), 1) + np.diag(np.ones(m - 1), -1)
    B = 4 * np.eye(m) - N
    A = np.kron(np.eye(m), B) - np.kron(N, np.eye(m))
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = cg(A, b, x0, it=int(1e6), tol=1e-6)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'CG Method: n={n:05d}, Steps={m:05d} Time={t:07.4f} sec, res={res:4.2e}')

As our implementation of the CG method does not require us to access individual entries of matrices and vectors, the implementation is even more efficient than the SOR method with optimal relaxation parameter.

## 6.3 Preconditioned CG method

**CG Method with Jacobi Preconditioning**

For an efficient implementation of the preconditioned CG method, we have to decide what preconditioner $P\approx A^{-1}$ to use. The simplest choice if the Jacobi preconditioner $P=D^{-1}$, where $D$ is the diagonal part of the matrix.

In [None]:
def cg_pre_jacobi(A, b, x, it=100, tol=1e-5):
    x = x.copy()
    r = b.copy() - A.dot(x)
    P_inv = 1 / np.diag(A)
    p = P_inv * r.copy()
    d = p.copy()
    rp = r.dot(p)
    tol = tol**2
    
    for i in range(1, it + 1):
        if abs(rp) < tol:
            break
        Ad = A.dot(d)
        alpha = rp / d.dot(Ad)
        x[:] += alpha * d
        r[:] -= alpha * Ad
        p[:] = P_inv * r
        rp2 = r.dot(p)
        beta = rp2 / rp
        d[:] = p + beta * d
        rp = rp2
    else:
        print(f'The Jabobi preconditioned CG method did not converge after {i} iterations.')
    return x, i

#### Example 6.11

Applied to the model matrix, the Jacobi preconditioned CG method yields the following results

In [None]:
for i in range(1, 11):
    m = i * 10
    n = m**2
    N = np.diag(np.ones(m - 1), 1) + np.diag(np.ones(m - 1), -1)
    B = 4 * np.eye(m) - N
    A = np.kron(np.eye(m), B) - np.kron(N, np.eye(m))
    b = np.ones(n)
    x0 = np.zeros(n)
    
    t = time.perf_counter()
    x, m = cg_pre_jacobi(A, b, x0, it=int(1e6), tol=1e-6)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'Jacobi preconditioned CG method: n={n:05d}, Steps={m:05d} Time={t:07.4f}sec, res={res:4.2e}')    

Jacobi preconditioning has not significantly accelerated the method. To implement the SSOR preconditioner, we implement the method with a general matrix $P$ and hide the application of  $P^{-1}$ in `np.linalg.solve(P, r)`.

In [None]:
def cg_pre(A, P, b, x, it=100, tol=1e-5):
    x = x.copy()
    r = b.copy() - A.dot(x)
    p = np.linalg.solve(P, r)
    d = p.copy()
    rp = r.dot(p)
    tol = tol**2
    
    for i in range(1, it + 1):
        if abs(rp) < tol:
            break
        Ad = A.dot(d)
        alpha = rp / d.dot(Ad)
        x[:] += alpha * d
        r[:] -= alpha * Ad 
        p[:] = np.linalg.solve(P, r)
        rp2 = r.dot(p)

        beta = rp2 / rp
        d[:] = p + beta * d
        rp = rp2
    else:
        print(f'The precondioned CG method did not converge after {i} iterations.')
    return x, i

Applied to the model matrix and the optimal choice of $\omega=2 - \frac{2\pi}{\sqrt{n}}$, we get

In [None]:
for i in range(1, 9):
    m = i * 10
    n = m**2
    N = np.diag(np.ones(m - 1), 1) + np.diag(np.ones(m - 1), -1)
    B = 4 * np.eye(m) - N
    A = np.kron(np.eye(m), B) - np.kron(N, np.eye(m))
    b = np.ones(n)
    x0 = np.zeros(n)
    b = np.ones(n)
    x0 = np.zeros(n)
    
    omega = 2 - 2 * np.pi / np.sqrt(n)
    D = np.diag(np.diag(A))
    D1 = np.diag(1 / np.diag(A))
    L = np.tril(A, -1)
    R = np.triu(A, 1)
    P = (D + omega * L) @ D1 @ (D + omega * R)
    
    t = time.perf_counter()
    x, m = cg_pre(A, P, b, x0, it=int(1e6), tol=1e-6)
    t = time.perf_counter() - t
    
    res = np.linalg.norm(b - np.dot(A, x))
    print(f'SSOR-preconditioned CG-Verfahren: n={n:04d}, Steps={m:05d} Time={t:07.4f}sec, res={res:4.2e}')

The convergence of the scheme has significantly improved and the number of necessary steps increases very slowly. However, solving the linear system in `p[:] = np.linalg.solve(P, r)` is much more expensive, so that each step takes significantly longer. This underlines the fact that a key property of an effective preconditioner, is that the application should be cheap.