Preconditioning
===

We call $C$ a preconditioner to the matrix $A$ if 
* $C^{-1} A \approx I$
* the matrix-vector multiplication $w = C^{-1} r$ is cheap

One extreme case is $C = A$, where the first claim is optimally satisfied, but (in general) not the second. The opposite extreme is $C = I$. 

If $A$ is an SPD matrix, we like to have an SPD preconditioner $C$. In this case, the quality of the approximation can be measured by the spectral bounds

$$
0 < \gamma_1 \leq \frac{x^T A x }{x^T C x} \leq \gamma_2 \qquad \forall \, 0 \neq x \in {\mathbb R^n}
$$

These spectral bounds are bounds for the eigenvalues $\lambda$ of the generalized eigenvalue problem

$$
A x = \lambda C x
$$

If $\lambda_i$ is an eigenvalue with eigenvector $x_i$, then $x_i^T A x_i = \lambda_i x_i^T C x_i$, and thus $\lambda_i \in [\gamma_1, \gamma_2]$

The preconditioned Richardson iteration
---
We use the preconditioner to obtain the correction from the residuum:

$$
\qquad  x^{k+1} = x^k - \alpha C^{-1} (b - A x^k)
$$

The error is now propagated as

$$
e^{k+1} = M e^k = (I - \alpha C^{-1} A) e^k
$$

The error-propagation matrix $M$ is self-adjoint in the energy inner product
\begin{eqnarray*}
\left< M x, y \right>_A & = & \left\{ A (x - \alpha C^{-1} A x) \right\}^T y  \\
& = & x^T (A - \alpha A C^{-1} A) y \\
& = & \left< x, M y \right>_A
\end{eqnarray*}

as well as in the inner products
$$
\left< x, y \right>_C \qquad \text{and} \qquad \left< x, y \right>_{AC^{-1} A}.
$$
The error is monotonically decreased in the corresponding norms. In particular the last one is practically interesting since it is computationally available:

\begin{eqnarray*}
\| x - x^\ast \|_{AC^{-1} A}^2 & = & \| A (x - x^\ast) \|_{C^-1}^2 \\
& = & \| A x - b \|_{C^{-1}}^2
\end{eqnarray*}

With the residuum $r$ and preconditioned residuum $w$, i.e.

$$
r = b - A x \qquad \text{and} \qquad w = C^{-1} r
$$

the error becomes

$$
\| x - x^\ast \|_{AC^{-1}A}^2 = r^T w
$$

In [None]:
from ngsolve import *
from netgen.geom2d import unit_square
mesh = Mesh(unit_square.GenerateMesh(maxh=0.1))
fes = H1(mesh, order=1)
u,v = fes.TnT()
a = BilinearForm(grad(u)*grad(v)*dx+10*u*v*dx).Assemble()
f = LinearForm(x*y*v*dx).Assemble()
gfu = GridFunction(fes)

A very simple preconditioner is the Jacobi-preconditioner

$$
C = \text{diag} A
$$ 

If $A$ is SPD, then all diagonal entries are positive, and $C$ is SPD as well.

In NGSolve, we can obtain a Jacobi preconditioner as follows. The result is a linear operator providing the linear operation

$$
w := C^{-1} * r
$$

In [None]:
cinv = a.mat.CreateSmoother()

In [None]:
hv = gfu.vec.CreateVector()
hv2 = gfu.vec.CreateVector()
hv3 = gfu.vec.CreateVector()
hv.SetRandom()
hv.data /= Norm(hv)
for k in range(20):
    hv2.data = a.mat * hv
    hv3.data = cinv * hv2
    rho = Norm(hv3)
    print (rho)
    hv.data = 1/rho * hv3

In [None]:
alpha = 1 / rho
r = f.vec.CreateVector()
w = f.vec.CreateVector()
gfu.vec[:] = 0

w.data = cinv * f.vec
err0 = sqrt(InnerProduct(f.vec, w))
its = 0
while True:
    r.data = f.vec - a.mat * gfu.vec
    w.data = cinv * r
    err = sqrt(InnerProduct(r,w))
    print ("iteration", its, "res=", err)
    gfu.vec.data += alpha * w
    if err < 1e-8 * err0 or its > 10000: break
    its = its+1
print ("needed", its, "iterations")

By situation is not considerably improved by the diagonal preconditioner. However, if we have a bilinear-form with variable coefficient, or large coefficients in the Robin - boundary condition such as

$$
A(u,v) = \int_\Omega \nabla u \nabla v \, dx + 10^8 \int_{\Gamma_R} u v \, ds,
$$

the Jacobi preconditioner captures these parameters (experiment in  excercise, theory soon).

The preconditioned gradient method
---

To introduce the preconditioner into the gradient method we proceed as follows. Since $C$ is SPD, we are allowed to form its square-root

$$
C^{1/2}
$$

as well as its inverse. The linear system $A x = b$ is equivalent to 

$$
C^{-1/2} A C^{-1/2} \; C^{1/2} x = C^{-1/2} b.
$$

With the definition of transformed quantities 

$$
\tilde A = C^{-1/2} A C^{-1/2}, \qquad
\tilde b = C^{-1/2} b, \qquad
\tilde x = C^{1/2} x
$$

we have the linear system

$$
\tilde A \tilde x = \tilde b
$$

The transformed matrix $\tilde A$ is SPD as well.

We apply the gradient method for the transformed system:

Given $\tilde x^0$ <br>
$\tilde r^0 = \tilde b - \tilde A \tilde x^0$ <br>
for $k = 0, 1, 2, \ldots$ <br>
$\qquad \tilde p = \tilde A \tilde r^k$ <br>
$\qquad \alpha = {\tilde r^k}^T \tilde r^k \, / \, {\tilde r^k}^T \tilde p$ <br>
$\qquad \tilde x^{k+1} = \tilde x^k + \alpha \tilde r^k$ <br>
$\qquad \tilde r^{k+1} = \tilde r^k - \alpha \tilde p$ <br>


Now we transform back via
$$
\tilde x^k = C^{1/2} x^k, \qquad \tilde r^k = C^{-1/2} r, 
\qquad \tilde p = C^{-1/2} p
$$
and obtain

Given $x^0$ <br>
$C^{-1/2} r^0 = C^{-1/2} b  - C^{-1/2} A C^{-1/2} C^{1/2} x^0$ <br>
for $k = 0, 1, 2, \ldots$ <br>
$\qquad C^{-1/2} p = C^{-1/2} A C^{-1/2} C^{-1/2} r^k$ <br>
$\qquad \alpha = \frac{\{C^{-1/2} r^k \}^T C^{-1/2} r^k \, }{ \, \{ C^{-1/2} r^k\}^T C^{-1/2} p}$ <br>
$\qquad C^{1/2} x^{k+1} = C^{1/2} x^k + \alpha C^{-1/2} r^k$ <br>
$\qquad C^{-1/2} r^{k+1} = C^{-1/2} r^k - \alpha C^{-1/2} p$ <br>

now we simplify, and introduce $w = C^{-1} r$

Given $x^0$ <br>
$r^0 = b - A x^0$ <br>
for $k = 0, 1, 2, \ldots$ <br>
$\qquad w = C^{-1} r$ <br>
$\qquad p = A w$ <br>
$\qquad \alpha = \frac{w^T r^k}{w^T p^k}$ <br>
$\qquad x^{k+1} = x^k + \alpha w$ <br>
$\qquad r^{k+1} = r^k - \alpha p$ <br>



In [None]:
r = f.vec.CreateVector()
w = f.vec.CreateVector()
p = f.vec.CreateVector()

gfu.vec[:] = 0
r.data = f.vec
w.data = pre*r
err0 = sqrt(InnerProduct(r,w))
its = 0
while True:
    w.data = pre*r
    p.data = a.mat * w
    err2 = InnerProduct(w,r)
    alpha = err2 / InnerProduct(w,p)

    print ("iteration", its, "res=", sqrt(err2))
    gfu.vec.data += alpha * w
    r.data -= alpha * p
    if sqrt(err2) < 1e-8 * err0 or its > 10000: break
    its = its+1
print ("needed", its, "iterations")