# Conjugate Gradient and Preconditioned Conjugate Gradient methods


### Error Analysis

Stationary methods

* Jacobi method

$$x^{(k+1)}_i = \frac{1}{A_{i,i}}\left(b_i-\sum_{j\neq i}A_{i,j}x^{(k)}_j\right)$$

In vectorized form
$${\bf x}^{(k+1)} = D^{-1}({\bf b}-(A-D){\bf x}^{(k)})$$

* Gauss Seidel method

$$x^{(k+1)}_i = \frac{1}{A_{i,i}}\left(b_i-\sum_{j<i}A_{i,j}x^{(k+1)}_j-\sum_{j>i}A_{i,j}x^{(k)}_j\right)$$

In vectorized form
$${\bf x}^{(k+1)} = L^{-1}({\bf b}-U{\bf x}^{(k)})$$

* Over Successive Relaxation 

$$x^{(k+1)}_i = \omega\frac{1}{A_{i,i}}\left(b_i-\sum_{j<i}A_{i,j}x^{(k+1)}_j-\sum_{j>i}A_{i,j}x^{(k)}_j\right) + (1-\omega)x^{(k)}_i$$

In vectorized form
$${\bf x}^{(k+1)} = \omega L^{-1}({\bf b}-U{\bf x}^{(k)}) + (1-\omega){\bf x}^{(k)}=\omega{\bf x}^{(k+1)}_{GS}+(1-\omega){\bf x}^{(k)}$$

These methods can all be summarized as: (*stationary method*)

$${\bf x}^{(k+1)} = {\bf x}^{(k)}+M^{-1}({\bf b}-A{\bf x}^{(k)})$$

<div style="background-color:rgba(256, 256, 0, 0.1); padding:10px 0;font-family:monospace;">
    <b>Theorem: Stationary Method Convergence.</b><br><hr>
    For the linear problem $A{\bf x}={\bf b}$, consider the iterative method $${\bf x}^{(k+1)}={\bf x}^{(k)}+M^{-1}{\bf r}^{(k)}$$ and define the <b>iteration matrix</b> $T=I-M^{-1}A$. Then the method converges if and only if the spectral radius of the iteration matrix satisfies $$\rho(T)<1$$ The smaller $\rho(T)$, the faster the convergence. 
</div>

### Conjugate Gradient Method

Conjugate gradient method is derived from **steepest descent method**. So we need to first talk about steepest descent method. <br>
Solving $A{\bf x}={\bf b}$ is equivalent to minimizing the following functional:
$$\phi({\bf x})=\frac{1}{2}{\bf x}^TA{\bf x}-{\bf b}^T{\bf x}$$
Gradient descent iteration:
$${\bf x}^{(k+1)}={\bf x}^{(k)}+\alpha_k{\bf r}^{(k)}$$
Which leads to minimizing:
$$\frac{1}{2}({\bf x}^{(k)}+\alpha_k{\bf r}^{(k)})^TA({\bf x}^{(k)}+\alpha_k{\bf r}^{(k)})-{\bf b}^T({\bf x}^{(k)}+\alpha_k{\bf r}^{(k)})$$
Taking derivative with respect to $\alpha_k$, we get 
$$\alpha_k{\bf r}^{(k)T}A{\bf r}^{(k)}+{\bf r}^{(k)T}A{\bf x}^{(k)}-{\bf r}^{(k)T}{\bf b}=0$$
This leads us to 
$$\alpha_k=\dfrac{{\bf r}^{(k)T}{\bf r}^{(k)}}{{\bf r}^{(k)T}A{\bf r}^{(k)}}$$

The **Conjugatet Gradient Method** takes the search direction ${\bf p}^{(k)}$ no longer the same as the residual vector ${\bf r}^{(k)}$, but the search direction ${\bf p}^{(k)}$ needs to be perpendicular to all the previous search directions ${\bf p}^{(j)}$ for $j<k$. 

<div style="background-color:rgba(0, 256, 0, 0.1); padding:10px 0;font-family:monospace;">
    <b>Algorithm: Conjugate Gradient Method.</b><br><hr>
    Given an initial guess ${\bf x}_0$ and a tolerance $\epsilon$, set at first ${\bf r}_0={\bf b}=-A{\bf x}_0$, $\delta_0=\langle {\bf r}_0, {\bf r}_0\rangle$, ${\bf b}_{\delta} = \langle {\bf b}_0, {\bf b}_0\rangle$, $k=0$, and ${\bf p}_0={\bf r}_0$. Then:<br>
    &nbsp;&nbsp;&nbsp;&nbsp; while $\delta_k>\epsilon^2{\bf b}_{\delta}$:<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; ${\bf s}_k=A{\bf p}_k$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; $\alpha_k = \dfrac{\delta_k}{\langle {\bf p}_k, {\bf s}_k\rangle}$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; ${\bf x}_{k+1} = {\bf x}_{k}+\alpha_k{\bf p}_k$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; ${\bf r}_{k+1} = {\bf r}_{k}-\alpha_k{\bf s}_k$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; $\delta_{k+1} = \langle {\bf r}_{k+1}, {\bf r}_{k+1}\rangle$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; ${\bf p}_{k+1} = {\bf r}_{k+1}+\dfrac{\delta_{k+1}}{\delta_{k}} {\bf p}_{k}$<br>
    &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; $k = k+1$
</div>