In [None]:
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'sky',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})

# Lecture 11. Multigrid continued

## Previous lecture
- Basic elements of the multigrid

## Todays lecture
- The basic elements of the multigrid.

## (Approximate) Syllabus
- **Week 1:** Intro & basic integral equations (turning PDEs into IEs, typical kernels, Nystrom, collocation, Galerkin, quadrature for singular/hypersingular integrals).
- **Week 2:** Translation-invariant kernels and convolutions, FFT. Concept of close and far interactions precorrected FFT. Barnes-Hut method
- **Week 3:**  Fast multipole methods. Algebraic analogue of fast multipole method, hierarchical matrices
- **Week 4:**  Multigrid methods, domain decomposition
- **Week 5:** Wavelets, best N-term approximation
- **Week 6:**  Sparse grids, tensors
- **Week 7:** Exam & test
- **Week 8:** App Period

## Multigrid scheme
We have a sequence of matrices $A_1, \ldots, $ of decreasing sizes, $A_j$ corresponds to mesh size $h_j$.

## Basic idea of a two-grid scheme


1. Two-grid method: given $A_1$, $P$ and the **smoother**t we construct:
2. $R = P^{\top}$, coarse-grid operator $A_{2} = R A_1 P = P^{\top} A_1 P$
3. Smooth, $u_1 := S^{\nu} u_1, \quad r_1 = f_1 - A_1 u_1$ 
4. Restrict residual: $r_{2} = R r_2$
5. Solve coarse, $e_{2} = A_{2}^{-1} r_{2}$
6. Interpolate back: $u_1 := u_1+ P e_{2}$.


In the multigrid scheme, the equation for $e_2$ is solved recursively.

## Algebraic formulation


We need to have assumptions on $A_j$. Suppose that $A_j = A^*_j > 0$


and

$$c_j h_j \leq \frac{(A_j u_j, u_j)}{u_j, u_j)} \leq c_j h^{-1}_j.$$

for all $u_j$.

We can introduce $A$-scalar product as

$$(u, v)_A = (Au, v),$$ 

$$\Vert u \Vert_A = \sqrt{(u, u)_A}.$$

Also, 

$$\Vert M \Vert A = \Vert A^{-\frac{1}{2}} M A^{-\frac{1}{2}} \Vert = \max_{v \ne 0} \frac{\Vert M u \Vert_A}{\Vert u \Vert_A}.$$

## Algebraic formulation (cont.)

Let $$\widehat{A} = P^{\top} A P$$ be the coarse-grid matrix.

From the definition, we have

$$ \Vert P v \Vert_{\widehat{A}} = \Vert v \Vert_A.$$

and

$$\Vert Q M \Vert_A = \Vert M \Vert_{\widehat{A}}.$$

An important role will be played by the matrix

$$Q = \widehat{A}^{-1} P^{\top} A,$$

which has the following properties:

$$(PQ)^{2} = PQ,\quad (I - PQ)^{2} = I - PQ.$$

$$\left(A^{\frac{1}{2}} P Q A^{\frac{-1}{2}}\right)^{\top} = A^{\frac{1}{2}} PQ A^{-\frac{1}{2}}.$$

$$\left((I - PQ)u, PQ v\right)_A  = 0.$$ 

$$\Vert (I - P Q) u + PQ u \Vert^2_A = \Vert (I - PQ)  u\Vert^2 + \Vert PQ \Vert^2_A,$$  

Thus, $PQ$ is an **A-orthogonal projector**, and $$QP = I.$$

## Smoother revisited

Consider any simple iteration method that converges:

$$u_k = u_{k-1} + \Phi (f - A u_{k-1}),$$

where $\Phi = \Phi^* > 0$.

The **smoother** is a matrix $$S= (I - \Phi A).$$

The matrix $S$ is not necessary symmetric, but it is $A$-symmetric and has all real eigenvalues.


## Assumptions
1. $$\lambda(S) = \lambda (I - \Phi_A)$$ satisfy $$0 < \lambda(S) < 1.$$
2. Minimal eigenvalue of the matrix $\Phi$ is bounded from below by $ch$. 
   An **equivalent statement** is that
   $$(Au, u)_A \leq \frac{1}{ch}((I - S) u, u)_A.$$
3. When a vector $u$ is approximated by a vector $v$ we have an estimate (interpolation property)
   $$\Vert u - Qv \Vert_2 \leq ch \Vert Au \Vert_2.$$

## Lemma

We will need the following Lemma 

$$ \Vert (I - PQ) u \Vert^2_A \leq ch \Vert Au^2_2 \Vert$$ \

**Proof.**

$$(I - PQ) u = (I - PQ) (u - Qv)$$, 

therefore

$$\Vert(I - PQ) u \Vert^2_A \leq \Vert u - Q v \Vert^2_A \leq \frac{c}{h} \Vert u - Qv \Vert_2 \leq ch \Vert Au \Vert^2_2.$$



## Smoother

For any integer $m > 0$ we have     
$$((I - S)S^m u, u)_A \leq \frac{1}{m} ((I - S^m) u, u)_A.$$

**Proof**

For any $j < m$ we have (since the eigenvalues of $S$ lies from $0$ to $1$

$$((I - S) S^j u, u)_A \leq ((I - S) S^m u, u)_A.$$

Then it is enough to use the fact

$$

$$(I - S^m) = \sum_{j=0}^{m-1} (I - S)S^j.$$

## Smoother (cont.)

THe following estimate holds (it will be required to verify the smoothness property):

$$\Vert (I - PQ) S^m u \Vert^2_A \leq \frac{c}{m} ((I - S^{2m}) u, u)_A  $$

**Proof:**

We have

$$\Vert (I - PQ) ^m u \Vert^2_A \leq ch (A S^m u, S^m u)_A \leq ((I - S) S^m u, S^m u)_A) \leq \frac{c}{m} ((I - S^{2m} u), u).$$

Here we also used $A$-symmetry of $S$.

## Multigrid convergence analysis


Let us consider multigrid case. Then, the error propopagation operator $B$ is defined recursively from the relations:

$$e = (I - B A) u = S^m (S^m u - P v) = S^m (S^m - P Q S^m + P (I - \widehat{B} \widehat{A})^s Q S^m) u.$$

For $s = 1$ we get $V$-cycle, for $s=2$ we get $W$-cycle.

Thus, we get that

$$(I - BA) =  S^m (S^m u - P v) = S^m (S^m - P Q S^m + P (I - \widehat{B} \widehat{A})^s Q S^m). $$

It related the iteration matrix on the fine level with the iteration matrix on the coarse level.

By construction, both $B$ and $\widehat{B}$ are symmetric, thus eigenvalues of $I - BA$ and $I - \widehat{B} \widehat{A}$ are real. If $I - \widehat{B} \widehat{A}$ has non-negative eigenvalues, then $I - B A$ also has them.

Indeed, if $I - \widehat{B}\widehat{A}$ has non-negative eigenvalues, we can take a square root:

$$F^2 = (I - \widehat{B} \widehat{A}),$$

and $F$ is $A$-symmetric, thus

$$\Vert F^s \Vert_A = \Vert F \Vert^s_A.$$

## Main inequality

From the main equality  of the iteration matrix 
$$(I - BA) =  S^m (S^m u - P v) = S^m (S^m - P Q S^m + P (I - \widehat{B} \widehat{A})^s Q S^m). $$

We have the following upper bound:

$$((I - B A) u, u)_A \leq \Vert (I - PQ) S^m u \Vert^2_A + \Vert I - \widehat{B} \widehat{A} \Vert^s_\widehat{A} \Vert P Q S^m \Vert^2_A. $$ 

## V-cycle analysis

V-cycle is obtained by setting $s = 1$.

Suppose that

$$\Vert I - \widehat{B} \widehat{A} \Vert_ \widehat{A} \leq \gamma < 1.$$

Now, select a vector $u$ such that

$$((I - B A) u, u)_A = \Vert I - B A \Vert_A, \quad \Vert u \Vert_A = 1.$$

Also, 


$$\Vert PQ S^m u \Vert_A = \Vert S^m u \Vert^2_A - \Vert (I - P Q) S^m u \Vert^2_A.$$ 

Bounding each term we have

$$\Vert I - B A \Vert_A = ((I - B A) u, u))_A \leq  (1 - \gamma) \frac{c}{m}((I - S^{2m}) u, u) +  \gamma (S^{2m} u, u)_A.$$

We want to have 

$$\Vert (I - BA) \Vert_A \leq \gamma.$$

If we have that, we can you a multigrid version.

If we have

$$(1 - \gamma) \frac{c}{m} \leq \gamma$$ then we obviously have such an estimate;

it requires

$$ \gamma \geq \frac{c}{c + m}.$$

## Theorem

The following theorem has been proven.

If the multigrid matrix $B_j$ satisfies

$$\Vert I - B_j A_j \Vert_{A_j} \leq \gamma \leq 1, $$

and $$\gamma \geq \frac{c}{c + m},$$

then for all $k \geq j$ we have

$$\Vert I - B_k A_k \Vert_{A_k} \leq \gamma.$$

## W-cycle analysis

For the W-cycle we have

$$\gamma^2 \leq \frac{c}{c+m}.$$

## Simplest (uniform) example.

In the simplest case of 1D uniform grids and linear interpolation, we have 

$$\Phi = \tau A, \tau = \frac{1}{\lambda_{\max} (A)}.$$

The only thing that needs to be tested is

$$\Vert u - P v \Vert_2 \leq ch \Vert A u \Vert_2,$$

where

$P$ is 

$$P = \begin{bmatrix} \frac{1}{2} & 0 
1 & 0  \\
\frac{1}{2} & \frac{1}{2} & 0 \\
0 & 1 & 0 \\
\vdots & \vdots & \vdots
\end{bmatrix}.$$

## Subspace correction methods

Multigrid methods can  be considered as a special case of more general **subspace correction methods**.

Suppose we want to solve 

$$A u = f,$$

and $u \in V$, and $V = V_1 + \ldots + V_m.$

Consider projection equations

$$Q_i A P_i u_i = Q_i f, \quad u_i \in V_i,$$

and $P_i$ is an $A$-orthogonal projector on $V_i$, and $Q_i$ is an orthogonal projector on $V_i$.

Then the operator

$$A_i = Q_i A P_i$$ satisfies

$$A_i P_i = Q_i A,$$

indeed

$$(A_i P_i u, v) = (A P_i u, Q_i y) = (A x, P_i Q_i y) = (Ax, Q_i y) = (Q_i A x, y). $$


Let $R_i$ be a preconditioner for the $i$-th equation. Then, 

the following matrix

$$B = R_1 Q_1 + \ldots + R_m Q_m$$  is a good **preconditioner** for the matrix $A$.

## Subspace correction & multigrid

In the multigrid setting we have

$$V_1 \subset V_2 \subset \ldots \subset V.$$

and if $S_i$ is the smoother, we can take

$$R = (I - S^k_i) A^{-1}_i,$$

which is called **BPX-preconditioner**.

## Convergence theorem

Assumptions (checking can be quite cumbersome). 

- Every vector $v$ can be decomposed as $v = \sum_{i} v_i$ and 
  $$\sum_{i} (R^{-1}_i v_i, v_i) \leq K_0 (Av, v).$$
- For operators $$T_i = R_i A_i P_i $$ we have
  $$\sum_{i, j} (T_i x, T_j y) \leq K_1 \left( \sum_{i=1}^m (T_i x, x)_A \right)^{1/2}\left( \sum_{i=1}^m (T_i y, y)\right)^{1/2}.$$
  
  Then the ratio between the maximal and minimal eigenvalues of $BA$ does not exceed $K_0 K_1$.

## Algebraic multigrid

The concept of geometric multigrid requires additional knowledge about the coarse spaces and the interpolation.

The concept of algebraic multigrid, first introduced by Ruge and Stuben in 1987, is to work only with the **sparsity pattern** in the spirit of sparse solvers.

The main question of AMG is how to construct **coarse meshes** and **interpolation operators**, since everything else is already there.

## Smoother for AMG

As a smoother we can use weighted Jacobi, Gauss-Seidel, Incomplete LU.

## Smoothing property (again)

The operator $S$ is a smoother, if

$$\Vert S v \Vert_A \leq \Vert v \Vert_A - \sigma \Vert v \Vert_2. $$

It means, it reduces the error when $\Vert v \Vert_2$ is relatively large with respect to $\Vert v \Vert_A$.

The error is called **algebraically smooth**, if $\Vert v \Vert_2 \ll \Vert v \Vert_A.$

A damped Jacobi method satisfied the smoothing property.

## Coarsening

Finally, we have to talk about coarsening. Given a graph of sparse matrix, we have to decide, which nodes will remain on the fine mesh, and which nodes will remain on the coarse mesh.

The nodes are called $F$-nodes and $C$-nodes.
 

## Strong coupling

A variable $i$ is said to be **strongly coupled** to the node $j$ if

$$|a_{ij}| \geq \varepsilon_{str} \max_{a_{ik} < 0} |a_{ik}|.$$

## Standard coarsening procedure

A standard coarsening procedure: 

1. Take the first variable as a C-node
2. All variables that are strongly coupled with it, become $F$ nodes.
3. Pick another one as a C-node, add strongly coupled to the $F$ nodes.

To control "uniformity" of C-nodes, some measure has to be used. 

Given a vertex $i$ and the list $S_i$ of variables that are strongly couple, we count

$$\lambda_i = \left|S_i \cap U\right| + 2 \left|S_i \cap F\right|,$$

where $U$ is the set of all **undecided** variables, and $F$ is the set of all **fine-nodes**.

In the beginning, the set of fine nodes is empty, but then the balance is changed.

Let us try to illustrate it on Laplace.

## Prolongation

The prolongation in AMG is a part of the art. We have to interpolate the error from C-nodes to F-nodes.


## Discussion on algebraically smooth error

$$Se \approx e$$ also means, in terms of residuals, that

$$(D^{-1} r, r) \ll (r, e)$$

or in the index form it means that

$$a_{ii} e_i + \sum_{j \in N_i} a_{ij} e_j \approx 0.$$

## Prolongation

From the "algebraically smooth" error we have

$$e_i \approx -\frac{1}{a_{ii}} \sum_{j \in N_i} a_{ij} e_j.$$

If $N_i = P_i$ (the set of interpolation nodes), then we have

$$e_i = \sum_{j \in P_i} w_{ij} e_j,$$

with $$w_{ij} = - \frac{a_{ij}}{a_{ii}}.$$

However, the number of coarse nodes is smaller, and we assume that the averages are similar:on

$$\frac{1}{\sum_{j \in P_i} a_{ij}} \sum_{j \in P_i} a_{ij} e_j \approx \frac{1}{\sum_{j \in N_i}} \sum_{j \in N_i} a_{ij} e_j,$$

Giving the following equations for the weights:

$$w_{ij} = -\left(\frac{\sum_{k \in N_i} a_{ij}}{\sum_{k \in P_i} a_{ik}}\right) \frac{a_{ij}}{a_{ii}}.$$

If the row sum is zero, we have

$$\sum_{j} w_{ij} = 1, $$

at least the constant is preserved.



## Summary
- Theory of the multigrid.

## Next lecture
- Iterative methods & preconditioners

In [40]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()