In [None]:
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'sky',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})

# Lecture 13. Sparse grids

## (Approximate) Syllabus
- **Week 1:** Intro & basic integral equations (turning PDEs into IEs, typical kernels, Nystrom, collocation, Galerkin, quadrature for singular/hypersingular integrals).
- **Week 2:** Translation-invariant kernels and convolutions, FFT. Concept of close and far interactions precorrected FFT. Barnes-Hut method
- **Week 3:**  Fast multipole methods. Algebraic analogue of fast multipole method, hierarchical matrices
- **Week 4:**  Multigrid methods, domain decomposition, sparse grids

## Previous lecture
- Finalize fast direct solvers
- Basics of Domain decomposition


## Todays lecture

- I will talk about sparse grids.

## History of sparse grids

The idea of sparse grids was first proposed by **Smolyak** in the paper 

Смоляк А. С. Квадратурные и интерполяционные формулы на тензорных
произведениях некоторых классов функций // Докл. АН СССР. — 1963. —
Т. 148, № 5. — С. 1042–1053.

Which was later popularized by German mathematicians in the 90-s

## Model problem

Consider a model problem


$$ \Delta u = f.$$

We use rectangular elements and **uniform grids** with step sizes $h_x$ and $h_y$.


We will use **different mode sizes**

## Error expression 

The Taylor expression of the error with respect to the grid sizes gives all even  powers:

$$u^{h_x, h_y} = u + h^2_x e^{2, 0} + h^2_y e^{0, 2} + h^4_x e^{0, 4} + h^4_y e^{4, 0} + h^2_x h^2_y e^{2, 2} + \ldots$$

The expressions $e^{i, j}$ depend on $x, y$ but neither on $h_x, h_y$.

## Richardson extrapolation

In order to get the idea, where the sparse grids come from, we have to recall the concept of Richardson extrapolation.


Suppose $h_x = h_y = h$ then.

$$u_h = u + h^2 e_1 + h^4 e_2 + \ldots. $$


Then we can define **extrapolated solutions** as 

$$u^{h, h}_0 = u^{h, h},$$

$$u^{h, h}_k = \frac{2^{2k}}{2^{2k} - 1} I^{h, h}_{h/2, h/2} u^{h/2, h/2}_{k-1} + \frac{1}{1 - 2^{2k}} u^{h, h}_{k-1},$$

where $I$ is the **injection operator**.

I.e., we solve on finer mesh, and bring it back to the coarse mesh and eliminate **high-order terms**.

The order is then $\mathcal{O}(h^{2k + 2}).$

## Multivariate Richardson extrapolation

The key idea of multivariate Richardson extrapolation is based on the idea of combining solutions on **coarser meshes** to remove higher-order terms:


$$u^{h_x, h_y}_{extra} := \sum_{i=0}^p \sum_{j=0}^q \alpha_{ij} I^{h_x, h_y}_{2^{-i} h_x, 2^{-j} h_y} u^{2^{-i} h_x, 2^{-j} h_y}.$$

## Example of multivariate Richardson
For example, to eliminate $h^2_x$ and $h^2_y$ we may use a combination like

$$u^{h_x, h_y}_{extra} = \frac{4}{3} I^{h_x, h_y}_{h_x/2, h_y} u^{h_x/2, h_y} + \frac{4}{3} I^{h_x, h_y}_{h_x, h_y/2} u^{h_x, h_y/2} u^{h_x, h_y/2} - \frac{5}{3} u^{h_x, h_y}.$$ 

The error of the extrapolation scheme is 

$$\mathcal{O}(h^2_x h^2_y + h^4_x + h^4_y).$$

This modified scheme requires the solution of problems with $N$, $2N$ and $2N$ unknowns, whereas the **Richardson extrapolation**
requires

solution of a problem with $N$ unknowns and $4N$ unknowns (note the complexity!)


## Multivariate interpolation(2)

The multivariate interpolation has better complexity if we will **extrapolate many times**

$$\begin{bmatrix} 1 & h^2_x & h^4_x & h^6_x & \ldots \\
 h^2_y & h^2_x h^2_y & \ldots \\
 h^4_y & h^2_x h^4_y & \ldots \\
 h^6_y & \ldots 
 \end{bmatrix}
 $$
 
 And the approximate solution table has the form
 
$$\begin{bmatrix} u^{h_x, h_y} & u^{h_x/2, h_y} & u^{h_x/4, h_y} & u^{h_x/8, h_y} & \ldots \\
 u^{h_x, h_y/2} & u^{h_x/2, h_y/2} & \ldots \\
 u^{h_x, h_y/4} & u^{h_x/2, h_y/4} & \ldots \\
 & \ldots 
 \end{bmatrix}
 $$
 

 If we want to eliminate the error terms $h^{2i}_x, h^{2j}_y$ for $2i + 2j \leq 2p,$
 
 using the triangular structure we can combine solutions
 
 $u^{h_x 2^{-i}, h_y 2^{-j}}$
 
 for $i + j \leq p$.
 

## Work and complexity estimates

Assume optimal solvers (i.e. they solve problems with $\mathcal{O}(N)$ complexity)

i.e. computing

$$u^{h_x 2^{-i}, h_y 2^{-j}}$$ involves 

$$2^{(i+j)} N$$ unknowns and cost.

The total work is then

$$W = \sum_{i=0}^p \sum_{j=0}^{p-i} 2^{(i+j)} N = N (2^{p+1} p +1 ) = N \mathcal{O}(p 2^p).$$

as compared to 

$$W = \sum_{i=0}^p 4^i N = N \left(\frac{4^{p+1}}{3} - \frac{1}{3} \right)  = N \mathcal{O}(4^p).$$

with the same formal order.


## Combination technique

Another technique, proposed by Griebel et. al, 

we assume that the **error splitting**

$$ u^{h_x, h_y} - u  = e_x(h_x) + e_y(h_y) + R(h_x, h_y), $$

where $e_x$ depends on $h_x, x, y$, whereas $e_y$ just depends on $h_y$, $x$, $y$.

Furthermore $$|R(h_x, h_y)| \leq c (h_x h_y)^{\nu}.$$


## Combination technique(2)

$$e_x(h_x) = \mathcal{O}(h^2_x), \quad e_y(h_y) = \mathcal{O}(h^2_y).$$
    $$\widehat{u}^{h_x, h_y} _k = \sum_{i=1}^k u^{h_x 2^{-i}, h_y 2^{-j}} - \sum_{i=1}^{k-1} u^{h_x 2^{-i}, h_y 2^{i-k}}.$$
    Note that the combined solution is defined not a regular full grid, but on a  **sparse grid**.
    In the **triaangular scheme** it corresponds to summing all the solution along the diagonal and subtracting all the solutions below the diagonal.

## Error bound

The error bound gives 

$$\widehat{u}^{h_x, h_y} - u = e_x(2^{-k} h_x) + e_y(2^{-k} h_y) + \widehat{R},$$

where

$$|\widehat{R}| \leq \left| \sum_{i=1}^k R(2^{-i} h_x, 2^{i-k-1} h_y) - \sum_{i=1}^{k-1} R(2^{-i} h_x, 2^{i-k} h_y) \right| \leq \widehat{c} \left(k 2^{-(k+1) \nu} + (k-1) 2^{-k \nu} \right) (h_x h_y)^{\nu}.$$

## Sparse grids

The sparse grids are closely related to the combination technique. The unknowns for the sparse grid are defined as

$$G^{h_x, h_y}_k := \bigcup_{i=1}^k G^{2^{-i} h_x, 2^{i-k-1} h_y}.$$

This  is $G^{(1, 1)}_7.$

<img src='pic/sparse-grid.png' \img>

## Interpolation approximation

We compute the values and sparse grid and use combination technique to compute the approximation.

The typical behaviour of the L2-error is $\mathcal{O}(h^2 | \log h|)$ where $h$ is the **minimal step size**, 

where as the total number of points grows as

$$\log(h^{-1})^{k}.$$ 

## Application to FEM

We can also interpret the discrete system as a approximation of the energy functional by replacing its true value $u$ 

by a combination technique.

$$E^{h_x, h_y} = \frac{1}{2} (u, A u) - (f, u).$$

In the most straightforward way, the Richardson extrapolation gives 

$$A^{h_x, h_y}_k u_k = f_k,$$


where $$A^{h_x, h_y}_0 = A^{h_x, h_y}, \quad $$

$$A^{h_x, h_y}_k = \frac{2^{2k}}{2^{2k} - 1}A^{h_x, h_y}_{k-1} + \frac{1}{1-2^{2k}}I^{\top} A^{2h_x, 2h_y}_{k-1}  I.$$

## Combination technique and nested basis

Combination technique can be explained as the **nested basis idea**.

Given a sequence of nested spaces $V_l$, we construct the **difference basis** 

$$W_l = V_l \setminus \otimes_{t=1}^d V_{l - e_t},$$

where $e_t$ is the $t$-th column of the identity matrix.

Given the difference spaces, we can defined the hierarhical decomposition



## ANOVA decomposition

Sparse grid is closely related to function approximation:

$$f(x_1, \ldots, x_d) = f_1(x_1) + \ldots + f_d(x_d) + f_{12}(x_1, x_2) + f_{13}(x_1, x_3) + \ldots, $$

i.e. we first sum single terms, then pairwise interactions, then triples.

How we can discover the coefficients, what do you think? 

There are two ways of doing so: from intergration, and from interpolation.

## Literature 



[Sparse grids in a nutshell](http://garcke.ins.uni-bonn.de/research/pub/sparse_grids_nutshell.pdf)

## Next lecture
- Wavelets & Tensors

In [40]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()