# MLMC Estimation of the covariance (matrix 2x2)

Lets try to compute the MLMC Estimation of the covariance $\Sigma = \text{Cov(y)}$

### Defintions :
As told in "Multilevel Monte Carlo covariance estimation for the computation of Sobol’ indices" ([Myzek & de Lozzo](https://hal.science/hal-01894503/document), (2.1)) We consider an abstract numerical simulator described by the function :

$$
\begin{align*}
    f_\ell : \mathcal{X}=\mathbb{R}^d & \rightarrow \mathbb{R} \\
    X & \mapsto (v^T \cdot X , w^T \cdot X), \quad v,w \in \mathbb{R}^d, 
\end{align*}
$$

We note $Y_\ell = (Y_{\ell,1},Y_{\ell,2}) =f_\ell(X)$.

Let $Y_L = (Y_{L,1},Y_{L,2}) =f_L(X)$ the random vector based on the numerical simulator with the highest fidelity $L > \ell$.

We have :
$$\Sigma =
\text{cov}(Y_L) = 
\begin{bmatrix}
\text{cov}(Y_{L,1}, Y_{L,1}) & \text{cov}(Y_{L,1}, Y_{L,2}) & \cdots & \text{cov}(Y_{L,1}, Y_{L,d}) \\
\text{cov}(Y_{L,2}, Y_{L,1}) & \text{cov}(Y_{L,2}, Y_{L,2}) & \cdots & \text{cov}(Y_{L,2}, Y_{L,d}) \\
\vdots & \vdots & \ddots & \vdots \\
\text{cov}(Y_{L,d}, Y_{L,1}) & \text{cov}(Y_{L,d}, Y_{L,2}) & \cdots & \text{cov}(Y_{L,d}, Y_{L,d}) \\
\end{bmatrix}
$$

<details>
$$\Sigma =
\begin{bmatrix}
\text{Var}(Y_1) & \text{cov}(Y_1,Y_2) \\
\text{cov}(Y_2,Y_1) & \text{Var}(Y_2) \\
\end{bmatrix}$$
</details>


#### Sampling

Let $\ell \in \{ 1, \dots, L\}$ corresponding of the level of fidelity of our numerical simulator $f_\ell$.

We have $n_0 > n_1 > \dots > n_\ell > \dots > n_L$.

We now take a sample of $n$ elements : $\left(X^{(1)},\dots,X^{(n)}\right)_{n\in\mathbb{N}}$ which gives us for each level of fidelity $\ell$ :
$$
\begin{align}
\left(Y_\ell^{(1)},\dots,Y_\ell^{(n_\ell)}\right) & = \left(f_\ell(X^{(1)}), \dots, f_\ell(X^{(n_\ell)}) \right) \\
& = \left( \begin{pmatrix} Y_{\ell,1}^{(1)} \\[7pt] Y_{\ell,2}^{(1)} \end{pmatrix}, \dots, \begin{pmatrix} Y_{\ell,1}^{(n_\ell)} \\[7pt] Y_{\ell,2}^{(n_\ell)} \end{pmatrix}  \right)
\end{align}
$$

Once we have this sample, we want to compute $\hat{\Sigma}_{\ell}$. So for each level of fidelity $\ell$, we have :

1. Let's compute the empirical mean $\hat{\mu}_{\ell}$ of each component of the random vector by taking the mean of each component over the whole sample. For $d$ components, we get $d$ empirical means :
    $$
    \hat{\mu}_{\ell,i} = \frac{1}{n_{\ell}} \sum_{k=1}^{n_{\ell}}(Y_{\ell,i}^{(k)})
    $$
    Here, $i \in \{1,2\}$ because $d=2$ in our case. So we have 2 empirical means.

2. Then we estimate the covariances for all $(i,j) \in \{1,\dots,d \}$ :
    $$
    \hat{\Sigma}_{\ell,ij} = \frac{1}{n_\ell} \sum_{k=1}^{n_\ell}\left(Y_{\ell,i}^{(k)}-\hat{\mu}_{\ell,i} \right)\left(Y_{\ell,j}^{(k)}-\hat{\mu}_{\ell,j} \right)
    $$

    We finally have our $(d \times d)$ matrix :
    $$
    \hat{\Sigma}_\ell =
    \begin{bmatrix}
    \hat{\Sigma}_{\ell,11} & \hat{\Sigma}_{\ell,12} & \cdots & \hat{\Sigma}_{\ell,1d} \\
    \hat{\Sigma}_{\ell,21} & \hat{\Sigma}_{\ell,22} & \cdots & \hat{\Sigma}_{\ell,2d} \\
    \vdots & \vdots & \ddots & \vdots \\
    \hat{\Sigma}_{\ell,d1} & \hat{\Sigma}_{\ell,d2} & \cdots & \hat{\Sigma}_{\ell,dd} \\
    \end{bmatrix}
    $$

    Which, in our case for $d=2$, is :

    $$\hat{\Sigma}_\ell =
    \begin{bmatrix}
    \hat{\Sigma}_{\ell,11}  & \hat{\Sigma}_{\ell,12} \\
    \hat{\Sigma}_{\ell,21}  & \hat{\Sigma}_{\ell,22} \\
    \end{bmatrix}$$

### 1. Classic Estimator of the covariance matrix :

To estimate $\Sigma$ by the classic method, we just do what's on the Sampling section with $\ell = L$

### 2. MLMC Covariance Estimation in Euclidean Geometry

Based on the work of Maurais Aimee, Terrence Alsup, Benjamin Peherstorfer, and Youssef Marzouk. [“Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry.”](https://arxiv.org/pdf/2301.13749.pdf) **"Section 2"** and in "Multilevel Monte Carlo covariance estimation for the computation of Sobol’ indices" ([Myzek & de Lozzo](https://hal.science/hal-01894503/document), (2.22, 2.23, 2.24)) we have the MLMC Covariance Estimation in Euclidian Geometry noted $\hat{\Sigma}_{L}^{EMF}$ :

$$
\hat{\Sigma}_{L}^{EMF} = \hat{\Sigma}_0[Y_0] + \sum_{l=1}^{L}\left(\hat{\Sigma}_{n_\ell} [Y_\ell] - \hat{\Sigma}_{n_\ell}[Y_{\ell-1}] \right) \qquad \text{with} \; n_{\ell-1} \ge n_\ell
$$
with the matrix $\hat{\Sigma}_{n_\ell,ij} [Y_{\ell'}]$ defined $\forall(i,j) \in \{1,\dots,d\}$ as :

$$
\begin{align}
\hat{\Sigma}_{n_\ell,ij} [Y_{\ell'}] & = \frac{n_\ell}{n_\ell-1} \sum_{k=1}^{n_\ell}\left[ \left( Y_{\ell',i}^{(k)} - \hat{\mathbb{E}}_{\ell,i}[Y_{\ell',i}^{(k)}] \right) - \left( Y_{\ell',j}^{(k)} - \hat{\mathbb{E}}_{\ell,j}[Y_{\ell',j}^{(k)}] \right) \right] \\
& = \frac{n_\ell}{n_\ell-1} \left[ \sum_{k=1}^{n_\ell} Y_{\ell',i}^{(k)} Y_{\ell',j}^{(k)} - \left( \sum_{k=1}^{n_\ell} Y_{\ell',i}^{(k)} \right) \left( \sum_{k=1}^{n_\ell} Y_{\ell',j}^{(k)} \right) \right]
\end{align}
$$

with $f_{\ell'}(X^{(k)}) = \left( Y_{\ell',i}^{(k)}, Y_{\ell',j}^{(k)} \right)$ and $(k)$ the $k$-th element of our sample of size $n_\ell$.

In practice, we use the equation $(2)$ and if $Y$ is a centered vector, which means that $\mathbb{E}_{\ell,i}[Y^{(k)}] = 0$ for $ i \in \{1,\dots,d\}$, then the equation becomes : 

$$
\begin{align}
\hat{\Sigma}_{n_\ell,ij} [Y_{\ell'}] & = 1 \cdot \sum_{k=1}^{n_\ell} Y_{\ell',i}^{(k)} Y_{\ell',j}^{(k)}
\end{align}
$$

with $\frac{n_\ell}{n_\ell-1}$ becoming $\frac{n_\ell}{n_\ell} = 1$

Once we have those matricies, we can use the MLMC Euclidean Estimator $\hat{\Sigma}_{L}^{EMF}$ or use the MLMC Log-Euclidean Estimator $\hat{\Sigma}_{L}^{LEMF}$ (Eq. (7), [Maurais Aimee, Terrence Alsup, Benjamin Peherstorfer, and Youssef Marzouk](https://arxiv.org/pdf/2301.13749.pdf)) described in the next section.

### 3. MLMC Covariance Estimation in Log-Euclidean Geometry

Based on the work of [Maurais Aimee, Terrence Alsup, Benjamin Peherstorfer, and Youssef Marzouk](https://arxiv.org/pdf/2301.13749.pdf) (Eq. (7)) we have the the MLMC Log-Euclidean Estimator $\hat{\Sigma}_{L}^{LEMF}$ :