# Distributions

Suppose we are trying to emulate a simulator's outputs $f(x) \in \mathbb{R}^d$. If we consider a batch of $n$ outputs then then we'd want to retrieve output means $\mu \in \mathbb{R}^{nd}$ and covariances $\Sigma \in \mathbb{R}^{nd \times nd}$.
$$
\Sigma =
\begin{bmatrix}
\Sigma_{11} & \Sigma_{12} & \cdots & \Sigma_{1n} \\[6pt]
\Sigma_{21} & \Sigma_{22} & \cdots & \Sigma_{2n} \\[6pt]
\vdots      & \vdots      & \ddots & \vdots      \\[6pt]
\Sigma_{n1} & \Sigma_{n2} & \cdots & \Sigma_{nn}
\end{bmatrix} \in \mathbb{R}^{nd \times nd}
\quad \text{s.t.} \quad 
\Sigma_{ij} \in \mathbb{R}^{d \times d}
$$
where $\Sigma_{ii}$ is the *marginal covariance* and $\Sigma_{ij}$ is the *cross-covariance*.

There are a couple of scenarios we want to consider:
1. **Full**: no simplification to the above covariance matrix.
2. **Block-diagonal**: $\Sigma_{ij} = 0$ for all $i\neq j$, i.e. no correlations between samples.
3. **Diagonal**: $\Sigma_{ij} = 0$ for all $i\neq j$ and $\Sigma_{ii}^{(a, b)} = 0$ for all $a \neq b$, i.e. no correlations between sample dimensions.
4. **Seperable**: $\Sigma = \Sigma_{N} \otimes \Sigma_{D}$ s.t. $\Sigma_{\text{N}} \in \mathbb{R}^{n \times n}$ and $\Sigma_{\text{D}} \in \mathbb{R}^{d \times d}$, i.e. correlations between samples and dimensions are modelled seperately. 

Note that:
$$
\begin{aligned}
\text{Full} &\supseteq \text{Block-Diagonal} \supseteq \text{Diagonal} \\
\text{Full} &\supseteq \text{Seperable}
\end{aligned}
$$

In [1]:
import torch, time
from autoemulate.experimental.data.gaussian import *
%load_ext autoreload
%autoreload 2

In [2]:
n, d = 4, 3
mean = torch.randn(n, d)

## Full $(nd, nd)$ covariance

In [3]:
cov = torch.randn(n*d, n*d)
cov = cov @ cov.T
dist = Dense(mean, cov)
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(18.2819), tensor(155.3120), tensor(37.7297))

## Block-diagonal $(n, d, d)$ covariance

In [4]:
cov = torch.randn(n, d, d)
cov = cov @ cov.transpose(-1, -2) + torch.eye(d) * 1e-4
dist = Block_Diagonal(mean, cov)
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(3.0497), tensor(41.9208), tensor(9.6233))

## Diagonal $(n, d)$ covariance

In [5]:
cov = torch.abs(torch.randn(n, d))
dist = Diagonal(mean, cov)
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-6.2500), tensor(9.7560), tensor(2.3465))

## Seperable $(n, n)$ and $(d, d)$ covariance

In [6]:
cov_n = torch.randn(n, n)
cov_n = cov_n @ cov_n.T + torch.eye(n) * 1e-4
cov_d = torch.randn(d, d)
cov_d = cov_d @ cov_d.T + torch.eye(d) * 1e-4
dist = Separable(mean, cov_n, cov_d)
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-10.1732), tensor(90.1683), tensor(46.8669))

## Dirac

In [7]:
dist = Dirac(mean)
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-inf), tensor(0.), tensor(0.))

## Empirical distribution

In [8]:
k, n, d = 100, 3, 2

In [9]:
dist = Empirical_Dense(torch.randn(k, n, d))
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-0.0492), tensor(6.1424), tensor(1.4122))

In [10]:
dist = Empirical_Block_Diagonal(torch.randn(k, n, d))
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-0.2205), tensor(5.8254), tensor(1.1238))

In [11]:
dist = Empirical_Diagonal(torch.randn(k, n, d))
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-0.3838), tensor(5.7599), tensor(1.4086))

## Ensemble of Gaussians

Let $K \in \mathbb{N}$ be the number of Gaussians. Then the mean and covariance of an ensemble of Gaussians is given by

$$
\mu
= \frac{1}{K}
  \sum_{i=1}^{K}
    \mu_i,
$$
$$
\Sigma
= \frac{1}{K}
  \sum_{i=1}^{K}
    \Sigma_i
+
  \frac{1}{K - 1}
  \sum_{i=1}^{K}
    \bigl(\mu_i - \mu\bigr)
    \bigl(\mu_i - \mu\bigr)^\top.
$$

In [19]:
dist = Ensemble(dists := [
    Empirical_Dense(torch.randn(k, n, d)),
    Empirical_Block_Diagonal(torch.randn(k, n, d)),
    Empirical_Diagonal(torch.randn(k, n, d)),
    Dirac(torch.randn(k, n, d).mean(dim=0))
])
dist.logdet(), dist.trace(), dist.max_eig()

(tensor(-1.7073), tensor(4.5409), tensor(0.8772))