In [None]:
from resources.workspace import *

$
% START OF MACRO DEF
% DO NOT EDIT IN INDIVIDUAL NOTEBOOKS, BUT IN macros.py
%
\newcommand{\Reals}{\mathbb{R}}
\newcommand{\Expect}[0]{\mathbb{E}}
\newcommand{\NormDist}{\mathcal{N}}
%
\newcommand{\DynMod}[0]{\mathscr{M}}
\newcommand{\ObsMod}[0]{\mathscr{H}}
%
\newcommand{\mat}[1]{{\mathbf{{#1}}}} 
%\newcommand{\mat}[1]{{\pmb{\mathsf{#1}}}}
\newcommand{\bvec}[1]{{\mathbf{#1}}} 
%
\newcommand{\trsign}{{\mathsf{T}}} 
\newcommand{\tr}{^{\trsign}} 
\newcommand{\tn}[1]{#1} 
\newcommand{\ceq}[0]{\mathrel{≔}}
%
\newcommand{\I}[0]{\mat{I}} 
\newcommand{\K}[0]{\mat{K}}
\newcommand{\bP}[0]{\mat{P}}
\newcommand{\bH}[0]{\mat{H}}
\newcommand{\bF}[0]{\mat{F}}
\newcommand{\R}[0]{\mat{R}}
\newcommand{\Q}[0]{\mat{Q}}
\newcommand{\B}[0]{\mat{B}}
\newcommand{\C}[0]{\mat{C}}
\newcommand{\Ri}[0]{\R^{-1}}
\newcommand{\Bi}[0]{\B^{-1}}
\newcommand{\X}[0]{\mat{X}}
\newcommand{\A}[0]{\mat{A}}
\newcommand{\Y}[0]{\mat{Y}}
\newcommand{\E}[0]{\mat{E}}
\newcommand{\U}[0]{\mat{U}}
\newcommand{\V}[0]{\mat{V}}
%
\newcommand{\x}[0]{\bvec{x}}
\newcommand{\y}[0]{\bvec{y}}
\newcommand{\z}[0]{\bvec{z}}
\newcommand{\q}[0]{\bvec{q}}
\newcommand{\br}[0]{\bvec{r}}
\newcommand{\bb}[0]{\bvec{b}}
%
\newcommand{\bx}[0]{\bvec{\bar{x}}}
\newcommand{\by}[0]{\bvec{\bar{y}}}
\newcommand{\barB}[0]{\mat{\bar{B}}}
\newcommand{\barP}[0]{\mat{\bar{P}}}
\newcommand{\barC}[0]{\mat{\bar{C}}}
\newcommand{\barK}[0]{\mat{\bar{K}}}
%
\newcommand{\D}[0]{\mat{D}}
\newcommand{\Dobs}[0]{\mat{D}_{\text{obs}}}
\newcommand{\Dmod}[0]{\mat{D}_{\text{obs}}}
%
\newcommand{\ones}[0]{\bvec{1}} 
\newcommand{\AN}[0]{\big( \I_N - \ones \ones\tr / N \big)}
%
% END OF MACRO DEF
$
# The ensemble (Monte-Carlo) approach
is an approximate method for doing Bayesian inference. Instead of computing the full (gridvalues, or parameters, of the) posterior distributions, we instead try to generate ensembles from them.

An ensemble is an *iid* sample. I.e. a set of "members" ("particles", "realizations", or "sample points") that have been drawn ("sampled") independently from the same distribution. With the EnKF, these assumptions are generally tenuous, but pragmatic.

Ensembles can be used to characterize uncertainty: either by reconstructing (estimating) the distribution from which it is assumed drawn, or by computing various *statistics* such as the mean, median, variance, covariance, skewness, confidence intervals, etc (any function of the ensemble can be seen as a "statistic"). This is illustrated by the code below.

In [None]:
# Parameters
b   = 0
B   = 25    
B12 = sqrt(B)

def true_pdf(x):
    return ss.norm.pdf(x,b,sqrt(B))

# Plot true pdf
xx = 3*linspace(-B12,B12,201)
fig, ax = plt.subplots()
ax.plot(xx,true_pdf(xx),label="True");

# Sample and plot ensemble
M = 1   # length of state vector
N = 100 # ensemble size
E = b + B12*randn((N,M))
ax.plot(E, zeros(N), '|k', alpha=0.3, ms=100)

# Plot histogram
nbins = max(10,N//30)
heights, bins, _ = ax.hist(E,density=1,bins=nbins,label="Histogram estimate")

# Plot parametric estimate
x_bar = np.mean(E)
B_bar = np.var(E)
ax.plot(xx,ss.norm.pdf(xx,x_bar,sqrt(B_bar)),label="Parametric estimate")

ax.legend();

# Uncomment AFTER Exc 4:
# dx = bins[1]-bins[0]
# c = 0.5/sqrt(2*pi*B)
# for height, x in zip(heights,bins):
#     ax.add_patch(mpl.patches.Rectangle((x,0),dx,c*height/true_pdf(x+dx/2),alpha=0.3))
# Also set
#  * N = 10**4
#  * nbins = 50

The plot demonstrates that the true distribution can be represented by a sample thereof (since we can almost reconstruct the Gaussian distribution by estimating the moments from the sample). However, there are other ways to reconstruct (estimate) a distribution from a sample. For example: a histogram.

**Exc 2:** Which approximation to the true pdf looks better: Histogram or the parametric?   
Does one approximation actually start with more information? The EnKF takes advantage of this.

#### Exc 4*:
Use the method of `gaussian_kde` from `scipy.stats` to make a "continuous histogram" and plot it above.
`gaussian_kde`  

In [None]:
#show_answer("KDE")

**Exc 5*:** Suppose the histogram bars get normalized (divided) by the value of the pdf at their location.  
How do you expect the resulting histogram to look?  
Test your answer by uncommenting the block in the above code.

Being able to sample a Gaussian distribution is a building block of the EnKF.
In the previous example, we generated samples from a Gaussian distribution using the `randn` function.
However, that was just for a scalar (univariate) case, i.e. with `M=1`. We need to be able to sample a multivariate Gaussian distribution. That is the objective of the following exercise.

**Exc 6 (Multivariate Gaussian sampling):**
Suppose $\z$ is a standard Gaussian,
i.e. $p(\z) = \mathcal{N}(\z \mid \bvec{0},\I_M)$,
where $\I_M$ is the $M$-dimensional identity matrix.  
Let $\x = \mat{L}\z + \bb$. 
Recall [Exc 3.7](T3%20-%20Univariate%20Kalman%20filtering.ipynb#Exc-3.7:-The-forecast-step:),
which yields $p(\x) = \mathcal{N}(\x \mid \bb, \mat{L}^{}\mat{L}^T)$.
    
 * (a). $\z$ can be sampled using `randn((M,1))`. How (where) is `randn` defined?
 * (b). Consider the above definition of $\x$ and the code below.
 Complete it so as to generate a random realization of $\x$.  
 Hint: matrix-vector multiplication can be done using the symbol `@`. 

In [None]:
M   = 3 # ndim
b   = 10*ones(M)
B   = diag(1+arange(M))
L   = np.linalg.cholesky(B) # B12
print("True mean and cov:")
print(b)
print(B)

### INSERT ANSWER (b) ###

In [None]:
#show_answer('Gaussian sampling a')

In [None]:
#show_answer('Gaussian sampling b')

 * (c). In the code cell below, sample $N = 100$ realizations of $\x$
 and collect them in an $M$-by-$N$ "ensemble matrix" $\E$.  
   - Try to avoid `for` loops (the main thing to figure out is: how to add a (mean) vector to a matrix).
   - Run the cell and inspect the computed mean and covariance to see if they're close to the true values, printed in the cell above.

In [None]:
N  = 100 # ensemble size

E = ### INSERT ANSWER (c) ###

# Use the code below to assess whether you got it right
x_bar = np.mean(E,axis=1)
B_bar = np.cov(E)

with printoptions(precision=1):
    print("Estimated mean:")
    print(x_bar)
    print("Estimated covariance:")
    print(B_bar)
plt.matshow(B_bar,cmap="Blues"); plt.grid('off'); plt.colorbar()

In [None]:
#show_answer('Gaussian sampling c')

**Exc 8*:** How erroneous are the ensemble estimates on average?

In [None]:
#show_answer('Average sampling error')

**Exc 10:** Above, we used numpy's (`np`) functions to compute the sample-estimated mean and covariance matrix,
$\bx$ and $\barB$,
from the ensemble matrix $\E$.
Now, instead, implement these estimators yourself:
$$\begin{align}\bx &\ceq \frac{1}{N}   \sum_{n=1}^N \x_n \, , \\
   \barB &\ceq \frac{1}{N-1} \sum_{n=1}^N (\x_n - \bx) (\x_n - \bx)^T \, . \end{align}$$

In [None]:
# Don't use numpy's mean, cov
def estimate_mean_and_cov(E):
    M, N = E.shape
    
    ### INSERT ANSWER ###
    
    return x_bar, B_bar

x_bar, B_bar = estimate_mean_and_cov(E)
with printoptions(precision=1):
    print(x_bar)
    print(B_bar)

In [None]:
#show_answer('ensemble moments')

**Exc 12:** Why is the normalization by $(N-1)$ for the covariance computation?

In [None]:
#show_answer('Why (N-1)')

**Exc 14:** Like Matlab, Python (numpy) is quicker if you "vectorize" loops.
This is eminently possible with computations of ensemble moments.  
Let $\X \ceq 
\begin{bmatrix}
		\x_1 -\bx, & \ldots & \x_n -\bx, & \ldots & \x_N -\bx
	\end{bmatrix} \, .$
 * (a). Show that $\X = \E \AN$, where $\ones$ is the column vector of length $N$ with all elements equal to $1$.   
 Hint: consider column $n$ of $\X$.
 * (b). Show that $\barB = \X \X^T /(N-1)$.
 * (c). Code up this, latest, formula for $\barB$ and insert it in `estimate_mean_and_cov(E)`

In [None]:
#show_answer('ensemble moments vectorized')

**Exc 16:** The cross-covariance between two random vectors, $\bx$ and $\by$, is given by
$$\begin{align}
\barC_{\x,\y}
&\ceq \frac{1}{N-1} \sum_{n=1}^N 
(\x_n - \bx) (\y_n - \by)^T \\\
&= \X \Y^T /(N-1)
\end{align}$$
where $\Y$ is, similar to $\X$, the matrix whose columns are $\y_n - \by$ for $n=1,\ldots,N$.  
Note that this is simply the covariance formula, but for two different variables.  
I.e. if $\Y = \X$, then $\barC_{\x,\y} = \barC_{\x}$ (which we have denoted $\barB$ in the above).

Implement the cross-covariance estimator in the code-cell below.

In [None]:
def estimate_cross_cov(Ex,Ey):
    ### INSERT ANSWER ###

In [None]:
#show_answer('estimate cross')

**Exc 18 (error notions)*:**
 * (a). What's the difference between error residual?
 * (b). What's the difference between error and bias?
 * (c). Show `MSE = RMSE^2 = Bias^2 + Var`

In [None]:
#show_answer('errors')

### Next: [Writing your own EnKF](T8%20-%20Writing%20your%20own%20EnKF.ipynb)