# Kalman Filter on the Stationary Dynamics

The inverse problem 

$$y = \mathcal{G}(\theta) + \eta$$

can be solved by first introducing a (mean-field) stochastic dynamical system in which the parameter-to-data map is embedded and then employing techniques from nonlinear Kalman filtering.

Consider a family of stochastic dynamical systems

$$\begin{align}
  &\textrm{evolution:}    &&\theta_{n+1} = r + \alpha (\theta_{n}  - r) +  \omega_{n+1}, &&\omega_{n+1} \sim \mathcal{N}(0,\Sigma_{\omega}),\\
  &\textrm{observation:}  &&x_{n+1} = \mathcal{F}(\theta_{n+1}) + \nu_{n+1}, &&\nu_{n+1} \sim \mathcal{N}(0,\Sigma_{\nu}).
\end{align}$$

Then different Kalman filters can be employed on these stochastic dynamical systems, which leads to different Kalman inversion algorithms.


# The Gaussian Approximation Algorithm

Let denote $Y_n=\{x_{\ell}\}_{\ell=1}^{n}$, then Kalman filter process aims to approximate $\rho_n(\theta)$, the probability density function of the conditional distribution of $\theta_n|Y_n$. 
In the predition step, we predict $\rho_n 
\mapsto \hat{\rho}_{n+1}$, where $\hat{\rho}_{n+1}$ is the probability density function of the distribution
of $\theta_{n+1}|Y_n$; in the analysis step, we update 
$\hat{\rho}_{n+1} \mapsto \rho_{n+1}$, 

This conceptual algorithm maps Gaussians into Gaussians.
Re refer to it henceforth as the Gaussian Approximation Algorithm.

## Prediction Step
Assume that $\rho_n \approx \mathcal{N}(m_n,C_n)$. 
Note that, under the linear evolution,
$\hat{\rho}_{n+1}$ is also Gaussian with mean and covariance

$$ \hat{m}_{n+1} = \mathbb{E}[\theta_{n+1}|Y_n] =  r + \alpha (m_n  - r) \qquad 
\hat{C}_{n+1} = \mathrm{Cov}[\theta_{n+1}|Y_n] = \alpha^2C_{n} + \Sigma_{\omega}$$

## Analysis Step
The algorithm proceeds by introducing the joint distribution
of $\theta_{n+1}, y_{n+1}|Y_n$,  projecting this onto a Gaussian by computing
its mean and covariance, and then conditioning this Gaussian to obtain
a Gaussian approximation  $\mathcal{N}(m_{n+1},C_{n+1})$ to $\mu_{n+1}.$

In the analysis step, we assume that the joint distribution of  $\{\theta_{n+1}, y_{n+1}\}|Y_{n}$ can be approximated by a Gaussian distribution

$$\begin{equation}
     \mathcal{N}\Bigl(
    \begin{bmatrix}
    \hat{m}_{n+1}\\
    \hat{x}_{n+1}
    \end{bmatrix}, 
    \begin{bmatrix}
   \hat{C}_{n+1} & \hat{C}_{n+1}^{\theta x}\\
    {\hat{C}_{n+1}^{\theta x}}{}^{T} & \hat{C}_{n+1}^{xx}
    \end{bmatrix}
    \Bigr).
\end{equation}$$

Then, with $\mathbb{E}$ denoting expectation with respect to 
$\theta_{n+1}|Y_n \sim \mathcal{N}( \hat{m}_{n+1},\hat{C}_{n+1})$, 

$$\begin{align*}
    \hat{x}_{n+1} =     & \mathbb{E}[\mathcal{F}(\theta_{n+1})|Y_n], \\
    \hat{C}_{n+1}^{\theta x} =     &  \mathrm{Cov}[\theta_{n+1}, \mathcal{F}(\theta_{n+1})|Y_n],\\
    \hat{C}_{n+1}^{xx} = &  \mathrm{Cov}[\mathcal{F}(\theta_{n+1})|Y_n] + \Sigma_{\nu}.
\end{align*}$$

Conditioning the Gaussian to find $\theta_{n+1}|\{Y_n,x_{n+1}\}=\theta_{n+1}|Y_{n+1}$ gives the following
expressions for the mean $m_{n+1}$ and covariance $C_{n+1}$ of the
approximation to $\mu_{n+1}:$

$$
\begin{equation}
\label{eq:KF_analysis}
    \begin{split}
        m_{n+1} &= \hat{m}_{n+1} + \hat{C}_{n+1}^{\theta x} (\hat{C}_{n+1}^{xx})^{-1} (x_{n+1} - \hat{x}_{n+1}),\\
        C_{n+1} &= \hat{C}_{n+1} - \hat{C}_{n+1}^{\theta x}(\hat{C}_{n+1}^{xx})^{-1} {\hat{C}_{n+1}^{\theta x}}{}^{T}.
    \end{split}
\end{equation}
$$

These equations establish a conceptual description of the Kalman filter process, however, when $\mathcal{G}$ is nonlinear, to evaluate these integrals,
$\hat{x}_{n+1}$, $\hat{C}_{n+1}^{\theta x}$ and $\hat{C}_{n+1}^{xx}$, different nonlinear Kalman filters are required.




# Unscented Kalman Inversion

When the [unscented Kalman filter](KalmanFilter.ipynb) is applied, the conceptual Gaussian approximation algorithm becomes 

* Prediction step :

    $$\begin{align*}
    \hat{m}_{n+1} = & r+\alpha(m_n-r)\\
    \hat{C}_{n+1} = & \alpha^2 C_{n} + \Sigma_{\omega}
    \end{align*}$$
    
* Generate sigma points :
    
    $$\begin{align*}
    &\hat{\theta}_{n+1}^0 = \hat{m}_{n+1} \\
    &\hat{\theta}_{n+1}^j = \hat{m}_{n+1} + c_j [\sqrt{\hat{C}_{n+1}}]_j \quad (1\leq j\leq N_\theta)\\ 
    &\hat{\theta}_{n+1}^{j+N_\theta} = \hat{m}_{n+1} - c_j [\sqrt{\hat{C}_{n+1}}]_j\quad (1\leq j\leq N_\theta)
    \end{align*}$$
    
*  Analysis step :
    
   $$
   \begin{align*}
        &\hat{x}^j_{n+1} = \mathcal{F}(\hat{\theta}^j_{n+1}) \qquad \hat{x}_{n+1} = \hat{x}^0_{n+1}\\
         &\hat{C}^{\theta x}_{n+1} = \sum_{j=1}^{2N_\theta}W_j^{c}
        (\hat{\theta}^j_{n+1} - \hat{m}_{n+1} )(\hat{x}^j_{n+1} - \hat{x}_{n+1})^T \\
        &\hat{C}^{xx}_{n+1} = \sum_{j=1}^{2N_\theta}W_j^{c}
        (\hat{x}^j_{n+1} - \hat{y}_{n+1} )(\hat{x}^j_{n+1} - \hat{x}_{n+1})^T + \Sigma_{\nu}\\
        &m_{n+1} = \hat{m}_{n+1} + \hat{C}^{\theta x}_{n+1}(\hat{C}^{xx}_{n+1})^{-1}(x - \hat{x}_{n+1})\\
        &C_{n+1} = \hat{C}_{n+1} - \hat{C}^{\theta x}_{n+1}(\hat{C}^{xx}_{n+1})^{-1}{\hat{C}^{\theta x}_{n+1}}{}^{T}\\
    \end{align*}
    $$




# Stochastic Ensemble Kalman Inversion

When the [stochastic ensemble Kalman filter](Kalman.ipynb) is applied, the conceptual Gaussian approximation algorithm becomes 

* Prediction step :

$$
\begin{align*}
\hat{\theta}_{n+1}^{j} &= \alpha\theta_{n}^{j}+ (1-\alpha)r  + \omega_{n+1}^{j},\\ 
\hat{m}_{n+1} &= \frac{1}{J}\sum_{j=1}^{J}\hat{\theta}_{n+1}^{j}
\end{align*}
$$
    
    
*  Analysis step :
    
   $$
   \begin{align*}
         &\hat{x}_{n+1}^{j} = \mathcal{F}(\hat{\theta}_{n+1}^{j})  \qquad \hat{x}_{n+1} = \frac{1}{J}\sum_{j=1}^{J}\hat{x}_{n+1}^{j},\\
         %
        &\hat{C}_{n+1}^{\theta x} = \frac{1}{J-1}\sum_{j=1}^{J}(\hat{\theta}_{n+1}^{j} - \hat{m}_{n+1})(\hat{x}_{n+1}^{j} - \hat{x}_{n+1})^T,  \\
                      %
        &\hat{C}_{n+1}^{xx} = \frac{1}{J-1}\sum_{j=1}^{J}(\hat{x}_{n+1}^{j} - \hat{x}_{n+1})(\hat{x}_{n+1}^{j} - \hat{x}_{n+1})^T +\Sigma_{\nu}, \\
                      %
        &\theta_{n+1}^{j} = \hat{\theta}_{n+1}^{j} + \hat{C}_{n+1}^{\theta x}\left(\hat{C}_{n+1}^{xx}\right)^{-1}(x - \hat{x}_{n+1}^{j} - \nu_{n+1}^{j}),\\
                      %
        &m_{n+1} = \frac{1}{J} \sum_{j=1}^{J} \theta_{n+1}^{j}.\\ 
    \end{align*}
    $$


**Remark**
When $\Sigma_{\omega} = \gamma C_n$, the prediction step can be treated deterministically, as follows,

$$
\begin{align*}
\hat{m}_{n+1} &= r + \alpha(m_n - r) \\
\hat{\theta}_{n+1}^{j} &= \hat{m}_{n+1} + \sqrt{\alpha^2 + \gamma} (\theta_n^{j} - m_n)\\ 
\end{align*}
$$

**Remark**
As a precursor to understanding the adjustment and transform filters which
follow this lecture, we need to point out that 
due to the stochastic treatment in the analysis step, Stochastic Ensemble Kalman Inversion does not exactly replicate the covariance
update equation.
To this end,
denote  the matrix square roots $\hat{Z}_{n+1},\, Z_{n+1} \in \mathbb{R}^{N_{\theta}\times J}$ of $\hat{C}_{n+1},\,C_{n+1}$ and $\hat{\mathcal{Y}}_{n+1}$ as follows:

$$
\begin{align*}
    \hat{Z}_{n+1} &= \frac{1}{\sqrt{J-1}}\Big(\hat{\theta}_{n+1}^{1} - \hat{m}_{n+1}\quad \hat{\theta}_{n+1}^{2} - \hat{m}_{n+1}\quad...\quad\hat{\theta}_{n+1}^{J} - \hat{m}_{n+1} \Big),\\
    Z_{n+1} &= \frac{1}{\sqrt{J-1}}\Big(\theta_{n+1}^{1} - m_{n+1}\quad \theta_{n+1}^{2} - m_{n+1}\quad...\quad\theta_{n+1}^{J} - m_{n+1} \Big),\\
    \hat{\mathcal{Y}}_{n+1} &= \frac{1}{\sqrt{J-1}}\Big(\hat{x}_{n+1}^{1} - \hat{x}_{n+1}\quad \hat{x}_{n+1}^{2} - \hat{x}_{n+1}\quad...\quad\hat{x}_{n+1}^{J} - \hat{x}_{n+1} \Big).
\end{align*}
$$

Then the covariance update equation does not hold exactly:
$$
\begin{align*}
        \hat{C}_{n+1} - \hat{C}_{n+1}^{\theta x}(\hat{C}_{n+1}^{xx})^{-1} {\hat{C}_{n+1}^{\theta x}}{}^{T} &= \hat{Z}_{n+1}\hat{Z}_{n+1}^T - \hat{Z}_{n+1}\hat{\mathcal{Y}}_{n+1}^T (\hat{\mathcal{Y}}_{n+1}\hat{\mathcal{Y}}_{n+1}^T + \Sigma_{\nu,n+1})^{-1}\hat{\mathcal{Y}}_{n+1}\hat{Z}_{n+1}^T \\
                &\neq Z_{n+1} Z_{n+1}^T = C_{n+1}.
\end{align*}
$$


# Ensemble Adjustment Kalman Inversion

When the ensemble adjustment Kalman filter is applied, the conceptual Gaussian approximation algorithm becomes 

* Prediction step :

$$
\begin{align*}
\hat{\theta}_{n+1}^{j} &= \alpha\theta_{n}^{j}+ (1-\alpha)r_0  + \omega_{n+1}^{j},\\ 
\hat{m}_{n+1} &= \frac{1}{J}\sum_{j=1}^{J}\hat{\theta}_{n+1}^{j}
\end{align*}
$$
    
    
*  Analysis step :
    
   $$
   \begin{align*}
         &m_{n+1} = \hat{m}_{n+1} + \hat{C}_{n+1}^{\theta x}\left(\hat{C}_{n+1}^{xx}\right)^{-1}(x - \hat{x}_{n+1})\\
                      %
        &\theta_{n+1}^{j} = m_{n+1} + A(\hat{\theta}_{n+1}^{j} - \hat{m}_{n+1}) 
    \end{align*}
    $$

where $A = P \hat{D}^{\frac{1}{2}} U D^{\frac{1}{2}}\hat{D}^{-\frac{1}{2}}P^T $ with 

$$
\begin{align*}
   \textrm{SVD :}     \quad       &\hat{Z}_{n+1} =  P \hat{D}^{\frac{1}{2}} V^T,\\
   \textrm{SVD :}     \quad      &V^T\Big(\mathbb{I} + \hat{\mathcal{Y}}_{n+1}^T \Sigma_{\nu,n+1}^{-1}  \hat{\mathcal{Y}}_{n+1}\Big)^{-1} V = U D U^T,
\end{align*}
$$

where both $\hat{D}$ and $D$ are non-singular diagonal matrices, with dimensionality rank($\hat{Z}_{n+1}$), and $\hat{Z}_{n+1}$ and $\hat{\mathcal{Y}}_{n+1}$ are defined before.

**Remark**
When $\Sigma_{\omega} = \gamma C_n$, the prediction step can be treated deterministically, as follows,

$$
\begin{align*}
\hat{m}_{n+1} &= r + \alpha(m_n - r) \\
\hat{\theta}_{n+1}^{j} &= \hat{m}_{n+1} + \sqrt{\alpha^2 + \gamma} (\theta_n^{j} - m_n)\\ 
\end{align*}
$$




# Ensemble Transform Kalman Inversion

When the ensemble adjustment Kalman filter is applied, the conceptual Gaussian approximation algorithm becomes 

* Prediction step :

$$
\begin{align*}
\hat{\theta}_{n+1}^{j} &= \alpha\theta_{n}^{j}+ (1-\alpha)r_0  + \omega_{n+1}^{j},\\ 
\hat{m}_{n+1} &= \frac{1}{J}\sum_{j=1}^{J}\hat{\theta}_{n+1}^{j}
\end{align*}
$$
    
    
*  Analysis step :
    
   $$
   \begin{align*}
         &m_{n+1} = \hat{m}_{n+1} + \hat{C}_{n+1}^{\theta x}\left(\hat{C}_{n+1}^{xx}\right)^{-1}(x - \hat{x}_{n+1})\\
                      %
        &Z_{n+1} = \hat{Z}_{n+1} T
    \end{align*}
    $$

where $T = P(\Gamma + I)^{-\frac{1}{2}}P^T$, with 
$$
\begin{align*}
\textrm{SVD:} \quad \hat{\mathcal{Y}}_{n+1} \Sigma_{\nu,n+1}^{-1} \hat{\mathcal{Y}}_{n+1} = P\Gamma P^T.
\end{align*}
$$
**Remark**
When $\Sigma_{\omega} = \gamma C_n$, the prediction step can be treated deterministically, as follows,

$$
\begin{align*}
\hat{m}_{n+1} &= r + \alpha(m_n - r) \\
\hat{\theta}_{n+1}^{j} &= \hat{m}_{n+1} + \sqrt{\alpha^2 + \gamma} (\theta_n^{j} - m_n)\\ 
\end{align*}
$$



