## Chapter 4: Kalman Filtering and Exact Solutions

Markovian Model:


$$
x_k \sim p(x_k | x_{k-1})
$$

$$
y_k \sim p(y_k | x_{k})
$$

- Markov Property
- Conditional independance of measurements

Bayesian Filtering: Finding the marginal posterior

$$
p(x_k | y_{1:k})
$$

# Kalman Filtering

The closed form solution to the recursive Bayesian filtering equations

$$
\begin{aligned}
\mathbf{x}_k & =\mathbf{A}_{k-1} \mathbf{x}_{k-1}+\mathbf{q}_{k-1} \\
\mathbf{y}_k & =\mathbf{H}_k \mathbf{x}_k+\mathbf{r}_k
\end{aligned}
$$

The dynamic and measurement models are linear Gaussian

$$
\begin{aligned}
p\left(\mathbf{x}_k \mid \mathbf{x}_{k-1}\right) & =\mathrm{N}\left(\mathbf{x}_k \mid \mathbf{A}_{k-1} \mathbf{x}_{k-1}, \mathbf{Q}_{k-1}\right), \\
p\left(\mathbf{y}_k \mid \mathbf{x}_k\right) & =\mathrm{N}\left(\mathbf{y}_k \mid \mathbf{H}_k \mathbf{x}_k, \mathbf{R}_k\right) .
\end{aligned}
$$

Evaluating the closed form results in Gaussian distributions. 

$$
p\left(\mathbf{x}_k \mid \mathbf{y}_{1: k}\right)=\mathrm{N}\left(\mathbf{x}_k \mid \mathbf{m}_k, \mathbf{P}_k\right),
$$

The iterative process of estimating a new state involves a prediction step

$$
\begin{aligned}
\mathbf{m}_k^{-} & =\mathbf{A}_{k-1} \mathbf{m}_{k-1} \\
\mathbf{P}_k^{-} & =\mathbf{A}_{k-1} \mathbf{P}_{k-1} \mathbf{A}_{k-1}^{\top}+\mathbf{Q}_{k-1},
\end{aligned}
$$

and a update step

$$
\begin{aligned}
\mathbf{v}_k & =\mathbf{y}_k-\mathbf{H}_k \mathbf{m}_k^{-}, \\
\mathbf{S}_k & =\mathbf{H}_k \mathbf{P}_k^{-} \mathbf{H}_k^{\top}+\mathbf{R}_k, \\
\mathbf{K}_k & =\mathbf{P}_k^{-} \mathbf{H}_k^{\top} \mathbf{S}_k^{-1}, \\
\mathbf{m}_k & =\mathbf{m}_k^{-}+\mathbf{K}_k \mathbf{v}_k, \\
\mathbf{P}_k & =\mathbf{P}_k^{-}-\mathbf{K}_k \mathbf{S}_k \mathbf{K}_k^{\top} .
\end{aligned}
$$

## Chapter 5: Extended and Unscented Kalman filter

$$
\begin{aligned}
\mathbf{x}_k & =\mathbf{f}\left(\mathbf{x}_{k-1}\right)+\mathbf{q}_{k-1}, \\
\mathbf{y}_k & =\mathbf{h}\left(\mathbf{x}_k\right)+\mathbf{r}_k,
\end{aligned}
$$

Assume Gaussian Approximations

$$
p\left(\mathbf{x}_k \mid \mathbf{y}_{1: k}\right) \simeq \mathrm{N}\left(\mathbf{x}_k \mid \mathbf{m}_k, \mathbf{P}_k\right)
$$

In Extendted Kalman Filtering (EKF), we use Taylor series to approximate the non-linearities. The prediction step then becomes

$$
\begin{aligned}
\mathbf{m}_k^{-} & =\mathbf{f}\left(\mathbf{m}_{k-1}\right), \\
\mathbf{P}_k^{-} & =\mathbf{F}_{\mathbf{x}}\left(\mathbf{m}_{k-1}\right) \mathbf{P}_{k-1} \mathbf{F}_{\mathbf{x}}^{\top}\left(\mathbf{m}_{k-1}\right)+\mathbf{Q}_{k-1},
\end{aligned}
$$

and the update step becomes

$$
\begin{aligned}
\mathbf{v}_k & =\mathbf{y}_k-\mathbf{h}\left(\mathbf{m}_k^{-}\right), \\
\mathbf{S}_k & =\mathbf{H}_{\mathbf{x}}\left(\mathbf{m}_k^{-}\right) \mathbf{P}_k^{-} \mathbf{H}_{\mathbf{x}}^{\top}\left(\mathbf{m}_k^{-}\right)+\mathbf{R}_k, \\
\mathbf{K}_k & =\mathbf{P}_k^{-} \mathbf{H}_{\mathbf{x}}^{\top}\left(\mathbf{m}_k^{-}\right) \mathbf{S}_k^{-1}, \\
\mathbf{m}_k & =\mathbf{m}_k^{-}+\mathbf{K}_k \mathbf{v}_k, \\
\mathbf{P}_k & =\mathbf{P}_k^{-}-\mathbf{K}_k \mathbf{S}_k \mathbf{K}_k^{\top} .
\end{aligned}
$$


In essense, this means using the Jacobian of the transition function and the measurement model matrix, respectively. Other versions exist that take non-additive and additive noise into account. Could be relevant!

Unscented Kalman Filtering (UKF) uses the unscented transformation and is used for approximating distributations of the same form as EKF. Prediction:

- Prediction:


1. Form the sigma points:
$$
\begin{aligned}
\mathcal{X}_{k-1}^{(0)} & =\mathbf{m}_{k-1}, \\
\mathcal{X}_{k-1}^{(i)} & =\mathbf{m}_{k-1}+\sqrt{n+\lambda}\left[\sqrt{\mathbf{P}_{k-1}}\right]_i, \\
\mathcal{X}_{k-1}^{(i+n)} & =\mathbf{m}_{k-1}-\sqrt{n+\lambda}\left[\sqrt{\mathbf{P}_{k-1}}\right]_i, \quad i=1, \ldots, n,
\end{aligned}
$$



2. Propagate the sigma points through the dynamic model:
$$
\hat{\mathcal{X}}_k^{(i)}=\mathbf{f}\left(\mathcal{X}_{k-1}^{(i)}\right), \quad i=0, \ldots, 2 n .
$$


3. Compute the predicted mean $\mathbf{m}_k^{-}$and the predicted covariance $\mathbf{P}_k^{-}$:
$$
\begin{aligned}
& \mathbf{m}_k^{-}=\sum_{i=0}^{2 n} W_i^{(\mathrm{m})} \hat{\mathcal{X}}_k^{(i)}, \\
& \mathbf{P}_k^{-}=\sum_{i=0}^{2 n} W_i^{(\mathrm{c})}\left(\hat{\mathcal{X}}_k^{(i)}-\mathbf{m}_k^{-}\right)\left(\hat{\mathcal{X}}_k^{(i)}-\mathbf{m}_k^{-}\right)^{\top}+\mathbf{Q}_{k-1},
\end{aligned}
$$
where the weights $W_i^{(\mathrm{m})}$ and $W_i^{(\mathrm{c})}$ were defined in Equation (5.77).

Update: 

1. Form Sigma Points
2.Propagate sigma points through the measurement model
3. Compute predicted mean $\mu_k$, covariamce of measurement $S_k$, and cross-covariance, $C_k$
4. Compute filter gain $K_k = C_k S_k^{-1}$, filtered state mean $m_k$, and covariance $P_k$.

## Chapter 8: Rauch-Tung-Striebel (RTS) smoother

$$
p\left(\mathbf{x}_k \mid \mathbf{y}_{1:T}\right)=\mathrm{N}\left(\mathbf{x}_k \mid \mathbf{m}_k^s, \mathbf{P}_k^s\right),
$$

Same predictions as Kalman

$$
\begin{aligned}
\mathbf{m}_{k+1}^{-} & =\mathbf{A}_{k} \mathbf{m}_{k} \\
\mathbf{P}_{k+1}^{-} & =\mathbf{A}_{k} \mathbf{P}_{k} \mathbf{A}_{k}^{\top}+\mathbf{Q}_{k},
\end{aligned}
$$

Backward recursion


$$
\begin{aligned}
\mathbf{G}_k & =\mathbf{P}_k \mathbf{A}_k^{\top} [\mathbf{P}_{k+1}]^{-1}, \\
\mathbf{m}_k^s & =\mathbf{m}_k+\mathbf{G}_k [\mathbf{m}_{k+1}^s - \mathbf{m}_{k+1}^-], \\
\mathbf{P}_k^s & =\mathbf{P}_k+\mathbf{
G}_k [\mathbf{P}_{k+1}^s -\mathbf{P}_{k+1}^-] \mathbf{G}_k^T .
\end{aligned}
$$

## Particle Filtering

The main issue in bayesian inference is computing the posterior, or the expectation of the prosterior, i.e., 

$$
\mathbb{E}\left[\mathrm{g}\left(\mathbf{x}\right) \mid {y}_{1:T}\right]=\int\mathrm{g}\left(\mathbf{x}\right) p(x \mid y_{1:T}) dx,
$$

where $g$ is an arbitrary (possibly nonlinear) function. Monte Carlo methods provides a numerical method for approximating the expectation. In a perfect setting, we simply sample from the posterior and compute the average, i.e.,

$$
\mathbb{E}\left[\mathrm{g}\left(\mathbf{x}\right)  \mid {y}_{1:T}\right]=\frac{1}{N} \sum \mathrm{g}\left(\mathbf{x}^{(i)}\right) .
$$

However, this method can be very cost efficient - especially if $g$ is nonlinear - and it might even be impossible to sample from the posterior. Hence, we introduce importance sampling.

Importance sampling build on the assumption that we can approximate the posterior using an importance distribution $\pi$ from which we can easily draw samples. We draw samples,

$$
\pi(x \mid y_{1:T}), \quad i = 1,...,N,
$$

and form the approximation 

$$
\begin{aligned}
\mathrm{E}\left[\mathrm{g}(\mathbf{x}) \mid \mathbf{y}_{1: T}\right] & \approx \frac{1}{N} \sum_{i=1}^N \frac{p\left(\mathbf{x}^{(i)} \mid \mathbf{y}_{1: T}\right)}{\pi\left(\mathbf{x}^{(i)} \mid \mathbf{y}_{1: T}\right)} \mathrm{g}\left(\mathbf{x}^{(i)}\right) \\
& =\sum_{i=1}^N \tilde{w}^{(i)} \mathrm{g}\left(\mathbf{x}^{(i)}\right),
\end{aligned}
$$

where $\pi$ must have a support greater or equal to the posterior. In this way, weights are used to attribute $\textit{importance}$ to samples approximating the posterior probability density. 


Importance sampling approximations can be used sequentially to generate filtering distributions of generic state space models. At each step, the expected value can be found using importance sampling. A problem in the sequential importance sampling (SIS) arises when nearly all weights are equal to zero. This is called the $\textit{degeneracy problem}$. Several methods exist for resampling particles - these are called sequential importance resampling (SIR). Briefly, the resampling procedure can be described as follows:

1. Interpret each weight $w_k^{(i)}$ as the probability of obtaining the sample index $i$.
2. Draw new samples from this discrete distribution.
3. Set all weights to $1/N$.

This procedure is normally what is referred to as the particle filter.


With the Rao-Blackwellized particle filter, we assume that we can sometimes evaulate some of the filter equations analitically and the rest with Monte Carlo. This is an improvement as it reduces variance. 