<a href="https://colab.research.google.com/github/jp2011/kalman-filters-examples/blob/main/kalman_filters_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# Kalman Filter

This is a detailed derivation of the filtering updates for the Kalman filter. The details follow closely the **Bayesian Filtering and Smoothing** book by *Simo Särkkä*.


Note that all letters in the text refer to vectors or matrices rather than scalars, hence no boldface in the notation below.


## Preliminaries

Before we proceed, we go through the derivation of some crucial properties of matrices and Gaussian random variables.




### Inversion of a partitioned matrix

Let $K$ be a invertible matrix partitioned as 
\begin{equation}
K = \begin{pmatrix}
    A & B\\
    C & D
    \end{pmatrix}.
\end{equation}
Its invers is the given by
\begin{equation}
    \begin{pmatrix}
    A & B\\
    C & D
    \end{pmatrix}^{-1}
    = 
    \begin{pmatrix}
    M         & - M B D^{-1}\\
    -D^{-1}CM & D^{-1} + D^{-1} C M B D^{-1}
    \end{pmatrix},
\end{equation}
where $M = (A - BD^{-1}C)^{-1}$.

### Joint distribution of a conditioned Gaussian on another Gaussian.

To derive the Kalman filter, we require several properties of Gaussian random variables.

Let $x$ and $y$ be random variables with Gaussian distributions:
\begin{align} 
x        & \sim \mathcal{N}(m, P), \\
y \mid x & \sim \mathcal{N}(Hx + u, R).
\end{align}
The joint distribution of $x$ and $y$ is given as
\begin{equation}
\left(\begin{array}{l}
x \\
y
\end{array}\right) \sim \mathcal{N}\left(\left(\begin{array}{c}
m \\
H m+u
\end{array}\right),\left(\begin{array}{cc}
P & P H^{\top} \\
H P & H P H^{\top}+R
\end{array}\right)\right)
\end{equation}
#### Proof
Let $p(x,y)$ be the joint density.
\begin{align}
\log p(x, y) & = \log p(y \mid x) + \log p(x) \\
             & = -\frac{1}{2}(y - Hx - u)^\top R^{-1}(y - Hx - u)
                 -\frac{1}{2}(x - m)^\top P^{-1} (x - m) + \text{const}
\end{align}
This is a quadratic function of $x$ and $y$ so the joint distribution must be Gaussian. Considering the second-order terms we obtain:
\begin{align}
& - \frac{1}{2} y^\top R^{-1} y - \frac{1}{2}x^{\top}\big(H^\top R^{-1} H + P^{-1}\big)x + \frac{1}{2} y^\top R^{-1} H x + \frac{1}{2} x^\top H^\top R^{-1} y \\
& = -\frac{1}{2} \begin{pmatrix} x \\ y \end{pmatrix} ^ \top 
     \begin{pmatrix}
         H^\top R^{-1} H + P^{-1}      & - H^\top R^{-1} \\
         -R^{-1}H                      & R^{-1}
     \end{pmatrix}
     \begin{pmatrix} x \\ y \end{pmatrix},
\end{align}
from where the covariance matrix given by
\begin{equation}
     \begin{pmatrix}
         H^\top R^{-1} H + P^{-1}      & - H^\top R^{-1} \\
         -R^{-1}H                      & R^{-1}
     \end{pmatrix} ^ {-1}
     = \begin{pmatrix}
           P        & P H^\top \\
           H^\top P & R + H P H^\top
       \end{pmatrix}.
\end{equation}



## Linear Gaussian Model

We have a time-evolving state process $x_k$ of which we observe a noisy version $y_k$ at time $k$. By assuming linear state transitions and Gaussian noise for both the process and the measurements the model is defined as follows:
\begin{align}
x_k & = A_{k-1} x_{k-1} + q_{k-1}, \\
y_K & = H_k x_k + r_k,
\end{align}
where $A_{k}$, $H_k$, $Q_k$, and $R_k$ are given and 
\begin{align}
q_{k-1} & \sim \mathcal{N}(0, Q_{k-1}), \\
r_{k}   & \sim \mathcal{N}(0, R_k),
\end{align}
are the process and measurement noise, respectively.

### Filtering equations

WIP, check back soon.