# 3.2 Kalman Filter

Table 3.1 The Kalman filter algorithm for linear Gaussian state transitions and measurements


Algorithm Kalman_filter($\mu_{t-1}$, $\Sigma_{t-1}$, $u_t$, $z_t$)

- Prediction

  - $\bar{\mu}_t = A_t \mu_{t-1} + B_t u_t $

  - $\bar{\Sigma}_t = A_t \Sigma_{t-1} A^T_t + R_t$

- Update

  - $K_t = \bar{\Sigma}_t C^T_t \left(C_t \bar{\Sigma} C^T_t + Q_t \right)^{-1}$

  - $ \mu_t = \bar{\mu}_t + K_t \left(z_t - C_t \bar{\mu}_t \right)$

  - $ \Sigma_t = \left(I - K_t C_t\right) \bar{\Sigma}_t $

return $\mu_t$, $\Sigma_t$

- $C_t$: Measurement matrix $k \times n$, $k$ is the dimension of the measurement vector $z_t$
- $z_t$: Measurement vector
- $Q_t$: Measurement covariance matrix


## Mathematical Derivation of the KF

### Linear Gaussian System

$x_t = A_tx_{t-1} + B_tu_t + \epsilon_t$

$ z_t = C_t x_t + \delta_t$

### Prediction

$\bar{bel}(x_t) = \int p(x_t \mid x_{t-1}, u_t) bel(x_{t-1})dx_{t-1}$

$bel(x_{t-1}) \sim \mathcal{N}(x_{t-1}; \mu_{t-1}, \Sigma_{t-1})$

$p(x_t \mid x_{t-1}, u_t) \sim \mathcal{N}(x_t; A_t x_{t-1} + B_t u_t, R_t)$

We begin by writing the equation in its Gaussian form:

$\bar{bel}(x_t) = \eta \int \exp\left\lbrace-\frac{1}{2}\left(x_t - A_t x_{t-1} - B_t u_t\right)^TR^{-1}_t\left(x_t - A_t x_{t-1} - B_t u_t\right) \right\rbrace \exp\left\lbrace-\frac{1}{2}(x_{t-1}-\mu_{t-1})^T\Sigma^{-1}_{t-1}(x_{t-1}-\mu_{t-1}\right\rbrace dx_{t-1}$

In short, we have

$\bar{bel}(x_t) = \eta \int \exp\left\lbrace-L_t \right\rbrace dx_{t-1}$

$L_t = \frac{1}{2}\left(x_t - A_t x_{t-1} - B_t u_t\right)^TR^{-1}_t\left(x_t - A_t x_{t-1} - B_t u_t\right) + \frac{1}{2}(x_{t-1}-\mu_{t-1})^T\Sigma^{-1}_{t-1}(x_{t-1}-\mu_{t-1})$

Notice that $L_t$ is quadratic in $x_{t-1}$; it is also quadratic in $x_t$.

We can decompose $L_t$ into $L_t(x_{t-1},x_t)$ and $L_t(x_t)$ to eliminate the integral form.

$L_t = L_t(x_{t-1}, x_t) + L_t(x_t)$

$\bar{bel}(x_t) = \eta \int \exp\left\lbrace-L_t \right\rbrace dx_{t-1}$

$= \eta \int \exp\left\lbrace-L_t(x_{t-1}, x_t) - L_t(x_t)\right\rbrace dx_{t-1}$

$= \eta \exp\left\lbrace-L_t(x_t)\right\rbrace \int \exp\left\lbrace-L_t(x_{t-1}, x_t)\right\rbrace dx_{t-1}$

The integral $L_t(x_{t-1}, x_t)$ will become a constant relative to the problem of the estimating the beliefdistribution over $x_t$.

$\bar{bel}(x_t) = \eta \exp\left\lbrace-L_t(x_t)\right\rbrace $

Let's now perform this decomposition.

$L_t = \frac{1}{2}\left(x_t - A_t x_{t-1} - B_t u_t\right)^TR^{-1}_t\left(x_t - A_t x_{t-1} - B_t u_t\right) + \frac{1}{2}(x_{t-1}-\mu_{t-1})^T\Sigma^{-1}_{t-1}(x_{t-1}-\mu_{t-1})$

$\frac{\partial L_t}{\partial x_{t-1}} = -A^T_tR^{-1}_t\left(x_t - A_t x_{t-1}- B_t u_t\right) + \Sigma^{-1}_{t-1}\left(x_{t-1} - \mu_{t-1}\right)$

$\frac{\partial^2 L_t}{\partial x^2_{t-1}} = A^T_t R^{-1}_t A_t + \Sigma^{-1}_{t-1} = \Psi^{-1}_t$

$\Psi$ defines the curvature of $L_t(x_{t-1}, x_t)$.


$\frac{\partial L_t}{\partial x_{t-1}} = 0$ gives us the mean

$A^T_tR^{-1}_t\left(x_t - A_t x_{t-1}- B_t u_t\right) = \Sigma^{-1}_{t-1}\left(x_{t-1} - \mu_{t-1}\right)$

This expression is noew solved for $x_{t-1}$

$A^T_tR^{-1}_t\left(x_t - A_t x_{t-1}- B_t u_t\right) = \Sigma^{-1}_{t-1}\left(x_{t-1} - \mu_{t-1}\right)$

$\Longleftrightarrow A^T_tR^{-1}_t\left(x_t - B_t u_t\right) - A^T_tR^{-1}_tA_t x_{t-1}= \Sigma^{-1}_{t-1}x_{t-1} - \Sigma^{-1}_{t-1}\mu_{t-1}$

$\Longleftrightarrow A^T_tR^{-1}_tA_t x_{t-1} + \Sigma^{-1}_{t-1}x_{t-1}= A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}$

$\Longleftrightarrow \left(A^T_tR^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)x_{t-1}= A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}$

$\Longleftrightarrow \Psi^{-1}_t x_{t-1}= A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}$

$\Longleftrightarrow x_{t-1} = \Psi_t A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}$

Thus, we now have a quadratic function $L_t(x_{t-1}, x_t)$ defined as follows

$L_t(x_{t-1}, x_t) = \frac{1}{2}\left(x_{t-1} - \Psi_t \left[A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]\right)^T \Psi^{-1}\left(x_{t-1} - \Psi_t \left[A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]\right)$

$\int \det\left( L_t(x_{t-1}, x_t)\right)^{-\frac{1}{2}} \exp\left\lbrace -L_t(x_{t-1}, x_t) \right\rbrace dx_{t-1} = 1$

$\int \exp\left\lbrace -L_t(x_{t-1}, x_t) \right\rbrace dx_{t-1} = \det\left( L_t(x_{t-1}, x_t)\right)^{\frac{1}{2}}$

The important thing to notice is that thevalue of this integral is independent of $x_t$, our target variable.

$ \bar{bel}(x_t) = \eta \exp\left\lbrace-L_t(x_t)\right\rbrace \int \exp\left\lbrace-L_t(x_{t-1}, x_t)\right\rbrace dx_{t-1}$

$ = \eta \exp\left\lbrace-L_t(x_t)\right\rbrace$

Notice that the normalizers $\eta$ are not the same in both lines.

It remains to determine the function $L_t(x_t)$ which is the difference of $L_t$ and $L_t(x_{t-1}, x_t)$

$L_t(x_t) = L_t - L_t(x_{t-1},x_t)$

$ = \frac{1}{2}\left(x_t - A_t x_{t-1} - B_t u_t\right)^TR^{-1}_t\left(x_t - A_t x_{t-1} - B_t u_t\right) + \frac{1}{2}(x_{t-1}-\mu_{t-1})^T\Sigma^{-1}_{t-1}(x_{t-1}-\mu_{t-1}) - \frac{1}{2}\left(x_{t-1} - \Psi_t \left[A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]\right)^T \Psi^{-1}\left(x_{t-1} - \Psi_t \left[A^T_tR^{-1}_t\left(x_t - B_t u_t\right) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]\right)$

, where $\Psi_t = \left[A^T_t R^{-1}_t A_t + \Sigma^{-1}_{t-1}\right]^{-1}$

$L_t(x_t)$ does not depend on $x_{t-1}$ because all terms that contain $x_{t-1}$ cancel out.

$L_t(x_t) = \frac{1}{2}(x_t - B_tu_t)^T R^{-1}_{t-1} (x_t - B_tu_t) + \frac{1}{2}\mu^T_{t-1}\Sigma^{-1}_{t-1}\mu_{t-1} - \frac{1}{2}\left[A^T_t R^{-1}_t (x_t - B_tu_t) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]^T(A^T_t R^{-1}_tA_t + \Sigma^{-1}_{t-1})^{-1}\left[A^T_t R^{-1}_t (x_t - B_tu_t) + \Sigma^{-1}_{t-1}\mu_{t-1}\right]$

$L_t(x_t)$ is quadratic in $x_t$. This observation means that $\bar{bel}(x_t)$ is indeed normal distribution.

- The mean of this distribution: the minimum of $L_t(x_t)$.
- The covariance of this distribution: the curvature of $L_t(x_t)$.

$\frac{\partial L_t(x_t)}{\partial x_t} = R^{-1}_t (x_t - B_t u_t) - R^{-1}_tA_t(A^T_t R^{-1}_tA_t + \Sigma^{-1}_t)^{-1}\left[A^T_tR^{-1}_t(x_t - B_tu_t) + \Sigma^{-1}_t \mu_{t-1}\right]$

= $\left[R^{-1}_t - R^{-1}_tA_t\left(A^T_tR^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)^{-1}A^T_tR^{-1}_t \right](x_t - B_tu_t) - R^{-1}A_t\left(A^T_t R^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)^{-1}\Sigma^{-1}_{t-1}\mu_{t-1}$

By inversion lemma

$\frac{\partial L_t(x_t)}{\partial x_t} = \left(R_t + A_t\Sigma_{t-1}A^T_t\right)^{-1}(x_t - B_tu_t) - R^{-1}A_t\left(A^T_t R^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)^{-1}\Sigma^{-1}_{t-1}\mu_{t-1}$

The minimum of $L_t(x_t)$ is attained when the first derivative is zero.

$\left(R_t + A_t\Sigma_{t-1}A^T_t\right)^{-1}(x_t - B_tu_t) = R^{-1}A_t\left(A^T_t R^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)^{-1}\Sigma^{-1}_{t-1}\mu_{t-1}$

Solving this for the target variable $x_t$ gives us

$x_t = B_tu_t + \left(R_t + A_t\Sigma_{t-1}A^T_t\right)R^{-1}A_t\left(A^T_t R^{-1}_tA_t + \Sigma^{-1}_{t-1}\right)^{-1}\Sigma^{-1}_{t-1}\mu_{t-1}$

$= B_tu_t + \left(A_t + A_t\Sigma_{t-1}A^T_tR^{-1}A_t\right)\left(\Sigma_{t-1}A^T_t R^{-1}_tA_t + I\right)^{-1}\mu_{t-1}$

$= B_tu_t + A_t\left(I + \Sigma_{t-1}A^T_tR^{-1}A_t\right)\left(\Sigma_{t-1}A^T_t R^{-1}_tA_t + I\right)^{-1}\mu_{t-1}$

$=B_tu_t + A_t\mu_{t-1}$

$\frac{\partial^2 L_t(x_t)}{\partial x^2_t} = \left(R_t + A_t\Sigma_{t-1}A^T_t\right)^{-1}$

This is the curvature of the quadratic function $L_t(x_t)$, whose inverse is the covariance of the belief $\bar{bel}(x_t)$

$\bar{\mu}_t = A_t\mu_{t-1} + B_tu_t$

$\bar{\Sigma}_t = A_t\Sigma_{t-1}A^T_t + R_t$



### Measurement Update

$bel(x_t) = \eta \cdot p(z_t \mid x_t) \bar{bel}(x_t)$

$p(z_t \mid x_t) \sim \mathcal{N}\left(z_t; C_tx_t, Q_t\right)$

$\bar{bel}(x_t) \sim \mathcal{N}\left(x_t;\bar{\mu}_t, \bar{\Sigma}_t\right)$

$bel(x_t) = \eta \exp\left\lbrace -J_t \right\rbrace$

$J_t = \frac{1}{2}\left(z_t - C_tx_t\right)^TQ^{-1}_t\left(z_t - C_tx_t\right) + \frac{1}{2}\left(x_t - \bar{\mu}_t\right)^T\bar{\Sigma}^{-1}_t\left(x_t - \bar{\mu}_t\right)$

To calculate its paramters, we calculate the first two derivaties of $J_t$ with respect to $x_t$.

$\frac{\partial J}{\partial x_t}=-C^T_tQ^{-1}_t\left(z_t - C_tx_t\right) + \bar{\Sigma}^{-1}_t\left(x_t - \bar{\mu}_t\right)$

$\frac{\partial^2 J}{\partial x^2_t} = C^T_t Q^{-1}_tC + \bar{\Sigma}^{-1}_t$

The second term is the inverse of the covariance of $bel(x_t)$.

$\Sigma_t = \left(C^T_t Q^{-1}_tC + \bar{\Sigma}^{-1}_t\right)^{-1}$

The mean of $bel(x_t)$ is the minimum of this quadratic function, which is now calculated by setting the first derivative of $J_t$ to zero and substituting $\mu_t$ for $x_t$.

$C^T_tQ^{-1}_t\left(z_t - C_t\mu_t\right) = \bar{\Sigma}^{-1}_t\left(\mu_t - \bar{\mu}_t\right)$

The expression on the left of the equal sign can be transformed as follows

$C^T_tQ^{-1}_t\left(z_t - C_t\mu_t\right)$

$ = C^T_tQ^{-1}_t\left(z_t - C_t\mu_t + C_t\bar{\mu}_t - C_t\bar{\mu}_t\right)$

$ = C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) - C^T_tQ^{-1}_tC_t\left(\mu_t - \bar{\mu}_t\right)$


Sbstituting this

$C^T_tQ^{-1}_t\left(z_t - C_t\mu_t\right) = \bar{\Sigma}^{-1}_t\left(\mu_t - \bar{\mu}_t\right)$

$\Longleftrightarrow C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) - C^T_tQ^{-1}_tC_t\left(\mu_t - \bar{\mu}_t\right) = \bar{\Sigma}^{-1}_t\left(\mu_t - \bar{\mu}_t\right)$

$\Longleftrightarrow C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) = C^T_tQ^{-1}_tC_t\left(\mu_t - \bar{\mu}_t\right) + \bar{\Sigma}^{-1}_t\left(\mu_t - \bar{\mu}_t\right)$

$\Longleftrightarrow C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) = \left(C^T_tQ^{-1}_tC_t + \bar{\Sigma}^{-1}_t\right)\left(\mu_t - \bar{\mu}_t\right)$

$\Longleftrightarrow C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) = \Sigma^{-1}_t \left(\mu_t - \bar{\mu}_t\right)$

Hence we have

$\Sigma_t C^T_tQ^{-1}_t\left(z_t - C_t\bar{\mu}_t\right) = \left(\mu_t - \bar{\mu}_t\right)$

We now define the Kalman gain as

$K_t = \Sigma_t C^T_tQ^{-1}_t$

and obtain

$\mu_t = \bar{\mu}_t + K_t\left(z_t - C_t \bar{\mu}_t\right)$

The Kalman gain is a function of $\Sigma_t$ can be expressed in terms of $\bar{\Sigma}_t$ (refer to 3.45).

$K_t = \bar{\Sigma}_t C^T_t \left(C_t \bar{\Sigma}_t C^T_t + Q_t\right)^{-1}$

By matrix inversion lemman

$\left(\bar{\Sigma}^{-1}_t + C^T_t Q^{-1}_t C_t\right)^{-1} = \bar{\Sigma}_t - \bar{\Sigma}_t C^T_t\left(Q_t + C_t\bar{\Sigma}_tC^T_t\right)^{-1}C_t \bar{\Sigma}_t$

$\Sigma_t = \left(C^T_t Q^{-1}_t C_t + \bar{\Sigma}^{-1}_t\right)^{-1}$

$ = \bar{\Sigma}_t - \bar{\Sigma}_t C^T_t\left(Q_t + C_t\bar{\Sigma}_tC^T_t\right)^{-1}C_t \bar{\Sigma}_t$

$ = \left[I - K_tC_t\right]\bar{\Sigma}_t$

