# Learning about Kalman filter / Extended Kalman Filter Part 1

**Resources**

`Kalman Filter from Ground Up`; author Alex Becker; https://www.kalmanfilter.net

**Overview**

background infos from chapter 13

see also other notebooks which explore in some more detail the linearisation methods which are required for the `EKF`-filter.



---

## Extended Kalman Filter / Equations

If the system dynamics are nonlinear the state transition matrix $\mathbf{F}$ will be replaced by the nonlinear system function $f(\mathbf{\hat{x}_{n,n}} )$. For the uncertainty propagation of system state the `Jacobian`-matrix $\frac{\partial \mathbf{f}}{\partial \mathbf{x}}$  is required.

For a state vector with $k$ components the `Jacobian`-matrix is expressed like this:

$$
\frac{\partial \mathbf{f}}{\partial \mathbf{x}} = \left[\begin{array}{ccc}
\frac{\partial {f_1}}{\partial \hat{x}_1} & \cdots & \frac{\partial {f_1}}{\partial \hat{x}_k} \\
\vdots & \cdots & \vdots \\
\frac{\partial {f_k}}{\partial \hat{x}_1} & \cdots & \frac{\partial {f_k}}{\partial \hat{x}_k}
\end{array}\right] 
$$

$\frac{\partial \mathbf{f}}{\partial \mathbf{x}}$ must be computed new for every iteration / time step. To make this denpendency more obvious we could use $\frac{\partial \mathbf{f}}{\partial \mathbf{x}}_{(n)}$.

If the measurements $\mathbf{z}_n$ are nonlinearly related to the system state the observation matrix $\mathbf{H}$ is replaced by a nonlinear function $f(\mathbf{\hat{x}_{n}})$.

For the computation of the uncertainty we need to compute the `Jacobian`-matrix of $f(\mathbf{\hat{x}_{n}})$.

$$
\frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} = \left[\begin{array}{ccc}
\frac{\partial {h_1}}{\partial \hat{x}_1} & \cdots & \frac{\partial {h_1}}{\partial \hat{x}_k} \\
\vdots & \cdots & \vdots \\
\frac{\partial {h_p}}{\partial \hat{x}_1} & \cdots & \frac{\partial {h_p}}{\partial \hat{x}_k}
\end{array}\right] 
$$

---

### State Extrapolation Equation

**Linear Kalman Filter**

$$
\mathbf{\hat{x}_{n+1,n}} = \mathbf{F} \cdot \mathbf{\hat{x}_{n,n}} + \mathbf{G} \cdot \mathbf{u_n} + \mathbf{w_n}
$$

or dropping the noise term:

$$
\mathbf{\hat{x}_{n+1,n}} = \mathbf{F} \cdot \mathbf{\hat{x}_{n,n}} + \mathbf{G} \cdot \mathbf{u_n} 
$$


**Extended Kalman Filter**

For a nonlinear system the state transition matrix $\mathbf{F}$ is replaced by the nonlinear system function $f(\mathbf{\hat{x}_{n,n}} )$:

$$
\mathbf{\hat{x}_{n+1,n}} = f(\mathbf{\hat{x}_{n,n}} ) + \mathbf{G} \cdot \mathbf{u_n} 
$$

---

### Covariance Extrapolation Equation

**Linear Kalman Filter**

$$
\mathbf{P_{n+1,n}} = \mathbf{F} \cdot \mathbf{P_{n,n}} \cdot \mathbf{F}^T  + \mathbf{Q_n}
$$

**Extended Kalman Filter**

$$
\mathbf{P_{n+1,n}} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}_{(n)}  \cdot \mathbf{P_{n,n}} \cdot \frac{\partial \mathbf{f}}{\partial \mathbf{x}}_{(n)}^T  + \mathbf{Q_n}
$$


---

### State Update Equation

**Linear Kalman Filter**

$$
\mathbf{\hat{x}_{n,n} } = \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{z_n} - \mathbf{H} \cdot \mathbf{\hat{x}_{n,n-1}}   \right)
$$


**Extended Kalman Filter**

$$
\mathbf{\hat{x}_{n,n} } = \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{z_n} - h(\mathbf{\hat{x}_{n,n-1}})  \right)
$$

---

### Covariance Update Equation

**Linear Kalman Filter**

$$
\mathbf{P_{n,n}} = \left(\mathbf{I} - \mathbf{K_n} \cdot \mathbf{H}\right) \cdot \mathbf{P_{n,n-1}} \cdot \left(\mathbf{I} - \mathbf{K_n} \cdot \mathbf{H}\right)^T + \mathbf{K_n} \cdot \mathbf{R_n} \cdot \mathbf{K_n}^T 
$$ 

It is not easy to see how to modify this equation for the nonlinear case. So let try to derive an equation from the state update equation for the nonlinear case. 

**Extended Kalman Filter**

We start with the state update equation for the nonlinear case:

$$
\mathbf{\hat{x}_{n,n} } = \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{z_n} - f(\mathbf{\hat{x}_{n,n-1}})  \right)
$$

Inserting the expression for the measurement vector $\mathbf{z_n}$

$$
\mathbf{z}_n = f(\mathbf{\mathbf{x}_n}) + \mathbf{v_n}
$$ 

$$
\mathbf{\hat{x}_{n,n} } = \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{f(\mathbf{x}_n)} + \mathbf{v_n} - \mathbf{f(\mathbf{\hat{x}_{n,n-1}})}  \right)
$$

The term $\mathbf{f(\mathbf{\hat{x}_{n,n-1}})}$ is linearly approximated by:

$$
\mathbf{f(\mathbf{\hat{x}_{n,n-1}})} \approx \mathbf{f(\mathbf{x}_n)} + \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot (\mathbf{\hat{x}_{n,n-1}} - \mathbf{x}_n)
$$

$$\begin{align}
\mathbf{\hat{x}_{n,n} } &= \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{v_n} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot (\mathbf{\hat{x}_{n,n-1}} - \mathbf{x}_n) \right) \\
&= \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{x}_n - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{\hat{x}_{n,n-1}}  + \mathbf{v_n} \right)
\end{align}
$$

We define the estimation error $\mathbf{e}_n$ as:

$$\begin{align}
\mathbf{e}_n &= \mathbf{x}_n - \mathbf{\hat{x}_{n,n}}   \\
&= (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \left(\frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{x}_n - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{\hat{x}_{n,n-1}}  + \mathbf{v_n} \right) \\
&= (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \left(\frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}})  + \mathbf{v_n} \right) \\
&= \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \mathbf{v_n}
\end{align}
$$

and compute the covariance matrix

$$\begin{align}
\mathbf{P_{n,n}} &= E\left(\mathbf{e_n} \cdot \mathbf{e_n}^T \right) \\
&=E\left( \left(\left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \mathbf{v_n} \right) \cdot \left( \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \mathbf{v_n} \right)^T \right) \\
&= E\left( \left(\left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) - \mathbf{K_n} \cdot \mathbf{v_n} \right) \cdot \left( (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}})^T \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T \right) - \mathbf{v_n}^T \cdot \mathbf{K_n}^T \right) \right) \\
&= \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot E\left((\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) \cdot (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}})^T \right) \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T  \right) + \mathbf{K_n} \cdot E\left(\mathbf{v_n} \cdot \mathbf{v_n}^T \right) \cdot \mathbf{K_n}^T \\
\mathbf{P_{n,n}} &= \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot \mathbf{P}_{n,n-1} \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T  \right) + \mathbf{K_n} \cdot \mathbf{R}_n \cdot \mathbf{K_n}^T
\end{align}
$$

Here we have used $E\left( (\mathbf{x}_n - \mathbf{\hat{x}_{n,n-1}}) \cdot \mathbf{v_n}^T  \right) = \mathbf{0}$.

---

### Kalman Gain

**Linear Kalman Filter**

$$
\mathbf{K_n} = \mathbf{P_{n,n-1}} \cdot  \mathbf{H}^T \cdot \left( \mathbf{H} \cdot \mathbf{P_{n,n-1}} \cdot  \mathbf{H}^T + \mathbf{R_n} \right)^{-1}  
$$

**Extended Kalman Filter**

We start with the covariance matrix $\mathbf{P_{n,n}} $ which is re-formulated :

$$\begin{align}
\mathbf{P_{n,n}} &= \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot \mathbf{P}_{n,n-1} \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T  \right) + \mathbf{K_n} \cdot \mathbf{R}_n \cdot \mathbf{K_n}^T \\
&= \left(\mathbf{P}_{n,n-1} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \right)  \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T  \right) + \mathbf{K_n} \cdot \mathbf{R}_n \cdot \mathbf{K_n}^T \\
&= \mathbf{P}_{n,n-1} - \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} + \mathbf{K_n} \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right) \cdot \mathbf{K_n}^T 
\end{align}
$$

We need to minimise the variance of $\mathbf{P_{n,n}}$ subjects to the matrix elements of `Kalman gain` $\mathbf{K_n}$. The aggregate variances of $\mathbf{P_{n,n}}$  is just the trace $ tr\left(\mathbf{P_{n,n}}\right)$ of this matrix.

For $ tr\left(\mathbf{P_{n,n}}\right)$ we get:

$$
tr\left(\mathbf{P_{n,n}}\right) = tr\left(\mathbf{P}_{n,n-1}\right) - 2 \cdot tr\left( \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \right) + tr\left( \mathbf{K_n} \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right) \cdot \mathbf{K_n}^T \right)
$$

$ tr\left(\mathbf{P_{n,n}}\right)$ is differentiated with respect to the elements of Kalman gain matrix $\mathbf{K}_n$. To do this we use these two formulas:

**rule#1**

$$
\frac{d}{d \mathbf{A}} \left(tr\left( \mathbf{A} \cdot \mathbf{B} \right) \right) = \mathbf{B}^T
$$

**rule#2**

$$
\frac{d}{d \mathbf{A}} \left(tr\left( \mathbf{A} \cdot \mathbf{B} \cdot \mathbf{A}^T \right) \right) = 2 \cdot \mathbf{A} \cdot \mathbf{B}  
$$

applying

$$\begin{align}
\frac{d}{d \mathbf{K}_n} \left(tr\left( \mathbf{P_{n,n}} \right) \right) &= - 2 \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + 2 \cdot \mathbf{K_n} \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right)
\end{align}
$$


Setting the derivatives to $\mathbf{0}$ yields:

$$
\mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T = \mathbf{K_n} \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right)
$$

which is solved for the `Kalman`-gain:

$$
\mathbf{K_n} = \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right)^{-1}
$$

---

## Summary / Equations of the Extended Kalman Filter

| Description |  Equation | 
|-------------|-----------|
| State Extrapolation | $\mathbf{\hat{x}_{n+1,n}} = f(\mathbf{\hat{x}_{n,n}} ) + \mathbf{G} \cdot \mathbf{u_n}$ |
| Covariance Extrapolation | $\mathbf{P_{n+1,n}} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}_{(n)}  \cdot \mathbf{P_{n,n}} \cdot \frac{\partial \mathbf{f}}{\partial \mathbf{x}}_{(n)}^T  + \mathbf{Q_n}$ |
| State Update | $\mathbf{\hat{x}_{n,n} } = \mathbf{\hat{x}_{n,n-1}} + \mathbf{K_n} \cdot \left(\mathbf{z_n} - h(\mathbf{\hat{x}_{n,n-1}})  \right)$ |
| Covariance Update | $\mathbf{P_{n,n}} = \left(\mathbf{I} - \mathbf{K_n} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \right) \cdot \mathbf{P}_{n,n-1} \cdot \left(\mathbf{I} - \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \mathbf{K_n}^T  \right) + \mathbf{K_n} \cdot \mathbf{R}_n \cdot \mathbf{K_n}^T$ |
| Kalman Gain | $\mathbf{K_n} = \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T \cdot \left( \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)} \cdot \mathbf{P}_{n,n-1} \cdot \frac{\partial \mathbf{h}}{\partial \mathbf{x}}_{(n)}^T + \mathbf{R}_n \right)^{-1}$ |





