## Notebook Goal ##
A high-level overview of the Extended Kalman Filter (EKF). We provide the equations and examples (but not the derivation of the EKF)
Also we provide a short mathematical background in order to understand the EKF.

## Short Background on Taylor Expansions and Jacobian Matrices ##
## Jacobian Matrices

The Jacobian of a vector function $f:\mathbb{R}^n \to \mathbb{R}^m$ is the matrix of partial derivatives:

$$
J_f(x) =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\
\vdots & \ddots & \vdots \\
\frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n}
\end{bmatrix}
$$

- Each row = gradient of one output component ($f_i$) with respect to all inputs $(x_1, \dots, x_n)$.  
<br>

## Taylor Expansions

A Taylor expansion is a way of approximating a nonlinear function with a polynomial around some point.  
For a scalar function $f(x)$, expanded around $x_0$ :


$$f(x) = \sum_{n=0}^{\infty} \frac{f^{(n)}(x_0)}{n!} (x - x_0)^n$$ or
$$f(x) = f(x_0) + f'(x_0)(x - x_0) + \frac{1}{2}f''(x_0)(x - x_0)^2 + \dots$$

- The more terms we keep, the more accurate the approximation.  
- In the EKF we use only the first-order term, which gives a local linear approximation.  

## First-Order Taylor Expansion

For a general nonlinear vector function $f(x)$, expanded around $x_0$:

$$f(x) \approx f(x_0) + J_f(x_0)(x - x_0)$$

- $f(x_0)$: the function evaluated at $x_0$  
- $J_f(x_0)$: the Jacobian matrix (matrix of first derivatives) of $f$ at $x_0$  

# Extended Kalman Filter (EKF)


## What is the EKF?

The EKF is an extension of the Kalman Filter for systems that are nonlinear.  
Instead of assuming linear dynamics:
$$x_k = A x_{k-1} + B u_k$$
$$y_k = C x_k + D u_k$$
we consider general nonlinear functions:
$$x_k = f(x_{k-1}) + B w_k$$
$$y_{k} = h(x_k) + D v_k$$

- $f(\cdot)$: nonlinear state transition  
- $h(\cdot)$: nonlinear measurement function  
- $w_k, v_k$: Gaussian process and measurement noise  

In the EKF, we linearize the nonlinear dynamics and measurement functions around the current best state estimate using first-order Taylor expansion.
The Jacobian matrices:
  - $F_k = \frac{\partial f}{\partial x}\big|_{x=\hat{x}_{k-1|k-1}}$: linearization of the state transition  
  - $H_k = \frac{\partial h}{\partial x}\big|_{x=\hat{x}_{k|k-1}}$: linearization of the measurement model

After defining them, we can linearize the state and observation equations using the First Order Taylor Expansion. We linearize the functions around our best approximation of $x_{k-1}$, which is $\hat{x}_{k-1|k-1}$:
  - $f(x_{k-1}) \approx f(\hat{x}_{k-1|k-1}) + F_k(x_{k-1} - \hat{x}_{k-1|k-1})$
  - $h(x_{k}) \approx h(\hat{x}_{k|k-1}) + H_{k}(x_k - \hat{x}_{k|k-1})$

## How Does It Work?
Same as the linear KF, the EKF has two methods: predict and update.

**Predict step**
1. $\hat{x}_{k|k-1} = f(\hat{x}_{k-1|k-1})$
2. $P_{k|k-1} = F_k P_{k-1|k-1}F_k^\top + Q_{k-1}$

**Update step**
1. The innovation: $e_k = y_k - h(\hat{x}_{k|k-1})$
2. The Innovation (residual) covariance: $S_{k} = H_{k} P_{k|k-1} H_{k}^\top + R_{k}$
3. The near-optimal Kalman gain: $K_k = P_{k|k-1}H_k^TS_k^{-1}$
4. Updated state estimate: $\hat{x}_{k|k} = \hat{x}_{k|k-1} + K_ke_k$
5. Update state covariance estimate: $P_{k|k} = (I - K_k H_k)P_{k|k-1}$

It operates like the KF, in the sense that we use the predict step before observing the next measurement, and then do update when we have that measuremet. As k increases, we would see that the KF can track the underlying signal better and better.

It is interesting to see where we actually use the linearization. Of course, we do not use it to calculate the next states - because we know what the non-linear function is, so we can use it.
But when we would need the linearization is to calculate the state covariance. We won't go into the maths on this, but the linearization allows us to approximate $P_{k|k-1}$ and $P_{k|k}$.

## When should we use it?
- The system dynamics or measurements are nonlinear, but approximately linear in a small region.  
- The first-order approximation is valid. If the nonlinearity is strong or uncertainty large, EKF may diverge.


## Example: Coordinated Turn Model

Consider a 2D target moving with a constant turn rate $\omega$.  
The state vector includes position, velocity components, and turn rate:

$$
x =
\begin{bmatrix}
p_x \\
p_y \\
v_x \\
v_y \\
\omega
\end{bmatrix}
$$

where:
- $(p_x, p_y)$: position  
- $(v_x, v_y)$: Cartesian velocity components  
- $\omega$: turn rate (rad/s)  

<br>

**Nonlinear dynamics**

With sampling time $\Delta t$:

Let  
$$
\Omega = \omega \Delta t, \qquad
A = \frac{\sin(\Omega)}{\omega}, \qquad
B = \frac{1 - \cos(\Omega)}{\omega}, \qquad
c = \cos(\Omega), \qquad s = \sin(\Omega).
$$

Then

$$
f(x) =
\begin{bmatrix}
p_x + A v_x + B v_y \\
p_y - B v_x + A v_y \\
c v_x - s v_y \\
s v_x + c v_y \\
\omega
\end{bmatrix}.
$$

<br>

**Measurement model**
Suppose the sensor measures position only:

$$
h(x) =
\begin{bmatrix}
p_x \\
p_y
\end{bmatrix}
$$


At each time step, the EKF requires:

$$
F_k = \frac{\partial f}{\partial x}\bigg|_{\hat{x}_{k|k}}, \qquad
H_k = \frac{\partial h}{\partial x}\bigg|_{\hat{x}_{k|k-1}}.
$$

Let  
$
\Omega = \omega \Delta t, \quad c = \cos(\Omega), \quad s = \sin(\Omega),
$
and
$
A = \frac{\sin(\Omega)}{\omega}, \qquad
B = \frac{1 - \cos(\Omega)}{\omega}, \qquad
A_\omega = \frac{\Omega c - \sin(\Omega)}{\omega^2}, \qquad
B_\omega = \frac{\Omega s - (1 - \cos(\Omega))}{\omega^2}.
$

Then

$$
F_k =
\begin{bmatrix}
1 & 0 & A & B & v_x A_\omega + v_y B_\omega \\
0 & 1 & -B & A & -v_x B_\omega + v_y A_\omega \\
0 & 0 & c & -s & -\Delta t (s v_x + c v_y) \\
0 & 0 & s & \;\,c & \;\,\Delta t (c v_x - s v_y) \\
0 & 0 & 0 & 0 & 1
\end{bmatrix},
\qquad
$$

$$
H_k =
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0
\end{bmatrix}.
$$

And after the linearization, we can proceed the calculations just like a linear Kalman Filter

## Summary

- EKF extends KF to nonlinear systems by linearizing at each step.  
- Same recursive predictâ€“update form, but with Jacobians instead of fixed matrices.  
- Works well for mildly nonlinear problems.  
- Limitations appear if the system is highly nonlinear or uncertainty is large.
