### Modeling a dynamic system

In order to turn a linear dynamical system into the formulation required for Kalman filters we need to go from higher order, continuous, linear differential equations to first order, linear (discrete) equations. So the first step is to go from higher order to first order. This is easily achieved by creating new variables. Eg: if we have $\ddot{x_1} = 6\dot{x_1} - 9 x_1 + u$ we can set $x_2 = \dot{x}$. Then we have $\ddot{x_2} = 6x_2 - 9x_1 + u$, so the latter two equations serve as our system of first order linear differential equations. From here we can write $\bold{\dot{x}} = A\bold{x}$ where now $\bold{x} = [x_1, x_2]^\intercal$. The next step is to go from continuous to discrete, i.e. we need to formulate an $\bold{F}$ such that $\mathbf x(t_k) = \mathbf F(\Delta t)\mathbf x(t_{k-1})$ or more succinctly $\bold{x_k} = \bold{F}\bold{x_{k-1}}$. In general, the solution to $\dot{x} = kx$ is $x = x_0 e^{kx}$. The solution to $\bold{\dot{x}} = \bold{Ax}$ is $\bold{x} = e^{\bold{At}}\bold{x_0}$. The matrix exponential is just the Taylor series expansion. As an exercise you can try this with:

$$\begin{bmatrix}\dot x \\ \dot v\end{bmatrix} =\begin{bmatrix}0&1\\0&0\end{bmatrix} \begin{bmatrix}x \\ v\end{bmatrix}$$

and you will find:

$$
\begin{aligned}
x_k &=\begin{bmatrix}1&\Delta t\\0&1\end{bmatrix}x_{k-1}
\end{aligned}$$


### Design of the Process Noise Matrix

This chapter goes on to discuss techniques for computing $Q$. I feel the author glazes over a lot of signal theory so at this time I don't feel the need to dig deep here.

### Stable computation of the posterior covariance

When we previously "derived" the Kalman filter equations starting form the Bayesian filter framework we had

$$
\bold{P} = (\bold{I} - \bold{KH})\bold{\bar{P}}
$$

This equation is correct assuming that we have the optimal kalman gain $\bold{K}$, but in practice we don't necessarily have the optimal $\bold{K}$ because the real world is never truly linear or Gaussian and because of numerical errors.

What if we derive a more robust form for the update to the process covariance? This chapter does that and comes out with the Josheph equation.

$$\mathbf P = (\mathbf I-\mathbf {KH})\mathbf{\bar P}(\mathbf I-\mathbf{KH})^\mathsf T + \mathbf{KRK}^\mathsf T$$

and if we subsitute in $\mathbf K = \mathbf{\bar P H^\mathsf T}(\mathbf{H \bar P H}^\mathsf T + \mathbf R)^{-1}$ which is how we computed the $\bold{K}$ in the Bayesian formulation, we recover $\bold{P} = (\bold{I} - \bold{KH})\bold{\bar{P}}$.

Therefore the original expression we had and Joseph's equation are equivalent assuming we have the optimal $\bold{K}$. It's just that Joseph's equation is correct even if we don't have the optimal $\bold{K}$.

### Deriving the Kalman gain equation

The book goes on to derive the Kalman gain equation following on the thread in the previous section (ie forgetting the Bayesian framework). We formulate the goal of the Kalman filter as a minimization of the trace of $\bold{P}$. That is, we want to minimize the uncertainty we have in each state variable independently. So to do this we need to find $\mathbf K$: $\frac{d\, trace(\mathbf P)}{d\mathbf K}$ and set it to 0 then solve for $\bold{K}$. The author goes on to do this, recovering $\mathbf K = \mathbf{\bar P H^\mathsf T}(\mathbf{H \bar P H}^\mathsf T + \mathbf R)^{-1}$.

### Numeric integration

Euler's method just uses the Taylor expansion up to first order. The trick is to make dt small enough so that significant errors are not accumulated. The author the goes on to cit Runge Kutta as the workhorse for numeric integration and drops the equations without explaining their derivation. I'll leave it at that.