
#Graph methods for imaging, Vision and computing (B31RX) 2025

##Tutorial 7: Bayesian smoothing with Gaussian densities: the Rauch-Tung-Striebel (RTS) smoother

In this tutorial, we will apply the sum-product algorithm to extend the Kalman filter from the THA and implement a Bayesian smoother for estimating a multivariate Gaussian state. Following the reasoning of tutorial 6, we will first derive the smoothing equations using the Bayes' rule (and variable elimination) and then investigate how the sum-product rules can be used to directly compute the marginal distributions of interest.


### Background

#### Bayesian model:

We consider a multivariate state denoted $ \mathbf{p}_t \in \mathbb{R}^4 $. This state can vary over time with $ t \in \{1, \dots, T\} $.

The variations of $ \mathbf{p} $ over time are modelled a priori by a homogeneous order-1 Markov chain with transition kernel:

$$
f(\mathbf{p}_t \mid \mathbf{p}_{t-1}) = \mathcal{N}(\mathbf{p}_t ; \mathbf{Q} \mathbf{p}_{t-1}, \mathbf{R}),
$$

where $ \mathbf{Q} $ and $ \mathbf{R} $ have been defined in the THA.

The state $ \mathbf{p} $ is not observed directly. Instead, it is partially observed via the observations $ \mathbf{y}_t \in \mathbb{R}^2 $, such that:

$$
\mathbf{y}_t = \mathbf{B} \mathbf{p}_t + \mathbf{w}_t,
$$

where $ \mathbf{B} $ is also defined in the THA and $ \mathbf{w}_t \sim \mathcal{N}(\mathbf{w}_t; \mathbf{0}, \sigma_n^2 \mathbf{I}_2) $.

In the THA, we investigated a sequential, online filtering method to compute

$$
f(\mathbf{p}_t \mid \mathbf{Y}_t),
$$

with $ \mathbf{Y}_t = \{ \mathbf{y}_1, \dots, \mathbf{y}_t \} $, i.e., the posterior distribution of $ \mathbf{p}_t $ conditioned on all the observations previously observed (i.e., not the future observations).  
Here we will compute the marginal distributions

$$
f(\mathbf{p}_t \mid \mathbf{Y}_T), \quad \forall t
$$

which can be computed once the whole sequence of observations has been
observed. The Bayesian filter associated with the Kalman filter is called the Rauch-Tung-Striebel (RTS) smoother.

We will first show that the marginal distributions above can be computed analytically using a brute force approach by first computing the joint posterior distribution

$$
f(\mathbf{p}_1, \dots, \mathbf{p}_T \mid \mathbf{Y}_T),
$$

and marginalising all but one state.

### Question 1

Does the joint prior distribution $ f(\mathbf{p}_1, \dots, \mathbf{p}_T) $ belong to a known family of parametric distributions?  
If so, explain which family and why.

### Question 2

Using Bayes' rule, show that $ f(\mathbf{p}_1, \dots, \mathbf{p}_T \mid \mathbf{Y}_T) $ belongs to a known family of parametric distributions.

### Question 3

Using the previous results, which family does $ f(\mathbf{p}_t \mid \mathbf{Y}_T) $ belong to?  
How can one compute its moments (e.g., mean and covariance)?

This computation starting from $ f(\mathbf{p}_1, \dots, \mathbf{p}_T \mid \mathbf{Y}_T) $ can be extremely expensive, especially for long sequences due to large matrix inversion required when marginalising variables. However, we can compute the marginals more efficiently using the **sum-product algorithm**.

### Question 4

Draw the factor graph (FG) associated with the Bayesian model:

$$
f(\mathbf{p}_1, \dots, \mathbf{p}_T, \mathbf{y}_1, \dots, \mathbf{y}_T).
$$

### Question 5

Is this graph a tree?

### Question 6

Compute the messages from each leaf variable to their neighbours (factor nodes).

### Question 7

Starting from $ \mathbf{p}_1 $, compute the messages propagating from $ t = 1 $ to $ t = T $.

### Question 8

Starting from $ \mathbf{p}_T $, compute the messages propagating from $ t = T $ to $ t = 1 $.

### Question 9

Implement the sum-product algorithm for this Bayesian model and, using the data of the **THA** (with and without missing data), compare the estimation results of the **Bayesian filter (THA)** and the **Bayesian smoother**.

What do you remark in the case of missing data?