# Internship Finn Sherry @ Sioux Mathware

---

# Bayesian grey-box system identification for thermal effects: UKF using ForneyLab
In this notebook, we will try to apply an unscented Kalman filter (UKF), for which we use [ForneyLab.jl](https://github.com/biaslab/ForneyLab.jl), a package for Julia developed by [BIASlab](https://biaslab.github.io/).

Last update: 27-07-2022

$\renewcommand{\vec}[1]{\boldsymbol{\mathrm{#1}}}$
$\newcommand{\covec}[1]{\hat{\vec{#1}}}$
$\newcommand{\mat}[1]{\boldsymbol{\mathrm{#1}}}$
$\newcommand{\inv}[1]{#1^{-1}}$
$\newcommand{\given}{\, \vert \,}$
$\newcommand{\problaw}[1]{p(#1)}$
$\newcommand{\Expectation}{\mathbb{E}}$
$\newcommand{\Variance}{\mathbb{V}}$
$\newcommand{\Geometric}{\textrm{Geom}}$
$\newcommand{\NegBin}{\textrm{NB}}$
$\newcommand{\Poisson}{\textrm{Pois}}$
$\newcommand{\Bernoulli}{\textrm{Bern}}$
$\newcommand{\Uniform}{\textrm{Uni}}$
$\newcommand{\NormDist}{\mathcal{N}}$
$\newcommand{\GammaDist}{\textrm{Gamma}}$
$\newcommand{\ExpDist}{\textrm{Exp}}$
$\newcommand{\Uniform}{\textrm{Uniform}}$
$\newcommand{\Binomial}{\textrm{Binom}}$
$\newcommand{\BetaDist}{\textrm{Beta}}$
$\newcommand{\BetaFunc}{\textrm{B}}$
$\newcommand{\setify}[1]{\mathbb{#1}}$
$\newcommand{\NatSet}{\setify{N}}$
$\newcommand{\IntSet}{\setify{Z}}$
$\newcommand{\RealSet}{\setify{R}}$
$\newcommand{\CompSet}{\setify{C}}$
$\newcommand{\QuatSet}{\setify{H}}$
$\newcommand{\FieldSet}{\setify{K}}$
$\newcommand{\define}{:=}$
$\newcommand{\enifed}{=:}$
$\newcommand{\loss}{\ell}$
$\newcommand{\risk}{\textrm{R}}$
$\newcommand{\MSE}{\textrm{MSE}}$
$\newcommand{\norm}[1]{\lVert #2 \rVert}$
$\newcommand{\InnerProduct}[2]{\left( #1 , #2 \right)}$
$\newcommand{\kilogram}{\textrm{kg}}$
$\newcommand{\metre}{\textrm{m}}$
$\newcommand{\watt}{\textrm{W}}$
$\newcommand{\joule}{\textrm{J}}$
$\newcommand{\kelvin}{\textrm{K}}$
$\newcommand{\second}{\textrm{s}}$
$\newcommand{\centi}{\textrm{c}}$
$\newcommand{\bigO}{\mathcal{O}}$

In [None]:
using ForneyLab, LinearAlgebra, Random # Computational
using StatsPlots, LaTeXStrings, Measures # Formatting
rng  = MersenneTwister(987654321)

## Physical Model
To get started, we reduce the system to 1 material with 1 temperature sensor:
$$ T_{n + 1} = \underbrace{\left(1 - \Delta t \frac{h_a A_1}{m_1 c_{p, 1}}\right)}_{\theta} T_n + \underbrace{\Delta t \frac{h_a A_1}{m_1 c_{p, 1}}}_{1 - \theta} T_a \, , $$
Of course now there is neither conduction nor radiation. We also do not put any heat into the system. We can view this system as an autoregressive model with exogenous input.

## Simulate Data
Let's start by visualising the system dynamics first.

In [None]:
# Time horizon
N = 200
Δt = 5e1
time = [n * Δt for n in 0:N]
# Material properties
m = 0.6
cp = 1.5e3
A = 5e-2
# Temperature state
T_ = zeros(N + 1)
# Known temperatures
T_a = 297.
T_[1] = 255.
# Unknown parameters
h_a = 10.
# Transition coefficient
real_θ = 1 - Δt * h_a * A / (m * cp)
# Simulate evolution
for n = 1:N
    T_[n + 1] = real_θ * T_[n] + (1 - real_θ) * T_a
end

In [None]:
p1 = plot(time, T_, ylabel = L"T~(\textrm{K})", xlabel = L"t~(\textrm{s})", label = L"T_1", xlim = (time[1], time[end]), ylim = (250, 300), rightmargin = 6mm)
hline!(p1, [T_a], label = L"T_a")

This is reasonable: we see that the temperature in the block equilibrates to the ambient temperature.

## Probabilistic Model
Next, we define the corresponding probabilistic model. We assume that $T_a$ is known and $\theta$ is unknown. Typically, our observations will be noisy, usually white noise. If we call our observations $y$, we can then write our measurement model as
$$y_n = \theta T_n + (1 - \theta) T_a + e_n,$$
where $e_n \sim \mathcal{N}(0, \tau^{-1})$ is white noise with precision parameter $\tau$. We assume in this example that the precision of the measurement noise is known. We could now apply a [Nonlinear Kalman Filtering](https://github.com/biaslab/ForneyLab.jl/blob/master/demo/nonlinear_kalman_filter.ipynb) from ForneyLab.

In [None]:
real_τ = 0.04 # Observation precision = inv(sqrt(σ^2)) where σ is observation standard deviation
generate_data(rng, n, states, τ) = states + randn(rng, n) * inv(sqrt(τ))

y_ = generate_data(rng, N + 1, T_, real_τ)
plot(time, T_, xlabel = L"t~(\textrm{s})", ylabel = L"T~(\textrm{K})", color = "orange", label = "True", xlim = (time[1], time[end]), ylim = (240, 320))
scatter!(time, y_, color = "orange", label = "Observed")

Now that we have the data, we can perform inference. This involves first defining the priors, and telling ForneyLab the relation between the parameters and the states.

In [None]:
# Parameters for prior distributions
m_x_0 = 300 # Temperature
v_x_0 = 400
m_θ_0 = 2 # Parameter to identify
v_θ_0 = 1
# Probabilistic Model
fg = FactorGraph()                                                          # Start model specification
@RV θ ~ Gaussian(placeholder(:m_θ), placeholder(:v_θ), id=:θ)               # Prior for θ
@RV x_tmin1 ~ Gaussian(placeholder(:m_x), placeholder(:v_x), id=:x_tmin1)   # Define previous state
g(x_tmin1, θ) = θ * x_tmin1 + (1 - θ) * T_a                                 # Nonlinear state transition function
@RV x_t ~ Delta{Unscented}(x_tmin1, θ; g=g, id=:x_t)                        # State transition node
@RV y_t ~ Gaussian{Precision}(x_t, real_τ, id=:y_t)                         # Observation likelihood
placeholder(y_t, :y_t);                                                     # Tell ForneyLab that variable y_t will be observed later on
# Define sum-product message passing procedure with state x_t and fertility r as parameters of interest
algo = messagePassingAlgorithm([x_t, θ])
# Compile message passing procedure to an inference algorithm
code = algorithmSourceCode(algo);
# Import compiled functions to workspace
eval(Meta.parse(code));

In [None]:
# Initialize arrays for storing parameter estimates
m_x_t = Vector{Float64}(undef, N + 1)
v_x_t = Vector{Float64}(undef, N + 1)
m_θ_t = Vector{Float64}(undef, N + 1)
v_θ_t = Vector{Float64}(undef, N + 1)

# Initialize previous parameter estimates
m_x_tmin1 = m_x_0
v_x_tmin1 = v_x_0
m_θ_tmin1 = m_θ_0
v_θ_tmin1 = v_θ_0

# Recursive estimation procedure (posteriors at t => priors at t+1)
for t ∈ 1:(N + 1)    
    # Store data for current time-step
    data = Dict(:y_t => y_[t],
                :m_x => m_x_tmin1,
                :v_x => v_x_tmin1,
                :m_θ => m_θ_tmin1,
                :v_θ => v_θ_tmin1)
    # Estimate marginal distributions of interest (x_t and θ)
    marginals = step!(data)  
    # Extract parameters of estimated marginal distributions
    (m_x, v_x) = ForneyLab.unsafeMeanCov(marginals[:x_t])
    (m_θ, v_θ) = ForneyLab.unsafeMeanCov(marginals[:θ])
    # Reset parameter estimate arrays for next time-step
    m_x_tmin1 = m_x_t[t] = m_x
    v_x_tmin1 = v_x_t[t] = v_x
    m_θ_tmin1 = m_θ_t[t] = m_θ
    v_θ_tmin1 = v_θ_t[t] = v_θ
end

We can now visualise the results. The UKF simultaneously estimates the parameter and the true state. As we make more measurements, we would expect our parameter estimates to become better, which should also improve the state estimates.

In [None]:
p_T = plot(time, T_, xlabel = L"t~(\textrm{s})", ylabel = L"T~(\textrm{K})", color = "orange", label = "True", xlim = (time[1], time[end]), ylim = (240, 320))
plot!(p_T, time, m_x_t, color = "blue", ribbon = 2 * sqrt.(v_x_t), alpha = 0.25, fill_color = "blue", label = "Bayesian")
scatter!(p_T, time, y_, color = "orange", label = "Observed")
# savefig(p_T, "Results\\Explore\\UKF\\LSSM_1_block_Forney_temp.pdf")

The plot above shows the mode of the posterior in blue, with a ribbon that is two standard deviations wide on each side (so that it contains roughly 95 % of the probability mass). Around $t = 2000~\second$, the state gets significantly overestimated, but over time this error appears to decay. 

We can also visualise the evolution of the parameter posterior.

In [None]:
p_p = plot(time, m_θ_t, color = "blue", ribbon = 2 * sqrt.(v_θ_t), alpha = 0.25, fill_color = "blue", xlabel = L"t~(\textrm{s})", ylabel = L"θ", label = "Bayesian", xlim = (time[1], time[end]), ylim = (0, 3))
hline!([real_θ], color = "orange", label = "True")
# savefig(p_p, "Results\\Explore\\UKF\\LSSM_1_block_Forney_param.pdf")

In [None]:
(last(m_θ_t) - real_θ) / real_θ # Relative error of the mean of the final posterior of θ

As the plot above suggests, this approach has been very successful: the mean of the posteriors of $\theta$ quickly goes to the true value of $\theta$. The relative error is only about 0.5 %. 

We can further quantify how good our estimate is by determining the (empirical) risk. A common choice of risk is the Mean Square Error MSE, which corresponds to a quadratic loss function.

In [None]:
using Distributions
MSE(true_θ, post_mean, post_var) = (post_mean - true_θ)^2 + post_var

mse_prior = MSE(real_θ, 2, 1)
mse_post = MSE(real_θ, m_θ_t[end], v_θ_t[end])
mse_prior, mse_post

Clearly, our risk has significantly decreased after the inference. We have to put the prior and posterior in different plots, because they have vastly different scales:

In [None]:
p_MSE_prior = plot(Distributions.Normal(2, 1), ylim = (0, 0.5), xlabel = L"θ", ylabel = "Density", label = L"Prior $θ$", title = "Prior")
vline!(p_MSE_prior, [real_θ], label = L"True $θ$")
annotate!(p_MSE_prior, (0.05, 0.95), (L"\textrm{MSE} = %$(round(mse_prior, sigdigits = 2))", 12, :left))
p_MSE_post = plot(Distributions.Normal(m_θ_t[end], sqrt(v_θ_t[end])), ylim = (0, 15), xlabel = L"θ", ylabel = "Density", label = L"Posterior $θ$",  title = "Posterior")
vline!([real_θ], label = L"True $θ$")
annotate!(p_MSE_post, (0.05, 0.95), (L"\textrm{MSE} = %$(round(mse_post, sigdigits = 2))", 12, :left))
p_MSE = plot(p_MSE_prior, p_MSE_post, layout = (1, 2), size = (800, 400), leftmargin = 6mm)
# savefig(p_MSE, "Results\\Explore\\UKF\\LSSM_1_block_Forney_MSE.pdf")

## Conclusion
Unfortunately, it does not seem to be possible to use a nonlinear Kalman filter for higher dimensional versions of our problem in ForneyLab. Hence, we must look to other approximate methods, since we do not have the time to implement such a nonlinear Kalman filter. We return to the [main Julia notebook](sysid-thermal-AR.ipynb).