## Silverbox

Silverbox refers to one of the nonlinear system identification benchmarks on http://nonlinearbenchmark.org/#Silverbox. 
It is a simulation of a [Duffing oscillator](https://en.wikipedia.org/wiki/Duffing_equation), ocurring for instance in nonlinear spring pendulums.

State-space model description of the system:

$$\begin{align}
m \frac{d^2 x(t)}{dt^2} + v \frac{d x(t)}{dt} + a x(t) + b x^3(t) =&\ u(t) + w(t) \\
y(t) =&\ x(t) + e(t)
\end{align}$$

where
$$\begin{align}
m     =&\ \text{mass} \\
v     =&\ \text{viscous damping} \\
a     =&\ \text{linear stiffness} \\
b     =&\ \text{nonlinear stiffness} \\
y(t)    =&\ \text{observation (displacement)} \\
x(t)    =&\ \text{state (displacement)} \\
u(t)    =&\ \text{force} \\
e(t)    =&\ \text{measurement noise} \\
w(t)    =&\ \text{process noise}
\end{align}$$

### Solution steps

#### 1. Ignore nonlinear stiffness

For now, we ignore the nonlinear stiffness component by setting the parameter $b$ to 0. The state transition thus reduces to:

$$\begin{align}
m x''(t) + v x'(t) + a x(t) = u(t) + w(t) 
\end{align}$$

#### 2. Divide by leading coefficient m

We will reduce the equation to standard form by dividing through the mass coefficient.

$$\begin{align}
x''(t) + \frac{v}{m} x'(t) + \frac{a}{m} x(t) = \frac{1}{m} u(t) + \frac{1}{m}w(t) 
\end{align}$$

#### 3. Reduce to first-order differential equation

We make the following substitutions:

$$\begin{align} 
z_1(t) =&\ x(t) \\
z_2(t) =&\ x'(t) \, , 
\end{align}$$

which produces:

$$\begin{align}
z_1'(t) =&\ z_2(t) \\ 
z_2'(t) =&\ -\frac{v}{m} z_2(t) - \frac{a}{m} z_1(t) + \frac{1}{m} u(t) + \frac{1}{m}w(t)  \, .
\end{align}$$

We can re-write this into a matrix form:
$$\begin{align}
\underbrace{\begin{bmatrix} z_1'(t) \\ z_2'(t) \end{bmatrix}}_{\frac{d}{dt}z(t)} = \underbrace{\begin{bmatrix} 0 & 1 \\ -\frac{a}{m} & -\frac{v}{m} \end{bmatrix}}_{A} \underbrace{\begin{bmatrix} z_1(t) \\ z_2(t) \end{bmatrix}}_{z(t)} + \underbrace{\begin{bmatrix} 0 \\ \frac{1}{m} \end{bmatrix}}_{B} u(t) + \begin{bmatrix} 0 \\ \frac{1}{m} \end{bmatrix} w(t)  \, .
\end{align}$$


#### 4. Discretize using Euler-Maruyama

We can perform an approximate discretization using Euler-Maruyama:

$$\begin{align}
\frac{z_{t+1} - z_{t}}{\Delta t} =&\ A z_t + B u_t + B w_t \\
z_{t+1} - z_{t} =&\ A z_t \Delta t + B u_t \Delta t + B \Delta \beta_t \, .
\end{align}$$

where $\Delta \beta_t \sim \mathcal{N}(0, \tau^{-1} \Delta t)$ and $\Delta t = (t+1) - t = 1$. Plugging in $1$ for $\Delta t$ produces:

$$\begin{align}
z_{t+1} = (I + A) z_t + B u_t + B \Delta \beta_t \, .
\end{align}$$

#### 5. Convert to Gaussian probabilities

We now have a standard discrete-time state transition with white noise. We can therefore cast it to:

$$\begin{align}
z_{t+1} \sim&\ \mathcal{N}(A z_{t} + B u_t, C) \\ 
y_t \sim&\ \mathcal{N}(c^{\top} z_t, \sigma^2) \, ,
\end{align}$$

for $C = \begin{bmatrix} 0 & 0 \\ 0 & \frac{\tau^{-1}}{m} \end{bmatrix}$, $c = \begin{bmatrix} 1 & 0 \end{bmatrix}^{\top}$ and $e_t \sim \mathcal{N}(0, \sigma^2)$.

#### 6. Choose priors

I will first study a situation with known measurement noise (so $\sigma$ is fixed). Shorthand notation for coefficients:

$$\begin{align} 
\theta_1 =&\ \frac{-v}{m} \\
\theta_2 =&\ \frac{-a}{m} \\
\eta =&\ \frac{1}{m} \\
\zeta =&\ \frac{\tau^{-1}}{m} \, .
\end{align}$$

Given four equations and four unknowns, I can recover $m$, $v$, $a$ and $\tau$ from $\theta_1$, $\theta_2$, $\eta$ and $\zeta$. The two $\theta$'s can be both negative and positive, while $\eta$ and $\zeta$ are strictly positive. I have thus chosen the following priors:

$$\begin{align}
\theta \sim&\ \mathcal{N}(m^{0}_\theta, V^{0}_\theta) \\
\log(\eta) \sim&\ \mathcal{N}(m^{0}_\eta, v^{0}_\eta) \\
\zeta \sim&\ \Gamma(a^{0}_\zeta, b^{0}_\zeta) 
\end{align}$$

I need a normal approximation for $\eta$ because it is a coefficients to the mean, but I don't need that for $\zeta$.

### Data

Let's first have a look at the data.

In [1]:
using Revise
using CSV
using DataFrames

In [2]:
using Plots

viz = false

false

In [3]:
# Read data from CSV file
df = CSV.read("../data/SNLS80mV.csv", ignoreemptylines=true)
df = select(df, [:V1, :V2])

# Shorthand
input = df[:,1]
output = df[:,2]

# Time horizon
T = size(df, 1);

In [4]:
if viz
    # Plot every n-th time-point to avoid figure size exploding
    n = 10
    p1 = Plots.scatter(1:n:T, output[1:n:T], color="black", label="output", markersize=2, size=(1600,800), xlabel="time (t)", ylabel="response")
    # Plots.savefig(p1, "viz/output_signal.png")
end

In [5]:
if viz
    p2 = Plots.scatter(1:n:T, input[1:n:T], color="blue", label="output", markersize=2, size=(1600,800), xlabel="time (t)", ylabel="control")
    # Plots.savefig(p2, "viz/input_signal.png")
end

## Estimating parameters via Bayesian filtering

Implementation with ForneyLab and AR node. The AR node is locally modified from the package LAR (LAR is in dev mode).

In [6]:
using ForneyLab
using ForneyLab: unsafeMean, unsafeCov, unsafeVar, unsafePrecision
using ProgressMeter

I will use the Nonlinear node to cope with a multivariate log-normal distribution.

In [52]:
# Start graph
graph = FactorGraph()

# Static parameters
@RV θ ~ GaussianMeanPrecision(placeholder(:m_θ, dims=(2,)), placeholder(:W_θ, dims=(2,2)))
@RV η ~ GaussianMeanPrecision(placeholder(:m_η), placeholder(:w_η))
@RV ζ ~ Wishart(placeholder(:v_ζ), placeholder(:n_ζ))

# Nonlinear node
g(x) = exp.(x) # g_inv(x) = log.(x) # Not used in Nonlinear due to DomainError
@RV eη ~ Nonlinear{Unscented}(η; g=g, dims=(1,))

# Observation selection variable
c1 = [1. , 0.]
c2 = [0. , 1.]

# Measurement std
σ = 0.1

# Control signal
# @RV u_t; placeholder(u_t, :u_t)
@RV u_t ~ GaussianMeanPrecision(placeholder(:m_u), [1e8 0.;0. 1e8])

# State prior
@RV z_t ~ GaussianMeanPrecision(placeholder(:m_z, dims=(2,)), placeholder(:W_z, dims=(2, 2)), id=:z_t)

# Mean of state transition
f(θ, x, η, u) = [1. 1.; θ[1] θ[2]+1]*x + [0., η*u]
@RV f_t ~ Nonlinear{Unscented}(θ, z_t, eη, u_t, g=f, dims=(2,), id=:Az)

# State transition
@RV x_t ~ GaussianMeanPrecision(f_t, ζ, id=:x_t)

# Observation likelihood
@RV y_t ~ GaussianMeanVariance(dot(c1, x_t), σ^2, id=:y_t)

# Placeholder for observation
placeholder(y_t, :y_t)

# Draw time-slice subgraph
ForneyLab.draw(graph)

In [53]:
# Infer an algorithm
q = PosteriorFactorization(z_t, x_t, θ, η, eη, ζ, ids=[:z, :x, :θ, :η, :eη, :ζ])
algo = variationalAlgorithm(q, free_energy=false)
source_code = algorithmSourceCode(algo, free_energy=false)
eval(Meta.parse(source_code))
println(source_code)

begin

function initeη()

messages = Array{Message}(undef, 12)


return messages

end

function stepeη!(data::Dict, marginals::Dict=Dict(), messages::Vector{Message}=Array{Message}(undef, 12))

messages[1] = ruleVBGaussianMeanPrecisionOut(nothing, ProbabilityDistribution(Multivariate, PointMass, m=data[:m_θ]), ProbabilityDistribution(MatrixVariate, PointMass, m=data[:W_θ]))
messages[2] = ruleVBGaussianMeanPrecisionM(marginals[:x_t], nothing, marginals[:ζ])
messages[3] = ruleVBGaussianMeanPrecisionOut(nothing, ProbabilityDistribution(Univariate, PointMass, m=data[:m_u]), ProbabilityDistribution(MatrixVariate, PointMass, m=[1.0e8 0.0; 0.0 1.0e8]))
messages[4] = ruleVBGaussianMeanPrecisionOut(nothing, ProbabilityDistribution(Univariate, PointMass, m=data[:m_η]), ProbabilityDistribution(Univariate, PointMass, m=data[:w_η]))
messages[5] = ruleSPNonlinearUTOutNG(g, nothing, messages[4])
messages[6] = ruleVBGaussianMeanPrecisionOut(nothing, ProbabilityDistribution(Multivariate, PointMass, m=d

In [54]:
# Looking at only the first few timepoints
# T = 100
T = size(df, 1);

# Inference parameters
num_iterations = 10

# Initialize marginal distribution and observed data dictionaries
data = Dict()
marginals = Dict()

# Initialize arrays of parameterizations
params_x = (zeros(2,T+1), repeat(.1 .*float(eye(2)), outer=(1,1,T+1)))
params_θ = (zeros(2,T+1), repeat(.1 .*float(eye(2)), outer=(1,1,T+1)))
params_η = (zeros(1,T+1), 0.1 *ones(1,T+1))
params_ζ = (repeat([1e8 0.;0. 1.], outer=(1,1,T+1)), 2*ones(1,T+1))

# Start progress bar
p = Progress(T, 1, "At time ")

# Perform inference at each time-step
for t = 1:T

    # Update progress bar
    update!(p, t)

    # Initialize marginals
    marginals[:f_t] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_x[1][:,t], w=params_x[2][:,:,t])
    marginals[:z_t] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_x[1][:,t], w=params_x[2][:,:,t])
    marginals[:x_t] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_x[1][:,t], w=params_x[2][:,:,t])
    marginals[:θ] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=[0., 0.], w=[1. 0.; 0. 1.])
    marginals[:η] = ProbabilityDistribution(Univariate, GaussianMeanPrecision, m=0., w=1.)
    marginals[:eη] = ProbabilityDistribution(Univariate, GaussianMeanPrecision, m=0., w=1.)
    marginals[:ζ] = ProbabilityDistribution(MatrixVariate, Wishart, v=params_ζ[1][:,:,t], nu=params_ζ[2][1,t])
    
    data = Dict(:y_t => output[t],
                :m_u => input[t],
                :m_z => params_x[1][:,t],
                :W_z => params_x[2][:,:,t],
                :m_θ => params_θ[1][:,t],
                :W_θ => params_θ[2][:,:,t],
                :m_η => params_η[1][1,t],
                :w_η => params_η[2][1,t],
                :v_ζ => params_ζ[1][:,:,t],
                :n_ζ => params_ζ[2][1,t])

    # Iterate variational parameter updates
    for i = 1:num_iterations

#         stepz!(data, marginals)
        stepx!(data, marginals)
        stepθ!(data, marginals)
        stepη!(data, marginals)
        stepζ!(data, marginals)
#         steplθ!(data, marginals)
#         steplη!(data, marginals)
    end

    # Store current parameterizations of marginals
    params_x[1][:,t+1] = unsafeMean(marginals[:x_t])
    params_x[2][:,:,t+1] = marginals[:x_t].params[:w]
    params_θ[1][:,t+1] = unsafeMean(marginals[:θ])
    params_θ[2][:,:,t+1] = marginals[:θ].params[:w]
    params_η[1][1,t+1] = unsafeMean(marginals[:η])
    params_η[2][1,t+1] = marginals[:η].params[:w]
    params_ζ[1][:,:,t+1] = marginals[:ζ].params[:v]
    params_ζ[2][1,t+1] = marginals[:ζ].params[:nu]

end

MethodError: MethodError: no method matching ruleSPNonlinearUTInGX(::typeof(f), ::Int64, ::Message{GaussianMeanPrecision,Multivariate}, ::Message{GaussianMeanPrecision,Multivariate}, ::Message{GaussianMeanPrecision,Multivariate}, ::Message{GaussianMeanVariance,Univariate}, ::Message{GaussianMeanPrecision,Univariate})
Closest candidates are:
  ruleSPNonlinearUTInGX(::Function, ::Int64, ::Message{#s36,V} where #s36<:Gaussian, !Matched::Message{#s35,V} where #s35<:Gaussian...; alpha) where V<:ForneyLab.VariateType at /home/wmkouw/.julia/dev/ForneyLab/src/engines/julia/update_rules/nonlinear_unscented.jl:214
  ruleSPNonlinearUTInGX(::Function, !Matched::Function, ::Message{#s40,V} where #s40<:Gaussian, !Matched::Union{Nothing, Message{#s39,V} where #s39<:Gaussian}...; alpha) where V<:ForneyLab.VariateType at /home/wmkouw/.julia/dev/ForneyLab/src/engines/julia/update_rules/nonlinear_unscented.jl:183

### Visualize results

In [None]:
viz = true

In [None]:
# Extract mean of state marginals
estimated_states = params_x[1][1,2:end]

if viz
    # Plot every n-th time-point to avoid figure size exploding
    n = 10
    p1 = Plots.scatter(1:n:T, output[1:n:T], color="black", label="output", markersize=2, size=(1600,800), xlabel="time (t)", ylabel="response")
    Plots.plot!(1:n:T, estimated_states[1:n:T], color="red", linewidth=1, label="estimated")
#     Plots.savefig(p1, "viz/estimated_states01.png")
end

In [None]:
# Extract mean of coefficient marginals
estimated_coeffs_1_mean = params_θ[1][1,2:end]
estimated_coeffs_1_std = sqrt.(inv.(params_θ[2][1,1,2:end]))
estimated_coeffs_2_mean = params_θ[1][2,2:end]
estimated_coeffs_2_std = sqrt.(inv.(params_θ[2][2,2,2:end]))

if viz
    # Plot both coefficients next to each other
    p2a = Plots.plot(1:n:T, estimated_coeffs_1_mean[1:n:T], ribbon=[estimated_coeffs_1_std[1:n:T], estimated_coeffs_1_std[1:n:T]], color="red", label="θ_1", xlabel="time (t)", ylim=[0., 0.8])
    p2b = Plots.plot(1:n:T, estimated_coeffs_2_mean[1:n:T], ribbon=[estimated_coeffs_2_std[1:n:T], estimated_coeffs_2_std[1:n:T]], color="blue", label="θ_2", xlabel="time (t)", ylim=[-0.1, 0.3])
    p2 = plot(p2a, p2b, size=(1600,600))
#     Plots.savefig(p2, "viz/estimated_coeffs.png")
end

In [None]:
# Extract mean of control coefficient marginals
estimated_ccoeff_mean = exp.(params_η[1][1,2:end])
estimated_ccoeff_std = sqrt.(inv.(params_η[2][1,2:end]))

if viz
    # Plot both coefficients next to each other
    p3 = Plots.plot(1:n:T, estimated_ccoeff_mean[1:n:T], ribbon=[estimated_ccoeff_std[1:n:T], estimated_ccoeff_std[1:n:T]], color="blue", label="η", xlabel="time (t)", size=(800,600), ylim=[0.0, 0.25])
#     Plots.savefig(p3, "viz/estimated_ccoeff.png")
end

In [None]:
# Extract mean of process precision marginals
estimated_pnoise_mean = params_τ[1][1,2:end] ./ params_τ[2][1,2:end]
estimated_pnoise_std = sqrt.(params_τ[1][1,2:end] ./ params_τ[2][1,2:end].^2)

if viz
    # Plot both coefficients next to each other
    p4 = Plots.plot(1:n:T, estimated_pnoise_mean[1:n:T], ribbon=[estimated_pnoise_std[1:n:T], estimated_pnoise_std[1:n:T]],color="blue", label="τ", xlabel="time (t)")
#     Plots.savefig(p4, "viz/estimated_pnoise.png")
end