## Silverbox

Silverbox is the name of a nonlinear system identification benchmark, proposed in 2004. Data, baselines and more info can be found at http://nonlinearbenchmark.org/#Silverbox.

State-space formulation of Silverbox's dynamics:

$$\begin{align}
\mu \frac{d^2 x(t)}{dt^2} + \nu \frac{d x(t)}{dt} + \kappa(x(t)) x(t) =&\ u(t) + w(t) \\
\kappa(x(t)) =&\ \alpha + \beta x^2(t) \\
y(t) =&\ x(t) + e(t)
\end{align}$$

where
$$\begin{align}
\mu     =&\ \text{mass} \\
\nu     =&\ \text{viscous damping} \\
\kappa(x(t)) =&\ \text{nonlinear spring} \\
y(t)    =&\ \text{observation (displacement)} \\
x(t)    =&\ \text{state (displacement)} \\
u(t)    =&\ \text{force} \\
e(t)    =&\ \text{measurement noise} \\
w(t)    =&\ \text{process noise}
\end{align}$$

### Steps to solve

I now take a series of steps to re-write this problem:

#### 1. Assume constant spring coefficient Œ∫

$$ \mu \frac{d^2 x(t)}{dt^2} + \nu \frac{d x(t)}{dt} + Œ∫ x(t) = u(t) + w(t)$$

#### 2. Divide by leading coefficient

$$ \frac{d^2 x(t)}{dt^2} + \frac{\nu}{\mu} \frac{d x(t)}{dt} + \frac{Œ∫}{\mu} x(t) = \frac{1}{\mu} u(t) + \frac{1}{\mu} w(t)$$

#### 3. Substitute standard variables

$$ \frac{d^2 x(t)}{dt^2} + 2\zeta \omega_0 \frac{d x(t)}{dt} + \omega_0^2 x(t) - \frac{u(t)}{\mu} = \frac{w(t)}{\mu}$$

where $$\begin{align} 
\zeta =&\ \frac{\nu}{2\sqrt{\mu \kappa}} \\ 
\omega_0 =&\ \sqrt{\frac{\kappa}{\mu}} \, .
\end{align}$$

#### 4. Apply Euler's method to obtain difference equation (step size is 1)

-> Forward Euler:

$$\begin{align}
\frac{x(t+2h)-2x(t+h)+x(t)}{h^2} + 2\zeta \omega_0 \frac{x(t+h)-x(t)}{h} + \omega_0^2 x(t) - \frac{u(t)}{\mu} =&\ \frac{w(t)}{\mu} \\
x(t+2) + 2(\zeta \omega_0 - 1) x(t+1) + (1 - 2 \zeta \omega_0 + \omega_0^2) x(t) - \frac{u(t)}{\mu} =&\ \frac{w(t)}{\mu} 
\end{align}$$

-> Backward Euler:

$$\begin{align}
\frac{x(t)-2x(t-h)+x(t-2h)}{h^2} - 2\zeta \omega_0 \frac{x(t)-x(t-h)}{h} - \omega_0^2 x(t) - \frac{u(t)}{\mu} =&\ \frac{w(t)}{\mu}  \\
(1 + 2 \zeta \omega_0 + \omega_0^2)x(t) - 2(1 + \zeta \omega_0)x(t-1) - x(t-2) - \frac{u(t)}{\mu} =&\ \frac{w(t)}{\mu} \\
x(t) - \frac{2(1 + \zeta \omega_0)}{(1 + 2\zeta \omega_0 + \omega_0^2)}x(t-1) - \frac{1}{(1 + 2\zeta \omega_0 + \omega_0^2)}x(t-2) - \frac{u(t)}{\mu(1 + 2\zeta \omega_0 + \omega_0^2)} =&\ \frac{w(t)}{\mu(1 + 2\zeta \omega_0 + \omega_0^2)}
\end{align}$$
    
Change to shorthand notation:

$$x_t - \alpha x_{t-1} - \beta x_{t-2} - \gamma u_t = \gamma w_t$$

where 
$$\begin{align} 
\alpha =&\ \frac{2(1 + \zeta \omega_0)}{1 + 2\zeta \omega_0 + \omega_0^2} \\
\beta =&\ \frac{1}{1 + 2\zeta \omega_0 + \omega_0^2} \\
\gamma =&\ \frac{1}{\mu(1 + 2 \zeta \omega_0 + \omega_0^2)}
\end{align}$$

#### 5. Convert to multivariate first-order difference form

Stick to backward Euler (matches AR structure)
- Backward Euler:

    $$z_t = M z_{t-1} + N u_t + N w_t$$

    where $z_t = [x_t\ \ x_{t-1}]$, $M = [Œ± \ \ Œ≤; 1\ \ 0]$, $N = [Œ≥\ \ 0]$

#### 6. Convert to Gaussian probability

- Backward Euler:

$$z_t \sim \mathcal{N}(M z_{t-1} + N u_t, N \tau)$$

where $w_t \sim \mathcal{N}(0, \tau)$

#### 7. Observation likelihood

$$y_t \sim \mathcal{N}(c z_t, œÉ)$$

where $e_t \sim \mathcal{N}(0, \sigma)$, $c = [1\ \ 0]$

Now, I need priors for $\alpha$, $\beta$, $\gamma$, $\tau$, $\sigma$. Given three equations and three unknowns, I can recover $\zeta$, $\omega_0$ and $\mu$ from $\alpha$, $\beta$, and $\gamma$. The variables are all strictly positive, which means they can be modeled by gamma distributions:

$$\begin{align}
\alpha \sim&\ Œì(1, 1e3) \\
\beta \sim&\ Œì(1, 1e3) \\
\gamma \sim&\ Œì(1, 1e3) \\
\tau \sim&\ Œì(1, 1e3) \\
\sigma \sim&\ Œì(1, 1e3) 
\end{align}$$

--> Implementation with ForneyLab and AR node

In [1]:
# Generate time-series
using CSV
using DataFrames
using Plots

# Read data from CSV file
df = CSV.read("../data/SNLS80mV.csv", ignoreemptylines=true)
df = select(df, [:V1, :V2])

# Shorthand
input = df[:,1]
output = df[:,2]

# Time horizon
T = size(df, 1);

131072

In [7]:
using ForneyLab
using LAR
using LAR.Node, LAR.Data
using ProgressMeter

I will introduce another shorthand: $\theta = [\alpha\ \ \beta]$ and use the Nonlinear node to provide the AR node with a Gaussian form for $\theta$.

In [138]:
# Start graph
graph = FactorGraph()

# Static parameters
@RV Œ∑ ~ GaussianMeanPrecision(placeholder(:Œ∑_m, dims=(2,)), placeholder(:Œ∑_w, dims=(2,2)))
@RV œÑ ~ Gamma(placeholder(:œÑ_a), placeholder(:œÑ_b))

# Nonlinear function
g(x) = exp.(x)
g_inv(x) = log.(x)

# Nonlinear node
# @RV Œ∏ ~ Nonlinear(log_Œ∏; g=g, g_inv=g_inv, dims=(2,))
@RV Œ∏ ~ Nonlinear{Unscented}(Œ∑; g=g, dims=(2,))

# I'm fixing measurement noise œÉ
œÉ = 0.0001

# Observation selection variable
c = [1, 0]

# State prior
@RV z_t ~ GaussianMeanPrecision(placeholder(:z_m, dims=(2,)), placeholder(:z_w, dims=(2, 2)))

# Autoregressive node
@RV x_t ~ Autoregressive(Œ∏, z_t, œÑ)

# Specify likelihood
@RV y_t ~ GaussianMeanVariance(dot(c, x_t), œÉ)

# Placeholder for data
placeholder(y_t, :y_t)

# Draw time-slice subgraph
ForneyLab.draw(graph)

In [139]:
# Infer an algorithm
q = PosteriorFactorization(z_t, x_t, Œ∏, Œ∑, œÑ, ids=[:z, :x, :Œ∏, :Œ∑, :œÑ])
algo = variationalAlgorithm(q, free_energy=false)
source_code = algorithmSourceCode(algo, free_energy=false)
eval(Meta.parse(source_code));
println(source_code)

begin

function stepœÑ!(data::Dict, marginals::Dict=Dict(), messages::Vector{Message}=Array{Message}(undef, 2))

messages[1] = ruleVBGammaOut(nothing, ProbabilityDistribution(Univariate, PointMass, m=data[:œÑ_a]), ProbabilityDistribution(Univariate, PointMass, m=data[:œÑ_b]))
messages[2] = ruleVariationalARIn3PPPN(marginals[:x_t], marginals[:z_t], marginals[:Œ∏], nothing)

marginals[:œÑ] = messages[1].dist * messages[2].dist

return marginals

end

function stepz!(data::Dict, marginals::Dict=Dict(), messages::Vector{Message}=Array{Message}(undef, 2))

messages[1] = ruleVBGaussianMeanPrecisionOut(nothing, ProbabilityDistribution(Multivariate, PointMass, m=data[:z_m]), ProbabilityDistribution(MatrixVariate, PointMass, m=data[:z_w]))
messages[2] = ruleVariationalARIn1PNPP(marginals[:x_t], nothing, marginals[:Œ∏], marginals[:œÑ])

marginals[:z_t] = messages[1].dist * messages[2].dist

return marginals

end

function initŒ∏()

messages = Array{Message}(undef, 4)


return messages

end

func

In [140]:
# Looking at only the first few timepoints
# T = 1000
T = size(df, 1);

# Inference parameters
num_iterations = 10

# Initialize marginal distribution and observed data dictionaries
data = Dict()
marginals = Dict()

# Initialize arrays of parameterizations
params_x = (zeros(2,T+1), repeat(1. .*float(eye(2)), outer=(1,1,T+1)))
params_z = (zeros(2,T+1), repeat(1. .*float(eye(2)), outer=(1,1,T+1)))
params_Œ∏ = (zeros(2,T+1), repeat(1. .*float(eye(2)), outer=(1,1,T+1)))
params_Œ∑ = (zeros(2,T+1), repeat(1. .*float(eye(2)), outer=(1,1,T+1)))
params_œÑ = (1e-3.*ones(1,T+1), ones(1,T+1))

# Start progress bar
p = Progress(T, 1, "At time ")

# Perform inference at each time-step
for t = 1:T

    # Update progress bar
    update!(p, t)

    # Initialize marginals
    marginals[:x_t] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_x[1][:,t], w=params_x[2][:,:,t])
    marginals[:z_t] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_z[1][:,t], w=params_z[2][:,:,t])
    marginals[:Œ∑] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_Œ∑[1][:,t], w=params_Œ∑[2][:,:,t])
    marginals[:Œ∏] = ProbabilityDistribution(Multivariate, GaussianMeanPrecision, m=params_Œ∏[1][:,t], w=params_Œ∏[2][:,:,t])
    marginals[:œÑ] = ProbabilityDistribution(Univariate, Gamma, a=params_œÑ[1][1,t], b=params_œÑ[2][1,t])
    
    data = Dict(:y_t => output[t],
                :Œ∏_m => params_Œ∏[1][:,t],
                :Œ∑_m => params_Œ∑[1][:,t],
                :z_m => params_z[1][:,t],
                :Œ∏_w => params_Œ∏[2][:,:,t],
                :Œ∑_w => params_Œ∑[2][:,:,t],
                :z_w => params_z[2][:,:,t],
                :œÑ_a => params_œÑ[1][1,t],
                :œÑ_b => params_œÑ[2][1,t])

    # Iterate variational parameter updates
    for i = 1:num_iterations

        stepx!(data, marginals)
        stepz!(data, marginals)
        stepŒ∏!(data, marginals)
        stepŒ∑!(data, marginals)
        stepœÑ!(data, marginals)
    end

    # Store current parameterizations of marginals
    params_x[1][:,t+1] = unsafeMean(marginals[:x_t])
    params_z[1][:,t+1] = unsafeMean(marginals[:z_t])
    params_Œ∑[1][:,t+1] = unsafeMean(marginals[:Œ∑])
    params_Œ∏[1][:,t+1] = unsafeMean(marginals[:Œ∏])
    params_x[2][:,:,t+1] = marginals[:x_t].params[:w]
    params_z[2][:,:,t+1] = marginals[:z_t].params[:w]
    params_Œ∑[2][:,:,t+1] = marginals[:Œ∑].params[:w]
    params_Œ∏[2][:,:,t+1] = marginals[:Œ∏].params[:w]
    params_œÑ[1][1,t+1] = marginals[:œÑ].params[:a]
    params_œÑ[2][1,t+1] = marginals[:œÑ].params[:b]

end

[32mAt time 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| Time: 0:05:08[39m


In [141]:
params_œÑ[1] ./ params_œÑ[2]

1√ó131073 Array{Float64,2}:
 0.001  0.37091  0.663558  0.939494  ‚Ä¶  366.23  366.233  366.236  366.238

In [142]:
estimated_states = params_x[1][1,2:end]

131072-element Array{Float64,1}:
  0.009397542105525122 
  0.012970406788125501 
  0.010824396276956453 
  0.004071953667434906 
 -0.003963772672221289 
 -0.009236040733642854 
 -0.009374452836156384 
 -0.004777936609386501 
  0.002040311226880663 
  0.007914778825148303 
  0.01031214272242364  
  0.008164737109141932 
  0.0025451650829010626
  ‚ãÆ                    
 -0.016075313038084654 
 -0.018502431971815562 
 -0.013416680122668617 
 -0.0036379907534132955
  0.007176561875299693 
  0.015286173090410174 
  0.016796850508230866 
  0.011990859165621079 
  0.0051064844633993325
 -0.001115931898292594 
 -0.005810212434372635 
 -0.006941482050576612 

In [143]:
params_z[1]

2√ó131073 Array{Float64,2}:
 0.0  2.81928e-14  3.43293e-14  3.72058e-14  ‚Ä¶  2.98595e-13  2.98594e-13
 0.0  0.00209809   0.00414058   0.00513575      0.00755062   0.00755025 

### Visualize results

In [144]:
using PyPlot

In [146]:
# Plot every n-th time-point to avoid figure size exploding
n = 10
p1 = Plots.scatter(1:n:T, output[1:n:T], color="black", label="output", markersize=2, size=(1600,800), xlabel="time (t)", ylabel="response")
Plots.plot!(1:n:T, estimated_states[1:n:T], color="red", linewidth=1, label="estimated")
Plots.savefig(p1, "viz/estimated_states01.png")