# One Float

Consider a factor graph with one variable $x \in [0,1]$ and two functions:

$$f_1(x) = (x < 1/2) \qquad f_2(x) = (x \geq 1/2)$$

In [1]:
using AutomotiveDrivingModels
using AutoScenes
using Base.Test
import QuadGK: quadgk

[1m[36mINFO: [39m[22m[36mRecompiling stale cache file /home/tim/.julia/lib/v0.6/AutoScenes.ji for module AutoScenes.

Use "const Assignment = Tuple{Int,Int}" instead.

Use "const Assignments = Vector{Tuple{Int,Assignment}}" instead.


### Entity

We define our entity as a single float that has no definition, and integer id, and no roadway.

In [2]:
const FloatScene = Frame{Entity{Float64, Void, Int}};

The `Vars` type stores a particular instance of the variables for a factor graph.
The `Vars` type in AutoScenes is constructed by a `scene` and a `roadway`. We must define this constructor in order automatically generate factor graphs. In this very simple problem we only create a 1-variable factor graph.

In [3]:
function AutoScenes.Vars(scene::FloatScene, roadway::Void)
    x = scene[1].state
    return Vars(Float64[x], # vector containing the one variable
                [StateBounds(-x, 1-x)], # bounds are deltas so we can vary x by this much to remain in [0,1]
                [:x], # variable symbol
                [1])  # index of entity in the scene
end

### Features

We define each feature - first the extraction functions:

In [4]:
function f1(
    vars::Vars,
    assignment::Assignment, # indeces of variables in vars
    roadway::Void,
    )::Float64

    return vars.values[assignment[1]] < 0.5
end
function f2(
    vars::Vars,
    assignment::Assignment, # indeces of variables in vars
    roadway::Void,
    )::Float64

    return vars.values[assignment[1]] ≥ 0.5
end;

We also define an assignment function for each feature which takes a scene and assigns features to the entities within. This is used to construct the factor graph.

In [5]:
function AutoScenes.assign_feature{F <: typeof(f1), Void}(
    f::F,
    scene::FloatScene,
    roadway::Void,
    vars::Vars,
    )

    return Assignment[(1,0)]
end
function AutoScenes.assign_feature{F <: typeof(f2), Void}(
    f::F,
    scene::FloatScene,
    roadway::Void,
    vars::Vars,
    )

    return Assignment[(1,0)]
end;

### Concrete tests

We now construct our factorgraphs, including a dataset $\mathcal{D} = \left\{0,1/3,3/4,3/4,3/4,1\right\}$.

In [6]:
roadway = nothing
scenes = [
    FloatScene([Entity{Float64,Void,Int}(0.0,roadway,1)], 1),
    FloatScene([Entity{Float64,Void,Int}(1/4,roadway,1)], 1),
    FloatScene([Entity{Float64,Void,Int}(3/4,roadway,1)], 1),
    FloatScene([Entity{Float64,Void,Int}(3/4,roadway,1)], 1),
    FloatScene([Entity{Float64,Void,Int}(3/4,roadway,1)], 1),
    FloatScene([Entity{Float64,Void,Int}(1.0,roadway,1)], 1),
]
features = (f1, f2)
factorgraphs = [FactorGraph(features, scene, roadway) for scene in scenes];

#### Probability

Factor graphs represent probability distributions. Let us test various functions.

$$p(x) = \frac{\tilde{p}(x \mid -)}{\tilde{p}(-)} = \frac{\exp(\theta_1 f_1 + \theta_2 f_2)}{\int_0^1 \exp(\theta_1 f_1 + \theta_2 f_2) \> dx} = \frac{\exp(\theta_1 f_1 + \theta_2 f_2)}{\frac{1}{2}e^{\theta_1} + \frac{1}{2}e^{\theta_2}} = \frac{\exp(\theta_1 f_1 + \theta_2 f_2)}{Z}$$

In [7]:
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
Z′ = ptilde_denom(1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(Z′, Z, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [8]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
Z′ = ptilde_denom(1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(Z′, Z, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [9]:
f1(x::Real) = 1.0*(x < 1/2)
f2(x::Real) = 1.0*(x ≥ 1/2)

θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
P = exp(θ[1]*f1(x) + θ[2]*f2(x)) / Z
P′ = ptilde(features, θ, fg.vars, fg.assignments, roadway) / Z
@test isapprox(P′, P, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [10]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
P = exp(θ[1]*f1(x) + θ[2]*f2(x)) / Z
P′ = ptilde(features, θ, fg.vars, fg.assignments, roadway) / Z
@test isapprox(P′, P, atol=1e-8)

[1m[32mTest Passed
[39m[22m

#### pseudolikelihood loss

single instance:
$$
\ell_\text{PL}(x) = \sum_j \ln p(x_j \mid x_{-j}) = \ln p(x) = \ln \tilde{p} - \ln Z = \begin{cases}\theta_1 - \ln Z & \text{if } x < 1/2 \\ \theta_2 - \ln Z & \text{if } x \geq 1/2 \end{cases}
$$

In [11]:
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
PL = x < 1/2 ? θ[1] - log(Z) : θ[2] - log(Z)
PL′ = log_pseudolikelihood(features, θ, fg)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [12]:
fg = factorgraphs[3]
x = fg.vars.values[1]
PL = x < 1/2 ? θ[1] - log(Z) : θ[2] - log(Z)
PL′ = log_pseudolikelihood(features, θ, fg)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [13]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
PL = x < 1/2 ? θ[1] - log(Z) : θ[2] - log(Z)
PL′ = log_pseudolikelihood(features, θ, fg)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [14]:
fg = factorgraphs[3]
x = fg.vars.values[1]
PL = x < 1/2 ? θ[1] - log(Z) : θ[2] - log(Z)
PL′ = log_pseudolikelihood(features, θ, fg)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

full dataset:
$$\ell_\text{PL}(\mathcal{D}) = \sum_m \ell_\text{PL}(x^{(m)}) = 2(\theta_1 - \ln Z) + 4(\theta_2 - \ln Z) = 2\theta_1+ 4\theta_2 - 6\ln Z$$

In [15]:
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
PL = 2θ[1] + 4θ[2] - 6*log(Z)
PL′ = log_pseudolikelihood(features, θ, factorgraphs)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [16]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
PL = 2θ[1] + 4θ[2] - 6*log(Z)
PL′ = log_pseudolikelihood(features, θ, factorgraphs)
@test isapprox(PL′, PL, atol=1e-8)

[1m[32mTest Passed
[39m[22m

#### log psuedolikelihood gradient

Before we compute the log pseudolikelihood gradient, we first compute the expected value: $\mathbb{E}_{x_j \sim p(x_j \mid x_{-j} ; \theta)}[f_i(x_j \mid x_{-j})]$

Our network only has one variable, so here it is simply:

$$\mathbb{E}_{x \sim p(x;\theta)}[f_i(x)] = \int_0^1 f_i(x) p_\theta(x) \> dx$$

For $f_1$ this is:
$$\mathbb{E}_{x \sim p(x;\theta)}[f_1(x)] = \int_0^1 (x < 1/2) p_\theta(x) \> dx = \int_0^{1/2} \frac{\exp(\theta_1 f_1 + \theta_2 f_2)}{\frac{1}{2}e^{\theta_1} + \frac{1}{2}e^{\theta_2}} \> dx = \int_0^{1/2} \frac{1}{Z} \exp(\theta_1) \> dx = \frac{1}{2Z}e^{\theta_1}$$

For $f_2$ this is:
$$\mathbb{E}_{x \sim p(x;\theta)}[f_2(x)] = \frac{1}{2Z}e^{\theta_2}$$

In [17]:
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
E1 = exp(θ[1])/(2Z)
E1′ = calc_expectation_x_given_other(1, 1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(E1′, E1, atol=1e-8)
E2 = exp(θ[2])/(2Z)
E2′ = calc_expectation_x_given_other(2, 1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(E2′, E2, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [18]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
fg = factorgraphs[1]
x = fg.vars.values[1]
E1 = exp(θ[1])/(2Z)
E1′ = calc_expectation_x_given_other(1, 1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(E1′, E1, atol=1e-8)
E2 = exp(θ[2])/(2Z)
E2′ = calc_expectation_x_given_other(2, 1, features, θ, fg.vars, fg.assignments, roadway)
@test isapprox(E2′, E2, atol=1e-8)

[1m[32mTest Passed
[39m[22m

Next we test the log pseudolikleihood gradient:
$$
\frac{\partial}{\partial \theta_i} \ell_\text{PL}(\mathcal{D}) = \sum_{j : x_j \in \text{scope}[f_i]} \left(\frac{1}{M}\sum_m f_i(x_{\cdot}^{(m)}) - \mathbb{E}[f_i(x_j' \mid x_{-j}^{(m)})]\right)
$$

For $f_1$ we get:
$$
    \frac{\partial}{\partial \theta_1} \ell_\text{PL}(\mathcal{D}) = \frac{1}{6}\sum_m \left(f_1(x^{(m)}) - \mathbb{E}[f_1(x)]\right) = \frac{1}{6}\left(2-6\mathbb{E}[f_1(x)]\right) = \frac{1}{3} - \mathbb{E}[f_1(x)] = \frac{1}{3} - \frac{1}{2Z}e^{\theta_1}
$$

Similarly, for $f_2$ we get:
$$
\frac{\partial}{\partial \theta_2} \ell_\text{PL}(\mathcal{D}) = \frac{2}{3} - \frac{1}{2Z}e^{\theta_2}
$$

In [19]:
srand(0)
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
∂ = 1/3 - exp(θ[1])/(2Z)
∂′ = log_pseudolikelihood_derivative_complete(1, features, θ, factorgraphs)
@test isapprox(∂′, ∂, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [20]:
Z = exp(θ[1])/2 + exp(θ[2])/2
∂ = 2/3 - exp(θ[2])/(2Z)
∂′ = log_pseudolikelihood_derivative_complete(2, features, θ, factorgraphs)
@test isapprox(∂′, ∂, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [21]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
∂ = 1/3 - exp(θ[1])/(2Z)
∂′ = log_pseudolikelihood_derivative_complete(1, features, θ, factorgraphs)
@test isapprox(∂′, ∂, atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [22]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
∂ = 2/3 - exp(θ[2])/(2Z)
∂′ = log_pseudolikelihood_derivative_complete(2, features, θ, factorgraphs)
@test isapprox(∂′, ∂, atol=1e-8)

[1m[32mTest Passed
[39m[22m

We finally compute the full gradient:

In [23]:
θ = Float64[1,1]
Z = exp(θ[1])/2 + exp(θ[2])/2
∇ = [1/3 - exp(θ[1])/(2Z), 2/3 - exp(θ[2])/(2Z)]
∇′ = Array{Float64}(length(θ))
log_pseudolikelihood_gradient!(∇′, features, θ, factorgraphs)
@test isapprox(∇′[1], ∇[1], atol=1e-8)
@test isapprox(∇′[2], ∇[2], atol=1e-8)

[1m[32mTest Passed
[39m[22m

In [24]:
θ = Float64[1/3,2/3]
Z = exp(θ[1])/2 + exp(θ[2])/2
∇ = [1/3 - exp(θ[1])/(2Z), 2/3 - exp(θ[2])/(2Z)]
∇′ = Array{Float64}(length(θ))
log_pseudolikelihood_gradient!(∇′, features, θ, factorgraphs)
@test isapprox(∇′[1], ∇[1], atol=1e-8)
@test isapprox(∇′[2], ∇[2], atol=1e-8)

[1m[32mTest Passed
[39m[22m