# Another 1D GP Example

We generate realizations of a simple one dimensional Gaussian Process with zero mean and Matern covariance. Then we use the LaplaceInterpolation procedure to produce the interpolation.

In [None]:
using GaussianProcesses, LaplaceInterpolation, Random, LinearAlgebra, Plots, Statistics, StatsPlots
Random.seed!(20140430);

In [None]:
function get_gp(ν, ll, lσ)
    kern = Matern(ν, ll, lσ)                   
    # We will use no observation noise here for the purposes of generating data
    return GP(x, y, MeanZero(), kern)
end

function gprealiz(x, y, ν, ll, lσ, n2, nrealiz = 1)
    kern = Matern(ν, ll, lσ)                   
    # We will use no observation noise here for the purposes of generating data
    gp = GP(x, y, MeanZero(), kern)
    # Generate realizations of the data. It is given as μ + σLV, where L is the Cholesky factor of K
    # and V is drawn from an iid N(0,1) distribution
    μ, Σ = predict_f(gp, collect(1.0:n2), full_cov = true)
    L = cholesky(Σ)
    realiz = zeros(n2, nrealiz)
    for i in 1:nrealiz
        realiz[:, i] = μ + L.L * randn(n2)
    end
    return realiz
end

## Generate some data with Matèrn covariance 

We use the GaussianProcesses.jl package to first make up a GP and then generate realizations from it. The parameters given to this package are on a log scale. 

In [None]:
# Training data
n = 10
n2 = 2 * n
v = Float64.(randperm(n2))
x = sort(v[1:n])
y = sin.(2π * x * 0.1) + 0.05*randn(n)

keep = Int64.(x)
discard = sort(Int64.(v[(n+1):n2]))

# The Matern parameters are set on a log scale
ν = 1.5
ll = 0.5
lσ = 0.0 #0.05
nrealiz = 100

realiz = gprealiz(x, y, ν, ll, lσ, n2, nrealiz)

Plotting the realizations atop the simple 1D GP.

In [None]:
gp = get_gp(ν, ll, lσ)
plot(gp; xlabel="x", ylabel="y", title="Gaussian process", legend=true, label = ["μ" "y"],
        xlim = [0,21]) 
plot!(realiz, c = RGBA(0,0,0,0.1), label = "")

## Interpolate using Matérn and Laplace interpolation

Here we assume the Matérn parameters are known and interpolate using them. We need to convert between the log-parameters given to the GP function above and the notation we've used, m, and $\epsilon$. 

In [None]:
h = 1.0

d = 1
# m = nu + d/2
m = 2 
# epsilon = sqrt(2*nu)/length_scale
epsilon = sqrt(3)/exp(ll)

y_lap = mapslices(z -> matern_1d_grid(z, discard, 1, 1.0, h), realiz, dims = 1)
y_mat = mapslices(z -> matern_1d_grid(z, discard, m, epsilon, h), realiz, dims = 1)


Looking at one of the realizations and the interpolation we get, we see that the squared distance between the interpolation and the truth is smaller for the Matérn.

In [None]:
function sqdist(r, interp, discard)
    return sum(abs2, r[discard] .- interp[discard])
end

i = 1

println("Squared distance between Laplace interpolation and the truth: $(sqdist(realiz[:,i], y_lap[:,i], discard))")
println("Squared distance between Matern interpolation and the truth: $(sqdist(realiz[:,i], y_mat[:,i], discard))")

plot(gp, label = "GP", alpha = 0.05, legend = :outertopright)
scatter!(keep, realiz[keep, i], label="Known")
plot!(realiz[:, i], label = "Realiz. $i")
scatter!(discard, y_lap[discard, i], label = "∇² Interp.")
scatter!(discard, y_mat[discard, i], label = "Mat($m, $(round(epsilon, sigdigits = 3))) Interp.")


In [None]:
# Compute the squared distance for all the interpolations
lap_dist = map(i -> sqdist(realiz[:, i], y_lap[:, i], discard), 1:nrealiz)
mat_dist = map(i -> sqdist(realiz[:, i], y_mat[:, i], discard), 1:nrealiz)

println("Mean squared distance between the Laplacian interpolation and the truth: $(mean(lap_dist)).")
println("Mean squared distance between the Matern interpolation and the truth: $(mean(mat_dist)).")

boxplot(lap_dist)