## Probabilistic Programming 3: Assignment

In this assignment, we will be using a regression model for forecasting. You will need to compute the posterior predictive distribution and visualize the estimates. If this doesn't ring a bell, have another look at the lecture on [regression](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Regression.ipynb).

In [None]:
using Pkg
Pkg.activate("workspace/")
Pkg.instantiate();

In [None]:
using Random
using ForneyLab
using Plots

In [None]:
Random.seed!(250) # Do not change this!

In [None]:
# Generate data
sample_size = 100
covariates = collect(range(0., stop=10., length=sample_size))
responses = covariates .+ randn(sample_size,)

# Visualize data
scatter(covariates, responses, color="black", label="", xlabel="covariates", ylabel="responses", size=(800,300))

In [None]:
# Start factor graph
g = FactorGraph();

# Variance of likelihood
σ2_Y = 1.

# Covariates
@RV X; placeholder(X, :X, dims=(2,))

# Define a prior over the weights
@RV θ ~ GaussianMeanVariance([0.0, 0.0], [1.0 0.;0. 1.0])

# Regressors
@RV Y ~ GaussianMeanVariance(dot(X,θ), σ2_Y)
placeholder(Y, :Y)

# Define and compile the algorithm
algorithm = messagePassingAlgorithm(θ, free_energy=true) 
source_code = algorithmSourceCode(algorithm)
eval(Meta.parse(source_code));

# Visualise the graph
ForneyLab.draw()

In [None]:
# Initialize posteriors dictionary
posteriors = Dict()
for i = 1:sample_size
    
    # Load i-th data point
    data = Dict(:X => [covariates[i], 1],
                :Y => responses[i])

    # Update posterior for θ
    step!(data, posteriors)
end

# Moments of posterior distribution for regression parameters
mθ = mean(posteriors[:θ])
Vθ = cov(posteriors[:θ])
println("Mean = "*string(mθ))
println("Covariance = "*string(Vθ))

So, we now have a posterior distribution for the regression parameters $\theta$. Below we are generating new covariates $x_{\bullet}$ and want to infer the unknown responses $y_{\bullet}$. The predictive distribution is described in the lectures as $p(y_{\bullet} \mid x_{\bullet}, D)$ where $D$ is previously observed data. This distribution has a particular form according to our regression model described above.

### **1) Compute the mean of the predictive distribution.**

Tip: use `.*` and `.+` to multiply with or add a number to each element in an array; `3 .* [1 2 3] = [3 6 9]` and `[1 2 3] .+ 3 = [4 5 6]` (see [broadcasting](https://docs.julialang.org/en/v1/manual/arrays/#Broadcasting))

In [None]:
# Generate future covariates
num_future = 10
x_bullet = collect(range(10.0, stop=12, length=num_future))

# Create a variable "mean_y_bullet" with the mean of predictive distribution
### YOUR CODE HERE

# Visualize forecasts
scatter(covariates, responses, color="black", label="data", xlabel="covariates", ylabel="responses", xlims=[0., 15.], ylims=[0., 15.], legend=:topleft)
plot!(x_bullet, mean_y_bullet, label="forecast", color="red", size=(800,300))

Your visualization should look this:

![](figures/mean_predictive.png)

### **2) Compute the variance of the predictive distribution.**

In [None]:
# Create a variable "var_y_bullet" with the variance of the predictive distribution
### YOUR CODE HERE

# Visualize forecasts
scatter(covariates, responses, color="black", label="data", xlabel="covariates", ylabel="responses", xlims=[0., 15.], ylims=[0., 15.], legend=:topleft)
plot!(x_bullet, mean_y_bullet, ribbon=[var_y_bullet, var_y_bullet], label="forecast", color="red", size=(800,300))

Your visualization should look like this:

![](figures/full_predictive.png)