# Linear Regression

Turing is powerful when applied to complex hierarchical models, but it can also be put to task at common statistical procedures, like [linear regression](https://en.wikipedia.org/wiki/Linear_regression). This tutorial covers how to implement a linear regression model in Turing.

We begin by importing all the necessary libraries.

In [None]:
# Import Turing and Distributions.
using Turing, Distributions

# Import RDatasets.
using RDatasets

# Import MCMCChain, Plots, and StatPlots for visualizations and diagnostics.
using MCMCChain, Plots, StatPlots

# MLDataUtils provides a sample splitting tool that's very handy.
using MLDataUtils

# Set a seed for reproducibility.
using Random
Random.seed!(0);

In [None]:
# Import the "Default" dataset.
data = RDatasets.dataset("datasets", "mtcars");

# Show the first six rows of the dataset.
head(data)

In [None]:
# Split our dataset 5/95 into training/test sets.
train, test = MLDataUtils.splitobs(data, at = 0.7);

# Create our labels. These are the values we are trying to predict.
train_label = train[:MPG]
test_label = test[:MPG]

# Get the list of columns to keep.
remove_names = filter(x->!in(x, [:MPG, :Model]), names(data))

# Filter the test and train sets.
train = Matrix(train[remove_names]);
test = Matrix(test[remove_names]);

# Rescale our matrices.
train = (train .- mean(train, dims=1)) ./ std(train, dims=1);
test = (test .- mean(test, dims=1)) ./ std(test, dims=1);

In [None]:
using LinearAlgebra

# Bayesian logistic regression (LR)
@model linear_regression(x, y, n_obs, n_vars) = begin
    σ₂ ~ InverseGamma(2,3)
    intercept ~ Normal(0, 1)
    coefficients = TArray{Real}(undef, n_vars)
    for i in eachindex(coefficients)
        coefficients[i] ~ Normal(0, 1)
    end

    for i = 1:n_obs
        v = intercept + coefficients ⋅ x[i,:]
        y[i] ~ Normal(v, σ₂)
    end
end;

n_obs, n_vars = size(train)
model = linear_regression(train, train_label, n_obs, n_vars)
chain = sample(model, HMC(4000, 0.001, 20));

In [None]:
function prediction(chain, x)
    _, max_lp = findmax(chain[:lp])

    α = chain[:intercept][max_lp]
    β = chain[:coefficients][max_lp]
#     α = mean(chain[:intercept])
#     β = mean(chain[:coefficients])
    
    return α .+ x * β
end

prediction(chain, test)