# Heights problem

We estimate simple linear regression model with a half-T prior.

In [1]:
using DynamicHMCModels

ProjDir = rel_path_d("..", "scripts", "04")
cd(ProjDir)

Import the dataset.

In [2]:
howell1 = CSV.read(rel_path("..", "data", "Howell1.csv"), delim=';');
df = convert(DataFrame, howell1);

Use only adults and standardize

In [3]:
df2 = filter(row -> row[:age] >= 18, df);

Show the first six rows of the dataset.

In [4]:
first(df2, 6)

Unnamed: 0_level_0,height,weight,age,male
Unnamed: 0_level_1,Float64,Float64,Float64,Int64
1,151.765,47.8256,63.0,1
2,139.7,36.4858,63.0,0
3,136.525,31.8648,65.0,0
4,156.845,53.0419,41.0,1
5,145.415,41.2769,51.0,0
6,163.83,62.9926,35.0,1


Half-T for `σ`, see below.

In [5]:
struct HeightsProblem{TY <: AbstractVector, Tν <: Real}
    "Observations."
    y::TY
    "Degrees of freedom for prior on sigma."
    ν::Tν
end;

Then make the type callable with the parameters *as a single argument*.

In [6]:
function (problem::HeightsProblem)(θ)
    @unpack y, ν = problem   # extract the data
    @unpack μ, σ = θ
    loglikelihood(Normal(μ, σ), y) + logpdf(TDist(ν), σ)
end;

Setup problem with data and inits.

In [7]:
obs = convert(Vector{Float64}, df2[:height]);
p = HeightsProblem(obs, 1.0);
p((μ = 178, σ = 5.0,))

│   caller = top-level scope at string:1
└ @ Core string:1


-5170.976519811121

Write a function to return properly dimensioned transformation.

In [8]:
problem_transformation(p::HeightsProblem) =
    as((σ = asℝ₊, μ  = as(Real, 100, 250)), )

problem_transformation (generic function with 1 method)

Wrap the problem with a transformation, then use Flux for the gradient.

In [9]:
P = TransformedLogDensity(problem_transformation(p), p)
∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));

Tune and sample.

In [10]:
chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, 1000);

MCMC, adapting ϵ (75 steps)
0.0001 s/step ...done
MCMC, adapting ϵ (25 steps)
0.00017 s/step ...done
MCMC, adapting ϵ (50 steps)
0.00013 s/step ...done
MCMC, adapting ϵ (100 steps)
8.1e-5 s/step ...done
MCMC, adapting ϵ (200 steps)
6.6e-5 s/step ...done
MCMC, adapting ϵ (400 steps)
6.7e-5 s/step ...done
MCMC, adapting ϵ (50 steps)
9.3e-5 s/step ...done
MCMC (1000 steps)
7.0e-5 s/step ...done


We use the transformation to obtain the posterior from the chain.

In [11]:
posterior = TransformVariables.transform.(Ref(problem_transformation(p)), get_position.(chain));

Extract the parameter posterior means: `β`,

In [12]:
posterior_μ = mean(last, posterior)

154.61962750596464

then `σ`:

In [13]:
posterior_σ = mean(first, posterior)

7.755062081852629

Effective sample sizes (of untransformed draws)

In [14]:
ess = mapslices(effective_sample_size,
                get_position_matrix(chain); dims = 1)

1×2 Array{Float64,2}:
 919.47  887.082

NUTS-specific statistics

In [15]:
NUTS_statistics(chain)

cmdstan_result = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
          Mean        SD       Naive SE      MCSE      ESS
sigma   7.7641872 0.29928194 0.004732063 0.0055677898 1000
   mu 154.6055177 0.41989355 0.006639100 0.0085038356 1000

Quantiles:
         2.5%      25.0%       50.0%      75.0%       97.5%
sigma   7.21853   7.5560625   7.751355   7.9566775   8.410391
   mu 153.77992 154.3157500 154.602000 154.8820000 155.431000
";

Extract the parameter posterior means: `β`,

In [16]:
[posterior_μ, posterior_σ]

2-element Array{Float64,1}:
 154.61962750596464 
   7.755062081852629

end of m4.5d.jl#-
*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*