In [1]:
using DynamicHMCModels

ProjDir = rel_path_d("..", "scripts", "12")

df = CSV.read(rel_path( "..", "data",  "Kline.csv"), delim=';');
size(df) # Should be 10x5

(10, 5)

New col logpop, set log() for population data

In [2]:
df[!, :society] = 1:10;
df[!, :logpop] = map((x) -> log(x), df[!, :population]);
#df[!, :total_tools] = convert(Vector{Int64}, df[!, :total_tools])
first(df[!, [:total_tools, :logpop, :society]], 5)

Unnamed: 0_level_0,total_tools,logpop,society
Unnamed: 0_level_1,Int64,Float64,Int64
1,13,7.00307,1
2,22,7.31322,2
3,24,8.18869,3
4,43,8.47449,4
5,33,8.90924,5


Define problem data structure

In [3]:
struct m_12_06d{TY <: AbstractVector, TX <: AbstractMatrix,
  TS <: AbstractVector}
    "Observations (total_tools)."
    y::TY
    "Covariates (logpop)"
    X::TX
    "Society"
    S::TS
    "Number of observations (10)"
    N::Int
    "Number of societies (also 10)"
    N_societies::Int
end;

Make the type callable with the parameters *as a single argument*.

In [4]:
function (problem::m_12_06d)(θ)
    @unpack y, X, S, N, N_societies = problem   # extract the data
    @unpack β, α, s = trans(θ)  # β : a, bp, α : a_society, s
    σ = s[1]^2
    ll = 0.0
    ll += logpdf(Cauchy(0, 1), σ) # sigma
    ll += sum(logpdf.(Normal(0, σ), α)) # α[1:10]
    ll += logpdf.(Normal(0, 10), β[1]) # a
    ll += logpdf.(Normal(0, 1), β[2]) # bp
    ll += sum(
      [loglikelihood(Poisson(exp(α[S[i]] + dot(X[i, :], β))), [y[i]]) for i in 1:N]
    )
end

Instantiate the model with data and inits.

In [5]:
N = size(df, 1)
N_societies = length(unique(df[!, :society]))
X = hcat(ones(Int64, N), df[!, :logpop]);
S = df[!, :society];
y = df[!, :total_tools];
γ = (β = [1.0, 0.25], α = rand(Normal(0, 1), N_societies), s = [0.2]);
p = m_12_06d(y, X, S, N, N_societies);

Function convert from a single vector of parms to parks NamedTuple

In [6]:
trans = as((β = as(Array, 2), α = as(Array, 10), s = as(Array, 1)));

Define input parameter vector

In [7]:
θ = inverse(trans, γ);
p(θ)

-4158.936566437799

Maximum_a_posterior

In [8]:
using Optim

x0 = θ;
lower = vcat([0.0, 0.0], -3ones(10), [0.0]);
upper = vcat([2.0, 1.0], 3ones(10), [5.0]);
ll(x) = -p(x);

inner_optimizer = GradientDescent()

res = optimize(ll, lower, upper, x0, Fminbox(inner_optimizer));
res

 * Status: failure

 * Candidate solution
    Minimizer: [1.07e+00, 2.66e-01, -3.38e-13,  ...]
    Minimum:   -1.429975e+02

 * Found with
    Algorithm:     Fminbox with Gradient Descent
    Initial Point: [1.00e+00, 2.50e-01, -2.84e-01,  ...]

 * Convergence measures
    |x - x'|               = 1.01e-11 ≰ 0.0e+00
    |x - x'|/|x'|          = 9.12e-12 ≰ 0.0e+00
    |f(x) - f(x')|         = 2.40e-06 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.68e-08 ≰ 0.0e+00
    |g(x)|                 = 4.47e+05 ≰ 1.0e-08

 * Work counters
    Iterations:    1000
    f(x) calls:    754557212
    ∇f(x) calls:   754557212


Minimum gives MAP estimate:

In [9]:
Optim.minimizer(res)

13-element Array{Float64,1}:
  1.0698724931479604    
  0.2658419562959115    
 -3.381328344345346e-13 
  6.423182312971422e-14 
 -1.349966807244842e-13 
  7.938212520010843e-13 
  4.053721034092625e-14 
 -7.721224913515355e-13 
  3.158313271020159e-13 
 -5.422554760912981e-13 
  7.676419143178036e-13 
 -3.5282745361778868e-12
  5.299625548047583e-5  

Write a function to return properly dimensioned transformation.

In [10]:
problem_transformation(p::m_12_06d) =
  as( Vector, length(θ) )

problem_transformation (generic function with 1 method)

Wrap the problem with a transformation, then use ForwardDiff for the gradient.

In [11]:
P = TransformedLogDensity(problem_transformation(p), p)
∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
#∇P = ADgradient(:ForwardDiff, P);

Tune and sample.

In [12]:
chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, 4000);

MCMC, adapting ϵ (75 steps)
0.0057 s/step ...done
MCMC, adapting ϵ (25 steps)
0.0061 s/step ...done
MCMC, adapting ϵ (50 steps)
0.0052 s/step ...done
MCMC, adapting ϵ (100 steps)
0.0024 s/step ...done
MCMC, adapting ϵ (200 steps)
0.0015 s/step ...done
MCMC, adapting ϵ (400 steps)
0.00085 s/step ...done
MCMC, adapting ϵ (50 steps)
0.0015 s/step ...done
MCMC (4000 steps)
step 1376 (of 4000), 0.00073 s/step
step 2741 (of 4000), 0.00073 s/step
0.00072 s/step ...done


We use the transformation to obtain the posterior from the chain.

In [13]:
posterior = TransformVariables.transform.(Ref(problem_transformation(p)),
  get_position.(chain));
posterior[1:5]

5-element Array{Array{Float64,1},1}:
 [1.1253772383240437, 0.2622669279848795, 0.12220828838470743, -0.1369849997766338, 0.02271722912313025, 0.49454421718002867, 0.15341007499761383, -0.05991801907704529, 0.2384526685667614, -0.32467509184525245, 0.21486652403184658, 0.03631578358564913, -0.3827019937005194]      
 [1.4259718961173498, 0.22738864681711948, -0.15879319565078936, -0.011617306363702652, -0.09403823223438239, 0.18913989177938156, -0.044102485189420025, -0.37557920610303447, 0.048000652553465244, -0.296828840347033, 0.0797573803924331, -0.0813660084647233, -0.5032575715241459]
 [1.4896572470500349, 0.232544582476378, -0.13263337832673855, -0.03320904446541051, -0.15565347332116952, 0.12616334869529966, -0.16879530401467072, -0.32439652378280576, 0.13086403156261034, -0.2740054411023427, 0.18129734000097128, -0.12150760812066619, -0.5089619013242039]  
 [0.2935812564862246, 0.32891430656041887, -0.3570150568598451, 0.2245955372364483, 0.3898509932638105, 0.46697636082484634

Extract the parameter posterior means.

In [14]:
posterior_β = mean(trans(posterior[i]).β for i in 1:length(posterior))
posterior_α = mean(trans(posterior[i]).α for i in 1:length(posterior))
posterior_σ = mean(trans(posterior[i]).s for i in 1:length(posterior))[1]^2

0.27249165372476997

Effective sample sizes (of untransformed draws)

In [15]:
ess = mapslices(effective_sample_size, get_position_matrix(chain); dims = 1)
ess

1×13 Array{Float64,2}:
 3296.24  3301.37  3971.67  3626.47  …  2827.36  1926.44  3197.35  488.611

NUTS-specific statistics

In [16]:
NUTS_statistics(chain)

Hamiltonian Monte Carlo sample of length 4000
  acceptance rate mean: 0.93, min/25%/median/75%/max: 0.0 0.91 0.97 0.99 1.0
  termination: AdjacentDivergent => 0% AdjacentTurn => 7% DoubledTurn => 93%
  depth: 0 => 0% 1 => 0% 2 => 1% 3 => 6% 4 => 92% 5 => 1% 6 => 0%


CmdStan result

In [17]:
m_12_6_result = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
                            Mean                SD               Naive SE             MCSE            ESS
            a          1.076167468  0.7704872560 0.01218247319 0.0210530022 1000.000000
           bp         0.263056273  0.0823415805 0.00130193470 0.0022645077 1000.000000
  a_society.1   -0.191723568  0.2421382537 0.00382854195 0.0060563054 1000.000000
  a_society.2    0.054569029  0.2278506876 0.00360263570 0.0051693148 1000.000000
  a_society.3   -0.035935050  0.1926364647 0.00304584994 0.0039948433 1000.000000
  a_society.4    0.334355037  0.1929971201 0.00305155241 0.0063871707  913.029080
  a_society.5    0.049747513  0.1801287716 0.00284808595 0.0043631095 1000.000000
  a_society.6   -0.311903245  0.2096126337 0.00331426674 0.0053000536 1000.000000
  a_society.7    0.148637507  0.1744680594 0.00275858223 0.0047660246 1000.000000
  a_society.8   -0.164567976  0.1821341074 0.00287979309 0.0034297298 1000.000000
  a_society.9    0.277066965  0.1758237250 0.00278001719 0.0055844175  991.286501
 a_society.10   -0.094149204  0.2846206232 0.00450024719 0.0080735022 1000.000000
sigma_society    0.310352849  0.1374834682 0.00217380450 0.0057325226  575.187461
";

Show means

In [18]:
[posterior_β, posterior_α, posterior_σ]

3-element Array{Any,1}:
  [1.11643807294995, 0.25946232932934726]                                                                                                                                                                               
  [-0.1897505644554755, 0.040701885090751795, -0.04180146915130906, 0.31265832957357764, 0.036828532743372715, -0.29682419939231225, 0.13967380955792097, -0.16330897270956143, 0.264447984204219, -0.08758409001758367]
 0.27249165372476997                                                                                                                                                                                                    

End of m12.6d1.jl

*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*