In [1]:
using DynamicHMCModels

ProjDir = rel_path_d("..", "scripts", "12")

df = CSV.read(rel_path( "..", "data",  "Kline.csv"), delim=';');
size(df) # Should be 10x5

(10, 5)

New col logpop, set log() for population data

In [2]:
df[:society] = 1:10;
df[:logpop] = map((x) -> log(x), df[:population]);
#df[:total_tools] = convert(Vector{Int64}, df[:total_tools])
first(df[[:total_tools, :logpop, :society]], 5)

Unnamed: 0_level_0,total_tools,logpop,society
Unnamed: 0_level_1,Int64⍰,Float64,Int64
1,13,7.00307,1
2,22,7.31322,2
3,24,8.18869,3
4,43,8.47449,4
5,33,8.90924,5


Define problem data structure

In [3]:
struct m_12_06d{TY <: AbstractVector, TX <: AbstractMatrix,
  TS <: AbstractVector}
    "Observations (total_tools)."
    y::TY
    "Covariates (logpop)"
    X::TX
    "Society"
    S::TS
    "Number of observations (10)"
    N::Int
    "Number of societies (also 10)"
    N_societies::Int
end;

Make the type callable with the parameters *as a single argument*.

In [4]:
function (problem::m_12_06d)(θ)
    @unpack y, X, S, N, N_societies = problem   # extract the data
    @unpack β, α, s = trans(θ)  # β : a, bp, α : a_society, s
    σ = s[1]^2
    ll = 0.0
    ll += logpdf(Cauchy(0, 1), σ) # sigma
    ll += sum(logpdf.(Normal(0, σ), α)) # α[1:10]
    ll += logpdf.(Normal(0, 10), β[1]) # a
    ll += logpdf.(Normal(0, 1), β[2]) # bp
    ll += sum(
      [loglikelihood(Poisson(exp(α[S[i]] + dot(X[i, :], β))), [y[i]]) for i in 1:N]
    )
end

Instantiate the model with data and inits.

In [5]:
N = size(df, 1)
N_societies = length(unique(df[:society]))
X = hcat(ones(Int64, N), df[:logpop]);
S = df[:society];
y = df[:total_tools];
γ = (β = [1.0, 0.25], α = rand(Normal(0, 1), N_societies), s = [0.2]);
p = m_12_06d(y, X, S, N, N_societies);

Function convert from a single vector of parms to parks NamedTuple

In [6]:
trans = as((β = as(Array, 2), α = as(Array, 10), s = as(Array, 1)));

Define input parameter vector

In [7]:
θ = inverse(trans, γ);
p(θ)

-2457.2905378716437

Maximum_a_posterior

In [8]:
using Optim

x0 = θ;
lower = vcat([0.0, 0.0], -3ones(10), [0.0]);
upper = vcat([2.0, 1.0], 3ones(10), [5.0]);
ll(x) = -p(x);

inner_optimizer = GradientDescent()

res = optimize(ll, lower, upper, x0, Fminbox(inner_optimizer));
res

Results of Optimization Algorithm
 * Algorithm: Fminbox with Gradient Descent
 * Starting Point: [1.0,0.25, ...]
 * Minimizer: [1.0990724279018136,0.2629112145980903, ...]
 * Minimum: -1.316389e+02
 * Iterations: 1000
 * Convergence: false
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 1.67e-11 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = -2.79e-08 |f(x)|
   * |g(x)| ≤ 1.0e-08: false 
     |g(x)| = 2.13e+05 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 885092394
 * Gradient Calls: 885092394

Minimum gives MAP estimate:

In [9]:
Optim.minimizer(res)

13-element Array{Float64,1}:
  1.0990724279018136    
  0.2629112145980903    
 -1.5659551996404486e-13
  4.6910955661802434e-14
 -3.904296229466099e-14 
  4.279773064698081e-13 
  6.697555078089509e-14 
 -3.3247285601867734e-13
  2.1131500920488918e-13
 -1.9596080577126818e-13
  4.694558976242896e-13 
  9.431369293268382e-13 
  9.388310974199122e-5  

Write a function to return properly dimensioned transformation.

In [10]:
problem_transformation(p::m_12_06d) =
  as( Vector, length(θ) )
# Wrap the problem with a transformation, then use ForwardDiff for the gradient.
P = TransformedLogDensity(problem_transformation(p), p)
∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
#∇P = ADgradient(:ForwardDiff, P);

Tune and sample.

In [11]:
chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, 1000);

MCMC, adapting ϵ (75 steps)
0.0054 s/step ...done
MCMC, adapting ϵ (25 steps)
0.0064 s/step ...done
MCMC, adapting ϵ (50 steps)
0.0061 s/step ...done
MCMC, adapting ϵ (100 steps)
0.0028 s/step ...done
MCMC, adapting ϵ (200 steps)
0.002 s/step ...done
MCMC, adapting ϵ (400 steps)
0.0012 s/step ...done
MCMC, adapting ϵ (50 steps)
0.0016 s/step ...done
MCMC (1000 steps)
0.00094 s/step ...done


We use the transformation to obtain the posterior from the chain.

In [12]:
posterior = TransformVariables.transform.(Ref(problem_transformation(p)), get_position.(chain));
posterior[1:5]

5-element Array{Array{Float64,1},1}:
 [1.3642302969170115, 0.2333151153228049, -0.5152032048456298, -0.24522924855319422, -0.0461566411093301, 0.3533528087673454, 0.05010767951190351, -0.541440441639268, 0.2281522559078383, 0.047260428309055455, 0.38167657871089566, 0.09007293303338929, 0.7154275563260439]        
 [1.1184064653413248, 0.2575161753611441, 0.01693104151124023, 0.16180105861351557, 0.07274953811640811, 0.16337367638364317, -0.04744203586903211, -0.0403426346340567, 0.14773057731708616, -0.21701277827141968, 0.10171857683913672, -0.09735893724763592, 0.27301400174299395]   
 [1.6076155728230748, 0.21252372852369553, -0.16692246445527723, -0.0251776068606131, -0.017744286350968842, 0.06911524805195697, -0.04605253304583832, -0.08959260296669631, 0.02806401016892899, -0.23961137516199732, 0.1065903139631784, 0.09065358006217313, 0.29056347230880286]
 [1.0035683337736914, 0.27439659737416827, -0.20196084907313858, -0.08184286860686514, -0.09753715726809313, 0.330351161049445

Extract the parameter posterior means.

In [13]:
posterior_β = mean(trans(posterior[i]).β for i in 1:length(posterior))
posterior_α = mean(trans(posterior[i]).α for i in 1:length(posterior))
posterior_σ = mean(trans(posterior[i]).s for i in 1:length(posterior))[1]^2

0.26118667039771004

Effective sample sizes (of untransformed draws)

In [14]:
ess = mapslices(effective_sample_size, get_position_matrix(chain); dims = 1)
ess

1×13 Array{Float64,2}:
 651.636  659.513  756.322  1000.0  …  808.073  707.056  760.414  246.369

NUTS-specific statistics

In [15]:
NUTS_statistics(chain)

Hamiltonian Monte Carlo sample of length 1000
  acceptance rate mean: 0.94, min/25%/median/75%/max: 0.11 0.92 0.97 0.99 1.0
  termination: AdjacentTurn => 8% DoubledTurn => 92%
  depth: 2 => 0% 3 => 7% 4 => 92% 5 => 1%


CmdStan result

In [16]:
m_12_6_result = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
                            Mean                SD               Naive SE             MCSE            ESS
            a          1.076167468  0.7704872560 0.01218247319 0.0210530022 1000.000000
           bp         0.263056273  0.0823415805 0.00130193470 0.0022645077 1000.000000
  a_society.1   -0.191723568  0.2421382537 0.00382854195 0.0060563054 1000.000000
  a_society.2    0.054569029  0.2278506876 0.00360263570 0.0051693148 1000.000000
  a_society.3   -0.035935050  0.1926364647 0.00304584994 0.0039948433 1000.000000
  a_society.4    0.334355037  0.1929971201 0.00305155241 0.0063871707  913.029080
  a_society.5    0.049747513  0.1801287716 0.00284808595 0.0043631095 1000.000000
  a_society.6   -0.311903245  0.2096126337 0.00331426674 0.0053000536 1000.000000
  a_society.7    0.148637507  0.1744680594 0.00275858223 0.0047660246 1000.000000
  a_society.8   -0.164567976  0.1821341074 0.00287979309 0.0034297298 1000.000000
  a_society.9    0.277066965  0.1758237250 0.00278001719 0.0055844175  991.286501
 a_society.10   -0.094149204  0.2846206232 0.00450024719 0.0080735022 1000.000000
sigma_society    0.310352849  0.1374834682 0.00217380450 0.0057325226  575.187461
";

Show means

In [17]:
[posterior_β, posterior_α, posterior_σ]

3-element Array{Any,1}:
  [1.1526552512751642, 0.2553011935767466]                                                                                                                                                                             
  [-0.1729653273692572, 0.03915919637251105, -0.035605675377336624, 0.30502483084599386, 0.027783015568052682, -0.27845457637397797, 0.136842683567227, -0.15559851884446968, 0.27141151365217836, -0.0762853229507955]
 0.26118667039771004                                                                                                                                                                                                   

End of m12.6d1.jl

*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*