# Form the NB Regression Random Intercept Model: Simulated set

Using Block update of beta and MM-update. We will use the file "fit_old.jl"

Without having to compute the gradient and hessian of variance components vector, we finish in 67 iterations and 9 seconds.

Next we may try joint estimation using Newton's + IPOPT. 

In [1]:
using DataFrames, Random, GLM, GLMCopula
using ForwardDiff, Test, LinearAlgebra
using LinearAlgebra: BlasReal, copytri!

Random.seed!(1234)

# sample size
N = 10000
# observations per subject
n = 5

variance_component_1 = 0.9
variance_component_2 = 0.1

r = 1
p = 0.7
μ = r * (1-p) * inv(p)

#var = r * (1-p) * inv(p^2)

# true beta
β_true = log(μ)

dist = NegativeBinomial

Γ = variance_component_1 * ones(n, n) + variance_component_2 * Matrix(I, n, n)
vecd = [dist(r, p) for i in 1:n]
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ)

Y_Nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, N)

┌ Info: Precompiling GLMCopula [c47b6ae2-b804-4668-9957-eb588c99ffbc]
└ @ Base loading.jl:1278


10000-element Array{Array{Float64,1},1}:
 [0.0, 1.0, 0.0, 0.0, 1.0]
 [2.0, 0.0, 0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0, 4.0, 1.0]
 [0.0, 0.0, 0.0, 0.0, 1.0]
 [3.0, 3.0, 3.0, 1.0, 1.0]
 [0.0, 0.0, 0.0, 0.0, 0.0]
 [3.0, 0.0, 0.0, 0.0, 0.0]
 [0.0, 1.0, 0.0, 0.0, 2.0]
 [0.0, 1.0, 0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0, 0.0, 0.0]
 [1.0, 0.0, 0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0, 0.0, 0.0]
 [1.0, 0.0, 0.0, 0.0, 0.0]
 ⋮
 [0.0, 0.0, 1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0, 0.0, 0.0]
 [0.0, 1.0, 0.0, 0.0, 1.0]
 [0.0, 0.0, 1.0, 2.0, 0.0]
 [1.0, 0.0, 0.0, 0.0, 3.0]
 [0.0, 0.0, 0.0, 0.0, 0.0]
 [0.0, 0.0, 1.0, 2.0, 1.0]
 [2.0, 2.0, 0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0, 1.0, 0.0]
 [2.0, 1.0, 2.0, 0.0, 0.0]
 [0.0, 0.0, 1.0, 0.0, 0.0]

In [2]:
d = NegativeBinomial()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{GLMCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = GLMCopulaVCObs(y, X, V, d, link)
end
gcm = GLMCopulaVCModel(gcs);

In [3]:
initialize_model!(gcm)
@show gcm.β
@show gcm.Σ

initializing β using Newton's Algorithm under Independence Assumption
1 0.0 -57421.69453199051 39999
2 -57421.69453199051 -57421.69453199051 39999
initializing variance components using MM-Algorithm
gcm.β = [-0.3620034006976286]
gcm.Σ = [0.18367019483213523, 4.683717988259242e-5]


2-element Array{Float64,1}:
 0.18367019483213523
 4.683717988259242e-5

In [4]:
β_true

-0.8472978603872034

Initialize β and σ2, here I just copy the solution for β and σ2 from MixedModels.jl over

In [5]:
GLMCopula.loglikelihood!(gcm, true, true)

-56845.87658829459

In [6]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, mehrotra_algorithm="yes", warm_start_init_point="yes", hessian_approximation = "exact"))

gcm.β = [-0.3620034006976286]
gcm.Σ = [0.18367013231033563, 4.570070064885234e-5]
gcm.β = [-0.3620034006976286]
gcm.Σ = [0.18367007132300728, 4.459186136154603e-5]

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        1

Total number of variables............................:        1
                     variables with only lower bo

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  40  5.5569750e+04 0.00e+00 3.00e-06 -11.0 6.59e-09    -  1.00e+00 1.00e+00h  1
gcm.β = [-0.8574310436235242]
gcm.Σ = [0.9825261931848392, 0.12942803780659948]
  41  5.5569750e+04 0.00e+00 2.74e-06 -11.0 6.01e-09    -  1.00e+00 1.00e+00h  1
gcm.β = [-0.8574310491061284]
gcm.Σ = [0.9825263504925689, 0.12942813320097524]
  42  5.5569750e+04 0.00e+00 2.50e-06 -11.0 5.48e-09    -  1.00e+00 1.00e+00f  1
gcm.β = [-0.8574310541090265]
gcm.Σ = [0.9825264940626082, 0.12942822027400017]
  43  5.5569750e+04 0.00e+00 2.28e-06 -11.0 5.00e-09    -  1.00e+00 1.00e+00h  1
gcm.β = [-0.8574310586744394]
gcm.Σ = [0.9825266250968376, 0.12942829975107223]
  44  5.5569750e+04 0.00e+00 2.08e-06 -11.0 4.57e-09    -  1.00e+00 1.00e+00h  1
gcm.β = [-0.8574310628408025]
gcm.Σ = [0.9825267446914231, 0.12942837229448995]
  45  5.5569750e+04 0.00e+00 1.90e-06 -11.0 4.17e-09    -  1.00e+00 1.00e+00h  1
gcm.β = [-0.8574310666431278]
gcm

In [7]:
@show β_true
@show gcm.β
@show gcm.Σ
@show gcm.∇β
@show GLMCopula.loglikelihood!(gcm, true, true);

β_true = -0.8472978603872034
gcm.β = [-0.8574311005581475]
gcm.Σ = [0.9825278276937094, 0.12942902934012274]
gcm.∇β = [-9.758218519784201e-6]
GLMCopula.loglikelihood!(gcm, true, true) = -55569.7498058729


In [8]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2");

estimated β = -0.8574311005581475; true β = -0.8472978603872034
estimated variance component 1 = 0.9825278276937094; true variance component 1 = 0.9
estimated variance component 2 = 0.12942902934012274; true variance component 2 = 0.1


In [9]:
# still needs some work (I hypothesize from GLM loglikeobs function)
using BenchmarkTools

@benchmark loglikelihood!($gcm, true, true)

BenchmarkTools.Trial: 
  memory estimate:  22.89 MiB
  allocs estimate:  250000
  --------------
  minimum time:     31.666 ms (0.00% GC)
  median time:      34.955 ms (0.00% GC)
  mean time:        38.332 ms (6.67% GC)
  maximum time:     102.631 ms (60.42% GC)
  --------------
  samples:          131
  evals/sample:     1