## Let n_i range between 50 and 100 observations simulate under GLMM

Instead of simulating the random vector element by element and then doing this 100,000 times, here we check the robustness of the estimation procedure by using a more standard way to simulate from the GLMM model. 

We simulate the multivariate normal systematic component with the random effects and then use the inverse link function for each entry to let $\mu_i = g^{-1}(\eta_i)$ be the rate of the GLMM poisson response.

$\beta = \begin{pmatrix} 1 \\ 1\\ 1 \end{pmatrix}$

$n_i = 20$

$V_1 = 1_n 1_n^t, V_2 = I_n$

$\Gamma = 0.1 * V_1 + 0.1 * V_2$

In [1]:
using GLMCopula, Random, LinearAlgebra, GLM

Random.seed!(123)

n = 1000  # number of observations
# # n_i = 5
ns = rand(50:100, n) # ni in each observation
ns = [n_i for i in 1:n]
p = 3   # number of mean parameters
m = 2   # number of variance components

dist = Poisson()
link = LogLink()
D = typeof(dist)

gcs = Vector{GLMCopulaVCObs{Float64, D}}(undef, n)
# true parameter values
βtruth = ones(p)
σ2truth = [0.1; 0.1]
for i in 1:n
    ni = ns[i]
    # set up covariance matrix
    V1 = ones(ni, ni)
    V2 = Matrix(I, ni, ni)
    Ω = σ2truth[1] * V1 + σ2truth[2] * V2
    
    Ωchol = cholesky(Symmetric(Ω))
    # simulate design matrix
    X = [ones(ni) randn(ni, p-1)]
    # generate mvn systematic 
    η = X * βtruth + Ωchol.L * randn(ni)
    
    # generate mu for glm 
    μ = GLM.linkinv.(link, η)
    # generate poisson response

    y = Float64.(rand.(Poisson.(μ)))
    
    # add to data
    gcs[i] = GLMCopulaVCObs(y, X, [V1, V2], dist)
end

gcm = GLMCopulaVCModel(gcs);

┌ Info: Precompiling GLMCopula [c47b6ae2-b804-4668-9957-eb588c99ffbc]
└ @ Base loading.jl:1278


In [2]:
@info "Initial point:"
initialize_model!(gcm);
@show gcm.β
fill!(gcm.Σ, 1)
update_Σ!(gcm)
@show inv.(gcm.τ)
@show gcm.Σ
@show loglikelihood!(gcm, true, true);

┌ Info: Initial point:
└ @ Main In[2]:1


1 0.0 -1.5225484584499385e7 39999
2 -1.5225484584499385e7 -8.654824756260166e6 9999
3 -8.654824756260166e6 -3.704095406036299e6 9999
4 -3.704095406036299e6 -1.803238753228862e6 9999
5 -1.803238753228862e6 -1.4208361950781124e6 9999
6 -1.4208361950781124e6 -1.3989401662315985e6 9999
7 -1.3989401662315985e6 -1.3988427030428154e6 9999
8 -1.3988427030428154e6 -1.398842700848381e6 9999
gcm.β = [1.100169656898577, 0.998566223906785, 0.9956295138855477]
inv.(gcm.τ) = [1.0]
gcm.Σ = [113.3815332845778, 16.087371166535945]
loglikelihood!(gcm, true, true) = -1.3792553331086917e6


In [3]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, hessian_approximation = "exact"))

gcm.Σ = [186.0186079639302, 26.43536935910852]
gcm.Σ = [251.72588182761857, 35.796288129745406]

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        6

Total number of variables............................:        3
                     variables with only lower bounds:        0
                variables with lower and upper bounds

In [4]:
@show gcm.β
@show inv.(gcm.τ)
@show gcm.Σ
@show gcm.∇β
@show loglikelihood!(gcm, true, true);

gcm.β = [1.1038263650669933, 0.9975545506624504, 0.9947757178780307]
inv.(gcm.τ) = [1.0]
gcm.Σ = [1026.2709133743042, 145.15673800395103]
gcm.∇β = [-6.051012917396292e-7, 3.4235327461829e-8, 4.798690156349039e-8]
loglikelihood!(gcm, true, true) = -1.379245023661551e6


### Now with different V1, V2

In [5]:
n = 10000  # number of observations
# n_i = 5
ns = rand(5:100, n) # ni in each observation
# ns = [n_i for i in 1:n]
p = 3   # number of mean parameters
m = 2   # number of variance components

dist = Poisson()
link = LogLink()
D = typeof(dist)

gcs = Vector{GLMCopulaVCObs{Float64, D}}(undef, n)
# true parameter values
βtruth = ones(p)
σ2truth = [0.1; 0.1]
for i in 1:n
    ni = ns[i]
    # set up covariance matrix
    V1 = convert(Matrix, Symmetric([Float64(i * (ni - j + 1)) for i in 1:ni, j in 1:ni])) # a pd matrix
    V1 ./= norm(V1) / sqrt(ni) # scale to have Frobenius norm sqrt(n)
    prob = fill(1/ni, ni)
    V2 = ni .* (Diagonal(prob) - prob * transpose(prob))
    V2 ./= norm(V2) / sqrt(ni) # scale to have Frobenious norm sqrt(n)
    Ω = σ2truth[1] * V1 + σ2truth[2] * V2
    
#     V1 = ones(ni, ni)
#     V2 = Matrix(I, ni, ni)
#     Ω = σ2truth[1] * V1 + σ2truth[2] * V2
    
    Ωchol = cholesky(Symmetric(Ω))
    # simulate design matrix
    X = [ones(ni) randn(ni, p-1)]
    # generate mvn systematic 
    η = X * βtruth + Ωchol.L * randn(ni)
    
    # generate mu for glm 
    μ = GLM.linkinv.(link, η)
    # generate poisson response

    y = Float64.(rand.(Poisson.(μ)))
    
    # add to data
    gcs[i] = GLMCopulaVCObs(y, X, [V1, V2], dist)
end

gcm = GLMCopulaVCModel(gcs);

In [6]:
@info "Initial point:"
initialize_model!(gcm);
@show gcm.β
fill!(gcm.Σ, 1)
update_Σ!(gcm)
@show inv.(gcm.τ)
@show gcm.Σ
loglikelihood!(gcm, true, true);

┌ Info: Initial point:
└ @ Main In[6]:1


1 0.0 -1.4755855524274368e7 39999
2 -1.4755855524274368e7 -8.3548837492566295e6 9999
3 -8.3548837492566295e6 -3.5120074376095557e6 9999
4 -3.5120074376095557e6 -1.6336893070059638e6 9999
5 -1.6336893070059638e6 -1.252653457568916e6 9999
6 -1.252653457568916e6 -1.2303635850738666e6 9999
7 -1.2303635850738666e6 -1.230259963694212e6 9999
8 -1.230259963694212e6 -1.2302599611448387e6 9999
gcm.β = [1.060414667922023, 0.9996920374294288, 0.9995099876005409]
inv.(gcm.τ) = [1.0]
gcm.Σ = [2639.0715303065854, 138.65963246741035]


In [7]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, hessian_approximation = "exact"))

gcm.Σ = [5272.016320847243, 277.0313453309274]
gcm.Σ = [7904.8725982142405, 415.39840494976755]
This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        6

Total number of variables............................:        3
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:  

In [8]:
@show gcm.β
@show inv.(gcm.τ)
@show gcm.Σ
@show loglikelihood!(gcm, true, true);

gcm.β = [1.0588268523995237, 1.0002899286031137, 1.0000401120507074]
inv.(gcm.τ) = [1.0]
gcm.Σ = [52671.02982336238, 2781.5444770909526]
loglikelihood!(gcm, true, true) = -1.222115361321756e6
