### Simulate from GLMM instead of Element wise simulation under model equal signal to noise ratio

Instead of simulating the random vector element by element and then doing this 100,000 times, here we check the robustness of the estimation procedure by using a more standard way to simulate from the GLMM model. 

We simulate the multivariate normal systematic component with the random effects and then use the inverse link function for each entry to let $\mu_i = g^{-1}(\eta_i)$ be the rate of the GLMM poisson response.

$\beta = \begin{pmatrix} 1 \\ 1\\ 1 \end{pmatrix}$

$n_i = 20$

$V_1 = 1_n 1_n^t, V_2 = I_n$

$\Gamma = 0.1 * V_1 + 0.1 * V_2$

In [1]:
using GLMCopula, Random, LinearAlgebra, GLM

Random.seed!(123)

n = 100000  # number of observations
n_i = 20
# ns = rand(5:100, n) # ni in each observation
ns = [n_i for i in 1:n]
p = 3   # number of mean parameters
m = 2   # number of variance components

dist = Poisson()
link = LogLink()
D = typeof(dist)
Link = typeof(link)

gcs = Vector{GLMCopulaVCObs{Float64, D, Link}}(undef, n)
# true parameter values
βtruth = ones(p)
σ2truth = [0.1; 0.1]
for i in 1:n
    ni = ns[i]
    # set up covariance matrix
    V1 = ones(ni, ni)
    V2 = Matrix(I, ni, ni)
    Ω = σ2truth[1] * V1 + σ2truth[2] * V2
    
    Ωchol = cholesky(Symmetric(Ω))
    # simulate design matrix
    X = [ones(ni) randn(ni, p-1)]
    # generate mvn systematic 
    η = X * βtruth + Ωchol.L * randn(ni)
    
    # generate mu for glm 
    μ = GLM.linkinv.(link, η)
    # generate poisson response

    y = Float64.(rand.(Poisson.(μ)))
    
    # add to data
    gcs[i] = GLMCopulaVCObs(y, X, [V1, V2], dist, link)
end

gcm = GLMCopulaVCModel(gcs);

In [2]:
@info "Initial point:"
initialize_model!(gcm);
@show gcm.β
fill!(gcm.Σ, 1)
update_Σ!(gcm)
@show inv.(gcm.τ)
@show gcm.Σ
loglikelihood!(gcm, true, true);

┌ Info: Initial point:
└ @ Main In[2]:1


1 0.0 -5.851294412415099e7 399999
2 -5.851294412415099e7 -3.3254867828476127e7 99999
3 -3.3254867828476127e7 -1.4292270737835135e7 99999
4 -1.4292270737835135e7 -6.946083518230645e6 99999
5 -6.946083518230645e6 -5.459909530314097e6 99999
6 -5.459909530314097e6 -5.37328944152139e6 99999
7 -5.37328944152139e6 -5.372890144048035e6 99999
8 -5.372890144048035e6 -5.3728901343954895e6 99999
gcm.β = [1.101864801076517, 0.997759869854687, 0.9999854726500036]
inv.(gcm.τ) = [1.0]
gcm.Σ = [1400.39060321597, 416.3170881522573]


In [3]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, hessian_approximation = "exact"))

gcm.β = [1.101864801076517, 0.997759869854687, 0.9999854726500036]
gcm.Σ = [2799.1875796181153, 832.2280738769048]
gcm.β = [1.101864801076517, 0.997759869854687, 0.9999854726500036]
gcm.Σ = [4197.986133077334, 1248.139528029994]

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        6

Total number of variables.......................

In [4]:
@show gcm.β
@show inv.(gcm.τ)
@show gcm.Σ
@show gcm.∇β
@show loglikelihood!(gcm, true, true);

gcm.β = [1.1086677531123705, 0.996202599821598, 0.998419247481194]
inv.(gcm.τ) = [1.0]
gcm.Σ = [28085.54531721324, 8257.588635580676]
gcm.∇β = [5.240789455740469e-7, 2.807070673682688e-7, 2.7682615311164227e-7]
loglikelihood!(gcm, true, true) = -5.2348788210895e6


### Now with different V1, V2

This is to see how it works for a more complicated covariance structure

In [7]:
gcs = Vector{GLMCopulaVCObs{Float64, D, Link}}(undef, n)
# true parameter values
βtruth = ones(p)
σ2truth = [0.1; 0.1]
for i in 1:n
    ni = ns[i]
    # set up covariance matrix
    V1 = convert(Matrix, Symmetric([Float64(i * (ni - j + 1)) for i in 1:ni, j in 1:ni])) # a pd matrix
    V1 ./= norm(V1) / sqrt(ni) # scale to have Frobenius norm sqrt(n)
    prob = fill(1/ni, ni)
    V2 = ni .* (Diagonal(prob) - prob * transpose(prob))
    V2 ./= norm(V2) / sqrt(ni) # scale to have Frobenious norm sqrt(n)
    Ω = σ2truth[1] * V1 + σ2truth[2] * V2
    
    Ωchol = cholesky(Symmetric(Ω))
    # simulate design matrix
    X = [ones(ni) randn(ni, p-1)]
    # generate mvn systematic 
    η = X * βtruth + Ωchol.L * randn(ni)
    
    # generate mu for glm 
    μ = GLM.linkinv.(link, η)
    # generate poisson response

    y = Float64.(rand.(Poisson.(μ)))
    
    # add to data
    gcs[i] = GLMCopulaVCObs(y, X, [V1, V2], dist, link)
end

gcm = GLMCopulaVCModel(gcs);

In [8]:
@info "Initial point:"
initialize_model!(gcm);
@show gcm.β
fill!(gcm.Σ, 1)
update_Σ!(gcm)
@show inv.(gcm.τ)
@show gcm.Σ
loglikelihood!(gcm, true, true);

┌ Info: Initial point:
└ @ Main In[8]:1


1 0.0 -5.6236598796058424e7 399999
2 -5.6236598796058424e7 -3.17852294779903e7 99999
3 -3.17852294779903e7 -1.342192145560639e7 99999
4 -1.342192145560639e7 -6.304214362333441e6 99999
5 -6.304214362333441e6 -4.86652151107163e6 99999
6 -4.86652151107163e6 -4.782807622908095e6 99999
7 -4.782807622908095e6 -4.782421643830048e6 99999
8 -4.782421643830048e6 -4.782421634489963e6 99999
gcm.β = [1.0682143061222635, 0.99798730010118, 0.9986766236086236]
inv.(gcm.τ) = [1.0]
gcm.Σ = [3056.05972766961, 773.6497498688857]


In [9]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, hessian_approximation = "exact"))

gcm.β = [1.0682143061222635, 0.99798730010118, 0.9986766236086236]
gcm.Σ = [6110.128057559592, 1546.8555708589943]
gcm.β = [1.0682143061222635, 0.99798730010118, 0.9986766236086236]
gcm.Σ = [9164.21275208016, 2320.0655323704664]
This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        6

Total number of variables............................:        3
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower boun

In [10]:
@show gcm.β
@show inv.(gcm.τ)
@show gcm.Σ
@show loglikelihood!(gcm, true, true);

gcm.β = [1.0618304344447238, 1.0002279474992415, 1.0008964340240643]
inv.(gcm.τ) = [1.0]
gcm.Σ = [60792.35827363276, 15715.040604181368]
loglikelihood!(gcm, true, true) = -4.714054927793511e6
