# Real Example Datasets: Intercept Only Random Intercept Model

In this notebook we compare the MLEs and loglikelihoods of the quasi-copula intercept only random intercept model with that of GLM and GLMM fit using `GLM.jl` and `MixedModels.jl` respectively.

We use the example datasets from various R packages using the `RCall` and `RDatasets` packages.

In [1]:
using QuasiCopula, DataFrames, LinearAlgebra, RCall, RDatasets
using GLM, MixedModels

# dyestuff data: lme4 (Gaussian)

In [2]:
df = dataset("lme4", "Dyestuff")
y = :Yield
grouping = :Batch

d = Normal()
link = IdentityLink()
Gaussian_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Normal
  * link function: IdentityLink
  * number of clusters: 6
  * cluster size min, max: 5, 5
  * number of variance components: 1
  * number of fixed effects: 1

In [3]:
# fit using QuasiCopula
QuasiCopula.fit!(Gaussian_VC_model);

gcm.β = [1527.4999999999998]
initializing dispersion using residual sum of squares
gcm.τ = [0.0002604449267498643]
initializing variance components using MM-Algorithm
gcm.θ = [0.8636149140056555]

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints wi

### fit dyestuff with QC:

In [4]:
# qc estimates
@show Gaussian_VC_model.β
@show Gaussian_VC_model.θ
@show Gaussian_VC_model.τ;

Gaussian_VC_model.β = [1541.7932221729745]
Gaussian_VC_model.θ = [12850.230761962066]
Gaussian_VC_model.τ = [0.0003461995259151126]


### fit dyestuff with GLM:

In [5]:
# fit with glm: GLM.jl
dyestuff_lm = lm(@formula(Yield ~ 1), df)

dyestuff_lm_β = GLM.coef(dyestuff_lm)
dyestuff_lm_θ = 0.0
dyestuff_lm_τ = inv(deviance(dyestuff_lm)/dof_residual(dyestuff_lm))
@show dyestuff_lm_β
@show dyestuff_lm_θ
@show dyestuff_lm_τ;

dyestuff_lm_β = [1527.4999999999998]
dyestuff_lm_θ = 0.0
dyestuff_lm_τ = 0.0002517634291915355


### fit dyestuff with GLMM:

In [6]:
# fit with lmm: MixedModels.jl
dyestuff_formula = @formula(Yield ~ 1 + (1|Batch));
mdl = LinearMixedModel(dyestuff_formula, df)
MixedModels.fit!(mdl)

dyestuff_LMM_β = mdl.beta
dyestuff_LMM_θ = mdl.σs[1][1]^2
dyestuff_LMM_τ = inv(mdl.σ^2)
@show dyestuff_LMM_β
@show dyestuff_LMM_θ
@show dyestuff_LMM_τ;

│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
└ @ ProgressMeter /Users/sarahji/.julia/packages/ProgressMeter/Vf8un/src/ProgressMeter.jl:620
[32mMinimizing 19 	 Time: 0:00:00 (17.69 ms/it)[39m


dyestuff_LMM_β = [1527.4999999999989]
dyestuff_LMM_θ = 1388.3331672465852
dyestuff_LMM_τ = 0.00040795511647148805


### dyestuff: compare loglikelihoods

In [7]:
# qc logl
logl_dyestuff_QC = logl(Gaussian_VC_model)

# fit with glm: GLM.jl
logl_dyestuff_LM = loglikelihood(dyestuff_lm)

# fit with lmm: MixedModels.jl
logl_dyestuff_LMM = loglikelihood(mdl)

@show logl_dyestuff_QC
@show logl_dyestuff_LM
@show logl_dyestuff_LMM;

logl_dyestuff_QC = -163.35547256352643
logl_dyestuff_LM = -166.36494298739052
logl_dyestuff_LMM = -163.6635299405715


# Oxboys data: mlmRev (Gaussian)

In [8]:
df = dataset("mlmRev", "Oxboys")
y = :Height
grouping = :Subject

Gaussian_VC_model2 = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Normal
  * link function: IdentityLink
  * number of clusters: 26
  * cluster size min, max: 9, 9
  * number of variance components: 1
  * number of fixed effects: 1

In [9]:
# fit using QuasiCopula
QuasiCopula.fit!(Gaussian_VC_model2);

gcm.β = [149.51940170940168]
initializing dispersion using residual sum of squares
gcm.τ = [0.012119112908652608]
initializing variance components using MM-Algorithm
gcm.θ = [0.8429685809722144]
Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 10

                                   (scaled)                 (unscaled)
Objective...............:   8.2657948607667993e+02    8.2657948607667993e+02
Dual infeasibility......:   2.3491180187725940e-0

### fit Oxboys with QC:

In [10]:
# qc estimates
@show Gaussian_VC_model2.β
@show Gaussian_VC_model2.θ
@show Gaussian_VC_model2.τ;

Gaussian_VC_model2.β = [149.16733835278978]
Gaussian_VC_model2.θ = [0.910413443061632]
Gaussian_VC_model2.τ = [0.014262433641989388]


### fit Oxboys with GLM:

In [11]:
# fit Oxboys with glm: GLM.jl
Oxboys_lm = lm(@formula(Height ~ 1), df);
Oxboys_lm_β = GLM.coef(Oxboys_lm)
Oxboys_lm_θ = 0.0
Oxboys_lm_τ = inv(deviance(Oxboys_lm)/dof_residual(Oxboys_lm))
@show Oxboys_lm_β
@show Oxboys_lm_θ
@show Oxboys_lm_τ;

Oxboys_lm_β = [149.51940170940168]
Oxboys_lm_θ = 0.0
Oxboys_lm_τ = 0.012067321827846403


### fit Oxboys with QC:

In [12]:
# fit with lmm: MixedModels.jl
Oxboys_formula = @formula(Height ~ 1 + (1|Subject));
mdl = LinearMixedModel(Oxboys_formula, df)
MixedModels.fit!(mdl);

Oxboys_LMM_β = mdl.beta
Oxboys_LMM_θ = mdl.σs[1][1]^2
Oxboys_LMM_τ = inv(mdl.σ^2)
@show Oxboys_LMM_β
@show Oxboys_LMM_θ
@show Oxboys_LMM_τ;

Oxboys_LMM_β = [149.51940170940276]
Oxboys_LMM_θ = 60.78727660622461
Oxboys_LMM_τ = 0.046025641233561725


### Oxboys: compare loglikelihoods

In [13]:
# qc logl
logl_Oxboys_QC = logl(Gaussian_VC_model2)

# fit with glm: GLM.jl
logl_Oxboys_LM = loglikelihood(Oxboys_lm)

# fit with lmm: MixedModels.jl
logl_Oxboys_LMM = loglikelihood(mdl);

In [14]:
@show logl_Oxboys_QC
@show logl_Oxboys_LM
@show logl_Oxboys_LMM;

logl_Oxboys_QC = -826.5794860766799
logl_Oxboys_LM = -848.3492814947749
logl_Oxboys_LMM = -734.6676665245718


# Sleepstudy data: lme4 (Gaussian)

In [15]:
df = dataset("lme4", "sleepstudy")
y = :Reaction
grouping = :Subject

Gaussian_VC_model3 = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Normal
  * link function: IdentityLink
  * number of clusters: 18
  * cluster size min, max: 10, 10
  * number of variance components: 1
  * number of fixed effects: 1

In [16]:
# fit using QuasiCopula
QuasiCopula.fit!(Gaussian_VC_model3);

gcm.β = [298.5078916666667]
initializing dispersion using residual sum of squares
gcm.τ = [0.00031692692475632976]
initializing variance components using MM-Algorithm
gcm.θ = [0.17963770582374433]
Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 35

                                   (scaled)                 (unscaled)
Objective...............:   9.7311733466615851e+02    9.7311733466615851e+02
Dual infeasibility......:   1.4739948630225240e

### fit sleepstudy with QC:

In [17]:
# qc estimates
@show Gaussian_VC_model3.β
@show Gaussian_VC_model3.θ
@show Gaussian_VC_model3.τ;

Gaussian_VC_model3.β = [290.5454941370383]
Gaussian_VC_model3.θ = [0.32200488770275487]
Gaussian_VC_model3.τ = [0.00034901417385826684]


### fit sleepstudy with GLM:

In [18]:
# fit with glm: GLM.jl
sleepstudy_lm = lm(@formula(Reaction ~ 1), df)
sleepstudy_lm_β = GLM.coef(sleepstudy_lm)
sleepstudy_lm_θ = 0.0
sleepstudy_lm_τ = inv(deviance(sleepstudy_lm)/dof_residual(sleepstudy_lm))
@show sleepstudy_lm_β
@show sleepstudy_lm_θ
@show sleepstudy_lm_τ;

sleepstudy_lm_β = [298.5078916666667]
sleepstudy_lm_θ = 0.0
sleepstudy_lm_τ = 0.00031516621961879464


### fit sleepstudy with GLMM:

In [19]:
# fit with lmm: MixedModels.jl
sleepstudy_formula = @formula(Reaction ~ 1 + (1|Subject));
mdl = LinearMixedModel(sleepstudy_formula, df)
MixedModels.fit!(mdl)

sleepstudy_LMM_β = mdl.beta
sleepstudy_LMM_θ = mdl.σs[1][1]^2
sleepstudy_LMM_τ = inv(mdl.σ^2)
@show sleepstudy_LMM_β
@show sleepstudy_LMM_θ
@show sleepstudy_LMM_τ;

sleepstudy_LMM_β = [298.5078916666664]
sleepstudy_LMM_θ = 1196.4362986993276
sleepstudy_LMM_τ = 0.0005104996523418869


### sleepstudy: compare loglikelihoods

In [20]:
# qc logl
logl_sleepstudy_QC = logl(Gaussian_VC_model3)

# fit with glm: GLM.jl
logl_sleepstudy_LM = loglikelihood(sleepstudy_lm)

# fit with lmm: MixedModels.jl
logl_sleepstudy_LMM = loglikelihood(mdl);

In [21]:
@show logl_sleepstudy_QC
@show logl_sleepstudy_LM
@show logl_sleepstudy_LMM;

logl_sleepstudy_QC = -973.1173346661585
logl_sleepstudy_LM = -980.5244758509474
logl_sleepstudy_LMM = -955.270529036532


# Gcsemv data: mlmRev (Gaussian)

In [22]:
df = dataset("mlmRev", "Gcsemv")
df = df[completecases(df), :]
y = :Course
grouping = :School

Gaussian_VC_model4 = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Normal
  * link function: IdentityLink
  * number of clusters: 73
  * cluster size min, max: 1, 83
  * number of variance components: 1
  * number of fixed effects: 1

In [23]:
# fit using QuasiCopula
QuasiCopula.fit!(Gaussian_VC_model4);

gcm.β = [73.38138542350625]
initializing dispersion using residual sum of squares
gcm.τ = [0.003703899254589357]
initializing variance components using MM-Algorithm
gcm.θ = [0.5145178403876451]
Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 20

                                   (scaled)                 (unscaled)
Objective...............:   6.3611970355006670e+03    6.3611970355006670e+03
Dual infeasibility......:   6.7714640667166823e-08

### fit Gcsemv with QC:

In [24]:
# qc estimates
@show Gaussian_VC_model4.β
@show Gaussian_VC_model4.θ
@show Gaussian_VC_model4.τ;

Gaussian_VC_model4.β = [72.56416347963582]
Gaussian_VC_model4.θ = [0.5484470406404645]
Gaussian_VC_model4.τ = [0.003967426482597063]


### fit Gcsemv with GLM:

In [25]:
# fit with glm: GLM.jl
Gcsemv_lm = lm(@formula(Course ~ 1), df);
Gcsemv_lm_β = GLM.coef(Gcsemv_lm)
Gcsemv_lm_θ = 0.0
Gcsemv_lm_τ = inv(deviance(Gcsemv_lm)/dof_residual(Gcsemv_lm))
@show Gcsemv_lm_β
@show Gcsemv_lm_θ
@show Gcsemv_lm_τ;

Gcsemv_lm_β = [73.38138542350625]
Gcsemv_lm_θ = 0.0
Gcsemv_lm_τ = 0.0037014672787163494


### fit Gcsemv with GLMM:

In [26]:
# fit with lmm: MixedModels.jl
Gcsemv_formula = @formula(Course ~ 1 + (1|School));
mdl = LinearMixedModel(Gcsemv_formula, df)
MixedModels.fit!(mdl);

Gcsemv_LMM_β = mdl.beta
Gcsemv_LMM_θ = mdl.σs[1][1]^2
Gcsemv_LMM_τ = inv(mdl.σ^2)
@show Gcsemv_LMM_β
@show Gcsemv_LMM_θ
@show Gcsemv_LMM_τ;

Gcsemv_LMM_β = [73.81102130954413]
Gcsemv_LMM_θ = 73.28906077041454
Gcsemv_LMM_τ = 0.005123038781782747


### Gcsemv: compare loglikelihoods

In [27]:
# qc logl
logl_Gcsemv_QC = logl(Gaussian_VC_model4)

# fit with glm: GLM.jl
logl_Gcsemv_LM = loglikelihood(Gcsemv_lm)

# fit with lmm: MixedModels.jl
logl_Gcsemv_LMM = loglikelihood(mdl);

In [28]:
@show logl_Gcsemv_QC
@show logl_Gcsemv_LM
@show logl_Gcsemv_LMM;

logl_Gcsemv_QC = -6361.197035500667
logl_Gcsemv_LM = -6424.201502669516
logl_Gcsemv_LMM = -6247.905067227811


# respiratory data: geepack (Bernoulli)

In [29]:
R"""
    data(respiratory, package="geepack")
    respiratory_df <- respiratory[order(respiratory$id),]
"""
@rget respiratory_df;

df = respiratory_df
df[!, :id] = string.(df[!, :id])
y = :outcome
grouping = :id

# Bernoulli
d = Bernoulli()
link = LogitLink()

Bernoulli_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Bernoulli
  * link function: LogitLink
  * number of clusters: 56
  * cluster size min, max: 4, 8
  * number of variance components: 1
  * number of fixed effects: 1

In [30]:
# fit using QuasiCopula
QuasiCopula.fit!(Bernoulli_VC_model);

initializing β using Newton's Algorithm under Independence Assumption
gcm.β = [0.23531408536427043]
initializing variance components using MM-Algorithm
gcm.θ = [0.2643698235606835]
Total number of variables............................:        2
                     variables with only lower bounds:        1
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 6

                                   (scaled)                 (unscaled)
Objective...............:   2.8889498858347241e+02    2.8889498858347241e+02
Dual infeasibility......:   6.4801335408759542e-09    6.48013354

### fit respiratory with QC:

In [31]:
# qc estimates
@show Bernoulli_VC_model.β
@show Bernoulli_VC_model.θ;

Bernoulli_VC_model.β = [0.1469681313371347]
Bernoulli_VC_model.θ = [0.2790494318995162]


### fit respiratory with GLM:

In [32]:
# fit with glm: GLM.jl
respiratory_glm = glm(@formula(outcome ~ 1), df, d, link)

respiratory_glm_β = GLM.coef(respiratory_glm)
respiratory_glm_θ = 0.0
@show respiratory_glm_β
@show respiratory_glm_θ;

respiratory_glm_β = [0.23531408536427043]
respiratory_glm_θ = 0.0


### fit respiratory with GLMM:

In [33]:
# fit with glmm: MixedModels.jl
respiratory_formula = @formula(outcome ~ 1 + (1|id));
mdl = GeneralizedLinearMixedModel(respiratory_formula, df, d, link)
MixedModels.fit!(mdl, fast=true)

respiratory_GLMM_β = mdl.beta
respiratory_GLMM_θ = mdl.σs[1][1]^2
@show respiratory_GLMM_β
@show respiratory_GLMM_θ;

│  - To prevent this behaviour, do `ProgressMeter.ijulia_behavior(:append)`. 
└ @ ProgressMeter /Users/sarahji/.julia/packages/ProgressMeter/Vf8un/src/ProgressMeter.jl:620
[32mMinimizing 15 	 Time: 0:00:00 (16.30 ms/it)[39m


respiratory_GLMM_β = [0.2739692386302114]
respiratory_GLMM_θ = 1.5273309064378675


### respiratory: compare loglikelihoods

In [34]:
# qc logl
logl_respiratory_QC = logl(Bernoulli_VC_model)

# fit with glm: GLM.jl
logl_respiratory_GLM = loglikelihood(respiratory_glm)

# fit with glmm: MixedModels.jl
logl_respiratory_GLMM = loglikelihood(mdl);

In [35]:
@show logl_respiratory_QC
@show logl_respiratory_GLM
@show logl_respiratory_GLMM;

logl_respiratory_QC = -288.8949885834724
logl_respiratory_GLM = -304.70530346181135
logl_respiratory_GLMM = -280.997334355401


# Contraception data: mlmRev  (Bernoulli)

In [36]:
df = dataset("mlmRev", "Contraception")
binary_use = map(x -> string.(x) == "N" ? 0.0 : 1.0, df[!, :Use])
df[!, :outcome] = binary_use
y = :outcome
grouping = :District

Bernoulli_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Bernoulli
  * link function: LogitLink
  * number of clusters: 60
  * cluster size min, max: 2, 118
  * number of variance components: 1
  * number of fixed effects: 1

In [37]:
# fit using QuasiCopula
QuasiCopula.fit!(Bernoulli_VC_model);

initializing β using Newton's Algorithm under Independence Assumption
gcm.β = [-0.43702156143913895]
initializing variance components using MM-Algorithm
gcm.θ = [0.08460354610678963]
Total number of variables............................:        2
                     variables with only lower bounds:        1
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 6

                                   (scaled)                 (unscaled)
Objective...............:   1.2764000166885739e+03    1.2764000166885739e+03
Dual infeasibility......:   3.4385887690431207e-09    3.438588

### fit Contraception with QC:

In [38]:
# qc estimates
@show Bernoulli_VC_model.β
@show Bernoulli_VC_model.θ;

Bernoulli_VC_model.β = [-0.4652690484907957]
Bernoulli_VC_model.θ = [0.08470619962374443]


### fit Contraception with GLM:

In [39]:
# fit with glm: GLM.jl
Contraception_glm = glm(@formula(outcome ~ 1), df, d, link);
Contraception_glm_β = GLM.coef(Contraception_glm)
Contraception_glm_θ = 0.0
@show Contraception_glm_β
@show Contraception_glm_θ;

Contraception_glm_β = [-0.43702156143913895]
Contraception_glm_θ = 0.0


### fit Contraception with GLMM:

In [40]:
# fit with glmm: MixedModels.jl
Contraception_formula = @formula(outcome ~ 1 + (1|District));
mdl = GeneralizedLinearMixedModel(Contraception_formula, df, d, link)
MixedModels.fit!(mdl, fast=true);

Contraception_GLMM_β = mdl.beta
Contraception_GLMM_θ = mdl.σs[1][1]^2
@show Contraception_GLMM_β
@show Contraception_GLMM_θ;

Contraception_GLMM_β = [-0.5257726102897046]
Contraception_GLMM_θ = 0.24556070264990815


### Contraception: compare loglikelihoods

In [41]:
# qc logl
logl_Contraception_QC = logl(Bernoulli_VC_model)

# fit with glm: GLM.jl
logl_Contraception_GLM = loglikelihood(Contraception_glm)

# fit with glmm: MixedModels.jl
logl_Contraception_GLMM = loglikelihood(mdl);

In [42]:
@show logl_Contraception_QC
@show logl_Contraception_GLM
@show logl_Contraception_GLMM;

logl_Contraception_QC = -1276.400016688574
logl_Contraception_GLM = -1295.4546621368916
logl_Contraception_GLMM = -1267.2367038450316


# epilepsy data: gcmr (Poisson)

In [43]:
R"""
    library("gcmr")
    data("epilepsy", package = "gcmr")
"""
@rget epilepsy;

df = epilepsy
y = :counts
grouping = :id

# Poisson
d = Poisson()
link = LogLink()
Poisson_VC_model = VC_model(df, y, grouping, d, link)

└ @ RCall /Users/sarahji/.julia/packages/RCall/6kphM/src/io.jl:172


Quasi-Copula Variance Component Model
  * base distribution: Poisson
  * link function: LogLink
  * number of clusters: 59
  * cluster size min, max: 5, 5
  * number of variance components: 1
  * number of fixed effects: 1

In [44]:
# fit using QuasiCopula
QuasiCopula.fit!(Poisson_VC_model);

initializing β using Newton's Algorithm under Independence Assumption
gcm.β = [2.5544643397090105]
initializing variance components using MM-Algorithm
gcm.θ = [6.362206769115058]
Total number of variables............................:        2
                     variables with only lower bounds:        1
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 14

                                   (scaled)                 (unscaled)
Objective...............:   2.9453773284413332e+03    2.9453773284413332e+03
Dual infeasibility......:   4.3645371761158458e-09    4.364537176

### fit epilepsy (Poisson)  with QC:

In [45]:
# qc estimates
@show Poisson_VC_model.β
@show Poisson_VC_model.θ;

Poisson_VC_model.β = [2.5334666652093887]
Poisson_VC_model.θ = [9.357653001537265]


### fit epilepsy (Poisson) with GLM:

In [46]:
# fit with glm: GLM.jl
epilepsy_glm = glm(@formula(counts ~ 1), df, Poisson(), LogLink());
epilepsy_glm_β = GLM.coef(epilepsy_glm)
epilepsy_glm_θ = 0.0
@show epilepsy_glm_β
@show epilepsy_glm_θ;

epilepsy_glm_β = [2.5544643397090105]
epilepsy_glm_θ = 0.0


### fit epilepsy (Poisson) with GLMM:

In [47]:
# fit with glmm: MixedModels.jl
df[!, :id] = string.(df[!, :id])
epilepsy_formula = @formula(counts ~ 1 + (1|id));
mdl = GeneralizedLinearMixedModel(epilepsy_formula, df, Poisson(), LogLink())
MixedModels.fit!(mdl, fast=true);

epilepsy_GLMM_β = mdl.beta
epilepsy_GLMM_θ = mdl.σs[1][1]^2
@show epilepsy_GLMM_β
@show epilepsy_GLMM_θ;

epilepsy_GLMM_β = [2.226516544978358]
epilepsy_GLMM_θ = 0.6078216515856457


### epilepsy (Poisson) : compare loglikelihoods

In [48]:
# qc logl
logl_epilepsy_QC = logl(Poisson_VC_model)

# fit with glm: GLM.jl
logl_epilepsy_GLM = loglikelihood(epilepsy_glm)

# fit with glmm: MixedModels.jl
logl_epilepsy_GLMM = loglikelihood(mdl);

In [49]:
@show logl_epilepsy_QC
@show logl_epilepsy_GLM
@show logl_epilepsy_GLMM;

logl_epilepsy_QC = -2945.3773284413332
logl_epilepsy_GLM = -3092.972481024413
logl_epilepsy_GLMM = -1785.335988439955


# Mmmec data: mlmRev (Poisson)

In [50]:
Mmmec = dataset("mlmRev", "Mmmec");

df = Mmmec
y = :Deaths
grouping = :Nation

Poisson_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: Poisson
  * link function: LogLink
  * number of clusters: 9
  * cluster size min, max: 3, 95
  * number of variance components: 1
  * number of fixed effects: 1

In [51]:
# fit using QuasiCopula
QuasiCopula.fit!(Poisson_VC_model);

initializing β using Newton's Algorithm under Independence Assumption
gcm.β = [3.326031339981156]
initializing variance components using MM-Algorithm
gcm.θ = [1.0]
Total number of variables............................:        2
                     variables with only lower bounds:        1
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0


Number of Iterations....: 57

                                   (scaled)                 (unscaled)
Objective...............:   6.6230027620134115e+03    6.6230027620134115e+03
Dual infeasibility......:   1.1068841262247114e-11    1.1068841262247114e-11
Con

### fit Mmmec (Poisson) with QC:

In [52]:
# qc estimates
@show Poisson_VC_model.β
@show Poisson_VC_model.θ;

Poisson_VC_model.β = [3.323833822069927]
Poisson_VC_model.θ = [351405.66779142304]


### fit Mmmec (Poisson)  with GLM:

In [53]:
# fit with glm: GLM.jl
Mmmec_glm = glm(@formula(Deaths ~ 1), df, Poisson(), LogLink());
Mmmec_glm_β = GLM.coef(Mmmec_glm)
Mmmec_glm_θ = 0.0
@show Mmmec_glm_β
@show Mmmec_glm_θ;

Mmmec_glm_β = [3.326031339981156]
Mmmec_glm_θ = 0.0


### fit Mmmec (Poisson) with GLMM:

In [54]:
# fit with glmm: MixedModels.jl
Mmmec_formula = @formula(Deaths ~ 1 + (1|Nation));
mdl = GeneralizedLinearMixedModel(Mmmec_formula, df, Poisson(), LogLink())
MixedModels.fit!(mdl, fast=true);

Mmmec_GLMM_β = mdl.beta
Mmmec_GLMM_θ = mdl.σs[1][1]^2
@show Mmmec_GLMM_β
@show Mmmec_GLMM_θ;

Mmmec_GLMM_β = [3.1193025178733476]
Mmmec_GLMM_θ = 1.0985820979715717


### Mmmec (Poisson): compare loglikelihoods

In [55]:
# qc logl
logl_Mmmec_QC = logl(Poisson_VC_model)

# fit with glm: GLM.jl
logl_Mmmec_GLM = loglikelihood(Mmmec_glm)

# fit with glmm: MixedModels.jl
logl_Mmmec_GLMM = loglikelihood(mdl);

In [56]:
@show logl_Mmmec_QC
@show logl_Mmmec_GLM
@show logl_Mmmec_GLMM;

logl_Mmmec_QC = -6623.0027620134115
logl_Mmmec_GLM = -6672.358128381808
logl_Mmmec_GLMM = -3762.8923979880137


# epilepsy data: gcmr (NB)

In [57]:
R"""
    library("gcmr")
    data("epilepsy", package = "gcmr")
"""
@rget epilepsy;

df = epilepsy
y = :counts
grouping = :id

# negative Binomial
d = NegativeBinomial()
link = LogLink()
NB_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: NegativeBinomial
  * link function: LogLink
  * number of clusters: 59
  * cluster size min, max: 5, 5
  * number of variance components: 1
  * number of fixed effects: 1

In [58]:
# fit using QuasiCopula
QuasiCopula.fit!(NB_VC_model);

initializing β using GLM.jl
gcm.β = [2.5544643337320343]
initializing variance components using MM-Algorithm
gcm.θ = [0.5657670131706649]
initializing r using Newton update
Converging when tol ≤ 1.0e-6 (max block iter = 10)
Block iter 1 r = 0.95, logl = -1041.21, tol = 1041.2064746554054
Block iter 2 r = 0.95, logl = -1041.2, tol = 9.169060760170463e-6


### fit epilepsy (NB) with QC:

In [59]:
# qc estimates
@show NB_VC_model.β
@show NB_VC_model.θ
@show NB_VC_model.r;

NB_VC_model.β = [2.4792606205031866]
NB_VC_model.θ = [0.5209466565320946]
NB_VC_model.r = [0.9466528028104012]


### fit epilepsy (NB) with GLM:

In [60]:
# fit with glm: GLM.jl
epilepsy_glm = glm(@formula(counts ~ 1), df, d, link);
epilepsy_glm_β = GLM.coef(epilepsy_glm)
epilepsy_glm_θ = 0.0
epilepsy_glm_r = inv(deviance(epilepsy_glm)/dof_residual(epilepsy_glm))
@show epilepsy_glm_β
@show epilepsy_glm_θ
@show epilepsy_glm_r;

epilepsy_glm_β = [2.5544643337320343]
epilepsy_glm_θ = 0.0
epilepsy_glm_r = 0.7015333873207771


### fit epilepsy (NB) with GLMM:

In [61]:
# fit with glmm: MixedModels.jl
df[!, :id] = string.(df[!, :id])
epilepsy_formula = @formula(counts ~ 1 + (1|id));
mdl = GeneralizedLinearMixedModel(epilepsy_formula, df, d, link)
MixedModels.fit!(mdl, fast=true)

epilepsy_GLMM_β = mdl.beta
epilepsy_GLMM_θ = mdl.σs[1][1]^2
epilepsy_GLMM_r = inv(mdl.σ^2)
@show epilepsy_GLMM_β
@show epilepsy_GLMM_θ
@show epilepsy_GLMM_r;

│ It is best to avoid trying to fit such models in MixedModels until
│ the authors gain a better understanding of those cases.
└ @ MixedModels /Users/sarahji/.julia/packages/MixedModels/mag0C/src/generalizedlinearmixedmodel.jl:374


epilepsy_GLMM_β = [2.255186388238702]
epilepsy_GLMM_θ = 0.3446862449721136
epilepsy_GLMM_r = 1.2158558474684866


### epilepsy (NB): compare loglikelihoods

In [62]:
# qc logl
logl_epilepsy_QC_nb = logl(NB_VC_model)

# fit with glm: GLM.jl
logl_epilepsy_GLM_nb = loglikelihood(epilepsy_glm)

# fit with glmm: MixedModels.jl
logl_epilepsy_GLMM_nb = loglikelihood(mdl);

In [63]:
@show logl_epilepsy_QC_nb
@show logl_epilepsy_GLM_nb
@show logl_epilepsy_GLMM_nb;

logl_epilepsy_QC_nb = -1041.196587062723
logl_epilepsy_GLM_nb = -1059.746665536632
logl_epilepsy_GLMM_nb = -1025.4137630663238


# Mmmec data: mlmRev (NB)

In [64]:
Mmmec = dataset("mlmRev", "Mmmec");

df = Mmmec
y = :Deaths
grouping = :Nation
# d = NegativeBinomial()
# link = LogLink()

NB_VC_model = VC_model(df, y, grouping, d, link)

Quasi-Copula Variance Component Model
  * base distribution: NegativeBinomial
  * link function: LogLink
  * number of clusters: 9
  * cluster size min, max: 3, 95
  * number of variance components: 1
  * number of fixed effects: 1

In [65]:
# fit using QuasiCopula
QuasiCopula.fit!(NB_VC_model);

initializing β using GLM.jl
gcm.β = [3.3260313399805215]
initializing variance components using MM-Algorithm
gcm.θ = [1.0]
initializing r using Newton update
Converging when tol ≤ 1.0e-6 (max block iter = 10)
Block iter 1 r = 0.95, logl = -1517.43, tol = 1517.4256786488381


### fit Mmmec (NB) with QC:

In [66]:
@show NB_VC_model.β
@show NB_VC_model.θ
@show NB_VC_model.r;

NB_VC_model.β = [3.261185431678419]
NB_VC_model.θ = [19.40906970423017]
NB_VC_model.r = [0.947526017253861]


### fit Mmmec (NB) with GLM:

In [67]:
# fit with glm: GLM.jl
Mmmec_glm = glm(@formula(Deaths ~ 1), df, d, link);
Mmmec_glm_β = GLM.coef(Mmmec_glm)
Mmmec_glm_θ = 0.0
Mmmec_glm_r = inv(deviance(Mmmec_glm)/dof_residual(Mmmec_glm))
@show Mmmec_glm_β
@show Mmmec_glm_θ
@show Mmmec_glm_r;

Mmmec_glm_β = [3.3260313399805215]
Mmmec_glm_θ = 0.0
Mmmec_glm_r = 0.7981758693672868


### fit Mmmec (NB) with GLMM:

In [68]:
# fit with glmm: MixedModels.jl
Mmmec_formula = @formula(Deaths ~ 1 + (1|Nation));
mdl = GeneralizedLinearMixedModel(Mmmec_formula, df, d, link);
MixedModels.fit!(mdl, fast=true);

Mmmec_GLMM_β = mdl.beta
Mmmec_GLMM_θ = mdl.σs[1][1]^2
Mmmec_GLMM_r = inv(mdl.σ^2)
@show Mmmec_GLMM_β
@show Mmmec_GLMM_θ
@show Mmmec_GLMM_r;

Mmmec_GLMM_β = [3.1384199624537628]
Mmmec_GLMM_θ = 1.1046946523166352
Mmmec_GLMM_r = 0.9266658961422913


│ It is best to avoid trying to fit such models in MixedModels until
│ the authors gain a better understanding of those cases.
└ @ MixedModels /Users/sarahji/.julia/packages/MixedModels/mag0C/src/generalizedlinearmixedmodel.jl:374


### Mmmec (NB): compare loglikelihoods

In [69]:
# qc logl
logl_Mmmec_QC_nb = logl(NB_VC_model)

# fit with glm: GLM.jl

logl_Mmmec_GLM_nb = loglikelihood(Mmmec_glm)

# fit with glmm: MixedModels.jl
logl_Mmmec_GLMM_nb = loglikelihood(mdl);

In [70]:
@show logl_Mmmec_QC_nb
@show logl_Mmmec_GLM_nb
@show logl_Mmmec_GLMM_nb;

logl_Mmmec_QC_nb = -1517.4254928293658
logl_Mmmec_GLM_nb = -1537.7008165848106
logl_Mmmec_GLMM_nb = -1452.072432624533
