# Bernoulli(p = 0.5) with CS rho = 0.9, sigma2 = 1.0

For Bernoulli Base with $p = 0.5$, with the correlation $\rho = 0.9, \sigma^2 = 1.0,$  

The theoretical correlation is a function of $d_i, \rho, \sigma^2, kt$

$$Corr(Y_k, Y_l)
 = \frac{\rho * \sigma^2}{1 + \frac{1}{2} * \sigma^2 * kt + \frac{1}{2} * \sigma^2 (d_i - 1)}$$

We have $p = 0.5,$ thus we have kurtosis $kt = kt(p) = 1.$

Let's see what happens to the theoretical/empirical correlation when we fix $\rho = 0.9, \sigma^2 = 1.0$ and range cluster sizes $d_i$

$$Corr(Y_k, Y_l)
 = \frac{0.9}{1 + \frac{1}{2} * 1 + \frac{1}{2} * (d_i - 1)} = \frac{0.9}{1 + \frac{d_i}{2}}$$


# We will show: 
    1) Simulating under QC model, the theoretical and empirical correlation IS a function of cluster sizes di

    2) Simulating under GLMM, the theoretical and empirical correlation is NOT function of cluster sizes di




## TOC:

# di = 2
* [CS di = 2](#ex0)
* [simulate under GLMM CS di = 2](#ex0g)

# di = 5
* [CS di = 5](#ex1)
* [simulate under GLMM CS di = 5](#ex1g)

# di = 10 
* [CS di = 10](#ex2)
* [simulate under GLMM CS di = 10](#ex2g)

# di = 25
* [CS di = 25](#ex3)
* [simulate under GLMM CS di = 25](#ex3g)

# Comparisons
* [Theoretical vs. Empirical Correlation Simulated under QC](#ex4)
* [Empirical Correlation Simulated under GLMM](#ex4g)

In [1]:
using GLMCopula, DelimitedFiles, LinearAlgebra, Random, GLM, MixedModels, CategoricalArrays
using Random, Roots, SpecialFunctions, StatsBase
using DataFrames, DelimitedFiles, Statistics, ToeplitzMatrices
import StatsBase: sem

In [2]:
function get_V_CS(ρ, n)
    vec = zeros(n)
    vec[1] = 1.0
    for i in 2:n
        vec[i] = ρ
    end
    V = ToeplitzMatrices.SymmetricToeplitz(vec)
    V
end

get_V_CS (generic function with 1 method)

In [3]:
# true parameter values
p = [0.5]
βtrue = [log(1.0)]
σ2true = [1.0]
ρtrue = [0.9]

samplesize = 100000 # number of sampling units

d = Bernoulli()
link = LogitLink()
D = typeof(d)
Link = typeof(link)
T = Float64

Float64

# Kurtosis of each Bernoulli(0.5) base distribution is 1.0

We have $p = 0.5,$ thus we have kurtosis $kt = kt(p) = 1.0.$

Let's see what happens to the theoretical/empirical correlations when we fix $\rho = 0.9, \sigma^2 = 1.0, kt = 1.0$ and range cluster sizes $d_i$

In [4]:
d = Bernoulli(p[1])
μ, σ², sk, kt = mean(d), var(d), skewness(d), kurtosis(d, false)

(0.5, 0.25, 0.0, 1.0)

## CS di = 2 <a class="anchor" id="ex0"></a>

$d_i = 2$

$p = 0.5, \rho = 0.9, \sigma^2 = 1.0$

In [5]:
di = 2 # number of observations per cluster
V_CS = get_V_CS(ρtrue[1], di)

# true Gamma
Γ_CS = σ2true[1] * V_CS

2×2 Matrix{Float64}:
 1.0  0.9
 0.9  1.0

In [6]:
vecd = Vector{DiscreteUnivariateDistribution}(undef, di)
for i in 1:di
    vecd[i] = Bernoulli(p[1])
end
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ_CS)

Random.seed!(12345)
@time Y_nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, samplesize)

gcs = Vector{GLMCopulaCSObs{T, D, Link}}(undef, samplesize)
for i in 1:samplesize
    X = ones(di, 1)
    y = Float64.(Y_nsample[i])
    V = [Float64.(Matrix(I, di, di))]
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
di = length(gcm.data[1].y)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_2 = StatsBase.cor(Y_CS)

  0.837941 seconds (5.52 M allocations: 360.511 MiB, 10.08% gc time, 66.66% compilation time)


2×2 Matrix{Float64}:
 1.0       0.446784
 0.446784  1.0

In [7]:
theoretical_rho_2_kurtosis = (ρtrue[1] * σ2true[1]) / (1 + ((di/2) * σ2true[1]) + (0.5 * (kt - 1) * σ2true[1]))

0.45

In [8]:
mean_empirical_cor_2 = mean(GLMCopula.offdiag(empirical_cor_2))

0.4467842937984511

## simulate under GLMM CS di = 2 <a class="anchor" id="ex0g"></a>

$d_i = 2$

$\lambda = 5, \rho = 0.9, \sigma^2 = 1.0$

In [9]:
function __get_distribution(dist::Type{D}, μ) where D <: UnivariateDistribution
    return dist(μ)
end

for i in 1:samplesize
    X = ones(di, 1)
    η = X * βtrue
    # generate mvn response
    mvn_d = MvNormal(η, Γ_CS)
    mvn_η = rand(mvn_d)
    μ = GLM.linkinv.(link, mvn_η)
    y = Float64.(rand.(__get_distribution.(D, μ)))
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_2_GLMM = StatsBase.cor(Y_CS)

2×2 Matrix{Float64}:
 1.0       0.163437
 0.163437  1.0

In [10]:
mean_empirical_cor_2_GLMM = mean(GLMCopula.offdiag(empirical_cor_2_GLMM))

0.16343678921893148

## CS di = 5 <a class="anchor" id="ex1"></a>

$d_i = 5$

$p = 0.5, \rho = 0.9, \sigma^2 = 1.0$

In [11]:
di = 5 # number of observations per cluster
V_CS = get_V_CS(ρtrue[1], di)

# true Gamma
Γ_CS = σ2true[1] * V_CS

5×5 Matrix{Float64}:
 1.0  0.9  0.9  0.9  0.9
 0.9  1.0  0.9  0.9  0.9
 0.9  0.9  1.0  0.9  0.9
 0.9  0.9  0.9  1.0  0.9
 0.9  0.9  0.9  0.9  1.0

In [12]:
vecd = Vector{DiscreteUnivariateDistribution}(undef, di)
for i in 1:di
    vecd[i] = Bernoulli(p[1])
end
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ_CS)

Random.seed!(12345)
@time Y_nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, samplesize)

gcs = Vector{GLMCopulaCSObs{T, D, Link}}(undef, samplesize)
for i in 1:samplesize
    X = ones(di, 1)
    y = Float64.(Y_nsample[i])
    V = [Float64.(Matrix(I, di, di))]
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
di = length(gcm.data[1].y)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_5 = StatsBase.cor(Y_CS)

  0.934538 seconds (8.50 M allocations: 651.550 MiB, 31.05% gc time)


5×5 Matrix{Float64}:
 1.0       0.258186  0.254597  0.258721  0.250135
 0.258186  1.0       0.256919  0.2537    0.257459
 0.254597  0.256919  1.0       0.254902  0.256095
 0.258721  0.2537    0.254902  1.0       0.257842
 0.250135  0.257459  0.256095  0.257842  1.0

In [13]:
theoretical_rho_5_kurtosis = (ρtrue[1] * σ2true[1]) / (1 + ((di/2) * σ2true[1]) + (0.5 * (kt - 1) * σ2true[1]))

0.2571428571428572

In [14]:
mean_empirical_cor_5 = mean(GLMCopula.offdiag(empirical_cor_5))

0.25585568847847295

## simulate under GLMM CS di = 5 <a class="anchor" id="ex1g"></a>

$d_i = 5$

$\lambda = 5, \rho = 0.9, \sigma^2 = 1.0$

In [15]:
for i in 1:samplesize
    X = ones(di, 1)
    η = X * βtrue
    # generate mvn response
    mvn_d = MvNormal(η, Γ_CS)
    mvn_η = rand(mvn_d)
    μ = GLM.linkinv.(link, mvn_η)
    y = Float64.(rand.(__get_distribution.(D, μ)))
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_5_GLMM = StatsBase.cor(Y_CS)

5×5 Matrix{Float64}:
 1.0      0.15434   0.1588    0.14886   0.15482
 0.15434  1.0       0.155104  0.153285  0.158922
 0.1588   0.155104  1.0       0.15193   0.151655
 0.14886  0.153285  0.15193   1.0       0.151914
 0.15482  0.158922  0.151655  0.151914  1.0

In [16]:
mean_empirical_cor_5_GLMM = mean(GLMCopula.offdiag(empirical_cor_5_GLMM))

0.15396313253845206

## CS di = 10 <a class="anchor" id="ex2"></a>

$d_i = 10$

$p = 0.5, \rho = 0.9, \sigma^2 = 1.0$

In [17]:
di = 10 # number of observations per cluster
V_CS = get_V_CS(ρtrue[1], di)

# true Gamma
Γ_CS = σ2true[1] * V_CS

10×10 Matrix{Float64}:
 1.0  0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  1.0  0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  1.0  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  1.0  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  1.0  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  1.0  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  1.0  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  1.0  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  1.0  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  1.0

In [18]:
vecd = Vector{DiscreteUnivariateDistribution}(undef, di)
for i in 1:di
    vecd[i] = Bernoulli(p[1])
end
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ_CS)

Random.seed!(1234)
@time Y_nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, samplesize)

gcs = Vector{GLMCopulaCSObs{T, D, Link}}(undef, samplesize)
for i in 1:samplesize
    X = ones(di, 1)
    y = Float64.(Y_nsample[i])
    V = [Float64.(Matrix(I, di, di))]
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
di = length(gcm.data[1].y)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_10 = StatsBase.cor(Y_CS)

  2.550523 seconds (16.50 M allocations: 1.691 GiB, 44.33% gc time)


10×10 Matrix{Float64}:
 1.0       0.149147  0.147695  0.150449  …  0.152096  0.145724  0.153344
 0.149147  1.0       0.155174  0.149649     0.148683  0.145699  0.148599
 0.147695  0.155174  1.0       0.152998     0.149787  0.151998  0.154418
 0.150449  0.149649  0.152998  1.0          0.147116  0.150745  0.154045
 0.144407  0.154657  0.147994  0.149709     0.144703  0.146199  0.149659
 0.143604  0.147694  0.146687  0.14942   …  0.146474  0.149268  0.152408
 0.147018  0.153922  0.158544  0.153598     0.152639  0.151621  0.156001
 0.152096  0.148683  0.149787  0.147116     1.0       0.145982  0.147921
 0.145724  0.145699  0.151998  0.150745     0.145982  1.0       0.148779
 0.153344  0.148599  0.154418  0.154045     0.147921  0.148779  1.0

In [19]:
theoretical_rho_10_kurtosis = (ρtrue[1] * σ2true[1]) / (1 + ((di/2) * σ2true[1]) + (0.5 * (kt - 1) * σ2true[1]))

0.15

In [20]:
mean_empirical_cor_10 = mean(GLMCopula.offdiag(empirical_cor_10))

0.1500729994570417

## simulate under GLMM CS di = 10 <a class="anchor" id="ex2g"></a>

$d_i = 10$

$\lambda = 5, \rho = 0.9, \sigma^2 = 1.0$

In [21]:
for i in 1:samplesize
    X = ones(di, 1)
    η = X * βtrue
    # generate mvn response
    mvn_d = MvNormal(η, Γ_CS)
    mvn_η = rand(mvn_d)
    μ = GLM.linkinv.(link, mvn_η)
    y = Float64.(rand.(__get_distribution.(D, μ)))
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_10_GLMM = StatsBase.cor(Y_CS)

10×10 Matrix{Float64}:
 1.0       0.149764  0.150673  0.154224  …  0.155914  0.154378  0.158806
 0.149764  1.0       0.151921  0.149119     0.152386  0.15302   0.15918
 0.150673  0.151921  1.0       0.15209      0.154892  0.154996  0.15611
 0.154224  0.149119  0.15209   1.0          0.155853  0.151676  0.152803
 0.152452  0.159039  0.153844  0.153233     0.152477  0.15257   0.157533
 0.157994  0.155321  0.155877  0.158489  …  0.152812  0.150516  0.14927
 0.157974  0.150719  0.156005  0.153472     0.159119  0.152728  0.152051
 0.155914  0.152386  0.154892  0.155853     1.0       0.151371  0.153356
 0.154378  0.15302   0.154996  0.151676     0.151371  1.0       0.154374
 0.158806  0.15918   0.15611   0.152803     0.153356  0.154374  1.0

In [22]:
mean_empirical_cor_10_GLMM = mean(GLMCopula.offdiag(empirical_cor_10_GLMM))

0.15404187675007466

## CS di = 25 <a class="anchor" id="ex3"></a>

$d_i = 25$

$p = 0.5, \rho = 0.9, \sigma^2 = 1.0$

In [23]:
di = 25 # number of observations per cluster
V_CS = get_V_CS(ρtrue[1], di)

# true Gamma
Γ_CS = σ2true[1] * V_CS

25×25 Matrix{Float64}:
 1.0  0.9  0.9  0.9  0.9  0.9  0.9  0.9  …  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  1.0  0.9  0.9  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  1.0  0.9  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  1.0  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  1.0  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  1.0  0.9  0.9  …  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  1.0  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  1.0     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  …  0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9     0.9  0.9  0.9  0.9  0.9  0.9  0.9
 0.9  0.9  0.9  0.9  0.9  0.9  0.9  0.9  

In [24]:
vecd = Vector{DiscreteUnivariateDistribution}(undef, di)
for i in 1:di
    vecd[i] = Bernoulli(p[1])
end
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ_CS)

Random.seed!(1234)
@time Y_nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, samplesize)

gcs = Vector{GLMCopulaCSObs{T, D, Link}}(undef, samplesize)
for i in 1:samplesize
    X = ones(di, 1)
    y = Float64.(Y_nsample[i])
    V = [Float64.(Matrix(I, di, di))]
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
di = length(gcm.data[1].y)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_25 = StatsBase.cor(Y_CS)

  8.440451 seconds (40.50 M allocations: 11.326 GiB, 28.20% gc time)


25×25 Matrix{Float64}:
 1.0        0.0641161  0.0624908  …  0.0686719  0.0621779  0.0646885
 0.0641161  1.0        0.0712733     0.0722922  0.0660648  0.0632414
 0.0624908  0.0712733  1.0           0.066537   0.0657191  0.0622896
 0.0658242  0.0675776  0.0716813     0.0689011  0.0645604  0.065305
 0.068169   0.0700323  0.0714229     0.0629226  0.0655409  0.0629704
 0.0667056  0.0635742  0.062509   …  0.0655682  0.0696232  0.0649494
 0.0612348  0.0690485  0.063898      0.0692782  0.0659394  0.0620341
 0.0691603  0.0684422  0.0620998     0.0678799  0.0680999  0.0613603
 0.0679979  0.069205   0.067979      0.0693591  0.0659797  0.0689977
 0.0739359  0.0678674  0.0703984     0.0704985  0.0715595  0.0724955
 0.0720895  0.0678318  0.0701831  …  0.0676028  0.067061   0.0635309
 0.0629031  0.0698589  0.0616009     0.0665408  0.0679202  0.0706639
 0.0690699  0.0654998  0.0698302     0.0678112  0.0632378  0.0699455
 0.0670348  0.0717286  0.069498      0.0667182  0.0639794  0.0656742
 0.0638035  

In [25]:
theoretical_rho_25_kurtosis = (ρtrue[1] * σ2true[1]) / (1 + ((di/2) * σ2true[1]) + (0.5 * (kt - 1) * σ2true[1]))

0.06666666666666667

In [26]:
mean_empirical_cor_25 = mean(GLMCopula.offdiag(empirical_cor_25))

0.06726269843459454

## simulate under GLMM CS di = 25 <a class="anchor" id="ex3g"></a>

$d_i = 25$

$p = 0.5, \rho = 0.9, \sigma^2 = 1.0$

In [27]:
for i in 1:samplesize
    X = ones(di, 1)
    η = X * βtrue
    # generate mvn response
    mvn_d = MvNormal(η, Γ_CS)
    mvn_η = rand(mvn_d)
    μ = GLM.linkinv.(link, mvn_η)
    y = Float64.(rand.(__get_distribution.(D, μ)))
    gcs[i] = GLMCopulaCSObs(y, X, d, link)
end

# form model
gcm = GLMCopulaCSModel(gcs);

N = length(gcm.data)
Y_CS = zeros(N, di)
for j in 1:di
    Y_CS[:, j] = [gcm.data[i].y[j] for i in 1:N]
end
empirical_cor_25_GLMM = StatsBase.cor(Y_CS)

25×25 Matrix{Float64}:
 1.0       0.15568   0.1569    0.153718  …  0.153319  0.156479  0.155419
 0.15568   1.0       0.15846   0.156402     0.156321  0.156241  0.156621
 0.1569    0.15846   1.0       0.160542     0.1529    0.15562   0.15628
 0.153718  0.156402  0.160542  1.0          0.157031  0.160671  0.158372
 0.153802  0.15924   0.15654   0.155891     0.153685  0.159725  0.159705
 0.150098  0.154901  0.159521  0.155528  …  0.151934  0.154694  0.154115
 0.152159  0.15048   0.15658   0.154595     0.151477  0.158597  0.156018
 0.161759  0.1566    0.1557    0.160274     0.155837  0.156077  0.156337
 0.156741  0.15458   0.15892   0.159626     0.156663  0.152863  0.152322
 0.153539  0.1513    0.15628   0.158098     0.153619  0.146619  0.153439
 0.157181  0.16022   0.1566    0.154028  …  0.152663  0.154703  0.157203
 0.158119  0.149204  0.156424  0.155971     0.152987  0.158027  0.156129
 0.158399  0.15984   0.15698   0.166353     0.157836  0.156236  0.157696
 0.152761  0.15016   0.15586 

In [28]:
mean_empirical_cor_25_GLMM = mean(GLMCopula.offdiag(empirical_cor_25_GLMM))

0.15515390099002302

## Comparisons 

##  1. Theoretical vs. Empirical Correlation simulated under QC <a class="anchor" id="ex4"></a>

### Takeaway: Quasi-Copula correlation IS a function of the di, kt, $\rho$ and $\sigma^2$

The theoretical correlation is a function of $d_i, \rho, \sigma^2, kt(p)$

$$Corr(Y_k, Y_l)
 = \frac{\rho * \sigma^2}{1 + \frac{1}{2} * \sigma^2 * kt + \frac{1}{2} * \sigma^2 (d_i - 1)}$$

We have $p = 0.5,$ thus we have kurtosis $kt = kt(p) = 1.$

Let's see what happens to the theoretical/empirical correlation when we fix $\rho = 0.9, \sigma^2 = 1.0$ and range cluster sizes $d_i$

$$Corr(Y_k, Y_l)
 = \frac{0.9}{1 + \frac{1}{2} * 1 + \frac{1}{2} * (d_i - 1)} = \frac{0.9}{1 + \frac{d_i}{2}}$$


In [29]:
# di = 2
[theoretical_rho_2_kurtosis mean_empirical_cor_2]

1×2 Matrix{Float64}:
 0.45  0.446784

In [30]:
# di = 5
[theoretical_rho_5_kurtosis mean_empirical_cor_5]

1×2 Matrix{Float64}:
 0.257143  0.255856

In [31]:
# di = 10
[theoretical_rho_10_kurtosis mean_empirical_cor_10]

1×2 Matrix{Float64}:
 0.15  0.150073

In [32]:
# di = 25
[theoretical_rho_25_kurtosis mean_empirical_cor_25]

1×2 Matrix{Float64}:
 0.0666667  0.0672627

##  2. Empirical Correlation simulated under GLMM<a class="anchor" id="ex4g"></a>

### Takeaway: GLMM correlation is NOT a function of the cluster size (Only a function of $\rho$ and $\sigma^2)$

In [33]:
# glmm correlation di = 2 
mean_empirical_cor_2_GLMM

0.16343678921893148

In [34]:
# glmm correlation di = 5 
mean_empirical_cor_5_GLMM

0.15396313253845206

In [35]:
# glmm correlation di = 10
mean_empirical_cor_10_GLMM

0.15404187675007466

In [36]:
# glmm correlation di = 25
mean_empirical_cor_25_GLMM

0.15515390099002302