# Form the Poisson Regression Random Intercept Model: Simulated set

In this notebook we will use simulated dataset using MixedModels.jl to test the fit of our copula model on the logistic regression outcome.

For a single observation, $i = 1$ we will use ForwardDiff.jl to check the following calculations:

    (1) Loglikelihood

    (2) Gradient with respect to Beta

In [1]:
using DataFrames, Random, GLM, GLMCopula
using ForwardDiff, Test, LinearAlgebra
using LinearAlgebra: BlasReal, copytri!

Random.seed!(1235)

# sample size
N = 10000
# observations per subject
n = 5

variance_component_1 = 0.1
variance_component_2 = 0.9

mean = 5

dist = Poisson

Γ = variance_component_1 * ones(n, n) + variance_component_2 * Matrix(I, n, n)
vecd = [dist(mean) for i in 1:n]
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ)

Y_Nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, N)

10000-element Array{Array{Float64,1},1}:
 [5.0, 6.0, 1.0, 2.0, 10.0]
 [7.0, 1.0, 10.0, 5.0, 13.0]
 [6.0, 6.0, 3.0, 4.0, 2.0]
 [1.0, 7.0, 3.0, 8.0, 7.0]
 [5.0, 1.0, 3.0, 4.0, 6.0]
 [6.0, 8.0, 6.0, 7.0, 6.0]
 [11.0, 6.0, 3.0, 1.0, 5.0]
 [3.0, 5.0, 3.0, 1.0, 4.0]
 [3.0, 5.0, 10.0, 8.0, 4.0]
 [5.0, 6.0, 2.0, 4.0, 1.0]
 [4.0, 7.0, 7.0, 2.0, 3.0]
 [3.0, 4.0, 7.0, 5.0, 8.0]
 [5.0, 10.0, 4.0, 3.0, 3.0]
 ⋮
 [7.0, 6.0, 6.0, 5.0, 0.0]
 [7.0, 3.0, 4.0, 7.0, 3.0]
 [2.0, 7.0, 5.0, 2.0, 3.0]
 [7.0, 6.0, 5.0, 6.0, 5.0]
 [5.0, 6.0, 3.0, 7.0, 12.0]
 [2.0, 8.0, 9.0, 7.0, 10.0]
 [4.0, 1.0, 6.0, 6.0, 5.0]
 [7.0, 4.0, 3.0, 5.0, 5.0]
 [4.0, 1.0, 7.0, 7.0, 2.0]
 [8.0, 8.0, 5.0, 6.0, 4.0]
 [7.0, 5.0, 6.0, 2.0, 3.0]
 [2.0, 6.0, 3.0, 4.0, 4.0]

In [3]:
d = Poisson()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{GLMCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = GLMCopulaVCObs(y, X, V, d, link)
end
gcm = GLMCopulaVCModel(gcs);

initialize_model!(gcm)
@show gcm.β;

1 0.0 -117045.2600382745 39999
2 -117045.2600382745 -117045.2600382745 39999
gcm.β = [1.6340954024564622]


## MM update for Sigma

$$
  \begin{eqnarray}
\mathcal{L}(\sigma) = - \sum_i^n \ln \left(1 + \sum_k^m \sigma_k^2 t_{ik}\right) + \sum_i^n \ln \left(1 + \frac{1}{2} \sum_k^m \sigma_k^2 q_{ik}\right)
\end{eqnarray}
$$

where
$$
\begin{eqnarray*}
	t_{ik} &=& \frac 12 \text{tr}(\mathbf{V}_{ik}), \quad \mathbf{t}_i = (t_{i1}, \ldots, t_{im})^T \\
	q_{ik} &=& \frac{1}{2}  \mathbf{r}_i( \boldsymbol{\beta})^T \mathbf{V}_{ik}  \mathbf{r}_i (\boldsymbol{\beta}), \quad \mathbf{q}_i = (q_{i1}, \ldots, q_{im})^T.
\end{eqnarray*}
$$

where the $k^{th}$ element in the $m \times 1$ vector of variance components above is:

$$\begin{eqnarray*}
\sigma_{k}^{2^{(t+1)}} & = & \sigma_{k}^{2^{(t)}} \left(\frac{\sum_{i=1}^n
\frac{q_{ik}}{1+ q_i^{(t)}}}{\sum_{i=1}^n \frac{t_{ik}}{1+ t_i^{(t)}}} \right).
\end{eqnarray*}
$$

where $q_i^{(t)} = \sum_{k=1}^m \sigma_k^{2(t)} q_{ik}$ and $t_i^{(t)} = \sum_{k=1}^m \sigma_k^{2(t)} t_{ik}$. 


In this case $m = 1$ single variance component to model the random intercept. 

Initialize β and σ2, here I just copy the solution for β and σ2 from MixedModels.jl over

In [4]:
fill!(gcm.Σ, 1.0)
update_Σ!(gcm)
@show gcm.Σ
GLMCopula.loglikelihood!(gcm, true, true)

gcm.Σ = [0.09165668470529689, 0.703193469117031]


-116209.33278544483

Closer Look at Observation 1
i =1

In [5]:
gc = gcm.data[1]
β  = gcm.β
Σ  = gcm.Σ
τ  = gcm.τ

@show β
@show Σ
@show τ

n_i  = length(gc.y)

β = [1.6340954024564622]
Σ = [0.09165668470529689, 0.703193469117031]
τ = [1.0]


5

## Loglikelihood for observation i = 1, j in [1, n_1]
$$\mathcal{L}(\mathbf{\beta})_1 =  - \ln \Big[1\! +\! \frac{1}{2}tr(\mathbf{\Gamma_{1}})\Big] +
\ln \Big\{1\!+\!\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_1} \mathbf{r_1}(\mathbf{\beta})\Big\} +  \sum_{j=1}^{n_1}y_{1j}log(\mu_{1j}(\mathbf{\beta})) - \mu_{1j}(\mathbf{\beta})$$

### First I want to check if the mean and residuals are updated and standardized at this point

In [6]:
@test gc.η == gc.X*β                         # systematic linear component
@test gc.μ == exp.(gc.η)                     # mu = ginverse of XB = mean component for GLM
@test gc.varμ == exp.(gc.η)                  # variance of the GLM response as a function of mean mu
@test gc.res ≈ (gc.y - gc.μ)./sqrt.(gc.varμ) # standardized residual for GLM outcome

[32m[1mTest Passed[22m[39m

### Next I want to check if the hard coded terms in the loglikelihood are correct

$$\text{Term 1 }= - \ln \Big[1\! +\! \frac{1}{2}tr(\mathbf{\Gamma_{1}})\Big]$$

In [7]:
Γ_est = Σ[1] * gc.V[1] + Σ[2] * gc.V[2]
trace_gamma = tr(Γ_est)

term1 = -log(1 + 0.5 * trace_gamma)
@show term1;

term1 = -1.0943115151041685


In [8]:
tsum = dot(Σ, gc.t)
@show -log(1 + tsum);

-(log(1 + tsum)) = -1.0943115151041687


$$\text{Term 2} = \ln \Big\{1\!+\!\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_1} \mathbf{r_1}(\mathbf{\beta})\Big\}$$

In [9]:
# term 2:
quad_form_standardized_res = transpose(gc.res) * Γ_est * gc.res
@show quad_form_standardized_res

term2 = log(1 + 0.5 * quad_form_standardized_res) 
@show term2

quad_form_standardized_res = 7.089996124595516
term2 = 1.5140273012922243


1.5140273012922243

### In the loglikelihood function I have:

$$\text{Term1 + Term2} =  - \ln \Big[1\! +\! \frac{1}{2}tr(\mathbf{\Gamma_{1}})\Big] +
\ln \Big\{1\!+\!\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_1} \mathbf{r_1}(\mathbf{\beta})\Big\}$$

In [10]:
logl_hard_coded_obs1 = term1 + term2
copula_logl_function = GLMCopula.copula_loglikelihood_addendum(gc, Σ)
@show logl_hard_coded_obs1
@show copula_logl_function
@test copula_logl_function ≈ logl_hard_coded_obs1

logl_hard_coded_obs1 = 0.41971578618805583
copula_logl_function = 0.4197157861880556


[32m[1mTest Passed[22m[39m

### Part of Loglikelihood that comes from the Density using GLM.jl

$$\text{Term 3} = \sum_{j=1}^{n_1} y_{1j} * log(\mu_{1j}(\beta)) - \mu_{1j}(\beta) - log(y_{1j}!)$$

In [11]:
function poisson_density(y, μ)
    logl = 0.0
    for j in 1:length(y)
        logl += y[j] * log(μ[j]) - μ[j] - log(factorial(y[j]))
    end
    logl
end

term3 = poisson_density(gc.y, gc.μ)

-13.57011304947251

In [12]:
logl_component_poisson = 0.0
logl_component_poisson += component_loglikelihood(gc, τ[1], logl_component_poisson)

-13.570113049472514

In [13]:
@test logl_component_poisson ≈ term3

[32m[1mTest Passed[22m[39m

$$\mathcal{L}(\mathbf{\beta})_1 =  - \ln \Big[1\! +\! \frac{1}{2}tr(\mathbf{\Gamma_{1}})\Big] +
\ln \Big\{1\!+\!\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_1} \mathbf{r_1}(\mathbf{\beta})\Big\} +  \sum_{j=1}^{n_1} y_{1j} * log(\mu_{1j}(\beta)) - \mu_{1j}(\beta) - log(y_{1j}!)$$

In [14]:
logl_hard = term1 + term2 + term3

-13.150397263284455

In [15]:
logl_my_function = copula_loglikelihood(gc, β, τ[1], Σ)

-13.150397263284459

In [16]:
@test logl_hard ≈ logl_my_function

[32m[1mTest Passed[22m[39m

# A Closer Look at the Gradient for observation i=1

$$\begin{eqnarray*}
\nabla_\beta &=& \sum_{i=1}^n \sum_j \nabla \ln f_{ij}(y_{ij} \mid \mathbf{\beta}) + \sum_{i=1}^n
\frac{\nabla \mathbf{r_i(\mathbf{\beta})}\mathbf{\Gamma_i}\mathbf{r_i(\mathbf{\beta})}}{1+\frac{1}{2}\mathbf{r_i}(\mathbf{\beta})^t \mathbf{\Gamma_i} \mathbf{r_i(\mathbf{\beta})}}
\end{eqnarray*}
$$

The gradient is made of two terms. The first is from the GLM component loglikelihood that corresponds to the Logistic Regression density. The second part is specific to our copula model. We start with Term 1 for observation 1:

$$\begin{eqnarray*}
    \text{Term 1} &=& \sum_{j=1}^{n_1} \frac{(y_{1j}-\mu_{1j}) \mu_{1j}'(\eta_{1j})}{\sigma_{1j}^2} \mathbf{x}_{1j}
\end{eqnarray*}
$$

We will check if the field $\mu_{1j}'$ or `mueta` from the GLM.jl package matches our theoretical value

In [17]:
# these are slightly off by small decimals
@test gc.dμ == exp.(gc.η)                  # derivative of mean with respect to systematic component

[32m[1mTest Passed[22m[39m

In [18]:
function poisson_gradient(y, X, dμ, σ2, μ)
    grad = zeros(size(X, 2))
    for j in 1:length(y)
        grad += (y[j] - μ[j]) * dμ[j]/σ2[j] * X[j, :]
    end
    grad
end

# check if glm gradient is right
term1_gradient = poisson_gradient(gc.y, gc.X, gc.dμ, gc.varμ, gc.μ)

1-element Array{Float64,1}:
 -1.6240999999999985

In [19]:
term1_grad_fctn = GLMCopula.glm_gradient(gc, β, τ)
@test term1_gradient == term1_grad_fctn

[32m[1mTest Passed[22m[39m

### Copula density specific gradient portion

$$ \text{Term 2} = \frac{\nabla \mathbf{r_1(\mathbf{\beta})}^\top\mathbf{\Gamma_1}\mathbf{r_1(\mathbf{\beta})}}{1+\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_i} \mathbf{r_1(\mathbf{\beta})}}
$$

Notice the second term uses a critical value that will come up in the Hessian as well. For the first observation, $i = 1,$ we have $n_1 = 5$ and $p = 1$ in the simulated dataset. $\nabla \mathbf{r_1(\mathbf{\beta})}$ which is an $5 \times 1$ matrix of differentials. 

Each column of $\nabla \mathbf{r_1(\mathbf{\beta})}^\top$ is a $p \times 1$ vector $\nabla r_{1j}(\mathbf{\beta})$

$$
\begin{eqnarray*}
\nabla r_{1j}(\mathbf{\beta}) &=&  -\frac{1}{\sigma_{1j}(\mathbf{\beta})} \nabla \mu_{1j}(\mathbf{\beta})- \frac{1}{2} \frac{y_{1j}-\mu_{1j}(\mathbf{\beta})}{\sigma_{1j}^3(\mathbf{\beta})} \nabla \sigma_{1j}^2(\mathbf{\beta})\\
&=&  -\frac{1}{\sigma_{1j}(\mathbf{\beta})} \nabla \mu_{1j}(\mathbf{\beta})- \frac{1}{2\sigma_{1j}^2(\mathbf{\beta})}r_{1j}(\mathbf{\beta}) \nabla \sigma_{1j}^2(\mathbf{\beta})
\end{eqnarray*}
$$


where 

$$
\begin{eqnarray*}
\nabla \mu_{1j}(\mathbf{\beta}) &=& e^{\eta_{1j}(\mathbf{\beta})} * \mathbf{x_{1j}}
\end{eqnarray*}
$$

In [20]:
# for j = 1 and j = 2, ..., j = end; lets take a look at the first two columns 
∇μβ1 = exp.(gc.η[1]) .* gc.X[1, :]
∇μβ2 = exp.(gc.η[2]) .* gc.X[2, :]
# ...
∇μβend = exp.(gc.η[end]) .* gc.X[end, :]

@show ∇μβ1
@show ∇μβ2
@show ∇μβend

∇μβ1 = [5.12482]
∇μβ2 = [5.12482]
∇μβend = [5.12482]


1-element Array{Float64,1}:
 5.12482

In [21]:
∇ηβ = gc.X

∇μβ = transpose(∇ηβ) * Diagonal(gc.dμ)

1×5 Array{Float64,2}:
 5.12482  5.12482  5.12482  5.12482  5.12482

and  
$$  \begin{eqnarray*}
    \nabla \sigma_{ij}^2(\beta) = \frac{d\sigma_{ij}^2(\beta)}{d\mu_{ij}(\beta)} \frac{d\mu_{ij}(\beta)}{d\eta_{ij}(\beta)} \frac{d\eta_{ij}(\beta)}{d\beta} = 1 * e^{\eta_{ij}(\beta)} \mathbf{x}_{ij} = e^{\eta_{ij}(\beta)} \mathbf{x}_{ij}
\end{eqnarray*}$$

In [22]:
# for j = 1 and j = 2 ,... , j = end; lets take a look at the first two columns 
∇σ2β1 = 1 * gc.dμ[1] .* gc.X[1, :]
∇σ2β2 = 1 * gc.dμ[2] .* gc.X[2, :]
# ...
∇σ2βend = 1 * gc.dμ[end] .* gc.X[end, :]

@show ∇σ2β1
@show ∇σ2β2
@show ∇σ2βend

∇σ2β = transpose(gc.X) * Diagonal(gc.dμ)

∇σ2β1 = [5.12482]
∇σ2β2 = [5.12482]
∇σ2βend = [5.12482]


1×5 Array{Float64,2}:
 5.12482  5.12482  5.12482  5.12482  5.12482

$$
\begin{eqnarray*}
\nabla r_{1j}(\mathbf{\beta}) &=&  -\frac{1}{\sigma_{1j}(\mathbf{\beta})} \nabla \mu_{1j}(\mathbf{\beta})- \frac{1}{2} \frac{y_{1j}-\mu_{1j}(\mathbf{\beta})}{\sigma_{1j}^3(\mathbf{\beta})} \nabla \sigma_{1j}^2(\mathbf{\beta})\\
&=&  -\frac{1}{\sigma_{1j}(\mathbf{\beta})} \nabla \mu_{1j}(\mathbf{\beta})- \frac{1}{2\sigma_{1j}^2(\mathbf{\beta})}r_{1j}(\mathbf{\beta}) \nabla \sigma_{1j}^2(\mathbf{\beta})
\end{eqnarray*}
$$


In [30]:
# for j = 1 and j = 2, ..., j = end; lets take a look at the first two columns
∇μβ1 = exp.(gc.η[1]) .* gc.X[1, :]
∇μβ2 = exp.(gc.η[2]) .* gc.X[2, :]
# ...
∇μβend = exp.(gc.η[end]) .* gc.X[end, :]

@show ∇μβ1
@show ∇μβ2
@show ∇μβend

∇μη = exp.(gc.η)
@show gc.varμ ≈ ∇μη
@test gc.dμ ≈ gc.varμ
∇ηβ = gc.X

∇μβ = transpose(∇ηβ)*Diagonal(∇μη)

# for j = 1 and j = 2 ,... , j = end; lets take a look at the first two columns
∇σ2β1 = gc.dμ[1] .* gc.X[1, :]
∇σ2β2 = gc.dμ[2] .* gc.X[2, :]
# ...
∇σ2βend = gc.dμ[end] .* gc.X[end, :]

@show ∇σ2β1
@show ∇σ2β2
@show ∇σ2βend

∇σ2β = transpose(gc.X)* Diagonal(gc.dμ)

update_res!(gc, β)
standardize_res!(gc)
std_res_differential!(gc)
@test gc.∇resβ[1, :] ≈ -1/(sqrt(gc.varμ[1])) .* ∇μβ1 - ((1/2gc.varμ[1]) * gc.res[1]) .* ∇σ2β1

∇μβ1 = [5.12482]
∇μβ2 = [5.12482]
∇μβend = [5.12482]
gc.varμ ≈ ∇μη = true
∇σ2β1 = [5.12482]
∇σ2β2 = [5.12482]
∇σ2βend = [5.12482]


[32m[1mTest Passed[22m[39m

In [31]:
@test gc.∇resβ[2, :] == -1/(sqrt(gc.varμ[2])) .* ∇μβ2 - ((1/2gc.varμ[2]) * gc.res[2]) .* ∇σ2β2

[32m[1mTest Passed[22m[39m

In [32]:
@test gc.∇resβ[end, :] == -1/(sqrt(gc.varμ[end])) .* ∇μβend - ((1/2gc.varμ[end]) * gc.res[end]) .* ∇σ2βend

[32m[1mTest Passed[22m[39m

### Gradient portion of Copula specific model

$$\begin{eqnarray*}
\text{Term 2} &=& \sum_{i=1}^n
\frac{\nabla \mathbf{r_i(\mathbf{\beta})}\mathbf{\Gamma_i}\mathbf{r_i(\mathbf{\beta})}}{1+\frac{1}{2}\mathbf{r_i}(\mathbf{\beta})^t \mathbf{\Gamma_i} \mathbf{r_i(\mathbf{\beta})}}
\end{eqnarray*}
$$

Again for observation $ i = 1$ we have:  

$$ \text{Term 2} = \frac{\nabla \mathbf{r_1(\mathbf{\beta})}^\top\mathbf{\Gamma_1}\mathbf{r_1(\mathbf{\beta})}}{1+\frac{1}{2}\mathbf{r_1}(\mathbf{\beta})^t \mathbf{\Gamma_i} \mathbf{r_1(\mathbf{\beta})}}
$$

In [33]:
Γ_est = Σ[1] * gc.V[1] + Σ[2] * gc.V[2]

grad_t2_numerator = transpose(gc.∇resβ) * Γ_est * gc.res       # new term ∇resβ^t * Γ * res
@show grad_t2_numerator

quadratic_form = transpose(gc.res) * Γ_est * gc.res
@show quadratic_form 
@test quadratic_form ≈ quad_form_standardized_res # from the loglikelihood 'qsum'

grad_t2_denominator = inv(1 + 0.5 * quadratic_form)
@show grad_t2_denominator

gradient_term2 = grad_t2_numerator * grad_t2_denominator

grad_t2_numerator = [-1.6586434409554274]
quadratic_form = 7.089996124595516
grad_t2_denominator = 0.22002209600380832


1-element Array{Float64,1}:
 -0.364938206401982

In [34]:
gradient_term2_function = GLMCopula.copula_gradient_addendum(gc, β, τ[1], Σ)

1-element Array{Float64,1}:
 -0.3649382064019818

In [35]:
@test gradient_term2 ≈ gradient_term2_function

[32m[1mTest Passed[22m[39m

In [36]:
gradient_hard_code = term1_gradient + gradient_term2

1-element Array{Float64,1}:
 -1.9890382064019805

In [37]:
full_gradient_my_function = copula_gradient(gc, β, τ, Σ)

1-element Array{Float64,1}:
 -1.9890382064019803

In [38]:
@test full_gradient_my_function ≈ gradient_hard_code

[32m[1mTest Passed[22m[39m

## Let's now use the ForwardDiff.jl package to check if our matrix calculus is correct.

I want to start by checking my calculation of the gradient. 

    (1) I will modify the functions that reflect parts of the loglikelihood above in section 1 to use the package properly. 
    (2) I will then compare the results to that from my gradient functions above in section 2 


###  Test if the gradient of the component loglikelihood matches our gradient function

In [39]:
function poisson_density(β::Vector)
    η = gc.X*β                        # systematic linear component
    μ = exp.(η)   # mu = ginverse of XB = mean component for GLM = [p]
    dμ = exp.(η)
    varμ = dμ
    logl = sum(gc.y .* log.(μ) .- μ)
end

logl_term3 = poisson_density(β)
@show logl_term3

g = x -> ForwardDiff.gradient(poisson_density, x)

gradientmagictest = g(β)
@show gradientmagictest
@show term1_grad_fctn
@test term1_grad_fctn == gradientmagictest

logl_term3 = 13.594189658955099
gradientmagictest = [-1.6240999999999985]
term1_grad_fctn = [-1.6240999999999985]


[32m[1mTest Passed[22m[39m

# Now we check the gradient of the matrix of differentials residual vector

Lets start with $i = 1, j = 1$ so the first observation of the $i^{th}$ group has the following gradient of the standardized residual.

In [40]:
function standardized_residual_firstobs(β::Vector)
    η = gc.X*β                        # systematic linear component
    μ = exp.(η) # mu = ginverse of XB = mean component for GLM = [p]
    varμ = exp.(η)
    res = (gc.y[1] - μ[1]) / sqrt(varμ[1])
end

g2 = x -> ForwardDiff.gradient(standardized_residual_firstobs, x)
gradientmagictest2 = g2(β)
@show gradientmagictest2

@test gc.∇resβ[1, :] == gradientmagictest2

gradientmagictest2 = [-2.2362379185306214]


[32m[1mTest Passed[22m[39m

## Now I will check the part of the Loglikelihood that is specific to our density

In [41]:
function copula_loglikelihood_addendum1(β::Vector)
  m = length(gc.V)
  η = gc.X*β                        # systematic linear component
  μ = exp.(η) # mu = ginverse of XB = mean component for GLM = [p]
  varμ = exp.(η)
  res = (gc.y .- μ) ./ sqrt.(varμ)
  trace_gamma = tr(Γ_est)

  term1 = -log(1 + 0.5 * trace_gamma)
  quad_form_standardized_res = transpose(res)* Γ_est * res
  term2 = log(1 + 0.5 * quad_form_standardized_res)
  logl_hard_coded_obs1 = term1 + term2
  logl_hard_coded_obs1
end

g3 = x -> ForwardDiff.gradient(copula_loglikelihood_addendum1, x)

@show copula_loglikelihood_addendum1(β)

gradientmagictest3 = g3(β)
@show gradientmagictest3
@show gradient_term2_function
@test gradient_term2_function ≈ gradientmagictest3


copula_loglikelihood_addendum1(β) = 0.41971578618805583
gradientmagictest3 = [-0.364938206401982]
gradient_term2_function = [-0.3649382064019818]


[32m[1mTest Passed[22m[39m

## Now I will put together both the parts of the loglikelihood and both the parts of the gradient to check alltogether now

In [42]:
function full_loglikelihood(β::Vector)
    logl = 0.0
    logl = poisson_density(β) + copula_loglikelihood_addendum1(β)
    logl
end

@show full_loglikelihood(β)

g4 = x -> ForwardDiff.gradient(full_loglikelihood, x)

gradientmagictest4 = g4(β)
@show gradientmagictest4

full_gradient_function = copula_gradient(gc, β, τ, Σ)
@show full_gradient_function
@test full_gradient_function ≈ gradientmagictest4

full_loglikelihood(β) = 14.013905445143154
gradientmagictest4 = [-1.9890382064019805]
full_gradient_function = [-1.9890382064019803]


[32m[1mTest Passed[22m[39m

# Now I show that using my hessian we get convergence to roughly right answers

In [46]:
d = Poisson()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{GLMCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = GLMCopulaVCObs(y, X, V, d, link)
end
gcm = GLMCopulaVCModel(gcs);

initialize_model!(gcm)
@show gcm.β;
fill!(gcm.Σ, 1.0)
update_Σ!(gcm)
GLMCopula.loglikelihood!(gcm, true, true)

1 0.0 -117045.2600382745 39999
2 -117045.2600382745 -117045.2600382745 39999
gcm.β = [1.6340954024564622]


-116209.33278544483

In [47]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 100, hessian_approximation = "exact"))

gcm.Σ = [0.09165656173880622, 0.7031918157799482]
gcm.Σ = [0.09165644216061264, 0.7031902079939716]
This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        1

Total number of variables............................:        1
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bound

gcm.Σ = [0.10031036433657615, 0.8610171034955616]
  52  1.1614334e+05 0.00e+00 1.93e-05  -8.6 1.12e-09    -  1.00e+00 4.88e-04f 12
gcm.Σ = [0.10031039124353809, 0.8610174826537456]
  53  1.1614334e+05 0.00e+00 3.10e-05  -8.6 5.03e-09    -  1.00e+00 1.53e-05f 17
gcm.Σ = [0.10031040773203312, 0.861017719388987]
  54  1.1614334e+05 0.00e+00 3.83e-05  -8.6 8.06e-09    -  1.00e+00 1.91e-06f 20
gcm.Σ = [0.10031041647387001, 0.861017845308664]
  55  1.1614334e+05 0.00e+00 4.22e-05  -8.6 9.96e-09    -  1.00e+00 1.91e-06f 20
gcm.Σ = [0.10031042014938757, 0.8610178931394405]
  56  1.1614334e+05 0.00e+00 4.35e-05  -8.6 1.10e-08    -  1.00e+00 3.91e-03f  9
gcm.Σ = [0.10031042258040132, 0.8610179254264329]
  57  1.1614334e+05 0.00e+00 4.41e-05  -8.6 1.13e-08    -  1.00e+00 7.81e-03f  8
gcm.Σ = [0.10031042330149391, 0.8610179325351913]
  58  1.1614334e+05 0.00e+00 1.07e-06  -8.6 1.15e-08    -  1.00e+00 1.00e+00f  1
gcm.Σ = [0.10031042739374355, 0.8610179730944845]
  59  1.1614334e+05 0.00e+00 2.37e-

In [48]:
@show exp.(gcm.β)

@show gcm.Σ;

exp.(gcm.β) = [4.986032236259763]
gcm.Σ = [0.10031043421216335, 0.8610180602709369]


In [49]:
d = Poisson()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{GLMCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = GLMCopulaVCObs(y, X, V, d, link)
end
gcm = GLMCopulaVCModel(gcs);

initialize_model!(gcm)
@show gcm.β;
fill!(gcm.Σ, 1.0)
update_Σ!(gcm)
GLMCopula.loglikelihood!(gcm, true, true)
fill!(gcm.Σ, 1.0)
update_Σ!(gcm)
@show gcm.Σ
initial_logl = GLMCopula.loglikelihood!(gcm, true, true)
@show initial_logl
@show gcm.Hβ

1 0.0 -117045.2600382745 39999
2 -117045.2600382745 -117045.2600382745 39999
gcm.β = [1.6340954024564622]
gcm.Σ = [0.09165668470529689, 0.703193469117031]
initial_logl = -116209.33278544483
gcm.Hβ = [-191298.6845107316]


1×1 Array{Float64,2}:
 -191298.6845107316

In [50]:
@time fit2!(gcm, IpoptSolver(print_level = 5, max_iter = 500, point_perturbation_radius = 0.01, derivative_test = "first-order", warm_start_init_point="yes", limited_memory_initialization = "constant", limited_memory_init_val = -gcm.Hβ[1], hessian_approximation = "limited-memory"))

gcm.Σ = [0.09230397547736052, 0.7501323074019326]
gcm.Σ = [0.09165664315970626, 0.7031929107670822]
gcm.Σ = [0.09165652133885283, 0.7031912728126108]
This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Starting derivative checker for first derivatives.

* grad_f[          1] =  3.2030006861441693e+03    ~  3.1992723888728970e+03  [ 1.165e-03]

Derivative checker detected 1 error(s).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:        1
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total num

  51  1.1614334e+05 0.00e+00 1.44e-05 -11.0 2.73e-09    -  1.00e+00 1.00e+00f  1
gcm.Σ = [0.10031016460138147, 0.8610137479095301]
  52  1.1614334e+05 0.00e+00 3.82e-05 -11.0 2.24e-09    -  1.00e+00 2.50e-01f  3
gcm.Σ = [0.10031017298811036, 0.8610139445005375]
  53  1.1614334e+05 0.00e+00 2.16e-05 -11.0 5.94e-09    -  1.00e+00 1.00e+00f  1
gcm.Σ = [0.1003101818527073, 0.8610141337799347]
  54  1.1614334e+05 0.00e+00 1.92e-06 -11.0 7.72e-09    -  1.00e+00 1.00e+00f  1
gcm.Σ = [0.10031019928661557, 0.861014484486974]
  55  1.1614334e+05 0.00e+00 1.09e-05 -11.0 6.32e-10    -  1.00e+00 1.00e+00F  1
gcm.Σ = [0.1003102078373102, 0.8610146472155117]
  56  1.1614334e+05 0.00e+00 1.38e-05 -11.0 5.37e-10    -  1.00e+00 1.00e+00f  1
gcm.Σ = [0.10031024014199172, 0.8610152230215464]
  57  1.1614334e+05 0.00e+00 3.05e-05 -11.0 6.78e-10    -  1.00e+00 2.50e-01f  3
gcm.Σ = [0.10031026874043, 0.8610156956091026]
  58  1.1614334e+05 0.00e+00 4.35e-05 -11.0 1.50e-09    -  1.00e+00 2.50e-01f  3
gcm.Σ = 

In [51]:
@show gcm.β
@show gcm.Σ

@show gcm.∇β
@show gcm.∇Σ
GLMCopula.loglikelihood!(gcm, true, true)

gcm.β = [1.6066404505444691]
gcm.Σ = [0.10031043407683647, 0.8610180586357418]
gcm.∇β = [-1.7648436096528997e-5]
gcm.∇Σ = [0.0, 0.0]


-116143.34140684143