# Fit Neg Binomial model 

Taking inspiration from [GLM.jl](https://github.com/JuliaStats/GLM.jl/blob/master/src/negbinfit.jl#L68), we will:
+ Initialize `r` with Poisson regression fit
+ Perform block updates:
    - Fix $r$, fit negative binomial copula until convergence
    - Fix $\beta$ and variance compoenent parameters, fit $r$ using Newton
    - Repeat until convergence
    
# No block updates (only initialize $r$)

In [28]:
using Revise
using DataFrames, Random, GLM, GLMCopula
using ForwardDiff, Test, LinearAlgebra
using LinearAlgebra: BlasReal, copytri!

Random.seed!(1234)

# sample size
N = 10000
# observations per subject
n = 5

variance_component_1 = 0.1
variance_component_2 = 0.5

r = 100
p = 0.7
μ = r * (1-p) * inv(p)

# var = r * (1-p) * inv(p^2)

# true beta
β_true = log(μ)

dist = NegativeBinomial

Γ = variance_component_1 * ones(n, n) + variance_component_2 * Matrix(I, n, n)
vecd = [dist(r, p) for i in 1:n]
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ)

Y_Nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, N)

10000-element Vector{Vector{Float64}}:
 [35.0, 32.0, 49.0, 37.0, 53.0]
 [55.0, 40.0, 39.0, 45.0, 35.0]
 [34.0, 42.0, 43.0, 60.0, 34.0]
 [41.0, 39.0, 34.0, 43.0, 55.0]
 [25.0, 60.0, 26.0, 32.0, 53.0]
 [42.0, 41.0, 46.0, 44.0, 47.0]
 [58.0, 47.0, 45.0, 42.0, 43.0]
 [48.0, 52.0, 39.0, 37.0, 56.0]
 [43.0, 33.0, 38.0, 39.0, 38.0]
 [44.0, 44.0, 39.0, 37.0, 36.0]
 [34.0, 36.0, 45.0, 35.0, 39.0]
 [39.0, 50.0, 51.0, 32.0, 42.0]
 [34.0, 47.0, 36.0, 50.0, 37.0]
 ⋮
 [49.0, 46.0, 53.0, 42.0, 34.0]
 [54.0, 46.0, 48.0, 36.0, 49.0]
 [49.0, 54.0, 38.0, 45.0, 32.0]
 [36.0, 47.0, 53.0, 54.0, 45.0]
 [34.0, 39.0, 42.0, 41.0, 59.0]
 [35.0, 38.0, 37.0, 49.0, 48.0]
 [35.0, 46.0, 53.0, 29.0, 34.0]
 [30.0, 56.0, 48.0, 46.0, 40.0]
 [40.0, 47.0, 35.0, 38.0, 48.0]
 [45.0, 37.0, 37.0, 28.0, 40.0]
 [29.0, 35.0, 53.0, 44.0, 46.0]
 [47.0, 48.0, 54.0, 38.0, 47.0]

In [29]:
d = NegativeBinomial()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{NBCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = NBCopulaVCObs(y, X, V, d, link)
end
gcm = NBCopulaVCModel(gcs);
# gcm.r[1] = r

In [30]:
initialize_model!(gcm)
@show gcm.β
# @show gcm.Σ
@show gcm.r

Initializing NegBin r to Poisson regression values
initializing β using Newton's Algorithm under Independence Assumption
1 0.0 -183965.880323221 39999
2 -183965.880323221 -183965.880323221 39999
initializing variance components using MM-Algorithm
gcm.Σ = [225.63027521074963, 1857.7441889422705]
gcm.θ = [-5.404042280046205, 224.7201241053222, 1864.4405333056088]
gcm.θ = [3.76335082451623, 225.63027521074963, 1857.7441889422705]
gcm.θ = [3.76335082451623, 225.63027521074963, 1857.7441889422705]
This is Ipopt version 3.12.10, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Starting derivative checker for first derivatives.

* grad_f[          2] = -1.1624128938060323e+01    ~ -1.1610857440697405e+01  [ 1.143e-03]
* grad_f[          3] =  1.4000239622658115e+00    ~  1.3984520467760686e+00  [ 1.124e-03]

Derivative checker detected 2 error(s).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros

1-element Vector{Float64}:
 64.78937959114738

In [31]:
β_true

3.757872325600888

Initialize β and σ2, here I just copy the solution for β and σ2 from MixedModels.jl over

In [32]:
GLMCopula.loglikelihood!(gcm, true, true)

-178909.94656519336

## fitting routine

In [33]:
# Quasi-Newton
@time GLMCopula.fit!(gcm, IpoptSolver(print_level = 5, max_iter = 100,
#     tol = 10^-6, mu_strategy = "adaptive", mu_oracle = "loqo",
#     mehrotra_algorithm="yes", warm_start_init_point="yes",
    hessian_approximation = "limited-memory"))

This is Ipopt version 3.12.10, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0 

-178909.59176759253

In [34]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.7611786562926066; true β = 3.757872325600888
estimated variance component 1 = 0.04829707124136521; true variance component 1 = 0.1
estimated variance component 2 = 0.04440991310090281; true variance component 2 = 0.5
estimated r = 64.78937959114738; true r = 100


In [27]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 1.456226847936331; true β = 1.4552872326068422
estimated variance component 1 = 0.10133551134062899; true variance component 1 = 0.1
estimated variance component 2 = 0.4371874111348784; true variance component 2 = 0.5
estimated r = 9.668018779781647; true r = 10


# Block updates

In [73]:
using Revise
using DataFrames, Random, GLM, GLMCopula
using ForwardDiff, Test, LinearAlgebra
using LinearAlgebra: BlasReal, copytri!

# Random.seed!(1234)
Random.seed!(12345)

# sample size
N = 10000
# observations per subject
n = 5

variance_component_1 = 0.1
variance_component_2 = 0.5

r = 100
p = 0.7
μ = r * (1-p) * inv(p)

# var = r * (1-p) * inv(p^2)

# true beta
β_true = log(μ)

dist = NegativeBinomial

Γ = variance_component_1 * ones(n, n) + variance_component_2 * Matrix(I, n, n)
vecd = [dist(r, p) for i in 1:n]
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ)

Y_Nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, N)

10000-element Vector{Vector{Float64}}:
 [49.0, 55.0, 38.0, 39.0, 38.0]
 [38.0, 30.0, 45.0, 27.0, 42.0]
 [49.0, 63.0, 49.0, 43.0, 35.0]
 [50.0, 40.0, 35.0, 61.0, 38.0]
 [27.0, 52.0, 51.0, 42.0, 39.0]
 [42.0, 47.0, 39.0, 68.0, 36.0]
 [33.0, 52.0, 38.0, 51.0, 50.0]
 [50.0, 44.0, 39.0, 47.0, 56.0]
 [49.0, 53.0, 56.0, 56.0, 48.0]
 [43.0, 44.0, 54.0, 45.0, 36.0]
 [44.0, 36.0, 34.0, 42.0, 29.0]
 [48.0, 37.0, 40.0, 35.0, 49.0]
 [44.0, 39.0, 43.0, 43.0, 33.0]
 ⋮
 [40.0, 31.0, 46.0, 47.0, 51.0]
 [38.0, 53.0, 58.0, 40.0, 36.0]
 [44.0, 53.0, 37.0, 57.0, 30.0]
 [36.0, 33.0, 40.0, 43.0, 37.0]
 [44.0, 35.0, 28.0, 35.0, 42.0]
 [42.0, 26.0, 51.0, 34.0, 42.0]
 [42.0, 33.0, 47.0, 36.0, 36.0]
 [43.0, 49.0, 37.0, 48.0, 55.0]
 [37.0, 32.0, 27.0, 34.0, 46.0]
 [50.0, 43.0, 38.0, 32.0, 32.0]
 [45.0, 43.0, 54.0, 47.0, 42.0]
 [41.0, 38.0, 38.0, 36.0, 63.0]

In [74]:
d = NegativeBinomial()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{NBCopulaVCObs{T, D, Link}}(undef, N)
for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    V = [ones(n, n), Matrix(I, n, n)]
    gcs[i] = NBCopulaVCObs(y, X, V, d, link)
end
gcm = NBCopulaVCModel(gcs);
# gcm.r[1] = r

In [75]:
initialize_model!(gcm)
@show gcm.β
# @show gcm.Σ
@show gcm.r

Initializing NegBin r to Poisson regression values
initializing β using Newton's Algorithm under Independence Assumption
1 0.0 -183674.41930971833 39999
2 -183674.41930971833 -183674.41930971833 39999
initializing variance components using MM-Algorithm
gcm.Σ = [204.3806181911617, 1875.9810978049666]
gcm.θ = [-5.404691326231122, 203.47046708573427, 1882.6774421683049]
gcm.θ = [3.7627017783313135, 204.3806181911617, 1875.9810978049666]
gcm.θ = [3.7627017783313135, 204.3806181911617, 1875.9810978049666]
This is Ipopt version 3.12.10, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Starting derivative checker for first derivatives.

* grad_f[          2] = -1.2103501333214426e+01    ~ -1.2157011253807358e+01  [ 4.402e-03]
* grad_f[          3] =  1.3070683144568718e+00    ~  1.3081313671431383e+00  [ 8.126e-04]

Derivative checker detected 2 error(s).

Number of nonzeros in equality constraint Jacobian...:        0
Number of 

1-element Vector{Float64}:
 65.68792280627983

In [76]:
β_true

3.757872325600888

In [77]:
GLMCopula.loglikelihood!(gcm, true, true)

-178754.66061963522

In [78]:
# Quasi-Newton
@time GLMCopula.fit!(gcm, IpoptSolver(print_level = 5, max_iter = 10,
#     tol = 10^-6, mu_strategy = "adaptive", mu_oracle = "loqo",
#     mehrotra_algorithm="yes", warm_start_init_point="yes",
    hessian_approximation = "limited-memory"))

This is Ipopt version 3.12.10, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:        3
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0 

-178726.81186502095

## 1 iteration each

In [58]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.7640471229274333; true β = 3.757872325600888
estimated variance component 1 = 0.010091096783959048; true variance component 1 = 0.1
estimated variance component 2 = 0.010061528161806528; true variance component 2 = 0.5
estimated r = 59.22108970817275; true r = 100


## 5 iteration each

In [45]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.761120679347187; true β = 3.757872325600888
estimated variance component 1 = 0.04200544450800426; true variance component 1 = 0.1
estimated variance component 2 = 0.020655323963911138; true variance component 2 = 0.5
estimated r = 64.2226158070482; true r = 100


## 10 iteration each

In [51]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.7577367229152965; true β = 3.757872325600888
estimated variance component 1 = 0.08454686818713725; true variance component 1 = 0.1
estimated variance component 2 = 0.3859156499408988; true variance component 2 = 0.5
estimated r = 96.07852027325903; true r = 100


## 1 IPOPT iter + complete Newton fit

In [65]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.762274167394871; true β = 3.757872325600888
estimated variance component 1 = 0.044468809218469255; true variance component 1 = 0.1
estimated variance component 2 = 0.046354142614646404; true variance component 2 = 0.5
estimated r = 67.38079696512439; true r = 100


## 5 IPOPT iter + complete Newton fit

In [72]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.759537350779348; true β = 3.757872325600888
estimated variance component 1 = 0.05312813376763176; true variance component 1 = 0.1
estimated variance component 2 = 0.07266779547787298; true variance component 2 = 0.5
estimated r = 70.93622181411288; true r = 100


## 10 IPOPT iter + complete Newton fit (this reached max iter count)

In [79]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.757598085312234; true β = 3.757872325600888
estimated variance component 1 = 0.08847643797375318; true variance component 1 = 0.1
estimated variance component 2 = 0.418885780706873; true variance component 2 = 0.5
estimated r = 98.01186301721411; true r = 100


## Copying GLM.jl

In [7]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 1.456226847936331; true β = 1.4552872326068422
estimated variance component 1 = 0.10133551134062899; true variance component 1 = 0.1
estimated variance component 2 = 0.4371874111348784; true variance component 2 = 0.5
estimated r = 9.69061414063922; true r = 10


In [73]:
println("estimated β = $(gcm.β[1]); true β = $β_true")
println("estimated variance component 1 = $(gcm.Σ[1]); true variance component 1 = $variance_component_1")
println("estimated variance component 2 = $(gcm.Σ[2]); true variance component 2 = $variance_component_2")
println("estimated r = $(gcm.r[1]); true r = $r");

estimated β = 3.758435534322703; true β = 3.757872325600888
estimated variance component 1 = 0.0934850410967605; true variance component 1 = 0.1
estimated variance component 2 = 0.3827355454688751; true variance component 2 = 0.5
estimated r = 94.45211633725886; true r = 100
