# n = 50 Autoregressive covariance structure 

We try to parameterize covariance Gamma only using two parameters rho and sigma2 as in the AR(1) structure.

For n = 50 this autoregressive model converges in 35 iterations using Quasi-Newton, 15 with Newton. 

Here we just initialize beta under independent GLM assumptions, sigma2 using MM-Algorithm letting rho = 0, then MOM using the empirical covariance between Y_1 and Y_2.

We set IPOPT convergence tolerance to 10^-6, with the adaptive mean option turned on. 

$$\mu_i= 5, \rho = 0.9, \sigma^2 = 0.1, n_i = 50$$

$$\Gamma_i = \sigma^2 * \begin{pmatrix} 1 & \rho & \rho^2 & \rho^3 & ...  &\rho^{n_i - 1}\\ \rho & 1 & \rho & \rho^2 & ... \\ & & ... & & \\ & &...& \rho & 1 & \rho \\ \rho^{n_i - 1} & \rho^{n_i - 2} & ...& \rho^2 & \rho & 1\end{pmatrix} \forall i \in [1, N = 10,000]$$ 

1. Create a new structure
2. Modify the loglikelihood function and related functions for this structure
3. Add gradient and hessian with respect to AR(1) parameterization
5. Quasi-Newton works
6. Newton works with cross terms for: rho and sigma2 and beta and sigma2

In [1]:
using DataFrames, Random, GLM, GLMCopula, Test
using LinearAlgebra, BenchmarkTools, Revise

Random.seed!(1234)

# sample size
N = 10000
# observations per subject
n = 50
ρ = 0.9
σ2 = 0.1

V = zeros(n, n) # will store the AR(1) structure without sigma2

mean = 5

dist = Poisson

"""
    get_AR_cov(n, ρ, σ2, V)
Forms the AR(1) covariance structure given n (size of cluster), ρ (correlation parameter), σ2 (noise parameter)
"""
function get_AR_cov(n, ρ, σ2, V)
    @inbounds for i in 1:n
        V[i, i] = 1.0
        @inbounds for j in i+1:n
            V[i, j] = ρ^(j-i)
            V[j, i] = V[i, j]
        end
    end
    V
end

V = get_AR_cov(n, ρ, σ2, V)

# true Gamma
Γ = σ2 * V

50×50 Array{Float64,2}:
 0.1          0.09         0.081        …  0.000636269  0.000572642
 0.09         0.1          0.09            0.000706965  0.000636269
 0.081        0.09         0.1             0.000785517  0.000706965
 0.0729       0.081        0.09            0.000872796  0.000785517
 0.06561      0.0729       0.081           0.000969774  0.000872796
 0.059049     0.06561      0.0729       …  0.00107753   0.000969774
 0.0531441    0.059049     0.06561         0.00119725   0.00107753
 0.0478297    0.0531441    0.059049        0.00133028   0.00119725
 0.0430467    0.0478297    0.0531441       0.00147809   0.00133028
 0.038742     0.0430467    0.0478297       0.00164232   0.00147809
 0.0348678    0.038742     0.0430467    …  0.0018248    0.00164232
 0.0313811    0.0348678    0.038742        0.00202756   0.0018248
 0.028243     0.0313811    0.0348678       0.00225284   0.00202756
 ⋮                                      ⋱               
 0.0018248    0.00202756   0.00225284      

In [2]:
vecd = [dist(mean) for i in 1:n]
nonmixed_multivariate_dist = NonMixedMultivariateDistribution(vecd, Γ)

Y_Nsample = simulate_nobs_independent_vectors(nonmixed_multivariate_dist, N)

10000-element Array{Array{Float64,1},1}:
 [7.0, 8.0, 7.0, 4.0, 8.0, 2.0, 6.0, 6.0, 6.0, 7.0  …  5.0, 3.0, 4.0, 6.0, 4.0, 5.0, 5.0, 6.0, 4.0, 7.0]
 [3.0, 7.0, 6.0, 7.0, 6.0, 6.0, 3.0, 3.0, 8.0, 5.0  …  3.0, 4.0, 5.0, 8.0, 2.0, 5.0, 4.0, 4.0, 8.0, 6.0]
 [6.0, 3.0, 3.0, 7.0, 4.0, 3.0, 6.0, 2.0, 1.0, 8.0  …  2.0, 4.0, 3.0, 4.0, 6.0, 3.0, 4.0, 4.0, 3.0, 5.0]
 [9.0, 4.0, 5.0, 3.0, 5.0, 5.0, 4.0, 12.0, 4.0, 5.0  …  4.0, 8.0, 7.0, 9.0, 5.0, 3.0, 8.0, 5.0, 4.0, 4.0]
 [5.0, 7.0, 4.0, 7.0, 4.0, 1.0, 7.0, 3.0, 5.0, 4.0  …  9.0, 7.0, 3.0, 1.0, 3.0, 5.0, 3.0, 6.0, 5.0, 5.0]
 [6.0, 6.0, 10.0, 2.0, 4.0, 2.0, 6.0, 7.0, 1.0, 5.0  …  2.0, 6.0, 3.0, 6.0, 3.0, 6.0, 4.0, 4.0, 4.0, 5.0]
 [5.0, 4.0, 6.0, 3.0, 4.0, 6.0, 0.0, 3.0, 3.0, 3.0  …  4.0, 6.0, 3.0, 3.0, 5.0, 1.0, 4.0, 1.0, 5.0, 10.0]
 [6.0, 4.0, 5.0, 7.0, 5.0, 8.0, 5.0, 3.0, 10.0, 1.0  …  5.0, 4.0, 5.0, 3.0, 4.0, 6.0, 5.0, 4.0, 7.0, 2.0]
 [4.0, 5.0, 8.0, 5.0, 5.0, 4.0, 7.0, 3.0, 3.0, 6.0  …  5.0, 2.0, 4.0, 7.0, 6.0, 5.0, 6.0, 5.0, 7.0, 4.0]
 [4.0, 6.0

In [3]:
Random.seed!(1234)

d = Poisson()
link = LogLink()
D = typeof(d)
Link = typeof(link)
T = Float64
gcs = Vector{GLMCopulaARObs{T, D, Link}}(undef, N)

for i in 1:N
    y = Float64.(Y_Nsample[i])
    X = ones(n, 1)
    gcs[i] = GLMCopulaARObs(y, X, d, link)
end

gcm = GLMCopulaARModel(gcs);

In [4]:
initialize_model!(gcm)
@show gcm.β
@show exp.(gcm.β)
@show gcm.ρ
@show gcm.σ2;

initializing β using Newton's Algorithm under Independence Assumption
1 0.0 -1.1094692620433543e6 39999
2 -1.1094692620433543e6 -1.1094692620433543e6 39999
gcm.β = [1.6135629925799966]
exp.(gcm.β) = [5.020668000000001]
gcm.ρ = [1.0]
gcm.σ2 = [0.06614485202244537]


In [5]:
Y_1 = [Y_Nsample[i][1] for i in 1:N]
Y_2 = [Y_Nsample[i][2] for i in 1:N]

update_rho!(gcm, Y_1, Y_2)
@show exp.(gcm.β)
@show gcm.ρ
@show gcm.σ2;

exp.(gcm.β) = [5.020668000000001]
gcm.ρ = [0.6837238623452251]
gcm.σ2 = [0.06614485202244537]


In [6]:
loglikelihood!(gcm, true, true)

gcm1 = deepcopy(gcm);
gcm2 = deepcopy(gcm);

# Quasi-Newton w/ adaptive mu

## Number of Iterations....: 20

## 7.8 seconds

In [7]:
@time GLMCopula.fit!(gcm1, IpoptSolver(print_level = 5, max_iter = 100, tol = 10^-6, mu_strategy = "adaptive", mu_oracle = "loqo", hessian_approximation = "limited-memory"))

gcm.θ = [1.6135629925799966, 0.6837238623452251, 0.06614485202244537]
gcm.θ = [1.6135629925799966, 0.6837238623452251, 0.06614485202244537]

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:        3
                     variables with only lower bounds:        1
         

-1.1087314092415082e6

# Newton w/ adaptive mu
## Number of Iterations....: 12

## 7.38 seconds

In [8]:
@time GLMCopula.fit!(gcm2, IpoptSolver(print_level = 5, max_iter = 100, tol = 10^-6, mu_strategy = "adaptive", mu_oracle = "loqo", hessian_approximation = "exact"))

gcm.θ = [1.6135629925799966, 0.6837238623452251, 0.06614485202244537]
gcm.θ = [1.6135629925799966, 0.6837238623452251, 0.06614485202244537]
This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        5

Total number of variables............................:        3
                     variables with only lower bounds:        1
                variables with lower and upper bounds:        1
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        ineq

-1.1087314092415036e6

In [9]:
@show gcm1.∇θ   # default quasi newton
@show gcm2.∇θ;  # newton default adaptive

gcm1.∇θ = [3.1941392375500754e-5, -1.1120928782304418e-5, -1.853362216586163e-6]
gcm2.∇θ = [-4.452884005345936e-5, 1.7802322949123095e-7, 1.8150069429623272e-6]


In [10]:
@show gcm1.θ  # default quasi newton
@show gcm2.θ; # newton default adaptive

gcm1.θ = [1.610834973594887, 0.8961115503514183, 0.09889395343093092]
gcm2.θ = [1.6108349736379017, 0.8961115501575265, 0.09889395333350837]


In [11]:
@show loglikelihood!(gcm1, true, true)  # default quasi newton
@show loglikelihood!(gcm2, true, true); # newton default adaptive

loglikelihood!(gcm1, true, true) = -1.1087314092415082e6
loglikelihood!(gcm2, true, true) = -1.1087314092415036e6
