Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with the VE-Step optimization #100

Open
khaledbouguila opened this issue Jul 19, 2023 · 2 comments
Open

Problems with the VE-Step optimization #100

khaledbouguila opened this issue Jul 19, 2023 · 2 comments

Comments

@khaledbouguila
Copy link

khaledbouguila commented Jul 19, 2023

Two problems related to the VE-Step optimization:

  • The VE-Step optimization on $(m_1, s_1)$ and $(m_2, s_2)$ individually produces different results compared to the simultaneous optimization on $((m_1,s_1), (m_2, s_2))$. In theory, this should not be the case as we are optimizing $\mathrm{J}(M, S) = \sum_i \mathrm{g}(m_i, s_i)$. Is this behavior to be expected in terms of the optimization algorithm?
  • When initialized with self$var_par$M and self$var_par$S, which are the optimum parameters found by the VEM optimization, the VE-Step optimization changes their values and finds new optimums $M$ and $S$ (the other parameters $B$ and $\Omega$ are fixed).

This is an example on trichoptera data that illustrates the problems:

library("PLNmodels")

# choose data
data("trichoptera")

# Prepare data 
PLNdata = prepare_data(trichoptera$Abundance, trichoptera$Covariate)

# Data dimensions
n = dim(PLNdata$Abundance)[1]
p = dim(PLNdata$Abundance)[2]

# fitting data to a PLN model
PLNmodel = PLN(Abundance ~ 1, data = PLNdata)

# VE (all M_i at once):
#######################
# New data
ind = 1:n # choose from train data
Newdata = PLNdata$Abundance[ind,]

# New data size
n.new = dim(Newdata)[1]

# Parameters for the VE-Step
covariates = matrix(1, n.new, 1) # intercept
offsets = matrix(0, n.new, p) # no offsets
responses = matrix(Newdata, nrow = n.new)
weights = rep(1, n.new)

B = PLNmodel$model_par$B # Fixed
Omega = PLNmodel$model_par$Omega # Fixed
control = PLN_param(backend = "nlopt",
                    config_optim = list(maxeval=1e4, ftol_rel=1e-8, xtol_rel=1e-6))

# Initialize with the optimal M, S
M_init = matrix(PLNmodel$var_par$M[ind,], nrow = n.new)
S_init = matrix(PLNmodel$var_par$S[ind,], nrow = n.new)
# M_init = matrix(0, n.new, p)
# S_init = matrix(0.1, n.new, p)

# The VE-Step
args <- list(data = list(Y = responses, X = covariates, O = offsets, w = weights),
             params = list(M = M_init, S = S_init), 
             B = as.matrix(B),
             Omega = as.matrix(Omega),
             config = control$config_optim)
VE = do.call(PLNmodels:::nlopt_optimize_vestep, args)

# Result: the parameters M, S change
print(sum(abs(VE$M - M_init))) ## Should be zero
print(sum(abs(VE$S - S_init))) ## Should be zero
print(sum(abs(VE$Ji - PLNmodel$loglik_vec[ind,]))) ## Should be zero

# VE (for each M_i):
####################
VE_for = list(M = array(dim = c(n, p)),
              S = array(dim = c(n, p)),
              Ji = c())
for(i in ind){
  # New data
  Newdata = PLNdata$Abundance[i,]
  
  # Parameters for the VE-Step
  covariates = matrix(1, 1, 1) # intercept
  offsets = matrix(0, 1, p) # no offsets
  responses = matrix(Newdata, nrow = 1)
  weights = 1

  # Initialization with training data
  M_init_i = matrix(PLNmodel$var_par$M[i,], nrow = 1)
  S_init_i = matrix(PLNmodel$var_par$S[i,], nrow = 1)
  # M_init_i = matrix(0, 1, p)
  # S_init_i = matrix(0.1, 1, p)
  
  # The VE-Step
  args <- list(data = list(Y = responses, X = covariates, O = offsets, w = weights),
               params = list(M = M_init_i, S = S_init_i), 
               B = as.matrix(B),
               Omega = as.matrix(Omega),
               config = control$config_optim)
  VE_ = do.call(PLNmodels:::nlopt_optimize_vestep, args)
  
  # Stock results
  VE_for$M[i, ] = VE_$M
  VE_for$S[i, ] = VE_$S
  VE_for$Ji[i] = VE_$Ji
}

# Different results obtained (VE for all M_i at once - VE for each M_i):
sum(abs(VE_for$M[ind,] - VE$M)) ## Should be zero
sum(abs(VE_for$S[ind,] - VE$S)) ## Should be zero
sum(abs(VE_for$Ji[ind] - VE$Ji)) ## Should be zero

@jchiquet
Copy link
Member

jchiquet commented Jul 19, 2023

Thanks for the report.

I am on it, and it seems that there is a sign error in the objective of the VE-step

double objective = accu(w.t() * (A - Y % Z - 0.5 * log(S2))) - 0.5 * trace(Omega * nSigma);

Since we are optimizing something related to the negative log-likelihood, the trace term should be

   + 0.5 * trace(Omega * nSigma);

right?

If anybody e.g. (@mahendra-mariadassou) can double check/confirm this...

  • Ok, I double-checked myself ;-)

jchiquet added a commit that referenced this issue Jul 19, 2023
@jchiquet
Copy link
Member

Ok, now the following piece of code

library("PLNmodels")

# choose data
n <- 100
p <- 2
counts <- matrix(rpois(n*p, c(5,11)), n, p)
covariates <- matrix(1, n, 1)

# Fit PLN on data
data  <- prepare_data(counts, covariates)
model <- PLN(Abundance ~ 1 + offset(log(Offset)), data = data)

# take training data
# get parameters for the VE-step
new_n <- n
ind <- 1:new_n
new_data <- data[ind, , drop=FALSE]
new_responses  <- new_data$Abundance
new_covariates <- matrix(new_data$V1, ncol=1)
new_offsets    <- matrix(rep(log(new_data$Offset), p), ncol = p)
new_weights <- rep(1, new_n)

B <- model$model_par$B
Omega <- model$model_par$Omega
M_init <- model$var_par$M[ind, , drop=FALSE]
S_init <- model$var_par$S[ind, , drop=FALSE]

M_init <- matrix(0  , new_n, p)
S_init <- matrix(0.1, new_n, p)

args <- list(data = list(Y = new_responses, X = new_covariates, O = new_offsets, w = new_weights),
             ## Initialize the variational parameters with the new dimension of the data
             params = list(M = M_init, S = S_init),
             B = as.matrix(B),
             Omega = as.matrix(Omega) ,
             config = PLN_param()$config_optim)
VE <- do.call(PLNmodels:::nlopt_optimize_vestep, args)

mse <- function(a, b) sum((a-b)^2)
print(mse(VE$S**2, S_init**2))
print(mse(VE$M, M_init))
print(mse(VE$Ji, model$loglik_vec[ind]))

is returning, as expected

[1] 0.01935975
[1] 1.701245e-05
[1] 4.389937e-10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants