{{ message }}

Problems with convergence of hierarchical models.#281

Closed
opened this issue May 7, 2019 · 5 comments
Closed

Problems with convergence of hierarchical models.#281

opened this issue May 7, 2019 · 5 comments urchgene commented May 7, 2019 • edited

 Hi @goldingn, I saw that you had a discussion about this exact issue in here [https://github.com//issues/107] but I am still having same issues. First of all I am modeling a cov matrix using: Omega1 = diag(sig_vec1) %*% LKJ(n) %*% diag(sig_vec1) Omega2 = diag(sig_vec2) %*% LKJ(n) %*% diag(sig_vec2) Then I found your post trying to avoid cholesky for a mvNormal using this cov matrix by doing: Omega_U <- greta:::as.greta_array(Omega\$node\$representations\$cholesky_factor) U <- sweep(Omega_U, 2, sigma, '*') I use Omega1 to draw samples from mvNormal as a conditional for mu as: mu = XB + z %*% Omega1 where z = N(0,1). To complicate mine, i have different mu for my mvNormal so I tried to write a function to get the density of observations given Omega using the re-parameterization trick: logdet <- function (cholx){ ll <- 2 * sum(log(diag(cholx))) ll } ### Sigma here should be chol(Sigma) dmvn <- function(x, mu, Sigma){ k <- nrow(Sigma) Omega <- chol2inv(Sigma) ss <- x - mu z <- rowSums({ss %*% Omega} * ss) #zz <- c() #for (i in 1:nrow(x)){ zz[i] <- sum(z[i,])} dens <- -0.5 * (k * log(2 * pi) + logdet(Sigma)) - (0.5 * z) return(dens) } Finally, I try to pass this to greta distribution as: # get Likelihood and logp from Greta distribution(Y) <- dmvn(Y, mu, Omega2) But I get the error message: Error: cannot convert objects with missing or infinite values to greta_arrays. I know for sure I am doing many things wrong here since the parameters cannot converge! My questions will be: How can I recover the cholesky_factor for the LKJ in greta because If I follow what is above I have an error saying the greta_array is an atomic vector. Some of the parameters in Omega2 converge but NONE converged for Omega1 if I do distribution(Y) <- multivariate_normal(mu, Rcov, n_realisations=nrow(mu)) instead of using dmvn function I have . This is the reason I am trying to use the function to avoid the problem. Your guidance and answers will be very valuable. Thanks. The text was updated successfully, but these errors were encountered: goldingn commented May 8, 2019 • edited

 The error is unsurprising, as the object on the right hand side of distribution<- needs to be a greta array with a probability distribution, not an R function. So you either need to use greta's existing probability distribution constructors (like multivariate_normal()) or write your own distribution R6 class, and function to create a variable greta array following that distribution. You can see examples of how these are written here and here, but this is not intended for users, is not documented, requires writing tensorflow code, and is not at all easy to do. I'm not sure you need to do that at all though. If you observe the variable (Y) that follows a multivariate normal distribution, then there is no need to a decentred parameterisation. That parameterisation is to avoid awkward shapes in the posterior that arise from both the variable and its covariance being unknown. If you know one of them, then it isn't a problem. There is still a computational benefit (and reduction in numerical issues) to avoiding an extra Cholesky decomposition though, so I'm assuming you want to work with the Cholesky factor for that reason. If not (and there's no point playing with Cholesky factors unless Omega2 is very large), then I would suggest you just use distribution(Y) <- multivariate_normal(mu, Omega2). If you really do need to skip the additional Cholesky factorisation that multivariate_normal() would do in that case, then you can adapt the following code. The trick is to get the Cholesky factor for the correlation matrix, do the operations to yield the Cholesky factor for the covariance matrix, and then convert the Cholesky factor for the correlation matrix back to the symmetric matrix, so it can be passed into multivariate_normal(). If the covariance matrix passed into multivariate_normal() has a Cholesky factor representation (and chol_to_symmetric makes sure it does), then it uses that in exactly the way you describe above, avoiding the need to do another Cholesky factorisation. # get some internal greta functions as.greta_array <- .internals\$greta_arrays\$as.greta_array get_node <- .internals\$greta_arrays\$get_node chol_to_symmetric <- .internals\$utils\$greta_array_operations\$chol_to_symmetric # fake data n_dim <- 5 n_obs <- 10 Y <- matrix(rnorm(n_obs * n_dim), n_obs, n_dim) # parameters sds <- normal(0, 1, dim = n_dim, truncation = c(0, Inf)) correl <- lkj_correlation(dimension = n_dim, eta = 1) mu <- zeros(n_obs, n_dim) # hack to find the cholesky factor representation correl_U <- as.greta_array(get_node(correl)\$representations\$cholesky) # convert to covariance on the cholesky scale Sigma_U <- sweep(correl_U, 2, sds, '*') # convert back from cholesky representation to a symmetric matrix Sigma <- chol_to_symmetric(Sigma_U) # use in likelihood distribution(Y) <- multivariate_normal(mu, Sigma) This obviously uses some internal functions. Ideally we could use the representations approach to hide this computational trick from the user. That would probably require us to add quad_form() and quad_form_diag() functions to greta, like those in Stan. That way we'll be able to detect that we can propagate Cholesky factors here (whereas with %*%, we wouldn't know the user's intent). urchgene commented May 8, 2019

 Thanks a lot @goldingn ... This is very helpful!! urchgene commented May 8, 2019 • edited

 Hi @goldingn, one more question... How do I obtain the inverse of this Sigma_U matrix # convert to covariance on the cholesky scale Sigma_U <- sweep(correl_U, 2, sds, '*') When I try chol2inv(), I get an error downstream that it cannot be converted to greta_array cos of infinite values. I am trying to use it in an operation to simulate from MvNormal using the precision matrix where: Sigma_A = kronecker( inv(Sigma_U), Ainv) Sigma_A is a very large dense matrix and I do not want to use the multivariate_normal () for it. Thanks. goldingn commented May 8, 2019

 That sounds like a separate issue. Could you please open an new issue and provide a reproducible example so I can recreate the error. urchgene commented May 9, 2019

 Hi @goldingn, I tested and I think it's not a general issue. I solved it by using solve(chol_to_symmetric(Sigma_U)) For a fake data example I tested chol2inv(sweep(correl_U, 2, sds, '*')) works but doesn't for my real data so I won't regard it as an issue. Thanks again.