-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Covariate-dependent latent variables #31
Comments
Hei @aminorberg
In personal communication, @ovaskain claimed that they encountered multiple challenges with this class of models, both in terms of sensitivity of prior assumptions and/or sophisticated exploration of the posterior typical set using available MCMC algorithm. |
Thanks @gtikhonov! By working example I mean e.g. an example using the data provided with the package, since I can't even get that working, and I think it's just that I don't know the syntax. This is what I tried: studyDesign <- data.frame(sample = as.factor(1:50),
plot = as.factor(sample(1:20, 50, replace = TRUE)))
rL2 <- HmscRandomLevel(units = TD$studyDesign$plot)
rL1 <- HmscRandomLevel(xData = data.frame(x1 = rep(1, length(TD$X$x1)),
x2 = TD$X$x2))
m <- Hmsc(Y = TD$Y, XData = TD$X,
XFormula = ~x1+x2,
studyDesign = studyDesign,
ranLevels = list("sample" = rL1, "plot" = rL2))
ps <- sampleMcmc(m, samples = 1000) And when attempting to sampleMcmc, I get an error: Error in (Eta[[r]][Pi[, r], ] * rL[[r]]$x[as.character(dfPi[, r]), k]) %*% :
non-conformable arguments
In addition: Warning message:
In Ops.factor(Eta[[r]][Pi[, r], ], rL[[r]]$x[as.character(dfPi[, :
‘*’ not meaningful for factors |
Looks like a bug. Will have a close look at this. |
Looks like this comes from for(r in seq_len(nr)){
if(rL[[r]]$xDim == 0){
...
} else{
LRan[[r]] = matrix(0,ny,ns)
for(k in 1:rL[[r]]$xDim)
LRan[[r]] = LRan[[r]] + (Eta[[r]][Pi[,r],]*rL[[r]]$x[as.character(dfPi[,r]),k]) %*% Lambda[[r]][,,k]
}
} and basically the error is caused by this statement with innermost loop at rL[[r]]$x[as.character(dfPi[,r]),k]
[1] o o o o o o o o o o o o o o o o o o o o o o o o o c c c c c c c c c c c c c
[39] c c c c c c c c c c c c
Levels: c o which is a factor that is not meaningful in multiplication (like the warning said), and when used within Eta[[r]][Pi[, r], ] * rL[[r]]$x[as.character(dfPi[,r]),k] The result drops dimensions (50,2) and returns a vector of 100 |
Thanks, @jarioksa, I was going to mark the same thing. We have not yet implemented the
|
@gtikhonov : It works in this case where model.matrix(reformulate(names(TD$X[, -1])), TD$X)
|
Hi, I have also been exploring this recently. Despite the minor bugs the models seem to fit. Regarding how to compute the Lambdas per site as a function of the covariates, I have tried (continuing from Gleb's working example above): #Covariates for plot-level latent variable
xmat = as.matrix(m$rL[[1]]$x)
dim(xmat)
#Residual correlations given observed covariates
OmegaCor = getPostEstimate(ps, "OmegaCor", r=1, x=xmat)$mean
OmegaCor
#Residual correlations given observed covariates for first site
OmegaCor = getPostEstimate(ps, "OmegaCor", r=1, x=xmat[1,])$mean
OmegaCor |
@oysteiop, this does not look like a valid approach:
The package |
Thanks for spotting this typo. I have edited my comment above to you much simpler solution. |
I think you are on a dangerous path: these tricks work when you study a case with a two-level factor where the internal numeric coding will give the correct contrast. However, this approach fails when you have a factor with three levels, where the internal numeric coding just changes the factor into an arbitrary continuous variable. Most dangerously, it will fail silently: no error messages, but wrong results. You must study the model matrix which changes a three-level factor into two contrast variables (and p-level factor into p-1 contrast variables). Solving a margin case will bring trouble and grievance later in life. |
Hmm, @gtikhonov, running your example results in a new error: |
@jarioksa , I am not sure whom did you address, but here are some thoughts of mine regarding this aspect. To my mind, any of the package's internal functionality on for simplifying user's life when specifying the covariate matrix (e.g. XData + XFormula way) are potentially unsafe, since a user can easily define something very different that he/she desires without noticing it. However, I also do fully recognise that it is often nuch more gandy in this way + Otso had some ideas on using this formula-empowered way for more informative postprocessing (not implemented yet). |
@aminorberg is the problem still actual? As far as I understood from the responses by other collaborators on this branch, for them that example code was executing without issues... |
@gtikhonov I reinstalled Hmsc (master version), and now your example works without errors. So only the CRAN version gives the error. Thank you! |
@gtikhonov my problems with cov-dep latents variables continued as I tried to compute predicted values: cv_partition <- createPartition(ps, nfolds = 2, column = "sample")
cv_preds <- computePredictedValues(ps, partition = cv_partition, expected = FALSE)
Error in LambdaPostMean[k, ] : incorrect number of dimensions Related to the |
Immediate reason for this is that in your example, internal item |
@gtikhonov Is there any chance that the predict function would be updated for cov-dep latent variables some time soon? |
Hi @aminorberg - I've been following this discussion and also have a few projects where I'm hoping to use the covariate-dependent association features. Am I correct in understanding that covariate(s) must be categorical? |
Hi, I'm still struggling to get the covariate-dependent functionality to work. Error in dimnames(x) <- dn : This seems to be because in the postList object created by poolMcmcChains in computeAssociations, the element of Lambda that corresponds to the covariate-dependent random level (rL1) has two species * species matrices (see also @jarioksa comment on 9th Dec 2019): postList[[1]]$Lambda [[1]] [,1] [,2] [,3] [,4] , , 2 [,1] [,2] [,3] [,4] [[2]] I can update computeAssociations to work around this, but I'm not certain how to use these two matrices to understand how the species associations are affected by the environmental variable. Am I right in thinking that the first matrix is the species associations independent of the environmental variable and the second matrix is the species associations dependent on the selected environmental variable? Presumably this is explained in the Matlab code for the 2017 paper, but that's not accessible to me. convertToCodaObject and the predict functions etc... also fails for the same reason I think. Again, I can fix for my data, but would need to understand what the two Lambda matrices are. Thanks for any light you're able to shed! |
Would any of the developers happen to have a full working example of the usage of the covariate-dependent latent variables?
The text was updated successfully, but these errors were encountered: