Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic error during fitting or cross-validation in: sampleMcmc -> alignPosterior -> abind #27

Closed
rburner opened this issue Oct 23, 2019 · 20 comments
Labels
bug Something isn't working

Comments

@rburner
Copy link

rburner commented Oct 23, 2019

Hi. When fitting several lists of 3-4 models I've been getting the following error, but not every time.

For example, when I run fit the models using short chains (e.g. 50 samples with thin = 3) as a test, all three models will fit and cross-validate successfully.

But, when I try with more samples (using the mcmc parameters as below) I eventually get the error I show below. Sometimes the error comes up during model fitting, other times during cross-validation. Sometime on the first model in the list, sometimes the second.

The models all have 100 sites, 100 species, 5 continuous environmental covariates, 3 trait covariates (one of which is a 3-level factor), and a phylogeny. Random effects include site (non-spatial), year, and project.

#read in models
redHPC1=readRDS("t02ns.rds")

#make list for cross validation results
crossFit=list("a")

#set fitting parameters
nChains = 2
thin = 5
samples = 1000
transient = 400thin
verbose = 150
thin

for (i in 1:length(redHPC1)) {
#run the model in parallel
redHPC1[[i]] = sampleMcmc(redHPC1[[i]], thin = thin, samples = samples,
transient = transient, nChains = nChains, nParallel = 2,
verbose = verbose, initPar = "fixed effects")

#save results
saveRDS(redHPC1, file = "t02nsFIT.rds")

#do cross-validation
partition = createPartition(redHPC1[[i]], nfolds = 2, column = "Site")
preds = computePredictedValues(redHPC1[[i]],partition=partition,nParallel = 2)
crossFit[[i]] = evaluateModelFit(hM=redHPC1[[i]], predY=preds)

#save cross-validation results
saveRDS(crossFit, file = "t02nsCROSS.rds")
}`

[1] "Setting updater$Gamma2=FALSE due to specified phylogeny matrix"
Error in abind(cpL[[j]]$Delta[[r]], array(1, DeltaAddDim), along = 1) :
arg 'X2' has dims=1, 2; but need dims=X, 1
Calls: sampleMcmc -> alignPosterior -> abind

@gtikhonov
Copy link
Member

I think I fixed this issue just an hour ago - was some unclearly-sourced bug happening at the stage of aligning the number of latent factors on different chains after the HMSC model was fitted (commit 45f6071). Please try again now.

@rburner
Copy link
Author

rburner commented Oct 23, 2019

@gtikhonov Great, thank you! I'll retry

@rburner
Copy link
Author

rburner commented Oct 23, 2019

@gtikhonov I was about to ask about this issue too, but maybe it got fixed by the same modifications?

Same pattern, except this error shows up when I'm using a spatial random effect. Models again run fine with short chains, and sometimes fit and cross-validate with long chains, but eventually I get this error (either at fitting or cross val):

Error in if (alphapw[alpha[h], 1] > 0) { : argument is of length zero
Calls: computePredictedValues -> predict -> predict.Hmsc -> predictLatentFactor

@gtikhonov
Copy link
Member

Yes, this issue can have the same conceptual source - varying number of factors in different chains. Did it happen with recent versions?

@rburner
Copy link
Author

rburner commented Oct 23, 2019

Yes, it happened with an installation from ~16 October

@gtikhonov
Copy link
Member

Well, currently I do not notice any logical flaws which could lead to such situation. Although this does not guarantee that there are none. Just to ensure several things:
0) you are using latest GitHub master branch version of the package

  1. in the call to sampler the aligning argument is set to true: sampleMcmc(..., alignPost=TRUE)
  2. you actually refit the model, not just running post processing using some model fitted with older version of code
  3. if both 1 and 2 are satisfied, try to run alignPosterior(hM) prior to your call to predict(...). If any of the methods throw an error, there is definitely some bug.

You can start with point 3) if refitting your models takes considerable time. If you can replicate this behaviour with any sharable code that I can rerun, that would be helpful.

@rburner
Copy link
Author

rburner commented Oct 23, 2019

Ok, I will work through that list and see what I can figure out. The first thing I notice is that I was using the CRAN version, so I will re-install from github.

I haven't included alignPost=TRUE when using sampleMcmc but can start...but TRUE is maybe the default?

@rburner
Copy link
Author

rburner commented Oct 23, 2019

Ok, I re-installed from github using (correct?):
install_github("hmsc-r/HMSC", build_opts = c("--no-resave-data", "--no-manual"),force=TRUE)

Then I ran the following code and still go the same error. This was on an HPC cluster just fyi.

Maybe I can upload a model object for you to try?
Thanks!

#####################################
R version 3.5.3 (2019-03-11) -- "Great Truth"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

#call packages
library(usdm)
Loading required package: sp
Loading required package: raster
library(Hmsc)
Loading required package: coda
library(MASS)
library(rockchalk)
library(ape)
library(corrplot)
library(MCMCpack)

#set working directory
setwd("/ddnB/work/rburne4/HmscFiles")

#read in models
redHPC1=readRDS("t03.rds")

#make list for cross validation results
crossFit=list("a")

#set fitting parameters
nChains = 2
thin = 5
samples = 1000
transient = 400thin
verbose = 150
thin

for (i in 1:length(redHPC1)) {
#run the model in parallel
redHPC1[[i]] = sampleMcmc(redHPC1[[i]], thin = thin, samples = samples,
transient = transient, nChains = nChains, nParallel = 2,
verbose = verbose, initPar = "fixed effects", alignPost = TRUE)

redHPC1[[i]]=alignPosterior(hM=redHPC1[[i]])

#save results
saveRDS(redHPC1, file = "t03FIT2.rds")

#do cross-validation
partition = createPartition(redHPC1[[i]], nfolds = 2, column = "Site")
preds = computePredictedValues(redHPC1[[i]],partition=partition,nParallel = 2)
crossFit[[i]] = evaluateModelFit(hM=redHPC1[[i]], predY=preds)

#save cross-validation results
saveRDS(crossFit, file = "t03CROSS2.rds")
}
[1] "Setting updater$Gamma2=FALSE due to specified phylogeny matrix"
[1] "Cross-validation, fold 1 out of 2"
[1] "Setting updater$Gamma2=FALSE due to specified phylogeny matrix"
Error in if (alphapw[alpha[h], 1] > 0) { : argument is of length zero
Calls: computePredictedValues -> predict -> predict.Hmsc -> predictLatentFactor
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

@gtikhonov gtikhonov added the bug Something isn't working label Oct 24, 2019
@gtikhonov
Copy link
Member

gtikhonov commented Oct 24, 2019

I've fixed the minor bug that caused this misbehaviour. It was just due ti setting some indices to 0 instead of 1. Perhaps I just had too much Python coding recently and it did not catch my eye immediately.
You do probably need to refit the model though, or do some direct manipulation with the

m$postList[[chain_index]][[sample_index]]$Alpha[[spatial_latent_factor_index]]

objects - these are vectors and all values of 0 must be replaced to 1. Then you can use the fitted object in the postprocessing.

@rburner
Copy link
Author

rburner commented Oct 24, 2019

Great, thank you for the quick response! Will give it a try.

@rburner rburner closed this as completed Oct 24, 2019
@plthompson
Copy link

Hi Gleb,

I seem to be getting the same error when running multiple chains using the sampleMcmc function.

Error in abind(cpL[[j]]$Delta[[r]], array(1, DeltaAddDim), along = 1) :
arg 'X2' has dims=1, 2; but need dims=X, 1

I don't get it every time, but it is the majority of times. It seems to be an issue with joining the chains at the end as the chains appear to run fine. I haven't yet had an issue when running a single chain.

Here is my model structure:

m <- Hmsc(Y = Y,
XData = XData,
XFormula = ~ poly(temp, degree = 2, raw = TRUE) * dispersal,
studyDesign = studyDesign,
ranLevels = list(tank = rL, metacommunity = rL_meta),
distr = "poisson")

Temp is continuous.
Dispersal is a three level factor - none, low, high.
Tank as a random factor has 48 levels, metacommunity as a random factor has 12 levels.

I have not encountered this error when I run models with temp or dispersal on their own as fixed effects.

I am running the current GitHub version of Hmsc in R.3.6.1.

@gtikhonov
Copy link
Member

Yes, it is a very annoying problem caused by a small utility that is called from sampleMcmc() just before you get the results back. It is on my to-do list and I will try to fix it as soon as I find enough time.
But you can always disable that - just use sampleMcmc(... , alignPost = TRUE). This should fix the problem.

@plthompson
Copy link

Thanks for the quick reply and solution. I really appreciate the package.

@amybauer
Copy link

amybauer commented Oct 4, 2021

Hi, I just started to explore the package but keep running into the same issue - "Error in if (alphapw[alpha[h], 1] > 0) { : argument is of length zero".
I tried both adding sampleMcmc(... , alignPost = TRUE) and alignPosterior(hM) prior to calling the predict(...), but it did not fix the issue. Maybe I am being blind to something obvious? Any help would be greatly appreciated!

I've been trying to run a relatively basic model for now, with the following setup:

simul <- Hmsc(Y=Y, XData = X,
              XFormula = XFormula,
              # TrData = TrData, 
              # TrFormula = TrFormula , 
              # C = C,
              studyDesign = studyDesign,
              ranLevels=list(ID=rL),
              distr = "probit")  

thin = 10
samples = 1000
nChains = 3
transient = 100

mod_HMSC2 = sampleMcmc(simul,
                      samples = samples,
                      thin = thin,
                      transient = transient,
                      nChains = nChains, 
                      nParallel = 6,
                      verbose = T, 
                      alignPost = TRUE)

and for the prediction:

grid = read.csv("rpt_check.csv")
grid = grid[1:500, ]
xy.grid = as.matrix(cbind(grid$x,grid$y))
XData.grid = data.frame(ndvi_mn_r = grid$ndvi_mn_r, stringsAsFactors = T)

#XData.grid[is.na(XData.grid)] <- 0

m = mod_HMSC2[[1]]
Gradient = prepareGradient(mod_HMSC2, XDataNew = XData.grid,sDataNew = list(ID = xy.grid))

alignPosterior(mod_HMSC2)

#nParallel=7
predY = predict(mod_HMSC2, Gradient = Gradient, predictEtaMean = TRUE)

I got the Hmsc package from github, via
library(devtools)
install_github("hmsc-r/HMSC", ref = "v3.0-2")

and have R 3.6.3 installed.

please let me know if I should provide more information (first time posting something on github), and if you have an idea of how I can fix this.

@rburner
Copy link
Author

rburner commented Oct 4, 2021

Hi Amy,

I'm not sure if this has anything to do with it, but I noticed that the install code refers to v3.0-2, whereas the latest version is 3.0-12.

If you run
packageVersion('Hmsc')
what version does it say you have? If not the latest you could try changing the install code.

Hope that helps!

Ryan

@ovaskain
Copy link
Collaborator

ovaskain commented Oct 5, 2021 via email

@amybauer
Copy link

amybauer commented Oct 5, 2021

Hi Otso,
thanks for getting back to me!
This is how it is set up:

studyDesign = dat[,c("ID")]
studyDesign<-as.data.frame(studyDesign)
names(studyDesign)<-"ID"

xy = as.matrix(cbind(dat$Longitudes,dat$Latitudes))
rownames(xy)=dat$ID
colnames(xy)=c("x-coordinate","y-coordinate")

rL = HmscRandomLevel(sData=xy)
ranlevels = list(ID = rL) 
ranlevels

Weirdly, this did run on another computer that I don't have access to regularly, and ultimately, I need it to run on my computer anyways (at least for the fitting part), so again, any help is greatly appreciated!
The Hmsc package and R versions on both computers are identical.

Thanks,
Amy

@ovaskain
Copy link
Collaborator

ovaskain commented Oct 5, 2021 via email

@amybauer
Copy link

amybauer commented Oct 7, 2021

Hi everyone,

again, thanks so much for trying to help me - I managed to get it to run, though I don't know how exactly.
I just redid my entire code and it worked.
My code was based on the executable script for section 11.1 (available https://www2.helsinki.fi/en/researchgroups/statistical-ecology/hmsc), with some minor additions from https://www.r-bloggers.com/2020/06/guide-to-using-the-hmsc-package-for-the-production-of-joint-species-distribtuion-models/

When I added the trait matrix, it gave me the same error again, but I also was able to work around that:
Some of it seemed due to an error that must have occurred when I saved the matrix in Microsoft excel (on windows 10), when I copied the content and saved it in a new .csv file, it worked.
I also had to add the traits in the trait function in the same order i added them to the TrData - it did not work when I was not maintaining the order.

I am just posting this in case anyone else runs into the same problem and is on the search for possible solutions.
Again, thank you so much for your help and great work with the package!
Amy

@amybauer
Copy link

amybauer commented Oct 10, 2021

EDIT: I added more details to my initial post that may be helpful

Hi everyone,

another update from my end: it looks like I (partially) figured out where the error is sneaking in.
I noticed that whether or not the error appears during the prediction step seems linked to the fitting parameters I choose before running my model, i.e., the number of chains, samples etc. I run the model on.

The model itself runs with any setting I have tried so far, but the prediction does not. The models generally seem to converge nicely, too.

Below, you can see an overview of parameter settings I have fitted my model to, and whether they worked for predictions or not (highlighted in red) - I put them in the same table to hopefully aid with comparison between fittings that were used (un-)successfully. The parameters that I have run a model on and successfully predicted something with vary, but it never works with more than 2 chains.

image

I haven't been able to figure out a pattern behind this.
Any ideas or thoughts regarding a solution to this issue this would be much welcome!
Thanks,
Amy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants