Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upError during conditional cross-validation in a NNGP model #40
Comments
|
I stumbled to this very same error message in another context, and committed a change that fixed this issue (but revealed another). You may try with the latest github version to see if this helps in your case. See Hmsc repository front page for instruction for |
|
Hi Jari, many thanks for your reply. I tried again after having installed the last github version and got a different error: m.spatial = Hmsc(Y = as.matrix(Y),
XData = as.data.frame(XDATA),
XFormula = XFormula,
TrData = TRAITS,
TrFormula = TrFormula,
studyDesign = studyDesign,
ranLevels = list("sample" = rL.spatial),
distr="probit")
m.spatial = sampleMcmc(m.spatial, thin = thin,
samples = samples, transient = transient,
nChains = nChains, verbose = verbose, nParallel=1,
updater=list(GammaEta=FALSE))
partition = createPartition(m.spatial, nfolds = 4, column = "sample")
m.spatial.predsREAL.sp = computePredictedValues(m.spatial, partition=partition, partition.sp=1:ncol(Y), updater=list(GammaEta=FALSE), nParallel=nChains, expected=F)
Cross-validation, fold 1 out of 4
Errore: Matrices must have same dimensions in iWs + tmp1
> traceback()
8: stop(gettextf("Matrices must have same dimensions in %s", deparse(sys.call(sys.parent()))),
call. = FALSE, domain = NA)
7: dimCheck(e1, e2)
6: iWs + tmp1
5: iWs + tmp1
4: updateEta(Y = Yc, Z = Z, Beta = sam$Beta, iSigma = 1/sam$sigma,
Eta = Eta, Lambda = sam$Lambda, Alpha = sam$Alpha, rLPar = rLPar,
X = X, Pi = PiNew, dfPi = dfPiNew, rL = rL)
3: predict.Hmsc(hM1, post = postList, X = XVal, XRRR = XRRRVal,
studyDesign = dfPi, Yc = Yc, mcmcStep = mcmcStep, expected = expected)
2: predict(hM1, post = postList, X = XVal, XRRR = XRRRVal, studyDesign = dfPi,
Yc = Yc, mcmcStep = mcmcStep, expected = expected)
1: computePredictedValues(m.spatial, partition = partition, partition.sp = 1:ncol(Y),
updater = list(GammaEta = FALSE), nParallel = nChains, expected = F)Waiting for your feedback |
|
Yes, that is the same error that I got with my case after fixing the first problem. I know how this happens and what is the problem. However, I don't know yet how to fix this problem. I'm studying the issue. |
|
I confirm that this is a bug. A reproducible example is: library(Hmsc)
set.seed(1)
partition <- createPartition(TD$m, nfolds = 2)
predsCV2 <- computePredictedValues(TD$m, partition = partition,
partition.sp = 1:TD$m$ns, mcmcStep = 100)This is based on ( |
|
Many thanks Jari. I hope this will be fixed in a next release. |
|
@mrkdfb : @gtikhonov made a commit that should fix your issue. Please test. |
|
Hi Jari, now the computePredictedValues function works properly. That said, I got a new error that never happened before. If I construct a gradient like this: GGradient = constructGradient(m.spatial, focalVariable = "bio_15", ngrid=20) GGradient $studyDesignNew $rLNew and then launch the predict function on it, I get the following error: predG=predict(m.spatial, Gradient=GGradient) here is the traceback: 7: stop("Number of columns must be same!.") |
|
I cannot reproduce this. Can you provide a reproducible example? |
|
Here it follows (it is based on the vignette 4 example):
Error in get.knnx(data, query, k, algorithm) : |
|
Hi Mirko & Jari, This issue should now be fixed. Cheers, |
|
@MelindadeJonge so it was dropping dimensions. |
|
@jarioksa Yes that was the issue. It only happened when making predictions for the latent variables on only one new sampling unit. |
|
Hi Melinda & Jari, I confirm that the last commit by Melinda fixed the issue when predicting on a gradient. Notwithstanding, another issue (maybe linked with the previous one) still stands when predicting on a different covariates dataset. Specifically, if I create new variables and studydesign like:
and try to predict the
I get the following error: Error in rL$s[unitsAll, ] : subscript out of bounds. In addition, if I try to run the same procedure on my own data, I get another error:
Error in get.knnx(data, query, k, algorithm) : Data non-numeric Are there any errors in my coding maybe? |
|
@mrkdfb @jarioksa From what I understand you are not fitting a model with factor covariates right? In that case it's probably not related to issue #31 . The first issue is related closed issue #19 . When you are making predictions to new spatial units, those units should have been specified when first defining the random levels. This is the case for all spatial models. The second error I have not seen yet, could you make a reproducible example for me so I can see what's going on? |
|
Hi Melinda, you are correct. The XData object in the code above is generated as described in the vignette 4 on spatial models. Accordingly, it contains only the numerical predictor "x1". This means that the first issue is related to the #19. In this regards, the #19 post does not provide any code describing how to specify all the spatial units before training the model, as well as how to indicate which unit to use in calibration and which one to take apart for prediction. Could you gently provide a couple of lines of code clarifying that? I suspect the second issue too derives from the same problem. |
|
Hi Mirko, The spatial locations of the units that you want to predict to need to be included in the sData that is given to xycoords_new = matrix(runif(2*n),ncol=2)
xycoordsFull = rbind(xycoords,matrix(runif(2*n),ncol=2))
colnames(xycoordsFull) = c("x-coordinate","y-coordinate")
rownames(xycoordsFull) = 1:nrow(xycoordsFull)
rL.nngp = HmscRandomLevel(sData = xycoordsFull, sMethod = 'NNGP', nNeighbours = 20)
studyDesign = data.frame(sample = as.factor(1:n))
m.nngp = Hmsc(Y=yprob, XData=XData, XFormula=~x1, studyDesign=studyDesign, ranLevels=list("sample"=rL.nngp),distr="probit")
m.nngp = sampleMcmc(m.nngp, thin = 1, samples = 10, transient = 5, nChains = 2, verbose = 0, updater=list(GammaEta=FALSE))
XData_pred=rbind(XData, XData)
studyDesign_pred=data.frame(sample=as.factor(1:nrow(XData_pred)))
predict(m.nngp, post=poolMcmcChains(m.nngp$postList)[1], XData=XData_pred, expected=F, studyDesign=studyDesign_pred)I hope this helps. Cheers, |
|
Hi Melinda, many thanks! The code you provided works perfectly in fixing the first issue regarding the studydesign in spatial models. Unfortunately, it does not fix the second issue that I get when working with my own data. Can I send you a script and a .RData workspace to allow you to reproduce the error? |
|
Hi Mirko, I actually ran into the same error message today while trying to fix another issue with the nngp predictions. I think the last commit should fix your issue. But if not, let me know, in that case you can send me the script and workspace. Cheers, |
|
Hi Melinda, I confirm that your last commit fixed also the remaining issue. Now, everything works really properly. |
|
Hi Melinda, sorry to bother you again, but the original issue that appeared when predicting on a gradient comes out again after the last commit. If you try to run again the reproducible example I attached in one of the posts above ("Here it follows..."), you'll see that. |
|
Hi Mirko, Apologies, I messed up something in the last commit, it should now be fixed again. Regarding the other issue you mentioned in one of the posts above: Cheers, |
|
Hi Melinda, you are right, that was exactly the problem. After a lot of attempts, I got that the variables for prediction need to be a data.frame, while xydata must be a matrix with column and row names provided. That was the only setting that made both predictions on gradient and on external variables working. |
|
I think we are not too user-friendly: some data must be data.frames or we fail, another must be a matrix or we fail, then perhaps it must be specifically constructed matrix (or was it data frame?) with certain names for rows and columns or we fail. In all cases the error messages are obscure and not at all related to the actual error. I think most of these things should be checked within the code so that users don't need to tackle with these quirks. In the future versions... |
Hi,
I report the following error tryng to run a conditional cross-validation on a NNGP model:
here is the traceback():
Diving into the functions, I discovered the issue appears when launching the updateEta function
Any ideas?
Thanks in advance for your precious help.
Mirko