error for predict along a gradient with NNGP #96

FabianRoger · 2021-04-29T10:29:59Z

I fitted a model with a spatial random factor with the 'NNGP' method. Now I tried to predict with this model along a gradient.

My code is

 Gradient = constructGradient(models$m1, focalVariable = "Granskog")
 predY = predict(models$m1, Gradient = Gradient, expected = TRUE)

This gives me the error

    Error in get.knnx(data, query, k, algorithm) : 
    Number of columns must be same!.

Googling the error message brings me here but I am not sure if this is what's going on internally. If necessary, I can try to provide a repex later.

Thanks!

The text was updated successfully, but these errors were encountered:

jarioksa · 2021-04-29T11:00:34Z

I can reproduce this in an NNGP model. I'll have a look at this.

jarioksa · 2021-04-29T11:26:38Z

This is now fixed in commit ee1367a. Please try the githbut version of Hmsc to see if it cures your problem (sloppy coding).

FabianRoger · 2021-04-29T14:06:10Z

thanks!

I tried it with the updated version and the original error seems to be fixed. However I now get and error saying

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 't': the leading minor of order 1 is not positive definite

also, weirdly, when testing the same code on another model that is identical to the model m1 in all but that it uses abundance data conditional on presence (the second part of the hurdle model) the original error still occurs.

jarioksa · 2021-04-29T15:52:50Z

This error does not come directly from Hmsc, but from some other package or R function that we use. After getting the error, write traceback() and post the output.

FabianRoger · 2021-04-29T20:52:41Z

this is what I get

 traceback()
 9: h(simpleError(msg, call))
 8: .handleSimpleError(function (cond) 
   .Internal(C_tryCatchHelper(addr, 1L, cond)), "the leading minor of order 1 is not positive definite", 
       base::quote(chol.default(W)))
7: chol.default(W)
6: chol(W)
5: t(chol(W))
4: predictLatentFactor(unitsPred = levels(dfPiNew[, r]), units = levels(object$dfPi[, 
       r]), postEta = postEta, postAlpha = postAlpha, rL = rL[[r]], 
       predictMean = predictEtaMean, predictMeanField = predictEtaMeanField)
3: predict.Hmsc(models$m1, Gradient = Gradient, expected = TRUE)
2: predict(models$m1, Gradient = Gradient, expected = TRUE)
1: predict(models$m1, Gradient = Gradient, expected = TRUE)

ovaskain · 2021-04-30T06:00:40Z

This error message could appear if you have (almost) identical spatial coordinates – check if this is the case. Otso From: FabianRoger ***@***.***> Sent: torstai 29. huhtikuuta 2021 23:53 To: hmsc-r/HMSC ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [hmsc-r/HMSC] error for predict along a gradient with NNGP (#96) this is what I get traceback() 9: h(simpleError(msg, call)) 8: .handleSimpleError(function (cond) .Internal(C_tryCatchHelper(addr, 1L, cond)), "the leading minor of order 1 is not positive definite", base::quote(chol.default(W))) 7: chol.default(W) 6: chol(W) 5: t(chol(W)) 4: predictLatentFactor(unitsPred = levels(dfPiNew[, r]), units = levels(object$dfPi[, r]), postEta = postEta, postAlpha = postAlpha, rL = rL[[r]], predictMean = predictEtaMean, predictMeanField = predictEtaMeanField) 3: predict.Hmsc(models$m1, Gradient = Gradient, expected = TRUE) 2: predict(models$m1, Gradient = Gradient, expected = TRUE) 1: predict(models$m1, Gradient = Gradient, expected = TRUE) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#96 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEIYMZQJXVIC2FV46YM754TTLHBK5ANCNFSM43ZTTQHA>.

jarioksa · 2021-04-30T07:59:56Z

What bugs me in this error is that you should never execute command t(chol(W)): it is executed only in full spatial models, but never in NNGP models. Now I really need a reproducible example to see how can you get into this line in NNGP models.

jarioksa · 2021-04-30T08:19:19Z

For anybody trying to inspect the issue, here is how I generated the reproducible test case:

m <- TD$m
rL1 <- TD$rL1
rL1 <- update(rL1, sMethod="NNGP", nNeighbours=4)
m <- update(m, ranLevels = list(sample = TD$rL2, plot = rL1))
m <- sampleMcmc(m, samples=100, thin=1, transient=50, nChains=2, nParallel=2)
Gradient <- constructGradient(m, focalVariable = "x2")
predY <- predict(m, Gradient = Gradient, expected = TRUE) # fails before ee1367a

Can you use something similar to reproduce your problem?

FabianRoger · 2021-04-30T08:41:14Z

re distance between points:

It shouldn't be, no. The coordinates are centroids and about 20 km apart form each other. I checked and the minimum distance between any two points is 26km.

FabianRoger · 2021-04-30T08:45:50Z

weirdly enough the repex above runs without error for me.

fkeppeler · 2021-10-06T20:28:39Z

Are there any updates about this issue?
I'm having similar problems when using both GPP and NNGP models.

traceback()
8: h(simpleError(msg, call))
7: .handleSimpleError(function (cond)
.Internal(C_tryCatchHelper(addr, 1L, cond)), "the leading minor of order 1 is not positive definite",
base::quote(chol.default(W)))
6: chol.default(W)
5: chol(W)
4: t(chol(W))
3: predictLatentFactor(unitsPred = levels(dfPiNew[, r]), units = levels(object$dfPi[,
r]), postEta = postEta, postAlpha = postAlpha, rL = rL[[r]],
predictMean = predictEtaMean, predictMeanField = predictEtaMeanField)
2: predict.Hmsc(m2, Gradient = Gradient)
1: predict(m2, Gradient = Gradient)

cgoetsch · 2021-10-22T19:10:54Z

I am also getting this error for a non-spatial model, although I have a temporal effect of year. I am just trying to create the gradient plots by covariate, not make spatial predictions - those are working fine.

studyDesign = data.frame(
  tow = as.factor(da$UniqueID),
  grid_id = as.factor(da$grid_id),
  year = as.factor(da$YEAR))

#set up the random effect of sample level to get species co-occurrence at the sample level
rL.tow = HmscRandomLevel(units = studyDesign$tow)
#set up the temporal random effect - temporally explicit for temporal autocorrelation
rL.time = HmscRandomLevel(sData=time_da)

m = Hmsc(Y = Y, XData = X, XFormula = XFormula.1.d,
distr="probit", studyDesign=studyDesign,
ranLevels=list("tow"= rL.tow, "year" = rL.time))

Gradient_cov = constructGradient(m, focalVariable = "rugosity")
predY_cov = predict(m, Gradient=Gradient_cov, expected = TRUE)

traceback()
9: h(simpleError(msg, call))
8: .handleSimpleError(function (cond)
.Internal(C_tryCatchHelper(addr, 1L, cond)), "the leading minor of order 1 is not positive definite",
base::quote(chol.default(W)))
7: chol.default(W)
6: chol(W)
5: t(chol(W))
4: predictLatentFactor(unitsPred = levels(dfPiNew[, r]), units = levels(object$dfPi[,
r]), postEta = postEta, postAlpha = postAlpha, rL = rL[[r]],
predictMean = predictEtaMean, predictMeanField = predictEtaMeanField)
3: predict.Hmsc(m, Gradient = Gradient_cov, expected = TRUE)
2: predict(m, Gradient = Gradient_cov, expected = TRUE)
1: predict(m, Gradient = Gradient_cov, expected = TRUE)

Yo-B · 2021-11-26T11:42:33Z

I have been having the same error with my temporal random effect. I cannot make prediction if I define it with argument sData when training the model. I also have a spatial random effect that works fine with NNGP or GPP, and removing it does not help in passing my temporal factor as sData. If I pass the temporal as unit, so not as a longitudinal factor, it works, but I'm surely loosing some information by doing this. I attach below a minimal reproducible example that triggers the same error. It is quite big, yet minimal enough I hope. Any suggestion on how I should define my longitudinal random effects here? BTW thanks guys for your awesome work!

library("dplyr")
library("Hmsc")

## data dimensions
n_cov <- 10
n_sample <- 100
n_species <- 20
n_trait <- 5

## generate data set with species occurrence, covariates, spatiotemporal coordinates and traits
XData <- data.frame(matrix(rnorm(n_sample*n_cov), ncol=n_cov))
colnames(XData) <- c("x", "y", paste0("var_", 1:(n_cov-2)))
XData$year <- sample(2000:2010, n_sample, replace=TRUE)

YData <- data.frame(matrix(rnorm(n_sample*n_species), ncol=n_species)) %>% 
  mutate_all(function(x) (x>.75)*1)
colnames(YData) <- paste0("sp_", 1:n_species)
TrData <- data.frame(matrix(rnorm(n_trait*n_species), ncol=n_trait))
colnames(TrData) <- paste0("tr_", 1:n_trait)
rownames(TrData) <- paste0("sp_", 1:n_species)

## subset the data set in train/test sets
train <- sample(n_sample, n_sample*.6)
designTrain <- XData[train, c("year", "x", "y")]
xTrain <- XData[train, paste0("var_", 1:(n_cov-2))]
yTrain <- YData[train, ]

## study design
studyDesign <- data.frame(year=factor(designTrain$year),
                          xy=factor(paste(designTrain$x, designTrain$y, sep="_")))
rownames(studyDesign) <- rownames(yTrain) <- rownames(xTrain) <- 1:nrow(yTrain)

## formulas
f_cov <- as.formula(paste0("~", paste(paste0("var_", 1:(n_cov-2)), collapse = "+")))
f_tra <- as.formula(paste0("~", paste(colnames(TrData), collapse = "+")))

## prepare temporal random effect
time <- data.frame(year=sort(unique(designTrain$year)))
rownames(time) <- time$year
## prepare spatial random effect
space <- data.frame(unique(designTrain[, c("x", "y")]))
rownames(space) <- paste(space$x, space$y, sep="_")
xyknots <- constructKnots(sData=space, knotDist = 1)
plot(space, asp=1) ; points(xyknots, col="red")
## define random effects
rL3 <- HmscRandomLevel(sData = time)
rL4 <- HmscRandomLevel(sData = space, sMethod = "GPP", sKnot = xyknots)

## Define the model
(m1 <- Hmsc(Y = yTrain,
            XData = xTrain, 
            studyDesign = studyDesign,
            TrData = TrData,
            TrFormula =  f_tra, 
            XFormula = f_cov,
            ranLevels = list(year=rL3, xy=rL4),
            distr = "probit"))

## train the model
m1 <- sampleMcmc(m1, thin=2, samples=100, nParallel=3, transient=10, nChains=3, verbose=10)

## predict on test and train data sets
designTest <- XData[!(1:n_sample %in% train), c("year", "x", "y")]
xTest <- XData[!(1:n_sample %in% train), paste0("var_", 1:(n_cov-2))]
yTest <- YData[!(1:n_sample %in% train), ]

## Try to make predictions
Gradient <- prepareGradient(m1, XDataNew = data.frame(rbind(xTest, xTrain)), 
                            sDataNew = list(
                              year=data.frame(year=rbind(designTest, designTrain)$year), ## commenting that line and removing rL3 from model definition above will allow predictions
                              xy=data.frame(rbind(designTest, designTrain)[, c('x','y')])))
pred_occ <- predict(m1, Gradient = Gradient) #### ERROR will occur here

jarioksa · 2021-12-02T12:53:55Z

@Yo-B This is one of the issues I have been studying recently: thank you for supplying a reproducible example.

I have an experimental, dirty and noisy fix for this discussed in issue #123. You can try this installing the experimental branch with command devtools::install_github("hmsc-r/HMSC", ref="updater-errors"). It is experimental because I have not thoroughly tested it, it is dirty because I have not thoroughly thought over it, and it is noisy because it makes a whole lot of noise: it prints you a large amount of error messages which are ignored, and you will get a result despite these messages.

The reason of failure in several of these cases seems to be that Gradient values duplicate data and therefore you have zero-distances between Gradient points and data points (training points). This in turn gives you zero W matrix – and a matrix of zeros is not positive definite. The purpose of this calculation is to get the value of standard error for random normal variate, and that standard error is distance-dependent, and zero when distance is zero.

The risk of duplicating data points is particularly large in time random variables which – like in your case – are integer years and the generated Gradient values will duplicate data values.

This fix may also work with the cases of @cgoetsch, @fkeppeler and @FabianRoger but I haven't tested with their data. Remember: you should get a lot of error messages, but these will not stop the analysis, but you will get a result.

jarioksa added a commit that referenced this issue Apr 29, 2021

issue #96: misplaced drop=FALSE and predict fails in NNGP models

ee1367a

jarioksa added a commit that referenced this issue Apr 29, 2021

Merge branch 'issue-#96'

b5eb042

jarioksa added the bug Something isn't working label Apr 29, 2021

jarioksa mentioned this issue Oct 14, 2021

Recovery from sporadic errors in updaters #123

Open

elgabbas mentioned this issue Apr 11, 2024

Can not predict at the same coordinates used to train the model #181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error for predict along a gradient with NNGP #96

error for predict along a gradient with NNGP #96

FabianRoger commented Apr 29, 2021

jarioksa commented Apr 29, 2021

jarioksa commented Apr 29, 2021

FabianRoger commented Apr 29, 2021 •

edited

Loading

jarioksa commented Apr 29, 2021

FabianRoger commented Apr 29, 2021

ovaskain commented Apr 30, 2021 via email

jarioksa commented Apr 30, 2021

jarioksa commented Apr 30, 2021 •

edited

Loading

FabianRoger commented Apr 30, 2021 •

edited

Loading

FabianRoger commented Apr 30, 2021

fkeppeler commented Oct 6, 2021

cgoetsch commented Oct 22, 2021

Yo-B commented Nov 26, 2021 •

edited

Loading

jarioksa commented Dec 2, 2021 •

edited

Loading

error for predict along a gradient with NNGP #96

error for predict along a gradient with NNGP #96

Comments

FabianRoger commented Apr 29, 2021

jarioksa commented Apr 29, 2021

jarioksa commented Apr 29, 2021

FabianRoger commented Apr 29, 2021 • edited Loading

jarioksa commented Apr 29, 2021

FabianRoger commented Apr 29, 2021

ovaskain commented Apr 30, 2021 via email

jarioksa commented Apr 30, 2021

jarioksa commented Apr 30, 2021 • edited Loading

FabianRoger commented Apr 30, 2021 • edited Loading

FabianRoger commented Apr 30, 2021

fkeppeler commented Oct 6, 2021

cgoetsch commented Oct 22, 2021

Yo-B commented Nov 26, 2021 • edited Loading

jarioksa commented Dec 2, 2021 • edited Loading

FabianRoger commented Apr 29, 2021 •

edited

Loading

jarioksa commented Apr 30, 2021 •

edited

Loading

FabianRoger commented Apr 30, 2021 •

edited

Loading

Yo-B commented Nov 26, 2021 •

edited

Loading

jarioksa commented Dec 2, 2021 •

edited

Loading