predict over millions of sites (non-spatially explicit); out of memory #145

stephanJG · 2022-07-06T05:15:20Z

Hi Hmsc team,
I am trying to predict over many new sites (~30 mil sites) with a large model ("142 sampling units, 232 species, 8 covariates, 1 traits and 3 random levels"), but starting to wonder if this is computational feasible.
Neither of the random levels have spatial dimensions. I am using the predict.Hmsc function, after creating new data using the prepareGradient function:
gradient<-prepareGradient(model, XDataNew = XData.grid)
predY<-predict(object = model, Gradient=gradient, expected = TRUE)

Until now I have:

only tested using the data from the model and increased the number of copies (1 to 5 times the data the model was build with)
split the data so I can run them parallel on the HPC
have saved predY in an array and rounded the predicted values to 4 digits to decrease the Mb
(in another project I successfully predicted over 20 mil sites with a model for 1 species using GPP)

If I use 4 times the data (568 sites) it takes 2.5 hours to predict. However, already at 5 times (710 sites) the HPC give me an out-of-memory message, which may be understandable as this is already 164.720.000 numbers (142 sites * 5 times * 232 species * 1000 draws)?

How can I make predictions for these data?
Many thanks in advance
Best
Jörg

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict over millions of sites (non-spatially explicit); out of memory #145

predict over millions of sites (non-spatially explicit); out of memory #145

stephanJG commented Jul 6, 2022

predict over millions of sites (non-spatially explicit); out of memory #145

predict over millions of sites (non-spatially explicit); out of memory #145

Comments

stephanJG commented Jul 6, 2022