number of samples for cross-validation #85

FabianRoger · 2021-03-03T09:05:34Z

Hi,

I am trying to evaluate the model fit by two-fold cross-validation using the computePredictedValues function. However, I am confused by what I should specify for the sampling. I looked in both the documentation and the book but didn't find an explanation of how the models are fitted to each fold. Sorry if I missed it!

The computePredictedValues has the parameters start, thin, and mcmcStep, each of which defaults to 1. Does that mean the model would be fitted with a single iteration? This seems implausible? Or do I need to specify the sampling to match the sampling form the model fitting?

For the model I fitted I used the following parameters:

nChains = 10
samples = 200
thin = 200
transient = 0.5*thin*samples

Sorry if I miss the obvious.

ps: I also get the error

keeping only two first columns of 'distr' matrixsetting updater$GammaEta=FALSE: not implemented for spatial methods 'GPP' and 'NNGP'

(I set the updater to FALSE as it was suggested to me this would speed up the mcmc sampling significantly. I didn't get an error during the initial model fitting)

Is that anything to worry about?

Thank you for your help!

The text was updated successfully, but these errors were encountered:

jarioksa · 2021-03-03T09:47:23Z

Neither of those is an error. A characteristic feature of errors is that the message starts with word Error: and the execution of code stops at that moment.

Both are informative messages. The first implies that you had an old model fitted with an earlier version (earlier than 3.0-9) of Hmsc which had two unused columns in the distribution matrix, and now we ignore those two unused ones (they really were unused earlier, too, but if you imagined that you could put there something useful, you are informed that it does not work and never worked).

The second tells that one of the updaters will not be used for your model because you had a spatial method which is not supported by that updater (GammaEta).

FabianRoger · 2021-03-03T11:25:09Z

sorry about that, I meant warning. Thanks for the explanation.

jarioksa · 2021-03-03T14:06:33Z

See issue #86 that touches the same problem.

jarioksa · 2021-03-03T15:33:30Z

@FabianRoger computePredictedValues computes predictions for all posterior samples, but you can skip some by setting thin and start. Then you get smaller predictions arrays (all sampling units, all species, but not so many posterior samples). MCMC sampling is only performed with cross-validated predictions, but there we use the same parameters as in the original Hmsc models (samples, thin, transient). The argument mcmcStep is used to update random effects in conditional models: see the help pages for predict.Hmsc in addition to computePredictedValues.

FabianRoger · 2021-03-05T12:59:03Z

MCMC sampling is only performed with cross-validated predictions, but there we use the same parameters as in the original Hmsc models

Thanks! This answers my question.

FabianRoger closed this as completed Mar 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

number of samples for cross-validation #85

number of samples for cross-validation #85

FabianRoger commented Mar 3, 2021

jarioksa commented Mar 3, 2021 •

edited

Loading

FabianRoger commented Mar 3, 2021

jarioksa commented Mar 3, 2021

jarioksa commented Mar 3, 2021

FabianRoger commented Mar 5, 2021

number of samples for cross-validation #85

number of samples for cross-validation #85

Comments

FabianRoger commented Mar 3, 2021

jarioksa commented Mar 3, 2021 • edited Loading

FabianRoger commented Mar 3, 2021

jarioksa commented Mar 3, 2021

jarioksa commented Mar 3, 2021

FabianRoger commented Mar 5, 2021

jarioksa commented Mar 3, 2021 •

edited

Loading