Gaps in the projected ensemble maps #242

jpmaalouf · 2023-04-17T14:43:40Z

Issue

We are having an issue on the final projections from ensemble models. In all ensemble models except CV, the map includes projection gaps corresponding to NA projection values.

This is surprising as CV incorporates the mean in its formula. So if CV is computed, how come the mean is not?

Here is the basic biomod code we used, as well as three projection maps :

Mean showing a white gap (northwest)
Median showing a white gap (northwest)
Coefficient of variation (no gaps)

Code

Biomod parameters

Parametres=BIOMOD_ModelingOptions(
  GLM = list(type="quadratic", interaction.level = 0), 
  GBM = list(n.trees = 1000),                          
  GAM=list(algo = "GAM_mgcv", k=4))

Biomod modeling

MODEL <- BIOMOD_Modeling(bm.format = BioDATA, # BioDATA is an object returned from BIOMOD_FormatingData()
                        modeling.id = "Test",
                        models = c("GLM", "FDA", "GBM", "RF", "MARS", "GAM"),
                        bm.options = Parametres,
                        nb.rep = 2,
                        data.split.perc = 70,
                        metric.eval = c('TSS','ROC'),
                        var.import = 2,
                        do.full.models = FALSE)

EnsMOD <- BIOMOD_EnsembleModeling(bm.mod = MODEL,
                               models.chosen="all",
                               em.by = "all",
                               metric.select = "TSS",
                               metric.select.thresh = THRESH,
                               metric.eval = c("ROC", "TSS"),
                               prob.mean = T, prob.cv=T, prob.ci=T,
                               prob.ci.alpha=0.05, 
                               prob.median=T,
                               committee.averaging = T,
                               prob.mean.weight = T,
                               prob.mean.weight.decay="proportional")

EnsPROJ<-BIOMOD_EnsembleForecasting(bm.em = EnsMOD,
                                    proj.name = "Test",
                                    new.env = EnvData,
                                    models.chosen = 'all',
                                    metric.binary = 'all',
                                    metric.filter = 'all')

Projections

Mean (gap in the northwest)

Median (gap in the northwest)

Coefficient of variation (no gaps)

The text was updated successfully, but these errors were encountered:

rpatin · 2023-04-17T15:45:09Z

Bonjour Jean-Paul,
Thank you for submitting the issue on github, it is really appreciated 🙏
Your issue highlight a code difference between mean, median and CV calculation. Mean and median uses na.rm = FALSE while CV uses na.rm = TRUE. So EMcv is able to calculate mean/sd where EMmean and EMmedian are not. This also imply that you have some individual models that predict NA in the area concerned.

A few conclusion:

About na.rm in BIOMOD_EnsembleForecasting:
1. We are discussing internally to harmonize the na.rm decision and will likely set na.rm = TRUE as default for all ensemble algorithm.
2. We will also likely add an option to set na.rm = FALSE
3. Those changes will likely be available in the next days on github but depending on the version you may need to adjust other things or re-run your workflow.
4. Alternatively you can update your CV map by removing the zone predicted as NA in other zones.
5. If your workflow is too long to run but you want values in the zone in question (i.e. having na.rm = TRUE for all), I can also set a dedicated biomod2 branch for you, depending on the version you are currently using.
About the missing values in your individual model projection:
1. they may be caused by missing values in the environment data
2. they can also be caused by NaN predictions because of environmental variables are out of calibration range
3. if you want to understand what is happening, I would look at individual model predictions to identify which algorithm or run are affected and at your environmental variables distribution in the given zone.

Note that you can also identify out of range predictions by using build.clamping.mask = TRUE in BIOMOD_Projection or BIOMOD_EnsembleForecasting. Some areas will then be filled with NA when variables are out of calibration ranges.

Feel free to update the issue if you have additional information or question. I will update it when the fix to na.rm will be available.

Best regards,
The biomod2 team

jpmaalouf · 2023-04-17T16:15:33Z

Bonjour the biomod2 team!

Thank you very much for your prompt and efficient reply.

Great news if we'll be able to manage this issue in the upcoming version on github in the coming days. We'd rather have complete maps, rather than excluding NA pixels in the CV map, so we'll just wait for this version to get published. Any information on how long it will take you to publish it?
Also, Thank you for offering to dedicate a specific branch to prevent us from running our workflow from be start :). No need to do it. Once the new version is published, we'll just try to adapt the code to the individual models which are already computed (on the latest version: 4.2-2). I'll let you know if things don't work properly.
Thank you for the tips that help identify individual models that failed to predict. Surprisingly, I was expecting that models from specific methods (e.g. all FDA models) would fail to predict, but actually not. Maybe this behavior could linked to differences in sampled Pseudo-Absences, which relates to your idea of some observation datasets not covering the variable range widely enough.

Cheers!

Jean Paul

rpatin · 2023-04-18T07:54:22Z

Bonjour Jean Paul,

We just published the new version this morning with the fix to na.rm. You can download it with devtools::install_github('biomodhub/biomod2'). It will likely be available on CRAN in the following weeks.
If you are lucky you may have to re-run only the BIOMOD_EnsembleForecasting part, but it is also likely that you'll have to start again. Let me know if that happens and need a dedicated branch inserting the correction for version 4.2-2.
I would also have expected some methods to fail but it is interesting to see that it is more closely related to Pseudo-Absences dataset (or cross-validation repetition).

Cheers,
Rémi

jpmaalouf · 2023-04-18T15:46:32Z

Bonjour Rémi !

Thank you very much for the na.rm fix. I downloaded the updated version on github. The help documentation of the function now includes the na.rm argument. I tried re-launching the whole workflow from BIOMOD_Modeling(), and it blocked at BIOMOD_EnsembleForecasting() with the error Error in { : task 6 failed - "[write] unknown option(s): na.rm".

Cheers

Jean Paul

rpatin · 2023-04-18T16:37:57Z

Bonjour Jean Paul,

Indeed there was a small mistake on my side. na.rm argument was on the wrong side of a bracket in some of the calculations (especially for EMci algorithm). This should be fixed now if you update again. You should just have to re-launch BIOMOD_EnsembleForecasting and it should (hopefully) work.

Cheers,
Rémi

rpatin · 2023-04-19T06:56:08Z

Bonjour Jean Paul,
There was a tiny mistake on my end when implementing the na.rm argument (a misplaced bracket). It is now corrected, sorry for that 🙏
If you update again to current github version, it should now work and you should just have to rerun BIOMOD_EnsembleForecasting hopefully.
Cheers,
Rémi

jpmaalouf · 2023-07-26T12:26:41Z

Bonjour Rémi,

Never got the chance to say big thank you for what you did on this issue :). Il n'est jamais trop tard. Our project is now done and delivered.

All the best,

JP

jpmaalouf changed the title ~~Gaps in the proejected ensemble maps~~ Gaps in the projected ensemble maps Apr 17, 2023

rpatin added the modeling question label Apr 17, 2023

rpatin closed this as completed May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gaps in the projected ensemble maps #242

Gaps in the projected ensemble maps #242

jpmaalouf commented Apr 17, 2023 •

edited by rpatin

Loading

rpatin commented Apr 17, 2023

jpmaalouf commented Apr 17, 2023 •

edited

Loading

rpatin commented Apr 18, 2023

jpmaalouf commented Apr 18, 2023

rpatin commented Apr 18, 2023

rpatin commented Apr 19, 2023

jpmaalouf commented Jul 26, 2023

Gaps in the projected ensemble maps #242

Gaps in the projected ensemble maps #242

Comments

jpmaalouf commented Apr 17, 2023 • edited by rpatin Loading

Issue

Code

Biomod parameters

Biomod modeling

Projections

Mean (gap in the northwest)

Median (gap in the northwest)

Coefficient of variation (no gaps)

rpatin commented Apr 17, 2023

jpmaalouf commented Apr 17, 2023 • edited Loading

rpatin commented Apr 18, 2023

jpmaalouf commented Apr 18, 2023

rpatin commented Apr 18, 2023

rpatin commented Apr 19, 2023

jpmaalouf commented Jul 26, 2023

jpmaalouf commented Apr 17, 2023 •

edited by rpatin

Loading

jpmaalouf commented Apr 17, 2023 •

edited

Loading