Variable Importance Output Issue with MARS Algorithm in SSDM package #124

Montyx23 · 2023-04-11T00:16:23Z

Hi there @sylvainschmitt, @lukasbaumbach,

I am writing to report an issue I encountered while using SSDM for a large modelling project. The project includes 10 species, each modelled with species-specific variables, and for every region across a stratified study area of 14 regions (resulting in a total of 140 model outputs). I used the ensemble modelling function with RF, MARS, SVM, and ANN algorithms with 5 repeats for all model runs. Note that I only have presence data, so automatic pseudo-absence was generated for all.

For the first ~70 models, I used a loop to iterate over all combinations of species, regions, and species-specific variables with the same parameters as the following code snippet:

ESDM <- ensemble_modelling( c('ANN', 'SVM', 'CTA', 'MARS', 'RF'), ensemble.metric = c('prop.correct'), ensemble.thresh = c(0.75), occurrence_data, predictor_variables, rep = 5, Xcol = 'x', Ycol = 'y' )

However, I ran into memory and speed issues, and then manually built the remaining 70 ESDMs via the GUI feature. In the GUI, I used the same parameters, and everything else was left as default parameters.

All model projections, evaluation results, and other files came out fine. However, the variable importance outputs for some models looked like this:

Presence	Presence	Presence	Presence	Presence	Presence	Presence	Presence	Presence	Presence	Presence	Presence
Axes.evaluation	6.06506260362906	3.22714547662735	12.9818109189702	4.25058836645605	6.9628498591444	2.60560023216018	5.63802728195239	26.7530966936013	13.3709808348051	5.19605061823909	9.42792640373865

The number of 'Presence' columns is the same as the number of original predictor variables I used for that particular model, and this is the same for other outputs too, indicating that these numbers are likely the variable importance numbers for the predictor variables.

I tested some single algorithm SDM models in the GUI with the MARS algorithm on a number of species and regions, and the variable importance table always came out looking like that. The issue with this is that now I am writing a report and producing summary statistics on the modelling and have no idea how to explain the chunk of models which have 'Presence' variable importance names. Below is a boxplot I created which shows a high-level overview of variable importance across all the species and regions, and as you can see, 'Presence' stands out clearly and accounts for 32.58% of the frequency of variables for all models.

I would like to ask for help understanding if this is a result of my configuration, or if it is really a bug. I am not very experienced with these algorithms, especially MARS. Ideally, I would like to be able to link these 'Presence' columns to their respective variable names.

Thank you for your time and for developing such an awesome package! :)

The text was updated successfully, but these errors were encountered:

lukasbaumbach · 2023-08-09T14:18:10Z

Hi,
to me this sounds like a bug in the evaluate.axes function when creating the variable importance data.frame (probably here: names(obj@variable.importance) <- names(obj@data)[4:(length(obj@data)-1)]. )
I'll try to look into it.
Best,
Lukas

Montyx23 · 2023-11-15T16:47:18Z

Thanks Lukas, hope you manage to fix it.

sylvainschmitt · 2024-05-15T07:56:27Z

@Montyx23 , this is fixed. I'll pushed the updated version on GitHub today.

lukasbaumbach added the bug label Aug 9, 2023

sylvainschmitt added a commit that referenced this issue May 15, 2024

#124

db80940

sylvainschmitt closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable Importance Output Issue with MARS Algorithm in SSDM package #124

Variable Importance Output Issue with MARS Algorithm in SSDM package #124

Montyx23 commented Apr 11, 2023 •

edited

Loading

lukasbaumbach commented Aug 9, 2023

Montyx23 commented Nov 15, 2023

sylvainschmitt commented May 15, 2024

Variable Importance Output Issue with MARS Algorithm in SSDM package #124

Variable Importance Output Issue with MARS Algorithm in SSDM package #124

Comments

Montyx23 commented Apr 11, 2023 • edited Loading

lukasbaumbach commented Aug 9, 2023

Montyx23 commented Nov 15, 2023

sylvainschmitt commented May 15, 2024

Montyx23 commented Apr 11, 2023 •

edited

Loading