Join GitHub today
Test frame as tibble - wrong calculations for variable importance and confusing warning messages #15
If you do have test frame as tibble (easy to get when using tidyverse)
For calculation of variable importance you get the same values for full_model and all variables except baseline, which is obviously wrong.
For single_variable calculations you get following warning (only), however output is of limited value.
Casting tibble to regular data.frame solves the issue. Having training data as tibble seems not to have an impact on calculations at all.
`apartmentsTest_tibble <- apartmentsTest %>% as_tibble()
model_liniowy <- lm(m2.price ~ construction.year + surface + floor + no.rooms + district, data = apartments)
explainer_lm <- explain(model_liniowy, data = apartmentsTest_tibble[,2:6], y = apartmentsTest_tibble$m2.price)
vi_lm <- variable_importance(explainer_lm, loss_function = loss_root_mean_square)
sv_lm <- single_variable(explainer_lm, variable = "construction.year", type = "pdp")`