selected_features for learners that don't support it should be the entirety of features seen in training #935

mb706 · 2023-06-29T19:06:23Z

This way we could correctly query a pipeline that selects features first and gives the result to a learner. The GraphLearner could then ask the learner at the end how many features it used, and if it is a learner that supports embedded featsel (rpart e.g.) then this would give the correct value, but even for learners that do not do support it the result could make sense.

Also this would solve mlr-org/mlr3fselect#87

be-marc · 2024-08-17T12:15:31Z

library(mlr3)
library(mlr3learners)

learner = lrn("classif.rpart")
task = tsk("spam")

learner$train(task)
learner$selected_features()

#> [1] "charDollar"      "hp"             
#> [3] "remove"          "charExclamation"
#> [5] "capitalTotal"    "free"  

learner = lrn("classif.log_reg")
learner$train(task)
learner$selected_features()
# > Error: attempt to apply non-function

berndbischl added the Workshop label Aug 16, 2024

berndbischl self-assigned this Aug 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

selected_features for learners that don't support it should be the entirety of features seen in training #935

selected_features for learners that don't support it should be the entirety of features seen in training #935

mb706 commented Jun 29, 2023

be-marc commented Aug 17, 2024

selected_features for learners that don't support it should be the entirety of features seen in training #935

selected_features for learners that don't support it should be the entirety of features seen in training #935

Comments

mb706 commented Jun 29, 2023

be-marc commented Aug 17, 2024