DefaultAlgorithm
errors out during cross validation with encoder and feature selection interaction
#2904
Labels
bug
Issues tracking problems with existing features.
Related to #2903
After running a naive pipeline with preprocessing and a random forest estimator, we use the random forest estimator to create a feature selection component. However, since the feature selection component selects columns after the encoder, these columns are post-encoding columns. In the scenario where different values are provided to the encoder during validation, the encoder may not provide the columns selected by the feature selector and the feature selector will error out since it cannot find the same columns. For example below, an original column would be
INTERNODE_21
but post-encoding these columns would becomeINTERNODE_21_X
. However, during cross validation certain values are not given to the encoder and these columns are created and will error out during feature selection.The text was updated successfully, but these errors were encountered: