FeatureSetSelector does not work when not set as the first item in a template. #1250

perib · 2022-05-17T22:08:04Z

FeatureSetSelector only works when set as the first step of a template. When no template is used, or when it is set to be in the middle of a template, the behavior is not well defined. It would be helpful if the FSS could be used without a template. For example, TPOT can set the FSS to be the first step of the pipeline, but then have the rest of the pipeline be unrestricted.

for example the following works normally.
template='FeatureSetSelector-Transformer-Classifier')
However "Transformer-FeatureSetSelector-Classifier" does not work, nor will the base model without a template. There are two issues with that:
When using string column names: those are not preserved in the other transformations so when FSS is not first, it cannot use the feature names and crashes.
2. When using indexes of columns, the ordering is not guaranteed to be preserved with transformations in the other steps. This leads to FSS picking out a different subset than indented while also discarding the rest of the data up to that point.
for example lets say subset 1 is indexes 0 and 1. and our pipeline is Some Transformer-FSS-Classifier
out data is [0,1,8,9]
the first transformation adds two columns
data is now [7, 7, 0, 1, 8, 9]
FSS will now select [7,7], and discard the rest (including the added transformation in the last step).

perib · 2022-05-17T22:17:00Z

An additional useful feature, but may be more difficult to implement, would be to have FSS pass in different data to different "branches". For example:

    FFS -> classifier
                      \
                        > classifier
                      /
    FFS - > classifier

     (height, age, weight) -> Regression
                                         \
                                          > regression forest
                                         /
      Genes/proteins - > KNN

This could be possible by using FeatureUnion to group the outputs of two branches. TPOT could be initialized to have a FeatureUnion with a user specified number of items that begin with with FSS, the rest being determined through GP.

perib mentioned this issue Sep 21, 2023

TPOT2 and the future of TPOT development -- From the Devs #1322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FeatureSetSelector does not work when not set as the first item in a template. #1250

FeatureSetSelector does not work when not set as the first item in a template. #1250

perib commented May 17, 2022

perib commented May 17, 2022

FeatureSetSelector does not work when not set as the first item in a template. #1250

FeatureSetSelector does not work when not set as the first item in a template. #1250

Comments

perib commented May 17, 2022

perib commented May 17, 2022