Skip to content

Class RFECV needs hard-coded n_features_to_select as a input parameter. #6564

@KeithBrodie

Description

@KeithBrodie

In scikit-learn 17.1 class RFECV uses a hard-coded 1 for n_features_to_select. For some classifiers with some parameter sets, the classifier will fail with only one feature. In addition, since RFECV takes some time, it would be helpful to allow the class to be initialized with a min_features_to_select with a value other than one that the user selects based on domain knowledge. One may not be interested in any RFECV solutions with less than, say five features. In this case the final four iterations of the recursive fit are wasted cycles.

The change is backward-compatible, anyone using RFECV without specifying min_features_to_select get's the argument default of one, the same behavior that's hard-coded today.

It's a trivial change to add it to the set of initialization parameters for the class. I use a customized version of sklearn 17.1 with this change. If I wasn't such a github noob I'd submit a pull or push or whatever it is.

Here's my code - modified RFECV init:

    def __init__(self, estimator, step=1, cv=None, scoring=None,
                 estimator_params=None, verbose=0, min_features_to_select=1):
        self.estimator = estimator
        self.step = step
        self.cv = cv
        self.scoring = scoring
        self.estimator_params = estimator_params
        self.min_features_to_select = min_features_to_select
        self.verbose = verbose

and in the fit() method:

        # Initialization
        cv = check_cv(self.cv, X, y, is_classifier(self.estimator))
        scorer = check_scoring(self.estimator, scoring=self.scoring)
        n_features = X.shape[1]
        n_features_to_select = self.min_features_to_select

Metadata

Metadata

Assignees

No one assigned

    Labels

    EasyWell-defined and straightforward way to resolve

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions