# Wrapper based feature selection
* wrapper based methods use estimator class rather than scoring function.

### RecursiveFeatureElimination:
* Uses an estimator to recursively remove features.
* Initially fits an estimator on all features.
* Obtains feature importance from the estimator and removes the least important feature.
* Repeats the process by removing features one by one, until desired number of features are obtained.

### Recursive Feature Elimination Cross Validation (RFECV)

* Use if we do not want to specify the desired number of features in RFE .
* It performs RFE in a cross-validation loop to find the optimal number of features.

### SelectFromModel

* Selects desired number of important features (as specified with max_features parameter) above certain threshold of feature importance as obtained from the trained estimator.
* The feature importance is obtained via coef_, feature_importances_ or an importance_getter callable from the trained estimator.
* The feature importance threshold can be specified either numerically or through string argument based on built-in heuristics such as `mean`, `median` and float multiples of these like `0.1*mean`.

```python
clf = LinearSVC(C=0.01, penalty="l1", dual=False)
clf = clf.fit(X, y)
clf.coef_
model = SelectFromModel(clf, prefit=True)
X_new = model.transform(X)
```

### SequentialFeatureSelector

Performs feature selection by selecting or deselecting features one by one in a greedy manner.



* The direction parameter controls whether forward or backward SFS is used.
* In general, forward and backward selection do not yield equivalent results.
* Select the direction that is efficient for the required number of selected features.
    * if you want to select 7 features from 10
        * forward selection would perform 7 iterations.
        * backward selection would perform 3 iterations.
* SFS does not require the underlying model to expose a coef_ or feature_importances_ attributes unlike in RFE and SelectFromModel.
* SFS may be slower than RFE and SelectFromModel as it needs to evaluate more models compared to the other two approaches.

for example in backward selection, the iteration going from from m to m-1 features using k-fold-cross-validation requires fitting $m * k$ models while,
* RFE would require only a single fit
* SelectFromModel performs a single fit and requires no iterations