# Recursive Feature Elimination (RFE)
This procedure is used to eliminate predictors (or dependent variables) that will not have influence on the outcome of a classifier. It works by recursively removing features from a model and estimating the model accuracy with the remaining features.

In order to illustrate how it works, I will use the function `make_classification` from sklearn.datasets in order to create a binary classification problem. In this case I will create a dataset with 4 predictors, two of which will not be informative and will be discarded by RFE

In [1]:
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=5000, n_features=4, n_informative=2,
                           n_redundant=2, n_repeated=0, n_classes=2,
                           n_clusters_per_class=1,
                           random_state=0)

In [3]:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

# I will create the RFE model for the logistic regression classifier
logreg = LogisticRegression()

# In this case, we will select 2 features
rfe = RFE(logreg, 2)

In [4]:
rfe = rfe.fit(X, y)

In [5]:
# print summaries for the selection of attributes
print(rfe.support_)
print(rfe.ranking_)

[ True  True False False]
[1 1 3 2]


The information returned by `rfe.support_` indicates what predictors were selected (TRUE) or discarded by RFE. In this case only the first 2 features were selected.  
`rfe.ranking` returns information on the orden of selection of the features, being the first and second features the first 2 followed by the 4th and finally the 3rd feature.