# Recursive Feature Elimination

The Recursive Feature Elimination (RFE) method is a feature selection approach. It works by recursively removing attributes and building a model on those attributes that remain. It uses the model accuracy to identify which attributes (and combination of attributes) contribute the most to predicting the target attribute.

In [9]:
import pandas as pd

df = pd.read_csv("sysco.csv")

df.head()

Unnamed: 0,Fresh,Milk,Grocery,Frozen,Detergents_Paper,Delicatessen
0,12669,9656,7561,214,2674,1338
1,7057,9810,9568,1762,3293,1776
2,6353,8808,7684,2405,3516,7844
3,13265,1196,4221,6404,507,1788
4,22615,5410,7198,3915,1777,5185


In [10]:
# Feature and label split
X_all = df.drop('Delicatessen', axis = 1)
y_all = df['Delicatessen']

X_all.shape
X_all.head()

y_all.shape
y_all.head()

0    1338
1    1776
2    7844
3    1788
4    5185
Name: Delicatessen, dtype: int64

In [11]:
# Recursive Feature Elimination
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

# create a base classifier used to evaluate a subset of attributes
model = LogisticRegression()

# create the RFE model and select 3 attributes
rfe = RFE(model, 3)
rfe = rfe.fit(X_all, y_all)

# summarize the selection of the attributes
print(rfe.support_)
print(rfe.ranking_)

[ True False False  True  True]
[1 2 3 1 1]


The features marked as "True" should be kept, but then after them, the rankings are as follows for false