Scikit-learn API provides RFE class that ranks features by recursive feature elimination to select best features. The method recursively eliminates the least important features based on specific attributes taken by estimator.

In [1]:
# Import libraries
from sklearn.feature_selection import RFE
from sklearn.ensemble import AdaBoostRegressor
from sklearn.datasets import load_boston
from numpy import array

In [2]:
# Load dataset
boston = load_boston()
x = boston.data
y = boston.target

print("Feature data dimension: ", x.shape)  

Feature data dimension:  (506, 13)


The feature data contains 13 columns of 506 rows, our purpose is to decrease those columns by selecting best 8 by their influence rank.  
 

Next, we'll define the model by using RFE class. 

The class requires estimator and we can use AdaBoostRegressor meta-estimator model for this purpose. The target number of features to select is defined by n_feature_to_select parameter and step defines number of features to remove in each round. 

We'll fit the model on x and y training data.

In [3]:
estimator = AdaBoostRegressor(random_state=0, n_estimators=100)
selector = RFE(estimator, n_features_to_select=8, step=1)
selector = selector.fit(x, y)

In [4]:
# After fitting we can obtain selected features and their ranking positions.
filter = selector.support_
ranking = selector.ranking_

print("Mask data: ", filter)
print("Ranking: ", ranking) 

Mask data:  [ True False False False  True  True False  True  True  True  True False
  True]
Ranking:  [1 5 3 6 1 1 4 1 1 1 1 2 1]


In [5]:
features = array(boston.feature_names)
print("All features:")
print(features)

print("Selected features:")
print(features[filter])

All features:
['CRIM' 'ZN' 'INDUS' 'CHAS' 'NOX' 'RM' 'AGE' 'DIS' 'RAD' 'TAX' 'PTRATIO'
 'B' 'LSTAT']
Selected features:
['CRIM' 'NOX' 'RM' 'DIS' 'RAD' 'TAX' 'PTRATIO' 'LSTAT']
