# Feature Selection Using Models Learned Thus Far...

First, Feature selection using SelectFromModel

SelectFromModel is a meta-transformer that can be used along with any estimator that has a coef_ or feature_importances_ 
attribute after fitting. 

The features are considered unimportant and removed, if the corresponding coef_ or feature_importances_ values are below 
the provided threshold parameter. 

Apart from specifying the threshold numerically, there are built-in heuristics for finding a threshold using a string argument. 

Available heuristics are “mean”, “median” and float multiples of these like “0.1*mean”.

### Example 1: Fit a Random Forest model and use SelectFromModel to keep important features

In [18]:
from sklearn.datasets import load_boston
from sklearn.feature_selection import SelectFromModel
from sklearn.ensemble import RandomForestRegressor

from sklearn.model_selection import train_test_split

Xtrain, Xtest, ytrain, ytest = train_test_split(boston.data, boston.target,
                                                random_state=0)

forest = RandomForestRegressor(n_estimators=200)
formodel = forest.fit(Xtrain, ytrain)


print()

print(formodel.feature_importances_)


[0.03809578 0.00089448 0.00756718 0.00120545 0.01594091 0.3939216
 0.01391059 0.03554424 0.00360569 0.01893587 0.02171648 0.01040486
 0.43825687]


In [26]:
# Set a minimum threshold of 0.25
sfm = SelectFromModel(formodel, threshold=.25)
sfm.fit(Xtrain, ytrain)
Xtrain_new = sfm.transform(Xtrain) # transform data to insert into new model

print(Xtrain_new[0:5,:]) #only two variables in X now

print(Xtrain.shape) #compare to original data with 13 variables

[[ 5.605 18.46 ]
 [ 5.927  9.22 ]
 [ 7.267  6.05 ]
 [ 6.471 17.12 ]
 [ 6.782 25.79 ]]
(379L, 13L)


### Example 2: Fit a Lasso model and use SelectFromModel to keep important features

In [39]:
from sklearn.linear_model import Lasso
from sklearn.feature_selection import SelectFromModel

lassomodel = Lasso(alpha=10).fit(Xtrain, ytrain)
model = SelectFromModel(lassomodel, prefit=True) # prefit argument allows non zero features to be chosen
                                                 # from regularized models like lasso
    
X_new = model.transform(Xtrain) # transform data to insert into new model

print(lassomodel.coef_)
print(X_new.shape) #down to four variables from 13



[-0.          0.03268736 -0.          0.          0.          0.
  0.         -0.          0.         -0.01155886 -0.          0.00679307
 -0.54971232]


(379L, 4L)

# Using Recursive Feature Elimination to Choose Model

Given an external estimator that assigns weights to features (e.g., the coefficients of a linear model), the goal of recursive feature elimination (RFE) is to select features by recursively considering smaller and smaller sets of features. 

First, the estimator is trained on the initial set of features and the importance of each feature is obtained either through a coef_ attribute or through a feature_importances_ attribute. 

Basic algorithm:
Start with full model.  Run series of models that evaluate prediction error on ytrain after dropping a feature.  Repeat for all features.  Drop feature that is helps least in predicting ytrain.  Repeat process with n-1 features...

Then, the least important features are pruned from current set of features. That procedure is recursively repeated on the pruned set until the desired number of features to select is eventually reached.

In [61]:
#EXAMPLE:  RFE to find 5 features that help model predict the best:

from sklearn.linear_model import LinearRegression
from sklearn.feature_selection import RFE

estimator = LinearRegression().fit(Xtrain, ytrain) #model with all X variables


selector = RFE(estimator, 5, step=1) # step tells RFE how many features to remove each time model features are evaluated

selector = selector.fit(Xtrain, ytrain) # fit RFE estimator.

print("Num Features: %d") % selector.n_features_
print("Selected Features: %s") % selector.support_ # T/F for top five features
print("Feature Ranking: %s") % selector.ranking_  # ranking for top five + features

Num Features: 5
Selected Features: [False False False  True  True  True False  True False False  True False
 False]
Feature Ranking: [3 5 9 1 1 1 8 1 4 6 1 7 2]


In [62]:
# Transform X data for other use in this model or other models:

Xnew = selector.transform( Xtrain) #reduces X to subset identified above
Xnew.shape

(379L, 5L)

## Can you use feature selection to transform the following dataset using different feature selection techniques?  
How do models differ if you do not subset the data or leave it the same?  

In [66]:
from sklearn.datasets import load_breast_cancer
bc = load_breast_cancer()