# Wrapper Methods
* In wrapper methods we generate models with subsets of find the subsets of the feature with which our wrapped model works the best.
* The feature selection process is based on a specific machine learning algorithm that we are trying to fit on a given dataset.

Here we will discuss about two kinds wrapper methods:
1. Recursive Feature Elimination (RFE)
2. Sequential Feature Selection (SFS)
    1. Forward selection
    2. Backward elimination
    3. Bi-directional elimination(Stepwise Selection)
    
    
<i>*</i>Genetic Algorithm based wrapper feature selection are also used widely.
    
Consider articles for more info:
1. [Recursive Feature Elimination](https://towardsdatascience.com/feature-selection-in-python-recursive-feature-elimination-19f1c39b8d15)
2. [Sequential Feature Selection](https://towardsdatascience.com/feature-selection-using-wrapper-methods-in-python-f0d352b346f)

## Recursive Feature Elimination (RFE)
* RFE considers smaller and smaller subset of feature with each iteration by pruning the least important feature at every step.
* The pruning process is done recursively

In [1]:
import pandas as pd

In [2]:
data = pd.read_csv("https://raw.githubusercontent.com/rahul96rajan/sample_datasets/master/diabetes.csv")
display(data.head())

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [3]:
X = data.drop("Outcome", axis=1)
y = data['Outcome']

In [4]:
from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier

log_reg = RandomForestClassifier(random_state=1)
rfe = RFE(log_reg, n_features_to_select=5) # Best 5 features will be selected

In [5]:
rfe_fit = rfe.fit(X, y)

In [6]:
ref_df = pd.DataFrame({
    "Features": X.columns,
    "Ranking": rfe_fit.ranking_,
    "Support": rfe_fit.support_
})
display(ref_df)

Unnamed: 0,Features,Ranking,Support
0,Pregnancies,2,False
1,Glucose,1,True
2,BloodPressure,1,True
3,SkinThickness,4,False
4,Insulin,3,False
5,BMI,1,True
6,DiabetesPedigreeFunction,1,True
7,Age,1,True


In [7]:
print("\n\t:: Values selected are :: ")
display(ref_df[rfe_fit.support_])


	:: Values selected are :: 


Unnamed: 0,Features,Ranking,Support
1,Glucose,1,True
2,BloodPressure,1,True
5,BMI,1,True
6,DiabetesPedigreeFunction,1,True
7,Age,1,True


## Sequential Feature Selection (SFS)
<u>Pre-requisite</u>: Make sure the **'mlxtend'** package is installed

If not installed use

> pip install mlxtend (windows) / pip3 install mlxtend (linux/mac)

> conda install mlxtend (for anaconda users)

## 1. Forward selection
In forward selection, we start with a null model and then start fitting the model with each individual feature one at a time and select the feature with the minimum p-value.

In [8]:
data = pd.read_csv("https://raw.githubusercontent.com/rahul96rajan/sample_datasets/master/boston_housing.csv")
display(data.head())

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [9]:
X ,y = data.drop('MEDV', axis=1), data['MEDV']

In [10]:
from mlxtend.feature_selection import SequentialFeatureSelector as SFS
from sklearn.linear_model import LinearRegression

fwd_sfs = SFS(LinearRegression(), k_features=10, forward=True, scoring = 'r2')

In [11]:
fwd_sfs.fit(X, y)
print("Features Selected (forward selection): ", fwd_sfs.k_feature_names_)

Features Selected (forward selection):  ('CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'DIS', 'PTRATIO', 'B', 'LSTAT')


## 2. Backward elimination
In backward elimination, we start with the full model (including all the independent variables) and then remove the insignificant feature with highest p-value. This process repeats again and again until we have the final set of significant features.

In [12]:
bck_sfs = SFS(LinearRegression(), k_features=10, forward=False,  scoring = 'r2')
bck_sfs.fit(X, y)
print("Features Selected (backward elimination):", bck_sfs.k_feature_names_)

Features Selected (backward elimination): ('CRIM', 'ZN', 'NOX', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT')


## 3. Bi-directional elimination(Stepwise Selection)
It is similar to forward selection but the difference is while adding a new feature it also checks the significance of already added features and if it finds any of the already selected features insignificant then it simply removes that particular feature through backward elimination.
Hence, It is a combination of forward selection and backward elimination.

In [13]:
bi_direct_sfs = SFS(LinearRegression(), k_features=10, forward=True, floating=True,  scoring = 'r2')
bi_direct_sfs.fit(X, y)
print("Features Selected (backward elimination):", bi_direct_sfs.k_feature_names_)

Features Selected (backward elimination): ('CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'DIS', 'PTRATIO', 'B', 'LSTAT')
