## Wrapper Methods
These methods are used for feature selection and are computationally intensive as they often involve search heuristics,stohastic gradient descent etc...These methods are only suitable for small, mid-scaled datasets (which will not be the case in real-world, so we often ignore these methods). The methods involve Forward Feature Selection, Backward Feature Elimination, Exhaustive Feature Search in mlxtend, Recurrsive Feature Elimination in sklearn

### Exhaustive Feature Selector in mlxtend
This method takes as input the estimator(ML Algorithm), minimum features, maximum features. This method is computationally intensive. This method starts with the mentioned minimum number of features compares the model performance with the mentioned scoring strategy by implementing the cross-validation technique

In [1]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from mlxtend.feature_selection import ExhaustiveFeatureSelector as EFS
from sklearn.exceptions import ConvergenceWarning
import warnings
warnings.filterwarnings("ignore")
with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=ConvergenceWarning)
iris = load_iris()
X = iris.data
y = iris.target

knn = KNeighborsClassifier(n_neighbors=3)

efs1 = EFS(knn, 
           min_features=1,
           max_features=4,
           scoring='accuracy',
           print_progress=True,
           cv=5)

efs1 = efs1.fit(X, y)

print('Best accuracy score: %.2f' % efs1.best_score_)
print('Best subset (indices):', efs1.best_idx_)
print('Best subset (corresponding names):', efs1.best_feature_names_)

Features: 15/15

Best accuracy score: 0.97
Best subset (indices): (0, 2, 3)
Best subset (corresponding names): ('0', '2', '3')


### Recurrsive Feature Elimination
In this method we have an estimator (ml model) and mention the number of features we need to retrieve with step size which means how many features to eliminate at any given time. The output will be as shown below. This method like EFS is not suitable for large datasets

In [2]:
from sklearn.feature_selection import RFE
from sklearn.svm import LinearSVC
estimator=LinearSVC()
selector=RFE(estimator,2,step=1)
selector=selector.fit(X,y)
selector.support_

array([False,  True, False,  True])

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [4]:
df=pd.read_csv('https://gist.githubusercontent.com/tijptjik/9408623/raw/b237fa5848349a14a14e5d4107dc7897c21951f5/wine.csv')
df.head()

Unnamed: 0,Wine,Alcohol,Malic.acid,Ash,Acl,Mg,Phenols,Flavanoids,Nonflavanoid.phenols,Proanth,Color.int,Hue,OD,Proline
0,1,14.23,1.71,2.43,15.6,127,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065
1,1,13.2,1.78,2.14,11.2,100,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050
2,1,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185
3,1,14.37,1.95,2.5,16.8,113,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480
4,1,13.24,2.59,2.87,21.0,118,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735


In [5]:
X=df.drop('Wine',axis=1)
y=df['Wine']

In [6]:
knn = KNeighborsClassifier(n_neighbors=3)

efs1 = EFS(knn, 
           min_features=1,
           max_features=10,
           scoring='accuracy',
           print_progress=True,
           cv=5)

efs1 = efs1.fit(X, y)

print('Best accuracy score: %.2f' % efs1.best_score_)
print('Best subset (indices):', efs1.best_idx_)
print('Best subset (corresponding names):', efs1.best_feature_names_)

Features: 8099/8099

Best accuracy score: 0.96
Best subset (indices): (0, 3, 6, 8, 9, 10, 11)
Best subset (corresponding names): ('Alcohol', 'Acl', 'Flavanoids', 'Proanth', 'Color.int', 'Hue', 'OD')


In [7]:
from sklearn.feature_selection import RFE
from sklearn.svm import LinearSVC
estimator=LinearSVC()
selector=RFE(estimator,10,step=1)
selector=selector.fit(X,y)
selector.support_



array([ True,  True, False,  True,  True,  True,  True, False,  True,
        True,  True,  True, False])