Wrapper methods are feature selection techniques that evaluate the usefulness of subsets of features based on the performance of a specific machine learning model. Unlike filter methods, which rely on intrinsic properties of the data, wrapper methods involve training and evaluating a model to determine the optimal feature subset. These methods are generally more computationally expensive but can provide better feature subsets because they take into account the interaction between features and the model.

Types of Wrapper Methods

Forward Selection:
Starts with no features and adds one feature at a time.
At each step, the feature that improves the model performance the most is added.

Backward Elimination:
Starts with all features and removes one feature at a time.
At each step, the feature whose removal least affects model performance is removed.

Recursive Feature Elimination (RFE):
Recursively removes the least important features based on model performance.
At each iteration, the model is trained and the least important features are pruned from the current set of features.

Exhaustive wrapper method : Exhaustive wrapper methods evaluate all possible feature combinations to identify the optimal subset, ensuring the best model performance but are computationally expensive.

In [4]:
from sklearn.datasets import load_iris
import seaborn as sns

In [8]:
iris=sns.load_dataset('iris')
iris

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica
146,6.3,2.5,5.0,1.9,virginica
147,6.5,3.0,5.2,2.0,virginica
148,6.2,3.4,5.4,2.3,virginica


In [21]:
from sklearn.preprocessing import LabelEncoder

le=LabelEncoder()

iris['species']=le.fit_transform(iris['species'])

In [25]:

X=iris.iloc[:,0:4]
y=iris.iloc[:,-1]

In [28]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3)

forward

In [34]:
from sklearn.feature_selection import SequentialFeatureSelector
from sklearn.linear_model import LogisticRegression

model=LogisticRegression(max_iter=1000)
sfs1=SequentialFeatureSelector(model,n_features_to_select=3,direction='forward',scoring='accuracy',cv=10)
sfs1.fit(X_train,y_train)
sfs1.get_support(indices=True)

array([0, 2, 3], dtype=int64)

backward

In [36]:
sfs2=SequentialFeatureSelector(model,n_features_to_select=3,direction='backward',scoring='accuracy',cv=10)
sfs2.fit(X_train,y_train)
sfs2.get_support(indices=True)


array([1, 2, 3], dtype=int64)

Recursive

In [50]:
from sklearn.feature_selection import RFE
sfs3=RFE(model,n_features_to_select=3)
sfs3.fit_transform(X_train,y_train)
sfs3.support_

array([False,  True,  True,  True])