##  Wrapper methods - Forward Selection

   Wrapper methods are a category of feature selection techniques that use the performance of a machine learning model as a criterion for selecting the most relevant features. These methods assess the quality of a subset of features by training and evaluating a model with that subset and making decisions based on the model's performance. The key idea is to search through different combinations of features to find the subset that optimizes the model's predictive performance. Common wrapper methods include:

1. **Forward Selection:**
   - In forward selection, you start with an empty set of features and iteratively add the most informative features one at a time.
   - The process begins by evaluating the performance of the model with each individual feature separately and selecting the one that performs the best.
   - Then, you continue to add features one by one, selecting the feature that contributes the most to the model's performance.
   - This process continues until a predefined stopping criterion is met, such as a maximum number of features or a decrease in model performance.

2. **Backward Elimination:**
   - In backward elimination, you begin with all the available features and iteratively remove the least informative features one at a time.
   - The process starts by evaluating the performance of the model using all the features.
   - You then remove the feature that has the least impact on the model's performance.
   - The process continues, removing one feature at a time, until you reach a stopping criterion.

Wrapper methods have some advantages and disadvantages:

Advantages:
- They consider the interaction between features, which can be important in some cases.
- They directly use the model's performance as a criterion, which may lead to better feature selection for specific modeling tasks.

Disadvantages:
- They can be computationally expensive, especially when the feature space is large.
- They are model-dependent, meaning that the choice of the machine learning algorithm can affect the feature selection process.
- The optimal feature subset selected by wrapper methods may not be the same for different machine learning models.

Despite the computational cost, wrapper methods are often used in situations where predictive performance is the primary goal, and a small, informative feature subset is crucial. These methods help to find the most relevant features for a specific machine learning task by considering how features impact the model's performance.

## Forward Selection

   - In forward selection, you start with an empty set of features and iteratively add the most informative features one at a time.
   - The process begins by evaluating the performance of the model with each individual feature separately and selecting the one that performs the best.
   - Then, you continue to add features one by one, selecting the feature that contributes the most to the model's performance.
   - This process continues until a predefined stopping criterion is met, such as a maximum number of features or a decrease in model performance.
   
Forward feature selection is a feature selection technique that involves iteratively adding one feature at a time to the model based on its performance. Like any method, forward selection has its advantages and disadvantages:

**Advantages of Forward Selection:**

1. **Simplicity:** Forward selection is easy to understand and implement. It starts with an empty set of features and builds up the feature set incrementally, making it accessible to those new to feature selection techniques.

2. **Efficiency:** In many cases, forward selection can be more computationally efficient compared to some other feature selection techniques because it only evaluates the performance of one feature at a time.

3. **Interpretability:** The resulting feature set can be more interpretable since you can see the sequence of feature additions and understand which features contribute most to the model's performance.

4. **Flexibility:** Forward selection can be used with a variety of machine learning algorithms, making it a versatile feature selection approach.

**Disadvantages of Forward Selection:**

1. **Not Guaranteed to Find the Best Subset:** While forward selection builds up a feature subset iteratively, it does not guarantee that it will find the best subset of features for a given problem. It may get stuck in suboptimal solutions and not explore all possible combinations.

2. **Time-Consuming for Large Feature Spaces:** In cases where the feature space is large, forward selection can be computationally expensive. As it evaluates features one by one, the number of evaluations can grow significantly.

3. **Dependent on the Order of Features:** The order in which features are added can significantly impact the final subset selected. If a critical feature is added late in the process, it may not be given proper consideration.

4. **No Interaction Consideration:** Forward selection does not explicitly consider interactions between features. It focuses on the individual impact of each feature on the model's performance.

5. **Potential Overfitting:** There's a risk of overfitting if the selection process continues until a stopping criterion is met. Overfit models may not generalize well to new data.

6. **Sensitive to Noise:** Forward selection can be sensitive to noise in the data. Noisy features might be selected if they appear to improve model performance on the training data.

In summary, forward selection is a straightforward and interpretable feature selection method that can work well for certain problems, especially when you have a relatively small feature space. However, it has limitations in terms of not guaranteeing the best subset and being potentially time-consuming for large feature spaces. The choice of feature selection technique should depend on the specific characteristics of your dataset and the problem you are trying to solve.

In [10]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

In [17]:
# Load the Iris dataset from scikit-learn
from sklearn.datasets import load_iris
iris = load_iris()
data = pd.DataFrame(data=iris.data, columns=iris.feature_names)
data['species'] = iris.target_names[iris.target]

In [18]:
# Split the dataset into training and testing sets
X = data.drop('species', axis=1)
y = data['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [19]:
# Create a Random Forest classifier
classifier = RandomForestClassifier(n_estimators=100, random_state=42)

In [20]:
X_train.shape

(105, 4)

In [21]:
X_train.shape[1] # Select the column count

4

In [22]:
# Forward Selection for feature selection
selected_features = []  # Start with Empty feature list
best_accuracy = 0.0

for i in range(X_train.shape[1]):
    best_feature = None
    for feature in X_train.columns:
        temp_features = selected_features + [feature]
        classifier.fit(X_train[temp_features], y_train)
        y_pred = classifier.predict(X_test[temp_features])
        accuracy = accuracy_score(y_test, y_pred)
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_feature = feature
    if best_feature is not None:
        selected_features.append(best_feature)

print("Selected features using Forward Selection:", selected_features)
print("Best Accuracy using Forward Selection:", best_accuracy)

Selected features using Forward Selection: ['petal width (cm)']
Best Accuracy using Forward Selection: 1.0


#### NOTE:
   The Forward Selection and Backward Selection are can use any of ML algorithm not only Random Forest Classifier.It depends on the problem and datasets.