# Feature Selection:  Recursive Feature Elimination

Adapted from Jason Brownlee. 2020. [Data Preparation for Machine Learning](https://machinelearningmastery.com/data-preparation-for-machine-learning/).

## Overview

Recursive Feature Elimination, or RFE for short, is a popular feature selection algorithm. RFE
is popular because it is easy to configure and use, and effective at selecting those
features (columns) in a training dataset that are more or most relevant in predicting the target
variable. There are two important configuration options when using RFE: the choice in the
number of features to select and the choice of the algorithm used to help choosing the features. Both
of these hyperparameters can be explored, although the performance of the method is not
strongly dependent on these hyperparameters being configured well.

## Learning Objectives

- Learn how RFE is an efficient approach for eliminating features from a training dataset for feature selection
- Learn how to use RFE for feature selection for classification and regression predictive modeling problems
- Learn how to explore the number of selected features and wrapped algorithm used by the RFE procedure
- Understand how to evaluate different algorithms wrapped by RFE for optimal feature selection

### Tasks to complete

- Implement RFE for classification problems
- Implement RFE for regression problems 
- Explore RFE hyperparameters
- Evaluate different estimator algorithms for RFE
- Analyze selected features

## Prerequisites

- A working Python environment and familiarity with Python
- Basic understanding of machine learning concepts
- Familiarity with pandas and numpy libraries
- Knowledge of basic statistical concepts

## Get Started

To start, we install required packages and import the necessary libraries.

### Install packages

In [None]:
# Install necessary Python packages using pip
# 'matplotlib' for plotting
# 'numpy' for numerical operations
# 'scikit-learn' for machine learning tools

%pip install matplotlib numpy scikit-learn  # Install the specified packages

### Import libraries

In [None]:
# Importing necessary libraries and modules for data processing, model building, and evaluation

# Import pyplot from matplotlib for plotting graphs
from matplotlib import pyplot

# Import mean and std from numpy to calculate statistical measures (mean and standard deviation)
from numpy import mean, std

# Import datasets for generating synthetic data
from sklearn.datasets import make_classification, make_regression

# Import ensemble classifiers for building gradient boosting and random forest models
from sklearn.ensemble import GradientBoostingClassifier, RandomForestClassifier

# Import Recursive Feature Elimination (RFE) and its cross-validation version (RFECV) for feature selection
from sklearn.feature_selection import RFE, RFECV

# Import linear models for classification
from sklearn.linear_model import LogisticRegression, Perceptron

# Import model selection techniques for cross-validation
from sklearn.model_selection import (
    RepeatedKFold,  # Repeated k-fold cross-validation
    RepeatedStratifiedKFold,  # Stratified k-fold cross-validation for classification tasks
    cross_val_score,  # Function for performing cross-validation
)

# Import Pipeline for building a sequence of processing steps including preprocessing and model fitting
from sklearn.pipeline import Pipeline

# Import decision tree classifiers and regressors for building decision tree-based models
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor


## RFE for Classification

First, we can use the
**make_classification**() function to create a synthetic binary classification problem with 1,000
examples and 10 input features, five of which are informative and five of which are redundant.

Next, we can evaluate an RFE feature selection algorithm on this dataset. We will use a
**DecisionTreeClassifier** to choose features and set the number of features to five. We will
then fit a new DecisionTreeClassifier model on the selected features. We will evaluate the
model using repeated stratified k-fold cross-validation, with three repeats and 10 folds. We will
report the mean and standard deviation of the accuracy of the model across all repeats and
folds.

In [None]:
# Evaluate Recursive Feature Elimination (RFE) for classification

# Define dataset
# Generate a random n-class classification problem with 1000 samples, 10 features,
# 5 informative features, and 5 redundant features. Random state ensures reproducibility.
X, y = make_classification(
    n_samples=1000, n_features=10, n_informative=5, n_redundant=5, random_state=1
)

# Create pipeline
# Initialize RFE with a DecisionTreeClassifier as the estimator
# Set n_features_to_select=5 to keep 5 most important features
rfe = RFE(estimator=DecisionTreeClassifier(), n_features_to_select=5)

# Create the classification model using DecisionTreeClassifier
model = DecisionTreeClassifier()

# Create a pipeline that first applies RFE to select important features and then trains the model
pipeline = Pipeline(steps=[("s", rfe), ("m", model)])

# Evaluate model
# Use RepeatedStratifiedKFold for cross-validation, which ensures the distribution of the target variable is maintained across folds.
# 10 splits and 3 repeats will give more reliable results by testing multiple splits of the data.
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)

# Perform cross-validation to evaluate the pipeline's performance, using accuracy as the evaluation metric
n_scores = cross_val_score(pipeline, X, y, scoring="accuracy", cv=cv, n_jobs=-1)

# Report performance
# Calculate and print the mean and standard deviation of the cross-validation accuracy scores
print("Accuracy: %.3f (%.3f)" % (mean(n_scores), std(n_scores)))

In this case, we can see the RFE that uses a decision tree and selects five features and
then fits a decision tree on the selected features achieves a classification accuracy of about 88
percent.

We can also use the RFE model pipeline as a final model and make predictions for classification. First, the RFE and model are fit on all available data, then the predict() function can
be called to make predictions on new data.

In [None]:
# Make a prediction with an RFE (Recursive Feature Elimination) pipeline

# Define dataset
# Generates a synthetic classification dataset with 1000 samples, 10 features (5 informative, 5 redundant)
# The random_state is set for reproducibility of results
X, y = make_classification(
    n_samples=1000, n_features=10, n_informative=5, n_redundant=5, random_state=1
)

# Create pipeline
# Set up an RFE (Recursive Feature Elimination) model to select the top 5 features based on a DecisionTreeClassifier
rfe = RFE(estimator=DecisionTreeClassifier(), n_features_to_select=5)
# Create a basic DecisionTreeClassifier as the final model in the pipeline
model = DecisionTreeClassifier()
# Create a pipeline that first applies RFE for feature selection, then uses the DecisionTreeClassifier for prediction
pipeline = Pipeline(steps=[("s", rfe), ("m", model)])

# Fit the model on all available data
# The pipeline is fit using all the data, where RFE first selects important features and then the DecisionTreeClassifier is trained
pipeline.fit(X, y)

# Make a prediction for one example
# Define a new data point with 10 feature values for prediction
data = [
    [
        2.56999479,
        -0.13019997,
        3.16075093,
        -4.35936352,
        -1.61271951,
        -1.39352057,
        -2.48924933,
        -1.93094078,
        3.26130366,
        2.05692145,
    ]
]
# Make a prediction using the fitted pipeline
yhat = pipeline.predict(data)

# Print the predicted class
# Output the predicted class label for the input data point
# Ensure you're extracting the correct element from the array
if yhat.ndim > 0:
    print("Predicted: %.3f" % (yhat.item()))  # Use .item() to get a single element
else:
    print("Predicted: %.3f" % (yhat))  # If it's already a scalar


## RFE for Regression

Next, we will look at using RFE for a regression problem. First, we can use the
**make_regression**() function to create a synthetic regression problem with 1,000 examples and
10 input features, five of which are important and five of which are redundant.

In [None]:
# evaluate RFE for regression

# define dataset

# Generate a random regression problem.
# 'n_samples=1000' specifies the number of data points.
# 'n_features=10' defines the total number of features.
# 'n_informative=5' sets how many features are informative for the model (the rest are noise).
# 'random_state=1' ensures reproducibility by fixing the random number generator seed.
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, random_state=1)

# Create pipeline to combine feature selection and model fitting
# 'RFE' (Recursive Feature Elimination) selects the top 'n_features_to_select' features using an estimator.
# 'estimator=DecisionTreeRegressor()' uses a Decision Tree Regressor as the model for feature importance.
# 'n_features_to_select=5' specifies that 5 features should be selected after performing RFE.
rfe = RFE(estimator=DecisionTreeRegressor(), n_features_to_select=5)

# Define the model to use after feature selection
# 'DecisionTreeRegressor' is chosen to build the regression model.
model = DecisionTreeRegressor()

# Create a pipeline that combines RFE and the DecisionTreeRegressor model
# The pipeline consists of two steps: first 'RFE' for feature selection ('s'), then 'DecisionTreeRegressor' ('m').
pipeline = Pipeline(steps=[("s", rfe), ("m", model)])


# evaluate model
# Repeated K-Fold cross validator.
# Repeats K-Fold n times with different randomization in each repetition.
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

# All scorer objects follow the convention that higher return values are better than lower return values.
# Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error,
# are available as neg_mean_squared_error which return the negated value of the metric.
n_scores = cross_val_score(
    pipeline, X, y, scoring="neg_mean_absolute_error", cv=cv, n_jobs=-1
)

# report performance
print("MAE: %.3f (%.3f)" % (mean(n_scores), std(n_scores)))

In this case, we can see the RFE pipeline with a decision tree model achieves a MAE of
about 27.

We can also use the RFE as part of the final model and make predictions for regression.
First, the Pipeline is fit on all available data, then the predict() function can be called to
make predictions on new data.

In [None]:
# make a regression prediction with an RFE pipeline

# Import necessary libraries (assumed to be already imported)
# - make_regression: used to generate synthetic regression data
# - RFE: Recursive Feature Elimination for feature selection
# - DecisionTreeRegressor: decision tree model used for regression
# - Pipeline: used to streamline multiple steps (feature selection and modeling) into a single workflow

# Generate a synthetic regression dataset
# X = features, y = target variable
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, random_state=1)
# n_samples: Number of samples (1000)
# n_features: Number of features (10)
# n_informative: Number of informative features (5)
# random_state: Seed for reproducibility

# Create a pipeline for feature selection and model fitting
# Initialize Recursive Feature Elimination (RFE) with DecisionTreeRegressor as the estimator
# RFE will select the 5 most important features from the dataset
rfe = RFE(estimator=DecisionTreeRegressor(), n_features_to_select=5)

# Initialize a DecisionTreeRegressor model (to be used after feature selection)
model = DecisionTreeRegressor()

# Combine the feature selection (RFE) and model (DecisionTreeRegressor) into a single pipeline
# 'steps' defines the sequence of operations: first RFE for feature selection, then DecisionTreeRegressor for prediction
pipeline = Pipeline(steps=[("s", rfe), ("m", model)])

# Fit the pipeline on the entire dataset (X = features, y = target)
# The pipeline will first perform RFE to select features, then train the DecisionTreeRegressor model
pipeline.fit(X, y)

# Make a prediction using the trained pipeline on a single example (data)
# 'data' represents a single input sample to predict the target variable (y)
data = [
    [
        -2.02220122,
        0.31563495,
        0.82797464,
        -0.30620401,
        0.16003707,
        -1.44411381,
        0.87616892,
        -0.50446586,
        0.23009474,
        0.76201118,
    ]
]

# Use the fitted pipeline to make a prediction for the input 'data'
yhat = pipeline.predict(data)

# Print the predicted value, formatted to 3 decimal places
# Ensure you're extracting the correct element from the array
if yhat.ndim > 0:
    print("Predicted: %.3f" % (yhat.item()))  # Use .item() to get a single element
else:
    print("Predicted: %.3f" % (yhat))  # If it's already a scalar



## RFE Hyperparameters

In this section, we will take a closer look at some of the hyperparameters you should consider
tuning for the RFE method for feature selection and their effect on model performance.

### Explore Number of Features

An important hyperparameter for the RFE algorithm is the number of features to select. In
the previous section, we used an arbitrary number of selected features, five, which matches
the number of informative features in the synthetic dataset. In practice, we cannot know the
best number of features to select with RFE; instead, it is good practice to test different values.

In [None]:
# explore the number of selected features for RFE

# Define a function to generate a synthetic dataset for classification
def get_dataset():
    # 'make_classification' generates a random classification problem dataset
    # Parameters:
    # - n_samples=1000: 1000 samples (data points) will be generated
    # - n_features=10: 10 features (columns) will be created
    # - n_informative=5: 5 of the features will be informative (useful for predicting the target)
    # - n_redundant=5: 5 of the features will be redundant (correlated with the informative features)
    # - random_state=1: ensures reproducibility by fixing the random seed
    X, y = make_classification(
        n_samples=1000, n_features=10, n_informative=5, n_redundant=5, random_state=1
    )
    
    # Return the generated feature matrix (X) and the target vector (y)
    return X, y



# Function to create a list of models for evaluation using Recursive Feature Elimination (RFE) and Decision Tree Classifier
def get_models():
    models = {}  # Initialize an empty dictionary to store models
    
    # Loop through the range 2 to 9 to create models with different numbers of selected features
    for i in range(2, 10):
        # Create an RFE (Recursive Feature Elimination) instance with a DecisionTreeClassifier as the estimator
        # RFE will select 'i' number of features
        rfe = RFE(estimator=DecisionTreeClassifier(), n_features_to_select=i)
        
        # Initialize a DecisionTreeClassifier model
        model = DecisionTreeClassifier()
        
        # Store the model in the dictionary, with the number of features as the key
        # Use a Pipeline to chain the RFE and the DecisionTreeClassifier together
        models[str(i)] = Pipeline(steps=[("s", rfe), ("m", model)])
    
    # Return the dictionary containing all the models
    return models



# evaluate a given model using cross-validation
def evaluate_model(model, X, y):
    cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
    scores = cross_val_score(model, X, y, scoring="accuracy", cv=cv, n_jobs=-1)
    return scores# Function to evaluate a given model using cross-validation
def evaluate_model(model, X, y):
    # Initialize a RepeatedStratifiedKFold cross-validation strategy
    # n_splits=10: 10 folds (subsets of data)
    # n_repeats=3: repeat the cross-validation process 3 times for more reliable results
    # random_state=1: ensures reproducibility of results
    cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
    
    # Perform cross-validation using the model, input features (X), and target labels (y)
    # scoring="accuracy": evaluate performance based on accuracy
    # cv=cv: use the defined cross-validation strategy
    # n_jobs=-1: use all available CPU cores to speed up the process
    scores = cross_val_score(model, X, y, scoring="accuracy", cv=cv, n_jobs=-1)
    
    # Return the accuracy scores from the cross-validation process
    return scores



# define dataset
X, y = get_dataset()

# get the models to evaluate
models = get_models()

# Initialize two empty lists to store the results and model names
results, names = [], []

# Iterate over each model in the 'models' dictionary
for name, model in models.items():
    
    # Evaluate the current model using the 'evaluate_model' function
    # 'X' is the feature set, and 'y' is the target labels
    scores = evaluate_model(model, X, y)
    
    # Append the model's evaluation scores to the 'results' list
    results.append(scores)
    
    # Append the model's name to the 'names' list
    names.append(name)
    
    # Print the model's name, mean, and standard deviation of its evaluation scores
    # 'mean(scores)' calculates the average score for the model, 'std(scores)' gives the standard deviation
    print(">%s %.3f (%.3f)" % (name, mean(scores), std(scores)))


In this case, we can see that performance improves as the number of features increase and
perhaps peaks around 4-to-7 as we might expect, given that only  five features are relevant to
the target variable.

A box and whisker plot is created for the distribution of accuracy scores for each con gured
number of features.

In [None]:
# Importing the necessary plotting module (assuming it's already imported)

# Plotting a boxplot to compare the performance of different models
# 'results' contains the performance data (e.g., accuracy scores, etc.) for each model
# 'names' is a list of the model names corresponding to the performance results
# The 'boxplot' function is used to generate a boxplot to compare the performance visually
pyplot.boxplot(results, showmeans=True)  # Create the boxplot and display means on each box

# Set the x-axis labels using 'xticklabels' (for model names corresponding to each boxplot)
pyplot.xticks(ticks=range(1, len(names) + 1), labels=names)  # Set model names as x-axis labels

# Display the plot to the user
pyplot.show()  # Show the generated plot

### Automatically Select the Number of Features

It is also possible to automatically select the number of features chosen by RFE. This can be
achieved by performing cross-validation evaluation of different numbers of features as we did in
the previous section and automatically selecting the number of features that resulted in the
best mean score. The **RFECV** class implements this.

In [None]:
# Automatically select the number of features for Recursive Feature Elimination (RFE) using cross-validation

# Define the dataset for classification
# 'X' are the features, 'y' is the target variable
# make_classification generates a synthetic classification dataset
X, y = make_classification(
    n_samples=1000,  # Number of samples in the dataset
    n_features=10,   # Total number of features
    n_informative=5, # Number of informative features (actually useful for prediction)
    n_redundant=5,   # Number of redundant features (linear combinations of the informative ones)
    random_state=1   # Random seed for reproducibility
)

# Create a pipeline that combines feature selection and model training
# Use Recursive Feature Elimination with Cross-Validation (RFECV) to automatically select the optimal number of features
rfe = RFECV(estimator=DecisionTreeClassifier())  # RFE uses a DecisionTreeClassifier as the estimator
model = DecisionTreeClassifier()  # The model to be used after feature selection
pipeline = Pipeline(steps=[("s", rfe), ("m", model)])  # Pipeline applies RFE and then the classifier

# Evaluate the model performance using cross-validation
# RepeatedStratifiedKFold splits the dataset into 10 folds and repeats 3 times for more reliable results
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)

# Perform cross-validation using the pipeline, scoring by accuracy, and running in parallel with n_jobs=-1
n_scores = cross_val_score(pipeline, X, y, scoring="accuracy", cv=cv, n_jobs=-1)

# Report the mean and standard deviation of accuracy from the cross-validation results
print("Accuracy: %.3f (%.3f)" % (mean(n_scores), std(n_scores)))  # Display average accuracy and its variation


In this case, we can see the RFE that uses a decision tree and automatically selects a number
of features and then fits a decision tree on the selected features achieves a classification accuracy
of about 88 percent.

### Which Features Were Selected

When using RFE, we may be interested to know which features were selected and which were
removed. This can be achieved by reviewing the attributes of the fit **RFE** object (or fit **RFECV**
object). The support attribute reports true or false as to which features in order of column
index were included and the ranking attribute reports the relative ranking of features in the
same order. The example below fits an RFE model on the whole dataset and selects five features,
then reports each feature column index (0 to 9), whether it was selected or not (True or False),
and the relative feature ranking.

In [None]:
# Report which features were selected by RFE (Recursive Feature Elimination)

# Define the dataset
# 'X' is the feature matrix, and 'y' is the target variable
# Using 'make_classification' to generate a synthetic dataset with 1000 samples and 10 features
# 5 informative features and 5 redundant features are used
X, y = make_classification(
    n_samples=1000,  # number of samples
    n_features=10,   # total number of features
    n_informative=5, # number of informative features
    n_redundant=5,   # number of redundant features
    random_state=1   # for reproducibility
)

# Define the RFE (Recursive Feature Elimination) model
# 'estimator' is the model used to evaluate the feature importance (here a DecisionTreeClassifier)
# 'n_features_to_select' specifies how many features should be selected
rfe = RFE(estimator=DecisionTreeClassifier(), n_features_to_select=5)

# Fit RFE on the dataset
# This will eliminate irrelevant features and rank the remaining features
rfe.fit(X, y)

# Summarize the selected features and their ranks
# 'rfe.support_' indicates which features are selected (True/False)
# 'rfe.ranking_' shows the ranking of all features (1 means the most important feature)
for i in range(X.shape[1]):
    print("Column: %d, Selected=%s, Rank: %d" % (i, rfe.support_[i], rfe.ranking_[i]))
    # Print the index of the feature, whether it's selected (True/False), and its ranking


### Explore Estimator Algorithm

There are many algorithms that can be used in the core RFE, as long as they provide some
indication of variable importance. Most decision tree algorithms are likely to report the same
general trends in feature importance, but this is not guaranteed. It might be helpful to explore
the use of different algorithms wrapped by RFE. The example below demonstrates how you
might explore this configuration option.

In [None]:
# explore the algorithm wrapped by RFE

# Function to generate a dataset for classification
def get_dataset():
    # Create a synthetic classification dataset with 1000 samples, 10 features, 5 informative and 5 redundant features
    X, y = make_classification(
        n_samples=1000, n_features=10, n_informative=5, n_redundant=5, random_state=1
    )
    return X, y  # Return the features (X) and labels (y)


# Function to create a dictionary of models wrapped with RFE (Recursive Feature Elimination)
def get_models():
    models = {}  # Initialize an empty dictionary to store models

    # Logistic Regression (lr)
    # RFE is wrapped around LogisticRegression to select the top 5 features
    rfe = RFE(estimator=LogisticRegression(), n_features_to_select=5)
    model = DecisionTreeClassifier()  # Use DecisionTreeClassifier as the final model
    models["lr"] = Pipeline(steps=[("s", rfe), ("m", model)])  # Create a pipeline with RFE and model

    # Perceptron (per)
    # Wrap RFE around Perceptron estimator to select 5 features
    rfe = RFE(estimator=Perceptron(), n_features_to_select=5)
    model = DecisionTreeClassifier()  # Use DecisionTreeClassifier as the final model
    models["per"] = Pipeline(steps=[("s", rfe), ("m", model)])  # Add to the models dictionary

    # Decision Tree Classifier (dtc)
    # Wrap RFE around DecisionTreeClassifier estimator
    rfe = RFE(estimator=DecisionTreeClassifier(), n_features_to_select=5)
    model = DecisionTreeClassifier()  # Use DecisionTreeClassifier as the final model
    models["dtc"] = Pipeline(steps=[("s", rfe), ("m", model)])

    # Random Forest Classifier (rf)
    # Wrap RFE around RandomForestClassifier estimator
    rfe = RFE(estimator=RandomForestClassifier(), n_features_to_select=5)
    model = DecisionTreeClassifier()  # Use DecisionTreeClassifier as the final model
    models["rf"] = Pipeline(steps=[("s", rfe), ("m", model)])

    # Gradient Boosting Classifier (gbm)
    # Wrap RFE around GradientBoostingClassifier estimator
    rfe = RFE(estimator=GradientBoostingClassifier(), n_features_to_select=5)
    model = DecisionTreeClassifier()  # Use DecisionTreeClassifier as the final model
    models["gbm"] = Pipeline(steps=[("s", rfe), ("m", model)])

    return models  # Return the dictionary containing models with RFE


# Function to evaluate a given model using cross-validation
def evaluate_model(model, X, y):
    # Define cross-validation strategy: StratifiedKFold with 10 splits and 3 repeats
    cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
    # Perform cross-validation and calculate accuracy scores
    scores = cross_val_score(model, X, y, scoring="accuracy", cv=cv, n_jobs=-1)
    return scores  # Return the accuracy scores


# Define the dataset by calling the function get_dataset
X, y = get_dataset()

# Get the models to evaluate by calling get_models
models = get_models()

# Initialize lists to store results and model names
results, names = [], []

# Loop through each model in the models dictionary
for name, model in models.items():
    # Evaluate each model using cross-validation and store the results
    scores = evaluate_model(model, X, y)
    results.append(scores)  # Append the scores for this model
    names.append(name)  # Append the model name
    # Print the model name, mean accuracy, and standard deviation of the accuracy scores
    print(">%s %.3f (%.3f)" % (name, mean(scores), std(scores)))  # Print the results


In this case, the results suggest that linear algorithms like logistic regression might select better features more reliably than the chosen decision tree and ensemble
of decision tree algorithms.

In [None]:
# Importing the necessary plotting module (assuming it's already imported)

# Plotting a boxplot to compare the performance of different models
# 'results' contains the performance data (e.g., accuracy scores, etc.) for each model
# 'names' is a list of the model names corresponding to the performance results
# The 'boxplot' function is used to generate a boxplot to compare the performance visually
pyplot.boxplot(results, showmeans=True)  # Create the boxplot and display means on each box

# Set the x-axis labels using 'xticklabels' (for model names corresponding to each boxplot)
pyplot.xticks(ticks=range(1, len(names) + 1), labels=names)  # Set model names as x-axis labels

# Display the plot to the user
pyplot.show()  # Show the generated plot


A box and whisker plot is created for the distribution of accuracy scores for each configured
wrapped algorithm. We can see the general trend of good performance with logistic regression,
DTC and perhaps GBM. The model used within RFE can make an important
difference to which features are selected and in turn the performance on the prediction problem.

## Conclusion

Through this tutorial, we learned how to effectively use Recursive Feature Elimination for feature selection in both classification and regression problems. We explored key hyperparameters like the number of features to select and the choice of wrapped algorithm. The techniques covered demonstrate how RFE can be used to identify the most relevant features for predictive modeling tasks.

## Clean up

Remember to shut down your Jupyter Notebook environment and delete any unnecessary files or resources once you've completed the tutorial.