# RFE with LightGBM Implementation (coreDevX)
In this notebook, we will explore feature selection using Recursive Feature Elimination (RFE) with LightGBM (Light Gradient Boosting Machine). RFE is an efficient algorithm that selects the most relevant features by recursively considering smaller sets of features. LightGBM is a gradient boosting framework that is known for its efficiency and effectiveness. We'll leverage these tools to identify the most important features for a given machine learning model.

In [1]:
# Importing Necessary Libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import RFE
from lightgbm import LGBMClassifier

In [2]:
# Here a demo data
df = pd.read_csv('UCI_Credit_Card.csv')
X = df.drop(['default.payment.next.month', 'ID'], axis="columns")
y = df['default.payment.next.month']

In [3]:
# Split to select the features only in train sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [4]:
# Set the number of features to be selected
n_features = 10

In [5]:
# Feature selection with RFE and LightGBM
estimator = LGBMClassifier(n_jobs=-1, num_leaves=31, max_depth=-1, min_child_samples=20, subsample=0.8, colsample_bytree=0.8, learning_rate=0.1, n_estimators=100)
selector_rfe_lgbm = RFE(estimator, n_features_to_select=n_features, step=1)
selector_rfe_lgbm = selector_rfe_lgbm.fit(X, y)
vector_names_rfe_lgbm = list(X.columns[selector_rfe_lgbm.support_])

In [6]:
# Extracting the selected feature to create new train and test sets
X_train_rfe = X_train[vector_names_rfe_lgbm]
X_test_rfe = X_test[vector_names_rfe_lgbm]

In [7]:
# Displaying Selected Features
print("Selected Features using RFE:")
for feature in vector_names_rfe_lgbm:
    print(feature)

Selected Features using RFE:
LIMIT_BAL
AGE
BILL_AMT1
BILL_AMT2
BILL_AMT4
BILL_AMT5
BILL_AMT6
PAY_AMT1
PAY_AMT2
PAY_AMT3
