Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval_metric in EarlyStoppingShapRFECV not used for LGBMClassifier #259

Open
PaulZhutovsky opened this issue Apr 30, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@PaulZhutovsky
Copy link
Contributor

Describe the bug

It seems like the eval_metric you can specify during initiation of the EarlyStoppingShapRFECV class is not being used in the case of a LGBMClassifier (did not test the other tree-based methods).

Environment (please complete the following information):

  • probatus version: 3.1.0
  • python version: 3.11.8
  • OS: linux

To Reproduce

import lightgbm
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import RandomizedSearchCV

from probatus.feature_elimination import ShapRFECV

feature_names = [
    "f1",
    "f2_missing",
    "f3_static",
    "f4",
    "f5",
    "f6",
    "f7",
    "f8",
    "f9",
    "f10",
    "f11",
    "f12",
    "f13",
    "f14",
    "f15",
    "f16",
    "f17",
    "f18",
    "f19",
    "f20",
]

# Prepare two samples
X, y = make_classification(
    n_samples=1000,
    class_sep=0.05,
    n_informative=6,
    n_features=20,
    random_state=0,
    n_redundant=10,
    n_clusters_per_class=1,
)
X = pd.DataFrame(X, columns=feature_names)

# Make missing nr consistent
np.random.seed(42)
X["f2_missing"] = X["f2_missing"].apply(lambda x: x if np.random.rand() < 0.8 else np.nan)
X["f3_static"] = 0

from probatus.feature_elimination import EarlyStoppingShapRFECV

model = lightgbm.LGBMClassifier(n_estimators=200, max_depth=3)

# Run feature elimination
shap_elimination = EarlyStoppingShapRFECV(
    model=model, step=0.2, cv=10, scoring="roc_auc", eval_metric="auc", early_stopping_rounds=5, n_jobs=3, verbose=2
)
report = shap_elimination.fit_compute(X, y)

(This is just the exact example data set + call to EarlyStoppingShapRFECV from here just with verbose=2 added and actually using the correct model (model=model))

Error traceback

There is no error but with verbose=2 you can see:

[LightGBM] [Warning] Unknown parameter: eval_metric

Expected behavior

I would expect the parameter for eval_metric to be used during the evaluations for the early-stopping.

Potential solution
I think the issue comes from the fact that for LGBMClassifier the eval_metric is being set here but this does not work as the class does not have this parameter. What should be done (I assume) is this:

  1. Remove this part from the code.
  2. Add another line with eval_metric=self.eval_metric to this as LGBMClassifier's fit method does have an eval_metric parameter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant