Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Categorical features in HistGradientBoostingRegressor #3436

Open
2 of 4 tasks
alexis-cvetkov opened this issue Dec 14, 2023 · 0 comments
Open
2 of 4 tasks

BUG: Categorical features in HistGradientBoostingRegressor #3436

alexis-cvetkov opened this issue Dec 14, 2023 · 0 comments
Labels
bug Indicates an unexpected problem or unintended behaviour

Comments

@alexis-cvetkov
Copy link

Issue Description

Hi! I have noticed that shap produces weird results when used on the HistGradientBoostingRegressor from scikit-learn.
This model has a parameter categorical_features to indicate which columns should be treated as categorical, but using it leads to strange results.

Below I made a simple example with only one feature X with 5 discrete values, and y = X ** 2, and I compare the SHAP values with/without the use of the categorical_features parameter.

Figure 1: SHAP values WITHOUT categorical_features
image

Figure 2: SHAP values WITH categorical_features
image

Minimal Reproducible Example

import shap
import numpy as np
import pandas as pd
from sklearn.ensemble import HistGradientBoostingRegressor

X = pd.DataFrame(np.random.randint(0, 5, size=1000), columns=["A"])
y = X["A"] ** 2

# Use HistGradientBoostingRegressor WITHOUT categorical features
model = HistGradientBoostingRegressor()
model.fit(X, y)
explainer = shap.Explainer(model)
shap_values = explainer(X)
shap.plots.scatter(shap_values[:, "A"], color=shap_values, hist=False)

# Use HistGradientBoostingRegressor WITH categorical features
model = HistGradientBoostingRegressor(categorical_features=["A"])
model.fit(X, y)
explainer = shap.Explainer(model)
shap_values = explainer(X)
shap.plots.scatter(shap_values[:, "A"], color=shap_values, hist=False)

Traceback

No response

Expected Behavior

No response

Bug report checklist

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest release of shap.
  • I have confirmed this bug exists on the master branch of shap.
  • I'd be interested in making a PR to fix this bug

Installed Versions

0.44

@alexis-cvetkov alexis-cvetkov added the bug Indicates an unexpected problem or unintended behaviour label Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behaviour
Projects
None yet
Development

No branches or pull requests

1 participant