Error with sklearn.ensemble.GradientBoostingClassifier #2

james-pearce · 2020-06-29T01:52:25Z

With a model using GradientBoostingClassifier, get an error

AssertionError: len(shap_explainer.expected_value)=1and len(labels)={len(self.labels)} do not match!

Code:

from explainerdashboard.explainers import *
from explainerdashboard.dashboards import *
from explainerdashboard.datasets import *
 
import plotly.io as pio
import os
 
# load classifier data
 
X_train, y_train, X_test, y_test = titanic_survive()
train_names, test_names = titanic_names()
 
# one-line example
 
#from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
 
 
#model = RandomForestClassifier(n_estimators=50, max_depth=5)
model = GradientBoostingClassifier(n_estimators=50, max_depth=5)
model.fit(X_train, y_train)
 
explainer = ClassifierExplainer(model, X_test, y_test)
 
explainer.plot_shap_contributions(index=0)

oegedijk · 2020-06-29T07:04:22Z

Ah, one of the models that I had not gotten around to writing a test for :)

In any case I can see the issue. shap.TreeExplainer(model).expected_value outputs array([-0.5844817]). For most binary classification models expected_value outputs a np.ndarray([float probability/logodds negative class, float probability/logodds positive class]), or simply a float with the probability/logodds of the positive class. However, for GradientBoostingClassifier for some reason it outputs an np.ndarray([logodds positive class]). Had not taken care of that corner case yet. (I think shap tries to follow the output format of the underlying model, but this does lead to some confusing heterogeneity in outputs formats)

In any case will add some code to autodetect this (and also force GradientBoostingClassifier to outputs probabilities by default and add some integration tests for GradientBoostingClassifier and HistGradientBoostingClassifier).

Thanks for letting me know, and let me know if you run into any other issues with other models!

oegedijk · 2020-06-29T08:15:51Z

released a fix with version 0.1.13

Seems to work on my end, let me know if it works for you as well..

james-pearce · 2020-06-29T22:55:53Z

Works for me! I am astonished by the speed of your response.

Best
James

james-pearce closed this as completed Jun 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with sklearn.ensemble.GradientBoostingClassifier #2

Error with sklearn.ensemble.GradientBoostingClassifier #2

james-pearce commented Jun 29, 2020

oegedijk commented Jun 29, 2020

oegedijk commented Jun 29, 2020

james-pearce commented Jun 29, 2020

Error with sklearn.ensemble.GradientBoostingClassifier #2

Error with sklearn.ensemble.GradientBoostingClassifier #2

Comments

james-pearce commented Jun 29, 2020

oegedijk commented Jun 29, 2020

oegedijk commented Jun 29, 2020

james-pearce commented Jun 29, 2020