-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Added AdaBoostClassifier and its test_sum_match #1546
base: master
Are you sure you want to change the base?
Conversation
This commit implements the AdaBoostClassifier to the shap/explainers/_tree.py. I have also added the corresponding test_sum_match function in tests/explainers/test_tree.py This is an extension of a previous pull request shap#1219 which was awaiting the test in test_tree.py.
I only just realised that |
Somehow copy/pasted some incorrect code into shap/explainers/_tree.py
Thanks! It looks like the unit test for AdaBoostClassifier is failing right, I am happy to review this once that is passing :) |
Hi, I've implemented the changes as necessary but I'm having trouble with the unit test. Everything appears to be working are expected, however the In some cases it appears to sum well and in other cases there are some issues.
Here's the image below. I am hoping you could give some direction here as I am at a loss as to why this is not summing correctly. |
@slundberg can you provide any guidance here? |
Sorry to leave this hanging! It could be because adaboost uses < comparison for thresholds instead of <= like the other sklearn methods. Or it could be some param is not passed properly in setup. The best way to sort this out is to generate the smallest simplest model that still has an issue and then see what broke. I can help with the seeing what broke. Do you mind trying to strip the unit test to a small single tree ensemble for debugging? |
Hi, I've just seen this now. I will break down the unit test to each of the trees involved in the Adaboost. I'll probably need some 'explain like i'm five' level of help on certain parts. I'll come back in a couple of days with some extra info |
Just checking if there are any updates here? Thanks! |
Hi, apologies this completely slipped my mind. I'll take a look this weekend and revert |
Hello, has there been any progress in adding the feature? Unfortunately my paper depends on analysis of the Adaboost results. Do you maybe know a way to explain the model in other way? |
Hi, I am trying to solve this problem at the moment. I think, I could use some support in identifying the problem @slundberg 😊 Here is the if statement I am using for the AdaBoostClassifier: ...
elif safe_isinstance(model, ("sklearn.ensemble.AdaBoostClassifier", "sklearn.ensemble._weighted_boosting.AdaBoostClassifier", "imblearn.ensemble.RUSBoostClassifier", "imblearn.ensemble._weight_boosting.RUSBoostClassifier")):
assert hasattr(model, "estimators_"), "Model has no `estimators_`! Have you called `model.fit`?"
self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
self.input_dtype = np.float64
self.trees = [SingleTree(e.tree_, normalize=True, scaling=weight, data=data, data_missing=data_missing) for e, weight in zip(model.estimators_, model.estimator_weights_/sum(model.estimator_weights_))]
self.objective = objective_name_map.get(model.base_estimator_.criterion, None) #This line is done to get the decision criteria, for example gini.
self.tree_output = "probability"
... Here is the additivity check: def test_sum_match_adaboost_classifier():
X_train,X_test,Y_train,Y_test = sklearn.model_selection.train_test_split(*shap.datasets.adult(), test_size=0.2, random_state=0)
clf = sklearn.ensemble.AdaBoostClassifier(random_state=202, n_estimators=1)
clf.fit(X_train, Y_train)
predicted = clf.predict_proba(X_test)
ex = shap.TreeExplainer(clf)
shap_values = ex.shap_values(X_test)
assert np.abs(shap_values[0].sum(1) + ex.expected_value[0] - predicted[:,0]).max() < 1e-4, \
"SHAP values don't sum to model output!" Looking forward to hear about your insight! |
This commit implements the AdaBoostClassifier to the
shap/explainers/_tree.py file. I have also added the corresponding
test_sum_match function in tests/explainers/test_tree.py
This is an extension of a previous pull request #1219 which
was awaiting the test in test_tree.py.
Pull Request #1219