Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for stacked ensemble pipelines to prediction explanations module #2971

Merged
merged 12 commits into from
Nov 1, 2021

Conversation

christopherbunn
Copy link
Contributor

Resolves #2963

@codecov
Copy link

codecov bot commented Oct 27, 2021

Codecov Report

Merging #2971 (7bff991) into main (43deb88) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2971     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        307     307             
  Lines      29283   29288      +5     
=======================================
+ Hits       29192   29197      +5     
  Misses        91      91             
Impacted Files Coverage Δ
...nderstanding/prediction_explanations/explainers.py 100.0% <ø> (ø)
...s/prediction_explanations_tests/test_explainers.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 43deb88...7bff991. Read the comment docs.

@christopherbunn christopherbunn marked this pull request as ready for review October 28, 2021 14:34
@christopherbunn christopherbunn requested review from a team and angela97lin October 28, 2021 14:34
Comment on lines 58 to 59
any CatBoost models. For Stacked Ensemble models, the SHAP value for each input pipeline's `predict_proba()`
(or `predict()` for regression pipelines) into the metalearner is used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence makes sense, but with the parentheses it's a little hard to parse. I'm not sure we need to go so details as to differentiate between predict_proba and predict, maybe we could rephrase this to "For Stacked Ensemble models, the SHAP value for each input pipeline's predict function into the metalearner is used"

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christopherbunn Thanks for making this enhancement! I left some test suggestions but the implementation looks good to me.

@@ -1596,29 +1592,76 @@ def test_explain_predictions_stacked_ensemble(
X_y_multi,
X_y_regression,
):
classifier_pl = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lime is supported. Can we test for it?

@@ -2,6 +2,7 @@ Release Notes
-------------
**Future Releases**
* Enhancements
* Added metalearner prediction explanations :pr:`2971`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe "Added support for stacked ensemble pipelines to prediction explanations module"?

@@ -1596,29 +1592,76 @@ def test_explain_predictions_stacked_ensemble(
X_y_multi,
X_y_regression,
):
classifier_pl = {
"Imputer": ["Imputer", "X", "y"],
"Regression": ["Logistic Regression Classifier", "Imputer.x", "y"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point, can you test once on the fraud dataset? I think it'll be good to have coverage for pipelines with more complex feature engineering, e.g. OHE + DateTimeFeaturizer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, updated the test case for the binary class to use fraud_100.

@christopherbunn christopherbunn changed the title Added metalearner prediction explanations Added support for stacked ensemble pipelines to prediction explanations module Oct 29, 2021
Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it, this looks good to me! 🚢 👏

@christopherbunn christopherbunn merged commit ffb7334 into main Nov 1, 2021
@chukarsten chukarsten mentioned this pull request Nov 9, 2021
@freddyaboulton freddyaboulton deleted the 2963_metalearner_explanations branch May 13, 2022 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stacked Ensemble Prediction Explanations: Metalearner Explanations
4 participants