-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for stacked ensemble pipelines to prediction explanations module #2971
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2971 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 307 307
Lines 29283 29288 +5
=======================================
+ Hits 29192 29197 +5
Misses 91 91
Continue to review full report at Codecov.
|
e7cc7bb
to
0f1d400
Compare
any CatBoost models. For Stacked Ensemble models, the SHAP value for each input pipeline's `predict_proba()` | ||
(or `predict()` for regression pipelines) into the metalearner is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence makes sense, but with the parentheses it's a little hard to parse. I'm not sure we need to go so details as to differentiate between predict_proba
and predict
, maybe we could rephrase this to "For Stacked Ensemble models, the SHAP value for each input pipeline's predict function into the metalearner is used"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christopherbunn Thanks for making this enhancement! I left some test suggestions but the implementation looks good to me.
@@ -1596,29 +1592,76 @@ def test_explain_predictions_stacked_ensemble( | |||
X_y_multi, | |||
X_y_regression, | |||
): | |||
classifier_pl = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think lime is supported. Can we test for it?
docs/source/release_notes.rst
Outdated
@@ -2,6 +2,7 @@ Release Notes | |||
------------- | |||
**Future Releases** | |||
* Enhancements | |||
* Added metalearner prediction explanations :pr:`2971` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe "Added support for stacked ensemble pipelines to prediction explanations module"?
@@ -1596,29 +1592,76 @@ def test_explain_predictions_stacked_ensemble( | |||
X_y_multi, | |||
X_y_regression, | |||
): | |||
classifier_pl = { | |||
"Imputer": ["Imputer", "X", "y"], | |||
"Regression": ["Logistic Regression Classifier", "Imputer.x", "y"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point, can you test once on the fraud dataset? I think it'll be good to have coverage for pipelines with more complex feature engineering, e.g. OHE + DateTimeFeaturizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, updated the test case for the binary class to use fraud_100.
b8777ec
to
a07b49b
Compare
f9a737b
to
872c832
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love it, this looks good to me! 🚢 👏
Resolves #2963