Added support for stacked ensemble pipelines to prediction explanations module#2971
Added support for stacked ensemble pipelines to prediction explanations module#2971christopherbunn merged 12 commits intomainfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2971 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 307 307
Lines 29283 29288 +5
=======================================
+ Hits 29192 29197 +5
Misses 91 91
Continue to review full report at Codecov.
|
e7cc7bb to
0f1d400
Compare
| any CatBoost models. For Stacked Ensemble models, the SHAP value for each input pipeline's `predict_proba()` | ||
| (or `predict()` for regression pipelines) into the metalearner is used. |
There was a problem hiding this comment.
This sentence makes sense, but with the parentheses it's a little hard to parse. I'm not sure we need to go so details as to differentiate between predict_proba and predict, maybe we could rephrase this to "For Stacked Ensemble models, the SHAP value for each input pipeline's predict function into the metalearner is used"
freddyaboulton
left a comment
There was a problem hiding this comment.
@christopherbunn Thanks for making this enhancement! I left some test suggestions but the implementation looks good to me.
| X_y_multi, | ||
| X_y_regression, | ||
| ): | ||
| classifier_pl = { |
There was a problem hiding this comment.
I think lime is supported. Can we test for it?
docs/source/release_notes.rst
Outdated
| ------------- | ||
| **Future Releases** | ||
| * Enhancements | ||
| * Added metalearner prediction explanations :pr:`2971` |
There was a problem hiding this comment.
nit: Maybe "Added support for stacked ensemble pipelines to prediction explanations module"?
| ): | ||
| classifier_pl = { | ||
| "Imputer": ["Imputer", "X", "y"], | ||
| "Regression": ["Logistic Regression Classifier", "Imputer.x", "y"], |
There was a problem hiding this comment.
Another point, can you test once on the fraud dataset? I think it'll be good to have coverage for pipelines with more complex feature engineering, e.g. OHE + DateTimeFeaturizer.
There was a problem hiding this comment.
Agreed, updated the test case for the binary class to use fraud_100.
b8777ec to
a07b49b
Compare
f9a737b to
872c832
Compare
angela97lin
left a comment
There was a problem hiding this comment.
I love it, this looks good to me! 🚢 👏
Resolves #2963