Prediction explanations: multiclass catboost support#2224
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2224 +/- ##
=========================================
- Coverage 100.0% 100.0% -0.0%
=========================================
Files 288 288
Lines 24495 24489 -6
=========================================
- Hits 24477 24470 -7
- Misses 18 19 +1
Continue to review full report at Codecov.
|
| pytest.skip("Skipping because estimator and pipeline are not compatible.") | ||
|
|
||
| if problem_type == ProblemTypes.MULTICLASS and estimator.model_family == ModelFamily.CATBOOST: | ||
| pytest.skip("Skipping Catboost for multiclass problems.") |
There was a problem hiding this comment.
After deleting this check, this test does verify that multiclass catboost works. I verified this by changing the multiclass catboost impl to error, and saw that this test errored out.
|
|
||
|
|
||
| baseline_message = "You passed in a baseline pipeline. These are simple enough that SHAP values are not needed." | ||
| xg_boost_message = "SHAP values cannot currently be computed for xgboost models." |
There was a problem hiding this comment.
I didn't see this message being used anywhere. I think its left over from a previous change.
| # Because of this issue: https://github.com/slundberg/shap/issues/1215 | ||
| if estimator.model_family == ModelFamily.CATBOOST and is_multiclass(pipeline.problem_type): | ||
| # Will randomly segfault | ||
| raise NotImplementedError("SHAP values cannot currently be computed for catboost models for multiclass problems.") |
There was a problem hiding this comment.
I don't think this is true anymore with shap >= 0.36.0!
| features = check_array(features.values) | ||
|
|
||
| if estimator.model_family.is_tree_estimator(): | ||
| # Because of this issue: https://github.com/slundberg/shap/issues/1215 |
There was a problem hiding this comment.
I think this was the wrong issue. This is a discussion about xgboost.
freddyaboulton
left a comment
There was a problem hiding this comment.
Looks good @dsherry ! Can you update the model understanding docs? There's a sentence saying we don't support prediction explanations for catboost multiclass.
5294e17 to
3c35a5e
Compare
Fixes #2136
The following code works after this PR:
which outputs
CatBoost Classifier w/ Imputer {'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'categorical_fill_value': None, 'numeric_fill_value': None}, 'CatBoost Classifier': {'n_estimators': 10, 'eta': 0.03, 'max_depth': 6, 'bootstrap_type': None, 'silent': True, 'allow_writing_files': False}} 1 of 1 Class: class_0 Feature Name Feature Value Contribution to Prediction ============================================================ proline 1065.00 ++ alcohol 14.23 + color_intensity 5.64 + Class: class_1 Feature Name Feature Value Contribution to Prediction ============================================================ alcohol 14.23 -- proline 1065.00 -- color_intensity 5.64 -- Class: class_2 Feature Name Feature Value Contribution to Prediction ========================================================== proline 1065.00 - flavanoids 3.06 - total_phenols 2.80 -