Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catboost is unable to calculate shap values when set_scale_and_bias is used #2389

Open
antipisa opened this issue May 15, 2023 · 0 comments
Open

Comments

@antipisa
Copy link

catboost version: 1.0.6
Operating System: Linux
CPU: Y

Problem:
catboost cannot compute shap values if the model has set a scale and bias via the set_scale_and_bias function.


import catboost as cb
import shap
import numpy as np
import pandas as pd
N = 1000
p = 100
train_data = np.random.randn(N, p)
train_label = np.random.randn(N)


model = cb.CatBoostRegressor(num_boost_round=500, learning_rate=0.05)
pool = cb.Pool(train_data, train_label)
model.fit(pool, verbose_eval=500)
model.set_scale_and_bias(0.5, 0.0)
print(model.get_scale_and_bias())


explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(pool)

output:

0:	learn: 1.0010493	total: 9.97ms	remaining: 4.97s
499:	learn: 0.2587147	total: 2.02s	remaining: 0us
(0.5, 0.0)
---------------------------------------------------------------------------
CatBoostError                             Traceback (most recent call last)
/var/folders/n2/yv9l5srn20563nwmmh5yrk_w0000gn/T/ipykernel_26189/3254303356.py in <module>
     13 
     14 explainer = shap.TreeExplainer(model)
---> 15 shap_values = explainer.shap_values(pool)

~/miniconda3/lib/python3.7/site-packages/shap/explainers/_tree.py in shap_values(self, X, y, tree_limit, approximate, check_additivity, from_call)
    366                 if type(X) != catboost.Pool:
    367                     X = catboost.Pool(X, cat_features=self.model.cat_feature_indices)
--> 368                 phi = self.model.original_model.get_feature_importance(data=X, fstr_type='ShapValues')
    369 
    370             # note we pull off the last column and keep it as our expected_value

~/miniconda3/lib/python3.7/site-packages/catboost/core.py in get_feature_importance(self, data, type, prettified, thread_count, verbose, fstr_type, shap_mode, model_output, interaction_indices, shap_calc_type, reference_data, log_cout, log_cerr)
   3055             shap_calc_type = enum_from_enum_or_str(EShapCalcType, shap_calc_type).value
   3056             fstr, feature_names = self._calc_fstr(type, data, reference_data, thread_count, verbose, model_output, shap_mode, interaction_indices,
-> 3057                                                   shap_calc_type)
   3058         if type in (EFstrType.PredictionValuesChange, EFstrType.LossFunctionChange, EFstrType.PredictionDiff):
   3059             feature_importances = [value[0] for value in fstr]

~/miniconda3/lib/python3.7/site-packages/catboost/core.py in _calc_fstr(self, type, pool, reference_data, thread_count, verbose, model_output, shap_mode, interaction_indices, shap_calc_type)
   1774 
   1775     def _calc_fstr(self, type, pool, reference_data, thread_count, verbose, model_output, shap_mode, interaction_indices, shap_calc_type):
-> 1776         return self._object._calc_fstr(type.name, pool, reference_data, thread_count, verbose, model_output, shap_mode, interaction_indices, shap_calc_type)
   1777 
   1778     def _calc_ostr(self, train_pool, test_pool, top_size, ostr_type, update_method, importance_values_sign, thread_count, verbose):

_catboost.pyx in _catboost._CatBoost._calc_fstr()

_catboost.pyx in _catboost._CatBoost._calc_fstr()

CatBoostError: catboost/libs/fstr/calc_fstr.cpp:485: Non-identity {Scale} for feature importance is not supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants