Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix zero division in compute_expectations #566

Conversation

dvpolyakov
Copy link

Hello, @slundberg ) Thank you for writing this tool.

I have a request with a really short implementation.

I want to add new SHAP functionality to Yandex CatBoost. But due to the special structure of trees in CatBoost, there might be nodes with right and left leafs with both zero sample weights in them. As a result, after calling the _cext.compute_expectations function https://github.com/slundberg/shap/blob/master/shap/explainers/tree.py#L899
which calls compute_expectations function here https://github.com/slundberg/shap/blob/master/shap/_cext.cc#L99 and finally calling compute_expectations function here https://github.com/slundberg/shap/blob/master/shap/tree_shap.h#L509, if (left_weight + right_weight) equals to 0, this leads to zero division and nans in returned Python array of expected values in nodes of trees.

In other tree-based algorithms, such as XGBClassifier and RandomForestClassifier, at least one sample must be in each leaf, so this problem does not appear. But in CatBoost trees it appears.

I just have added additional if-statement to check that left_weight and right_weight are not zeroes.

@dvpolyakov
Copy link
Author

@slundberg , @imatiach-msft , could you, please, give an idea how to fix the continuous-integration conflicts? It looks like the continuous-integration app check failed because of some problems in unit tests for Keras LSTM: https://ci.appveyor.com/project/slundberg/shap/builds/24145185/job/q7dag65j3c8wat0f (at the bottom of the page) . Could you tell me, please, how to deal with the errors in the unit tests? Thank you in advance.

@slundberg
Copy link
Collaborator

Thanks @dvpolyakov! It looks like the unit test failures are due to a Keras bug: tensorflow/tensorflow#28102

@slundberg slundberg merged commit 50a6423 into shap:master May 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants