New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: LightGBM with multiclass interaction TreeShap produces explainer error #3574
Comments
I do not see this as a bug. For the Internally lightgbm creates
Edit: The problem I do see here though, is that the error message is not really informative and that even if one sets |
@CloseChoice I am convinced that your comment before is not right. Here are my arguments:
Therefore my conclusion, this check of if all leafs are covered with the background data should not be there at all if background data is set to None (as done in my example) because then tree_path_dependent option is used with does not require a background dataset (See argument 1). See argument 2 that this is the case if compute interactions is set to False. Just in case of compute interactions is set to True this error occurs, but can be functionally bypassed (See argument 3). Finally your argument (argument 4) does not prove that the error message is right, but only says that the check works as intended. Please let me know if my argumentation makes sense to you. Thank you very much for your invaluable work! add: That tree_path_dependent does not require a background dataset is also mentioned in the original SHAP paper. |
I believe there is a central flaw in your argumentation. But let me explain this step by step. So first of all nobody said that you'd need a background dataset, you can extract the cover of each node directly from the model. Now, if the leaf is uncovered (cover = 0) then a problem arises in the algorithm (see algorithm 2 in the paper "Consistent individual attributions for tree shap models") where one divides by the cover. Obviously one cannot do that if the leaf is uncovered. The flaw in your example is that you explain the same data you trained on so while explaining you do not evaluate the (in training) uncovered leafs. I would suspect that you'll run into an error when you evaluate examples that run through uncovered leafs (but did not test that). |
Thank you @CloseChoice for your answer! I modified my example from above to have a dedicated train and test set as you suggested in your comment above, and as you can see it still works to compute SHAP explanations with background dataset to None and tree_path_dependent, also with the data not in the training set (tst data). Just the one case where interactions is set to True in the multi class case does throw again the error reported above (that I argue should not tested for). Therefore, I don't see a flaw in my argumentation. Furthermore, I want to point out that I use tree_path_dependent with background data=None to explain data not seen during training on a regular basis and I never ran into issues. Here the code:
I suspect in your comment above you are mixing up two different things. The point is that there are at least two methods to compute SHAP values with the TreeExplainer. The difference comes at how to compute conditional expectations, that we need to decide how to handle correlated (or otherwise dependent) input features (https://shap.readthedocs.io/en/latest/generated/shap.TreeExplainer.html). First, Therefore, I am arguing again that in this case of multiclass AND interactions set to True AND background dataset is None AND feature_perturbation is PS: Please also remember #3187, where I show that it works if one passes the error with a dirty hack. |
BTW if commenting out line 346 to line 357 of _tree.py, hence, this code where the error is generated
then it works. |
Hmm, will take a deeper look into this when I am back from vacation but there are thee things:
You raised a valid concern that all examples except for multiclass interactions are working. That is what I'll take a look at. One difference certainly is that for |
@CloseChoice thank you very much again for responding, really appreciating your effort!
Thank you for the information that interaction=False uses a different implementation, was not aware of that! Have a nice vacation! |
@NegatedObjectIdentity thanks for responding and the code examples. Your 99%/1% example is a stromg indication that we can in fact remove the check. Will provide code examples concerning 1. once I am back. We'll figure that out ;) |
So I have looked deep into this and I while I cannot find an example where the code actually breaks if we remove the check I still would not like to remove it. Here is just that there is indication that it could break and I do not understand the code good enough to actually verify that it cannot break if we remove the tests. I suspect that running into this error is happening less often than we would run into some other problems when we remove the check, therefore I would like to keep it except you can help verify that this is not causing problems.
from sklearn.datasets import load_digits
from lightgbm import LGBMClassifier
from shap.explainers import Tree as TreeExplainer
import numpy as np
data_mult = load_digits(as_frame=True)
# Train data
data_mult_trn = data_mult.data.iloc[:, :]
target_mult_trn = data_mult.target.iloc[:]
data_mult_tst = data_mult.data.iloc[1750:, :]
target_mult_tst = data_mult.target.iloc[1750:]
model_mult = LGBMClassifier(**{'verbosity': -1,}).fit(data_mult_trn, target_mult_trn)
explainer_mult = TreeExplainer(
model_mult,
data=None,
feature_perturbation='tree_path_dependent')
pred = model_mult.predict_proba(data_mult_tst, raw_score=True)
tree_idx_with_uncovered_leafs = [idx for idx, k in enumerate(explainer_mult.model.node_sample_weight) if np.sum(np.abs(k)) == 0]
# check this to find an uncovered leaf
explainer_mult.model.node_sample_weight[tree_idx_with_uncovered_leafs[0]]
explanations_mult_inter = explainer_mult(data_mult_tst, interactions=True)
assert np.allclose(explanations_mult_inter.base_values + explanations_mult_inter.sum((1, 2)).values, pred)
self.model.node_sample_weight = np.zeros_like(self.model.node_sample_weight) here the assert above breaks. This is a rather artificial example but I guess that we are just not finding the correct examples that actually run into the uncovered leafs. So from my side this is a nofix until we have more information. |
@CloseChoice thank you again for your reply!
Please see TreeNodes dataframe. Some of the nodes have a value of 0 (which is OK since this means the prediction of this node is zero) and some have a weight of 0 (which is OK as well, because this measures the importance of a node and with a learning rate of 0.1 and this multiplied 100 time some nodes become very unimportant). However, last column gives you the sample counts for all ~45000 nodes in the lightgbm model. None of them has a count below 20 since this is the min_child_samples parameter that was used during training. Hence, coverage is given for all nodes. There is not a single node without samples associated to it. I even argue that it is not possible to make a decision tree with zero samples in a node, especially as there is one hyperparameter that checks exactly for that (min_child_samples).
Therefore, I argue again that the assert should not be there for the case of tree_path_dependent and background data = None.
Specifically it says "no background dataset" |
If you'll find me the lightgbm docs where it says what you claim, I'll be convinced. But please understand that I invested 5 hours into this and I do not really see an issue here. So as mentioned, without further information, I see this as a nofix |
@CloseChoice @jameslamb answered the question about nodes with no coverage in microsoft/LightGBM#6388. He confirms that there are several checks in place to prevent nodes without samples. I think we could solve this by not removing the check completely but to add
In my local copy this fix works. Happy to hear your opinion! |
Issue Description
This is an follow up to #3187. In #3187 I reported on inconsistencies in the returned SHAP explainer object depending on the explanation task (e.g. classification bin/multi or regression; with or without feature interactions). This has bin solved (see assessment below). Thank you very much for your awesome work! But there is still an error in the case of task multiclass+interactions.
My assessment:
Kind of task | Explainer object shape | My assessment
Minimal Reproducible Example
Traceback
Expected Behavior
I expected the return of a SHAP explainer object with calculated interactions. Please be aware that I explicitly pass
data=None
as well asfeature_perturbation='tree_path_dependent'
to the TreeExplainer. Hence, I do it exactly as mentioned in the error message of how to do it to get rid of the error. Please be aware that this works if interactions are set to False. Hence, I suspect that there is an error somewhere in the logic of showing the error message withinteractions=True
.Bug report checklist
Installed Versions
Win11
Python 3.11.8
LightGBM 4.3.0
SHAP 0.45.0
The text was updated successfully, but these errors were encountered: