Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additivity check failed in TreeExplainer! (with check_additivity=False) #28

Open
mahboobctwd opened this issue Jan 2, 2024 · 4 comments

Comments

@mahboobctwd
Copy link

Hi, I am trying to use the survshap library to extract SHAP values from a random survival model. I passed the training dataset to the SurvivalModelExplainer. The following code gives an idea:

rsf = RandomSurvivalForest(
    n_estimators=8, n_jobs=-1, random_state=random_state
)

rsf.fit(X_train, y_train)

explainer = SurvivalModelExplainer(model = rsf, data = X_train, y = y_train)
model_survshap = ModelSurvSHAP(calculation_method="treeshap") 
model_survshap.fit(explainer = explainer)

My X_train contains some one-hot encoded features and I'm getting the following error:

ExplainerError: Additivity check failed in TreeExplainer! Please ensure the data matrix you passed to the explainer is the same shape that the model was trained on. If your data shape is correct then please report this on GitHub. This check failed because for one of the samples the sum of the SHAP values was 0.666667, while the model output was 0.000000. If this difference is acceptable you can set check_additivity=False to disable this check.

I passed check_additivity=False which didn't change the outcome.

@krzyzinskim
Copy link
Collaborator

Hi,

I fixed it and now check_additivity=False should work (there's a new version of the package at PyPI and repository).

However, such a large difference in values seems really suspicious. Are you sure it's definitely the same data matrix and everything is ok?

@mahboobctwd
Copy link
Author

mahboobctwd commented Jan 4, 2024

Thanks for updating @krzyzinskim .

Yes, I'm not passing any observation data so it would use the explainer.data to do the analysis and it's the same I used for training the RSF model. That's the other thing that bugged me, I shouldn't be needing to disable additivity since it's the same data.

Any pointers to go about debugging this?

@krzyzinskim
Copy link
Collaborator

To be honest, I haven't encountered this problem and can't reproduce it, at least on the few datasets I just tested. Could you please send a reproducible example with the dataset on which this error occurs?

@ntnhu-ctump
Copy link

Dear Dr @krzyzinskim,

I have the same issue, even with the survshap version 0.4.2. In addition, the error occurs when I try to pass check_additivity=False in the codes (like your example). Thank you in advance for taking time to look at this issue.

Best,
Noah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants