Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RE: shap values for ngboost model_ouput=1 #291

Open
AliSamiiXOM opened this issue Jun 3, 2022 · 1 comment
Open

RE: shap values for ngboost model_ouput=1 #291

AliSamiiXOM opened this issue Jun 3, 2022 · 1 comment

Comments

@AliSamiiXOM
Copy link

Hi and thanks for the great work ! I am having trouble understanding what the shap values for model_output=1 represent. Here is a sample notebook:

https://github.com/AliSamiiXOM/ngboost_question/blob/main/shap_with_ngboost.ipynb.

What I am expecting to see is that the sum of shap values for all features be equal to target variable minus expected value of target. This is true for mean output (model_output=0). But as shown in the last cell of the linked notebook, the scale output does not satisfy this. This is most probably a question, rather than a bug or issue, but it still can be helpful for future reference to be asked here.

@Penna88
Copy link

Penna88 commented Sep 14, 2023

Hi AliSamiiXOM,

I faced the same issue because I wanted to interpret the Shap Values for NGBoost using Gamma distribution.

Short Answer:

Shapley values are calcolated for param[0] and param[1] depending on what you select on model_output.

The problem is that params[0] and params[1] mean different things depending on the considered distribution.
For example, I opened the folder "ngboost/ngboost/distns" and check the gamma.py. You clearly notice the relation between alpha and beta and params at rows 39 and 40 (and below for convenience):

self.alpha = np.exp(params[0])
self.beta = np.exp(params[1])

After applying np.exp, the meaning of shapley values is clear.
Below the code i used for checking.

# Shap Analysis
# Get Predictions

model_out = 0

explainer = shap.TreeExplainer(ngb_model, model_output = model_out)
explanation = explainer(x_test)
explanation.base_values = explanation.base_values.reshape(-1)

# Get Predictions 
predicted = ngb_model.predict(x_test) # mean of the predictive distribution
pred_alpha = ngb_model.pred_dist(x_test).params['alpha']
pred_beta = ngb_model.pred_dist(x_test).params['beta']

# check the properties of Explanation object
assert explanation.values.shape == (*x_test.shape,)
assert explanation.base_values.shape == (len(x_test),)

if model_out == 0:
    assert (
        np.abs(np.exp(explanation.values.sum(1) + explanation.base_values) - pred_alpha).max()
        < 1e-5
    )
else:
    assert (
        np.abs(np.exp(explanation.values.sum(1) + explanation.base_values) - pred_beta).max()
        < 1e-5
 )

Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants