Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does SHAP give very high importance to outliers? #960

Closed
pidahbus opened this issue Dec 17, 2019 · 8 comments
Closed

Does SHAP give very high importance to outliers? #960

pidahbus opened this issue Dec 17, 2019 · 8 comments
Labels
stale Indicates that there has been no recent activity on an issue

Comments

@pidahbus
Copy link

Hi @slundberg , I have some confusion while calculating the SHAP values to calculate feature importance. You have already mentioned that for linear regression the SHAP value equation is,

shap_values = regression_coefficients * (X - X.mean(0))

Seeing that equation I find that the feature importance of a data point is directly proportional to the absolute value of that feature for the data point. This means if there is any outlier (assuming very large value in X) with respect to any feature then that feature may get important most of the time because of its large X value.

For example, we have a fitted regression line: y_hat = b0 + b1 * x1 + b2 * x2, where b0 =3, b1=0.08 and b2=0.9 gives, y_hat = 3 + 0.08 * x1 + 0.9 x2
Now let's say we have one data point where x1 = 900 and x2 = 5 for which I want to calculate the SHAP values,
Here we see that in spite of the b1 having a very less value, feature 1 comes most important because of the very large value of x1 and also in spite of b2 having very large value, feature 2 comes less important with respect to feature 1 because of it's less absolute value for that data point.

To describe the issue, I have compared the case with both SHAP and LIME.

  • The result of SHAP and LIME is given for the test point below.

Screenshot 2019-12-17 at 5 09 26 PM

  • Now I have introduced one outlier value in the test point replacing Capital Loss from 0 to 20,000 (some arbitrary large number). Doing so, I found that because of the large value of Capital Loss, it becomes the most important variable for SHAP as it considers the absolute value of the feature. But, LIME calculates the feature importances by the regression coefficients that is the reason this outlier did not affect much but the overall result changes because of the new values of the regression coefficients because of the introduction of the outlier.
    Is my thought process correct? If yes, is it a good measure to consider the absolute value of a feature to calculate it's feature importance? If not then it would be really helpful if you would help me describing the difference between these two results.

Screenshot 2019-12-17 at 5 09 40 PM

@slundberg
Copy link
Collaborator

Good questions. In the linear model SHAP does indeed give high importance to outlier feature values. This is correct (in my opinion) because the linear model also gives very high importance to those values. You can always multiply a feature by 100 and then divide its corresponding coefficient by 100 and leave a linear model unchanged. Since this does not change the model outputs it should also not change the explanations when those explanations are in the units of the model output (as opposed to the units of the feature). LIME tabular is actually also likely to change in response to outliers, but perhaps less so because of some details about binning etc. Also not that tree-based model are less sensitive to outliers than linear models so the effect will be less there.

@pidahbus
Copy link
Author

Thanks for your reply @slundberg .

The explanation I gave in the form of the linear equation just to make it more interpretable while discussing. But if you see the notebooks' result you can find that those results were generated from the randomforest tree model. There also I find that introducing outlier will give that feature high SHAP value which means higher feature importance.

Could you please confirm the following things?
Re-writing the equation: shap_values = regression_coefficients * (X - X.mean(0)), which will help me to formulate the questions.

  • for a fitted regression line: y_hat = b0 + b1 * X1 + b2 * X2, to calculate feature importance of say X1 we say that if we change one unit in X1 (if X1 is a continuous variable, for the categorical variable it is the change in the levels) what is the change in y_hat, which is basically b1. we do a t-test on b1 to find it whether b1 is significant or not. To calculate feature importance on a single data point (x1, x2, y1), LIME fits a linear regression on the nearby points and by calculating the regression coefficients (b1, b2, etc.) as I told above they calculate the feature importance on that data point. Ultimately b1 or b2 itself is the feature importance. In SHAP instead of b1 or b2, it is told that b1(x1 - X1.mean(0)) is the feature importance. Both are not the same. In the case of linear regression or LIME, for a positive regression coefficient say b1, we can tell that if we increase (or decrease) X1, y_hat will increase (or decrease). That means X1 is a feature that affects positively and b1 is the measure of the positive effect on the output response variable(y). Higher b1 means a higher positive effect on the output response. And if b1 is negative then higher b1 means a higher negative effect on the output response
    Can we interpret the same for SHAP values also i.e. like b1 if we get positive SHAP value for saying feature X1, can we say that increasing that feature will increase the output response too? Also, can we say that the magnitude of the SHAP value is the measure of the effect of that feature on the output variable like b1 or b2?

  • If yes, then for this equation: shap_values = regression_coefficients * (X - X.mean(0)),
    let's assume regression_coefficients > 0 and X < X.mean(0) which says shap_values > 0. This means when regression_coefficients say that the feature is positively impacting, shap_values say that it is negatively affecting. So, I find that the explanation of linear regression (or LIME) and SHAP are not the same and for some cases contradicting too. It would be really helpful if you give me some insights regarding the case where shap_values and regression_coefficients hold opposite signs.

  • The above points are my thoughts after reading both LIME and SHAP paper. Kindly confirm whether what I understood is correct.

@slundberg
Copy link
Collaborator

There are a few tricky things to keep in mind here, and the best answer will be for me to finish up my tutorial on SHAP that I have been working on. But a few shorter answers here:

  1. For a linear (or additive) model SHAP values trace out the partial dependence plot for each feature. So a positive SHAP value tells you that your value for that feature increases the model's output relative to typical values for that feature. For example if you have systolic blood pressure of 150, the average BP is 120 and higher blood pressure is bad for you then you will get a positive SHAP value because your BP is worse than average. But if you have a BP of 110 you will get a negative SHAP value because your BP is better than average (lowers your risk relative to average). SHAP values tell you about the informational content of each of your features, they don't tell you how to change the model output by manipulating the inputs (other than what would happen if you "hide" those feature values). To know how the model will change as you change the inputs you would need to trace out a dependence_plot of many SHAP values.
  2. By default LIME behaves similarly to SHAP and does not give you linear model coefficients. If you use the LIME tabular explainer it will be much closer to beta * (X - X.mean(0)) than to beta (the reason it is not exactly the same is due to some binning it does). You could run LIME in the original input space and get beta as you explanation, but that is not how it works in any of its default configurations.

@pidahbus
Copy link
Author

Thanks @slundberg again for your immediate reply.

  • While I am still struggling to understand the relation between the physical interpretation of SHAP values and regression coefficients, could you please give me another example (like the example of BP above), where you will physically interpret both the SHAP values & regression coefficients values and help me understand the relation (or difference) between the two?

  • I will repeat the second point from my previous comment here. For this equation: shap_values = regression_coefficients * (X - X.mean(0)), let's assume regression_coefficients > 0 and if X < X.mean(0) then shap_values > 0. This means when regression_coefficients say that the feature is positively impacting, shap_values say that it is negatively affecting. So, I find that the explanation of linear regression and SHAP are not the same and for some cases contradicting too. It would be really helpful if you give me some insights regarding the case where shap_values and regression_coefficients hold opposite signs.

@slundberg
Copy link
Collaborator

These are good questions that really need a longer answer than can fit here. I pushed a draft version of a tutorial-style notebook that starts by exploring SHAP on simple linear models. Check it out and see if it helps here, I wrote some of it with the questions you had here in mind :)

https://github.com/slundberg/shap/blob/master/notebooks/general/Explainable%20AI%20with%20Shapley%20Values.ipynb

@pidahbus
Copy link
Author

Hi @slundberg,
Thank you so much for the detailed answer.

One last question I have. I am solving one NLP regression problem, where given one text response I need to predict the score. Along with the predictions I need to calculate enablers and disablers.

Enablers are the words (or phrases or sentences) that have positive effect on the output score i.e. inclusion of this word will increase the output score.

Disablers are the words (or phrases or sentences) that have negative effect on the output score i.e. inclusion of this word will decrease the output score.

Now can we treat words with positive SHAP values as enablers? If so, can we say that a word with higher positive SHAP value is more enabler with respect to another word with lesser positive SHAP value? Can we say the same for disablers also?

Copy link

This issue has been inactive for two years, so it's been automatically marked as 'stale'.

We value your input! If this issue is still relevant, please leave a comment below. This will remove the 'stale' label and keep it open.

If there's no activity in the next 90 days the issue will be closed.

@github-actions github-actions bot added the stale Indicates that there has been no recent activity on an issue label Dec 29, 2023
Copy link

This issue has been automatically closed due to lack of recent activity.

Your input is important to us! Please feel free to open a new issue if the problem persists or becomes relevant again.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Indicates that there has been no recent activity on an issue
Projects
None yet
Development

No branches or pull requests

2 participants