Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to interpret the Shop force plot? #977

Open
SSMK-wq opened this issue Jan 2, 2020 · 15 comments
Open

How to interpret the Shop force plot? #977

SSMK-wq opened this issue Jan 2, 2020 · 15 comments
Labels

Comments

@SSMK-wq
Copy link

SSMK-wq commented Jan 2, 2020

Hello Everyone,

I am trying to practice and learn shapley value approach to explain my predictions on a binary classification problem. However am having difficulty in understanding the below plot.

image

  1. Does it indicate the day_2_balance influences prediction to 1? or does blue values leads to prediction 1

  2. What about the axis scale? (-4.357 to 5.643). How is this obtained?

  3. What does base value mean?

  4. When I hover around the pink color , I see few more column names with some values. What do they indicate?

  5. does the size of features represent their importance? Meaning PEEP_min=5 has a larger size than other features?

  6. What does higher to lower and lower to higher indicate?

  7. Why is -2.92 alone is in bold format? If it's predicted value, how can it be because am working on a binary classification problem with label 1 and label 0?

Can someone help me with this?

@ibuda
Copy link

ibuda commented Jan 2, 2020

Hi, I will try to respond in the order of the questions asked.

  1. As a matter of fact, all of the features (day_2_balance, PEEP_min, Fi02_100_max, etc) values lead to the prediction value of -2.92, which is then "transformed" to a value of 1. Here, by all values I mean even those that are not shown in the plot. However, Shap plots the top most influential features for the sample under study. Features in red color influence positively, i.e. drag the prediction value closer to 1, features in blue color - the opposite.
  2. As you already might have understood, the model prediction values are not 0 and 1 (discrete), but real (float) number values - raw values. The scale here represents a visualization of a small interval around the output and base values.
  3. The base value is the average of all output values of the model on the training
  4. The pink (red) color features in your example are many with small (low importance) values. The plot stacked them all together and shows their values on hover. The values you see are those raw values I mentioned above. They represent how much those features influence the final output of the model for the sample under study.
  5. Correct. In the case of PEEP_min, it has negative magnitude importance, i.e. prediction tends more to 0 because of its value.
  6. Higher in pink(red) means that high pink values drag the prediction to 1 (i.e. increase the output raw value), whereas the blue drag it towards 0 (i.e. decrease the model output value).
  7. You are right, -2.92 is the model output for your sample with index 4100. This is the "raw" value which is then transformed into probability space, to give you the final output of 0 and 1 (< 0.5 and > 0.5).
    If you need more details on some of the above questions, or additional ones arise, please consult the tutorial notebooks - these are well explained and illustrate the entire span of the use cases of this incredible package.

@SSMK-wq
Copy link
Author

SSMK-wq commented Jan 2, 2020

Hi @ibuda - Just a quick question regarding point 1 of your response. "the prediction value of -2.92, which is then "transformed" to a value of 1" - "should this be "transformed to a value of 0"?

Because -2.92 is less than 0, so the final predicted output for this observation (4100) is class 0. Am I right?

@ibuda
Copy link

ibuda commented Jan 2, 2020

Hi @ibuda - Just a quick question regarding point 1 of your response. "the prediction value of -2.92, which is then "transformed" to a value of 1" - "should this be "transformed to a value of 0"?

Because -2.92 is less than 0, so the final predicted output for this observation (4100) is class 0. Am I right?

In general, yes and no. Even though it is lower than the base value of 0.6427, it might be 1 for some cases. I've seen this happen when the training dataset is highly unbalanced.
Coming back to your case, I think you are right, it is 0. I said it "transforms" to 1 (i.e some probability higher than 0.5) because you mentioned "influences prediction to 1? " in your question. So I guess it is my typo because of not paying attention to the context. :)
Let me know if you have any other questions.

@vinrok
Copy link

vinrok commented Dec 21, 2020

Hello All,

I have just started learning about Explainable AI and was implementing the SHAP algorithm for that. But I am facing difficulty in interpreting the results of the SHAP force_plot. I will be grateful if someone can help me with an intuitive way of interpreting it. 🙂 🙏

Consider, below force_plot for about 43 test samples of heart_data.

image

  1. Does the numbering on the top X-axis represent samples in the dataset?
  2. How to actually interpret the force_plot result as to which feature contributes more in predicting whether the patient has heart disease or not?

Here is the force_plot for the 10th sample individually and how we can relate it with the above:

image

Here is the link to my notebook - Explainable AI using SHAP

@ibuda
Copy link

ibuda commented Dec 21, 2020

Hi @vinrok, great to hear that you are using shap. The answers to your questions are:

  1. Yes, the upper numbers on the x-axis are the indices of the sample data.
  2. The above plot is just a summary of 43 horizontally stacked individual plots. If you reverse it (or vice versa if you reverse the second plot) you'll see that they coincide for the 10th sample. You need to use summary_plot in order to see what feature and in what way contributes more or less to the classification of your patients.

I looked at your notebook, you did use the summary_plot. There you can see that cp and thal are your top 2 features that influence the output of your model.

For force_plot I would add only the following - there is a drop box in the upper-middle region of the graph. Use it to see what the plot shows you for cp - most important feature and chol - least important feature. For cp the width of its band is large, whereas for chol it's not.

I also suggest you play with the left side drop box as well. And try to use it in combination with the above-mentioned dorp box.

Hope that answers your question. If not, do let us know what other questions arise.

@vinrok
Copy link

vinrok commented Dec 21, 2020

Thank you so much this intuitive explanation @ibuda 😊🙏.

@vinrok
Copy link

vinrok commented Dec 21, 2020

But @ibuda I am still confused regarding the pink and the blue bands. I mean consider sample order by similarity and f(x). Can we say that for this plot we are putting the patients having the most similar features together?

Then in that case how to interpret the prediction in terms of pink and blue and by hovering over it?

As for samples in range 8-17, from the test dataset, most of them have the label - 1 (heart disease) but from this plot, the SHAP value goes below base value.

@ibuda
Copy link

ibuda commented Dec 21, 2020

I think you are confusing predictions with Y_test. Shap gives you info on what your model predicted, not what the real value is supposed to be.

That is why you get that discrepancy.

Anyways, the blue band is what features and how much are dragging the final output value down (to 0 class), and the pink bands are those that increase it (up to 1 class).

Try to look at your notebook with this in mind, and let me know if that helps.

@vinrok
Copy link

vinrok commented Dec 21, 2020

Got it @ibuda The idea is somewhat clear to me now. 😊

@ibuda
Copy link

ibuda commented Dec 21, 2020

Hi @SSMK-wq, for the sake of consistency, if your questions were answered, please consider closing this issue. Thank you.

@herewego321
Copy link

Hi, I will try to respond in the order of the questions asked.

  1. As a matter of fact, all of the features (day_2_balance, PEEP_min, Fi02_100_max, etc) values lead to the prediction value of -2.92, which is then "transformed" to a value of 1. Here, by all values I mean even those that are not shown in the plot. However, Shap plots the top most influential features for the sample under study. Features in red color influence positively, i.e. drag the prediction value closer to 1, features in blue color - the opposite.
  2. As you already might have understood, the model prediction values are not 0 and 1 (discrete), but real (float) number values - raw values. The scale here represents a visualization of a small interval around the output and base values.
  3. The base value is the average of all output values of the model on the training
  4. The pink (red) color features in your example are many with small (low importance) values. The plot stacked them all together and shows their values on hover. The values you see are those raw values I mentioned above. They represent how much those features influence the final output of the model for the sample under study.
  5. Correct. In the case of PEEP_min, it has negative magnitude importance, i.e. prediction tends more to 0 because of its value.
  6. Higher in pink(red) means that high pink values drag the prediction to 1 (i.e. increase the output raw value), whereas the blue drag it towards 0 (i.e. decrease the model output value).
  7. You are right, -2.92 is the model output for your sample with index 4100. This is the "raw" value which is then transformed into probability space, to give you the final output of 0 and 1 (< 0.5 and > 0.5).
    If you need more details on some of the above questions, or additional ones arise, please consult the tutorial notebooks - these are well explained and illustrate the entire span of the use cases of this incredible package.

Hi, you mentioned this -2.92 is the raw output and it can be transformed into probability space. But how can I do such transformation? I encountered the same question. My prediction model is LightGBM and I am using shap.TreeExplainer(). Or maybe you could give me a hint where I can find the transform equation?

@hjanh
Copy link

hjanh commented Jan 18, 2022

Hi, you mentioned this -2.92 is the raw output and it can be transformed into probability space. But how can I do such transformation? I encountered the same question. My prediction model is LightGBM and I am using shap.TreeExplainer(). Or maybe you could give me a hint where I can find the transform equation?

These values are log-odds due to this example being a (binary) classification task. You can easily revert them:


def logodds_to_prob(logit):
        odds = math.exp(logit)
        return odds / (1 + odds)

@prashanthin
Copy link

How to interpret below shap Force plot ?
Hello everyone,

I am trying to plot a force plot with all points in my data, but having difficulty in its interpretation and understanding below plot.
here the code line - shap.force_plot(explainer.expected_value, shap_values, X_test)

In my case - Demand value in my dependent variable and Adobe Visits is my independent variable.

What does the dropdown on x axis means?
what does the dropdown on Y axis means?
what is meant by effects on Y axis?
What it means when I select adobe visits effects in y axis dropdown?
what does average sample means which pops out when we hover over the region in graph?
what does the f(x) means on yaxis dropdown? does that means a function to all my independent variables ?
How to interpret Adobe visits effective in deriving demand from this graph?
How other variables interaction /cross effects is calculated when we change variables in drop down ( in second image attached)?

image

image

Can someone please help on this ?
Thanks.

@thomaschateau
Copy link

How to interpret below shap Force plot ? Hello everyone,

I am trying to plot a force plot with all points in my data, but having difficulty in its interpretation and understanding below plot. here the code line - shap.force_plot(explainer.expected_value, shap_values, X_test)

In my case - Demand value in my dependent variable and Adobe Visits is my independent variable.

What does the dropdown on x axis means? what does the dropdown on Y axis means? what is meant by effects on Y axis? What it means when I select adobe visits effects in y axis dropdown? what does average sample means which pops out when we hover over the region in graph? what does the f(x) means on yaxis dropdown? does that means a function to all my independent variables ? How to interpret Adobe visits effective in deriving demand from this graph? How other variables interaction /cross effects is calculated when we change variables in drop down ( in second image attached)?

image

image

Can someone please help on this ? Thanks.

Hello @prashanthin,

Did you find any answer ?

Regards

@MarioIuliano87
Copy link

Hi, you mentioned this -2.92 is the raw output and it can be transformed into probability space. But how can I do such transformation? I encountered the same question. My prediction model is LightGBM and I am using shap.TreeExplainer(). Or maybe you could give me a hint where I can find the transform equation?

These values are log-odds due to this example being a (binary) classification task. You can easily revert them:


def logodds_to_prob(logit):
        odds = math.exp(logit)
        return odds / (1 + odds)

This answer was the game changer in my case. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants