## General SHAP Plots

These are general shap plots that cover all the data passed. By default, these plots only display the top 15 features according to their ranking metric. This number can be adjusted in the config file by adding a `"display_num"` parameter with an integer value.

### Heatmap Plot

This plot offers a condensed high-level overview over the data passed. It presents a plot with data instances on the x-axis, the model decisions on the y-axis, and the SHAP values encoded on a color scale. By default the samples are ordered based on hierarchical clustering by their explanation similarity. This results in samples that have similar model output for similar reasons getting grouped together.

Features are ranked by mean absolute impact, meaning the highest feature in this plot has the highest average impact on the model decisions given the data passed.


In [None]:
shap.plots.heatmap(explanations, max_display=display)

#### Prediction value ordered heatmap plot

This heatmap plot has its x-axis sorted in descending order of the model output value. Starting at the highest output value down to the lowest output value.

This plot can be useful to spot features that display counter-intuitive behaviors or clustering. We expect the feature colors (which represent the shap value) to be on a gradient if they correlate with the model output. If the colors instead display clusters, then the feature does not necessarily correlate with the output.

In [None]:
shap.plots.heatmap(explanations,instance_order=explanations.sum(1), max_display=display)

### Global Bar Plot

This plot gives a quick overview over the shap values of the data passed. Features are ranked by mean absolute impact.

The number to the right of the bar represents the mean absolute shap value of that feature.

The higher the mean shap value of your feature is, the higher the average contribution of that feature to a model decision is.


In [1]:
shap.plots.bar(explanations, max_display=display)

NameError: name 'shap' is not defined

#### Minimum mean max impact bar plot

This plot represents features ranked by minimum mean impact. The highest feature in this plot will have the overall lowest mean impact on a model decision. The values displayed are the absolute max impact of the feature.

This plot is useful if you are trying to catch a feature that does not contribute to the model decision, or contributes less than we would expect. If the feature is ranked high, and the impact value is low, then this feature is not impacting your model decisions significantly in the data given.

**caveat**: This plot can be misleading if you select a small sample of data to explain. A feature that is ranking high in this plot may just have a very low value in the select rows that were explained.

**Disclaimer**: The ordering is currently unreliable, due to SHAP rounding to $10^{-2}$ before displaying values in a plot. This usually happens with models with a large feature set. The values displayed are correct. If in doubt, refer to the "mean_shap_values.csv" and "max_shap_values.csv" files. They contain this information without rounding. If your .csv files contain features with a 0-value in both the above files, then they have no impact on the model for the data passed.

In [None]:
shap.plots.bar(explanations.abs.max(0), order=explanations.abs.mean(0)[::-1], max_display=display)

### Beeswarm Plot

The beeswarm plot gives an information-dense overview over your shap-values. Each row of data (i.e. model-decision) is represented by a dot on the given feature row in the plot.  The x-axis position of the dot is determined by the shap-value of that feature in that given decision. The further away from the 0-value a dot is, the higher the impact of that feature was for that decision. This impact can be negative (to the left) or positive (to the right).

The feature value (not shap value!) is marked by the color on the plot. Red signifies a high feature value, blue signifies a low feature value. Features are ranked by the mean-absolute impact they have on a model decision. The top feature in this plot will have the highest mean absolute impact. 


In [None]:
shap.plots.beeswarm(explanations, max_display=display)
#plt.gcf().axes[-1].set_aspect(1000)
#plt.gcf().axes[-1].set_box_aspect(1000)

#### Beeswarm ranked by maximum impact

This beeswarm plot is ranked by the absolute max-impact of your features. The highest ranked feature in this plot will have the highes maximum impact on a model decision. This can be relevant if you want to catch features that on average do not have a high impact, but have high maximum impact instead.

In [None]:
shap.plots.beeswarm(explanations, order=explanations.abs.max(0), max_display=display)

#### Absolute mean beeswarm

This plot is equivalent to the first beeswarm plot, but has the values transformed for mean absolute impact. This is useful if you want to see how much impact a feature has on average while also displaying where those impact values are clustered. This is a information richer version of the simple bar-plot in the Bar Plot section.

In [None]:
shap.plots.beeswarm(explanations.abs, order=explanations.abs.mean(0), max_display=display)

In [None]:
shap.plots.beeswarm(explanations, order=explanations.abs.mean(0), max_display=display)

In [None]:
shap.plots.beeswarm(explanations, max_display=display)

In [None]:
shap.plots.beeswarm(explanations.abs, max_display=display)

#### Minimum mean impact beeswarm plot

This is a beeswarm version of the minimum mean max impact bar plot. However, the shap values are plotted as absolute values and the features are ranked by minimum mean impact. If all the dots of a feature are concentrated on 0.0, then this feature is likely not contributing to the model decision in a significant way.

The feature order in this plot may differ from the corresponding bar plot, this is due to shap rounding all values below .

In [None]:
shap.plots.beeswarm(explanations.abs, order=explanations.abs.mean(0)[::-1], max_display=display)