## Model explanation using SHAP values & plots

### Shapley values: a brief introduction

SHAP values are generated using the [SHAP library](https://shap.readthedocs.io/en/latest/index.html) and are approximations of [Shapley Values](https://en.wikipedia.org/wiki/Shapley_value), a concept derived from game-theory. Briefly speaking, the value for a feature indicates, for every model decision passed to the explainer, how the model decision would be impacted by removing that feature. For a more in-depth explanation, please refer to this [summary article](https://towardsdatascience.com/understanding-how-ime-shapley-values-explains-predictions-d75c0fceca5a).

By default, `rsmexplain` uses the [Sampling](https://shap.readthedocs.io/en/latest/generated/shap.explainers.Sampling.html#shap.explainers.Sampling) explainer which computes SHAP values through random permutations of the features, a method described [here](https://link.springer.com/article/10.1007/s10115-013-0679-x). Although the sampling explainer is model agnostic and should – in principle – work for any type of model, RSMExplain currently only explains regressor models they are the most commonly used model for automated scoring.

### Reading SHAP values


In [None]:
msg = (f"SHAP values are additive representations of a feature's impact on a model "
       f"decision. The sum of all SHAP values and the base value for an example yields "
       f"the actual model prediction for that example. A SHAP value for a feature can "
       f"be considered that feature's contribution to that specific prediction. By "
       f"computing the averge of all absolute SHAP values for a specific feature, we "
       f"can calculate that feature's average impact on the predictions for the data "
       f"we are trying to explain. The average, maximum, and minimum absolute SHAP "
       f"values for all features can be found in `output/{experiment_id}_absolute_shap_values.csv`.")

display(Markdown(msg))

### Things to consider

In [None]:
msg = ("- `rsmexplain` can only generate SHAP values for the examples contained in \"explain_data\" "
       "and, if specified, \"sample_range\" or \"sample_size\". If the dataset passed is small, then "
       "the values derived may not be representative of the model as a whole. Plots displaying mean "
       "values for your SHAP values may be unreliable if your dataset was small or not actually "
       "representative of the data the model deals with. \n\n"
       "- It is assumed that a **sufficiently large and diverse** background set is used. \n\n"
       f"- To analyze SHAP values manually, please refer to `output/{experiment_id}_shap_values.csv`.\n\n"
       f"- To use the generated shap explanation object for additional "
       f"processing, unpickle `output/{experiment_id}_explanation.pkl` to "
       f"get access to it. If you chose specific examples via `sample_range` or "
       f"`sample_size`, you can find them in `output/{experiment_id}_ids.pkl` "
       f"where they are stored as a mapping between the position of the example "
       f"in the dataset and the ID of the example.")

display(Markdown(msg))

### SHAP values summary

This is a quick textual summary of your SHAP values. Please refer to the Plots section below for visualizations. All values are rounded to three decimal places unless specified otherwise.

#### Top 5 features by mean absolute SHAP value

The following table shows the top 5 features in terms of mean absolute SHAP value, i.e., the top 5 features with the biggest mean impact on model predictions. Note that the table also includes the maximum and minimum absolute values for each feature. 

*Note: if your model has <=5 features with non-zero SHAP values, all of them will be displayed.*

In [None]:
mean_abs_nonzero_values = df_abs_shap[df_abs_shap['abs. mean shap'] != 0].copy()
if len(mean_abs_nonzero_values) > 5:
    top_5_table = HTML(mean_abs_nonzero_values[0:5].to_html(classes=['sortable'],
                                                            float_format=float_format_func))
else:
    top_5_table = HTML(mean_abs_nonzero_values.to_html(classes=['sortable'],
                                                       float_format=float_format_func))

display(top_5_table)

#### Bottom 5 features by mean absolute SHAP value

The following features are the ones with the lowest absolute non-zero mean SHAP value. Assuming that your dataset was large enough and representative, these features may be the least useful to the model. Note that the table also includes the maximum and minimum abs. SHAP value for each feature. 

*Note: If your model has <= 5 features with non-zero SHAP values, all of them will be displayed.*

In [None]:
# sort the values in ascending order for this one
mean_abs_nonzero_values.sort_values(by=['abs. mean shap'], inplace=True)
formatter = partial(float_format_func, scientific=True)
if len(mean_abs_nonzero_values) > 5:
    bottom_5_table = HTML(mean_abs_nonzero_values[0:5].to_html(classes=['sortable'], float_format=formatter))
else:
    bottom_5_table = HTML(mean_abs_nonzero_values.to_html(classes=['sortable'], float_format=formatter))

display(bottom_5_table)

In [None]:
mean_values = df_abs_shap["abs. mean shap"]
rows_with_zero_mean_values = df_abs_shap[df_abs_shap["abs. mean shap"] == 0].copy()

msg = ("#### Features with zero mean SHAP value\n The features in the table below "
       "likely did not contribute to the model's decisions. Assuming the examples passed "
       "were sufficient in number and representative of the data the model usually "
       "encounters, the features in this table are not useful to the model.\n "
       "**IMPORTANT**: Please make sure that the distribution of the features values "
       "for the actual samples being explained does not happen to be significantly "
       "different from the data on which the model was trained.")

if len(rows_with_zero_mean_values) > 0:
    display(Markdown(msg))
    
    if len(rows_with_zero_mean_values) <= 10:
        zero_value_table = HTML(rows_with_zero_mean_values.to_html(classes=['sortable'], 
                                                                   float_format=float_format_func))
    else:
        display(Markdown("You have more than 10 features with absolute mean SHAP value of 0."
                         "Displaying the first 10 here. Check `absolute_shap_values.csv` for the rest."))
        zero_value_table = HTML(rows_with_zero_mean_values[:10].to_html(classes=['sortable'], 
                                                                       float_format=float_format_func))
    display(zero_value_table)

#### Top 5 features by maximum absolute SHAP value

Features in the table below are the ones with the largest impact according to the maximum absolute SHAP value. If these *do not* overlap with the features with the largest mean impact, then it is likely that they have large outlier values, but lower average impact. 

*Note: if your model has less than or equal to 5 features, all of them will be displayed. *

In [None]:
if len(df_abs_shap) > 5:
    top_5_max_table = HTML(df_abs_shap.sort_values(by=['abs. max shap'], 
                                                   ascending=False)[0:5].to_html(classes=['sortable'], 
                                                                                 float_format=float_format_func))
else:
    top_5_max_table = HTML(df_abs_shap.sort_values(by=['abs. max shap'], 
                                                   ascending=False).to_html(classes=['sortable'],
                                                                            float_format=float_format_func))
    
display(top_5_max_table)