# SHAP (SHapley Additive exPlanations)

## Idea

SHAP is a method to make explanation by using Shapley values, so it's good to understand Shapley value first.

## Shapley value

$F$ is the set of all features.

$S$ is all feature subsets.

$f$ is model.

$f_{S \cup \{i\}}$ is a trained model which includes $i$th feature.

$f_{S}$ is a trained model which doesn't include $i$th feature.

$x_{S \cup \{ i \}}$ is the values of the features including $i$th feature.

$x_{S}$ is the values of the features not including $i$th feature.

$f_{S \cup \{i\}} (x_{S \cup \{ i \}})$ is a prediction by including $i$th feature.

$f_{S} (x_{S})$ is a prediction by not including $i$th feature.

$f_{S \cup \{i\}} (x_{S \cup \{ i \}}) - f_{S} (x_{S})$ is the difference between the predictions by the 2 models.

The vertical lines of $|F|$ represents the cardianlity, meaning the number of elements of set $F$.

$\frac{|S|!(|F| - |S| - 1)!}{|F|!}$ is a weight that we can compute from the number of features.

$\sum_{S \subseteq F \backslash \{ i \}}$ is summing over the combination of features which doesn't include $i$th feature but we are adding $i$th feature.

$\sum_{S \subseteq F \backslash \{ i \}} \frac{|S|!(|F| - |S| - 1)!}{|F|!}$ will be 1, so we can do a weighted average.

$S \subseteq F$ means $S$ is a subset of $F$ and $S$ and $F$ can be equal. 

$S \subseteq F \backslash \{ i \}$ means that $S$ is a subset of $F$ but we exclude $i$th feature from $S$. Even if $S \subseteq F$, we won't use a case where $S = F$ and instead use $S \subseteq F \backslash \{ i \}$ because we are interested in the additional effect of $i$th feature.

$\backslash$ means relative complement. $A \backslash B$ means objects that belong to $A$ and not to $B$. For example, $A$ = {1, 2}, $B$ = {2, 3}, $A \backslash B$ = {1}. 

$\phi_i$ is Shapley value of $i$th feature, computed by the following formula.

$$
\phi_i = \sum_{S \subseteq F \backslash \{i\}} \frac{|S|!(|F| - |S| - 1)!}{|F|!} \left[ f_{S \cup \{i\}} (x_{S \cup \{ i \}}) - f_{S} (x_{S}) \right]
$$

It means that Shapley value of $i$th feature is a weighted average of prediction differences between a model including $i$th feature and a model not including $i$th feature.

For example,

When we have 3 features and we wanna know the Shapley value of the 1st feature.

$F = \{ F_1, F_2, F_3 \}, |F| = 3$.

When we have 3 features, $S$ is a set of the followings

$S_1 = \{ \}, |S| = 0$

$S_2 = \{ F_1 \}, |S| = 1$

$S_3 = \{ F_2 \}, |S| = 1$

$S_4 = \{ F_3 \}, |S| = 1$

$S_5 = \{ F_1, F_2 \}, |S| = 2$

$S_6 = \{ F_1, F_3 \}, |S| = 2$

$S_7 = \{ F_2, F_3 \}, |S| = 2$

$S_8 =  \{ F_1, F_2, F_3 \}, |S| = 3$

$S \subseteq F \backslash \{1\} = \{ S_1, S_3, S_4, S_7 \}$, because by adding $F_1$, we have $S_1 \rightarrow S_2$, $S_3 \rightarrow S_5$, $S_4 \rightarrow S_6$, and $S_7 \rightarrow S_8$

We need to train 8 different models for $S_1$ through $S_8$ features.

We check the way to compute weights below.

In [27]:
from math import factorial


def compute_weight(F, S, output=False):

    if output:
        print(f'F: {F}, S: {S}')
        print(f'|F|!: {factorial(F)}')
        print(f'|S|!: {factorial(S)}')
        print(f'(|F| - |S| - 1)!: {factorial(F - S - 1)}')
        print(f'|S|!*(|F| - |S| - 1)! / |F|!: {factorial(S)}*{factorial(F - S - 1)} / {factorial(F)} = {factorial(S) * factorial(F - S - 1)} / {factorial(F)}')
        print()
    
    return (factorial(S) * factorial(F - S - 1)) / factorial(F)


print(compute_weight(3, 0, True))
print(compute_weight(3, 1))
print(compute_weight(3, 2))
print()

w_1 = compute_weight(3, 0)
w_2 = compute_weight(3, 1)
w_3 = compute_weight(3, 1)
w_4 = compute_weight(3, 2)

print(f'Sum of weights: {w_1 + w_2 + w_3 + w_4}')

F: 3, S: 0
|F|!: 6
|S|!: 1
(|F| - |S| - 1)!: 2
|S|!*(|F| - |S| - 1)! / |F|!: 1*2 / 6 = 2 / 6

0.3333333333333333
0.16666666666666666
0.3333333333333333

Sum of weights: 1.0


## SHAP

$f$ is our machine learning model that we want to explain. 

$f(x)$ is the prediction of the machine learning model.

$g(x)$ approximates $f(x)$. 

We call $g$ the **explanation model**.

$g$ is the linear combination of Shapley values.

$\phi_0$ is the null model output.

We call this method the **additive feature attribution methods** because of the linear combination of Shapley values of each feature.


xxx

## Resource

- [SHAP Values Explained Exactly How You Wished Someone Explained to You](https://towardsdatascience.com/shap-explained-the-way-i-wish-someone-explained-it-to-me-ab81cc69ef30)
- [Black-Box models are actually more explainable than a Logistic Regression](https://towardsdatascience.com/black-box-models-are-actually-more-explainable-than-a-logistic-regression-f263c22795d)
- [Math Symbols List](https://www.rapidtables.com/math/symbols/Basic_Math_Symbols.html)