# Calculating aSHAP values (aggregated SHAP values)

SHAP values calculated in notebook ***03*** will be now transformed to be compatible with aSHAP implementation in *DALEX* package 

(in future, now codes from script will be used)

Theory:

As written on https://ema.drwhy.ai/breakDown.html#BDMethodGen Shapley values have the property called *local accuracy*:

$f(x_{i}) = v_{0} + \sum^{p}_{j=1} v(x_{i},j)$,

where: 

* $f$ - model
* $v_{0}$ - mean model prediction
* $x_{i}$ - i-th observation from a subset to be explained (let it be $X$)
* $v(x_{i},j)$ - SHAP value for i-th observation for j-th feature

Summing over the whole subset to be explained we get:

$\sum_{i=1}^{N} f(x_{i}) = \sum_{i=1}^{N} (v_{0} + \sum^{p}_{j=1} v(x_{i},j))$

$\frac{\sum_{i=1}^{N} f(x_{i})}{N} = \frac{\sum_{i=1}^{N} v_{0} + \sum_{i=1}^{N} \sum^{p}_{j=1} v(x_{i},j)}{N}$

$\overline{f(X)} = v_{0} + \sum^{p}_{j=1} \overline{v(X, j)}$

So, in plots below, *prediction* is an average of preditions over the set $X$, *intercept* is an average predictions over the training set and contributions are average contributions across set $X$.

## Calculating predictions for the whole dataset ( $\overline{Y}$ )

In [None]:
library(ranger)
library(tidyr)
library(dplyr)

In [None]:
model <- readRDS('./model/model.rds')
df_preprocessed <- read.csv('./data/preprocessed_data.csv')

In [None]:
y_hat <- predict(model, df_preprocessed)$predict

Predicting.. Progress: 53%. Estimated remaining time: 27 seconds.


In [None]:
mean_prediction <- mean(y_hat)

In [None]:
saveRDS(y_hat, './model/y_hat.RDS')
saveRDS(mean_prediction, './model/mean_prediction.RDS')

In [None]:
y_hat_shap_format <- data.frame(variable_name = 'intercept', contribution = y_hat)

In [None]:
saveRDS(y_hat_shap_format, './model/y_hat_shap_format.RDS')

## Function to transform

It is available in `./scripts/transform_shap.R`.

In [None]:
source('./scripts/transform_shap.R')

In [None]:
transform_all_tasks('./results', 'ranger', y_hat_shap_format, model)

## Create `shap_aggregated` objects

In [None]:
source('./scripts/aSHAP.R')
source('./scripts/create_shap_aggreated_object.R')

In [None]:
create_shap_aggreated_objects_for_all_tasks('./results', 'ranger')

## Create plots

In [None]:
library(iBreakDown)
library(ggplot2)

In [None]:
source('./scripts/aSHAP.R')
source('./scripts/create_shap_aggreated_object.R')

In [None]:
create_aSHAP_plots_for_all_tasks('./results', max_features = 55, height = 40, bg = 'white', plot_filename_addition = 'long')

Saving 16.9 x 40 cm image

Saving 16.9 x 40 cm image

Saving 16.9 x 40 cm image



In [None]:
create_aSHAP_plots_for_all_tasks('./results', max_features = 6, height = 20, bg = 'white', plot_filename_addition = 'shorter')

Saving 16.9 x 20 cm image

Saving 16.9 x 20 cm image

Saving 16.9 x 20 cm image



<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=638a36e2-efff-486f-858d-cbca546da2c6' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>