# Shapash in Jupyter - Overview

<b>With this tutorial you:</b><br />
Understand how Shapash works in Jupyter Notebook
with a simple use case<br />

Contents:
- Build a Regressor
- Compile Shapash SmartExplainer
- Compile Shapash SmartExplainer to SmartPredictor
- Save Shapash Smartpredictor Object in pickle file
- Make a prediction

Data from Kaggle [House Prices](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data)

In [2]:
import pandas as pd
from category_encoders import OrdinalEncoder
from lightgbm import LGBMRegressor
from sklearn.model_selection import train_test_split

## Building Supervized Model 

In [35]:
import sys
from shapash.explainer.smart_predictor import SmartPredictor
from shapash.explainer.smart_explainer import SmartExplainer
from shapash.data.data_loader import data_loading
from shapash.utils.load_smartpredictor import load_smartpredictor
#from shapash.data.data_loader import data_loading
house_df, house_dict = data_loading('house_prices')

In [4]:
y_df=house_df['SalePrice'].to_frame()
X_df=house_df[house_df.columns.difference(['SalePrice'])]

In [7]:
house_df.head()

.. table:: 

    +-------------------------------+-----------------------+-------+------+------------------+---------------+--------------------------------+-------------------------------+------------+-------------+-------------------------+----------+----------------------+----------+-----------+-----------+---------+------------+---------+----------------------------+------------+-------------+----------+----------+---------------+---------------+---------------+----------------------+---------------------------------+-----------------------+-----------------------+----------+----------------------+----------+---------+-----------+---------------------------+---------+----------+---------------------------------+--------+--------+------------+---------+------------+------------+--------+--------+------------+------------+---------------+------------+---------------------+----------+------------------+-----------+--------------------+----------+---------------+---------------+-------

#### Encoding Categorical Features 

In [8]:
from category_encoders import OrdinalEncoder

categorical_features = [col for col in X_df.columns if X_df[col].dtype == 'object']

encoder = OrdinalEncoder(
    cols=categorical_features,
    handle_unknown='ignore',
    return_df=True).fit(X_df)

X_df=encoder.transform(X_df)

#### Train / Test Split

In [9]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X_df, y_df, train_size=0.75, random_state=1)

#### Model Fitting

In [10]:
regressor = LGBMRegressor(n_estimators=200).fit(Xtrain,ytrain)

In [11]:
y_pred = pd.DataFrame(regressor.predict(Xtest),columns=['pred'],index=Xtest.index)

## Understand my model with shapash

#### Declare and Compile SmartExplainer 

In [12]:
from shapash.explainer.smart_explainer import SmartExplainer

In [36]:
house_dict.pop("GarageCars")

'Size of garage in car capacity'

In [38]:
xpl = SmartExplainer(features_dict=house_dict) # Optional parameter, dict specifies label for features name 

In [39]:
xpl.compile(
    x=Xtest,
    model=regressor,
    preprocessing=encoder, # Optional: compile step can use inverse_transform method
    y_pred=y_pred # Optional
)

Backend: Shap TreeExplainer


#### Compile SmartExplainer to SmartPredictor

In [40]:
predictor = xpl.to_smartpredictor()

## Save and Load your Predictor

#### Save your predictor in Pickle File

In [41]:
predictor.save('./predictor.pkl')

#### Load your predictor in Pickle File

In [42]:
predictor_load = load_smartpredictor('./predictor.pkl')

## Make a prediction with your Predictor

#### Add data

In [43]:
predictor_load.add_input(x=X_df, ypred=y_df)

#### Make prediction

In [44]:
prediction = predictor_load.predict()

In [45]:
prediction.head()

.. table:: 

    +--------+
    | ypred  |
    |206462.9|
    +--------+
    |181128.0|
    +--------+
    |221478.1|
    +--------+
    |184788.4|
    +--------+
    |256637.5|
    +--------+


#### Get detailed explanability associated to the prediction

In [46]:
detailed_contributions = predictor_load.detail_contributions()

In [48]:
detailed_contributions.head()

.. table:: 

    +--------+--------+--------+---------+------------+--------+--------+------------+----------+----------+------------+------------+------------+------------+--------+---------+----------+----------+----------+----------+-------------+---------+---------+-----------+-----------+----------+----------+--------+----------+----------+----------+------------+----------+----------+-----------+---------+--------+-------+---------+----------+------------+-----------+-----------+---------+-------+---------+--------+------------+----------+--------+----------+----------+-------+-------+------------+-----------+-----------+-----------+----------+--------+--------+---------+-------------+--------+-----------+------+------------+-----------+---------+----------+---------+------------+-------+
    | ypred  |1stFlrSF|2ndFlrSF|3SsnPorch|BedroomAbvGr|BldgType|BsmtCond|BsmtExposure|BsmtFinSF1|BsmtFinSF2|BsmtFinType1|BsmtFinType2|BsmtFullBath|BsmtHalfBath|BsmtQual|BsmtUnfSF|CentralAir|Cond

#### Summarize explainability of the predictions

In [49]:
predictor_load.modify_mask(max_contrib=10)

In [50]:
explanation = predictor_load.summarize()

In [51]:
explanation.head()

.. table:: 

    +--------+----------------------------------------+-------+--------------+----------------------------------------+-------+--------------+----------------------------------+-------+--------------+---------------------------------------+-------+--------------+-----------------------------+-------+--------------+----------------------------------+-------+--------------+----------------------------------+-------+--------------+------------------------------------------+-------+--------------+--------------------------------+-------+--------------+------------------------------------------+--------+---------------+
    | ypred  |               feature_1                |value_1|contribution_1|               feature_2                |value_2|contribution_2|            feature_3             |value_3|contribution_3|               feature_4               |value_4|contribution_4|          feature_5          |value_5|contribution_5|            feature_6             |value_6|contr