# Shapash Library

**Shapash** by *MAIF* é um kit de ferramentas Python que facilita a compreensão de modelos de aprendizado de máquina para cientistas de dados. Isso torna mais fácil compartilhar e discutir a interpretabilidade do modelo com não especialistas em dados: analistas de negócios, gerentes e usuários finais.

Concretamente, o Shapash fornece visualizações fáceis de ler e um aplicativo da web. O Shapash exibe os resultados com o texto apropriado (pré-processamento inverso / pós-processamento). O Shapash é útil em um contexto operacional, pois permite que os cientistas de dados usem a explicabilidade da exploração à produção: você pode facilmente implantar a explicabilidade local na produção para completar cada uma de suas previsões / recomendações com um resumo da explicabilidade local.

Documentação: https://shapash.readthedocs.io/en/latest/<br>
Artigo: https://www.kdnuggets.com/2021/04/shapash-machine-learning-models-understandable.html

In [None]:
# instalando a biblioteca
#!pip install shapash

In [1]:
# importando as bilbiotecas
import pandas as pd
from shapash.data.data_loader import data_loading

from category_encoders import OrdinalEncoder

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

from shapash.explainer.smart_explainer import SmartExplainer

import warnings
# warnings.simplefilter(action='ignore', category=FutureWarning)
warnings.filterwarnings("ignore")

Para testar, vamos utilizar no conjunto de dados “Preços das casas” do Kaggle para ajustar um regressor e prever os preços das casas.

In [2]:
# importando o conjunto de dados
house_df, house_dict = data_loading('house_prices')

# checando as primeiras linhas
house_df.head()

Unnamed: 0_level_0,MSSubClass,MSZoning,LotArea,Street,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,...,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,MiscVal,MoSold,YrSold,SaleType,SaleCondition,SalePrice
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,2-Story 1946 & Newer,Residential Low Density,8450,Paved,Regular,Near Flat/Level,"All public Utilities (E,G,W,& S)",Inside lot,Gentle slope,College Creek,...,0,0,0,0,0,2,2008,Warranty Deed - Conventional,Normal Sale,208500
2,1-Story 1946 & Newer All Styles,Residential Low Density,9600,Paved,Regular,Near Flat/Level,"All public Utilities (E,G,W,& S)",Frontage on 2 sides of property,Gentle slope,Veenker,...,0,0,0,0,0,5,2007,Warranty Deed - Conventional,Normal Sale,181500
3,2-Story 1946 & Newer,Residential Low Density,11250,Paved,Slightly irregular,Near Flat/Level,"All public Utilities (E,G,W,& S)",Inside lot,Gentle slope,College Creek,...,0,0,0,0,0,9,2008,Warranty Deed - Conventional,Normal Sale,223500
4,2-Story 1945 & Older,Residential Low Density,9550,Paved,Slightly irregular,Near Flat/Level,"All public Utilities (E,G,W,& S)",Corner lot,Gentle slope,Crawford,...,272,0,0,0,0,2,2006,Warranty Deed - Conventional,Abnormal Sale,140000
5,2-Story 1946 & Newer,Residential Low Density,14260,Paved,Slightly irregular,Near Flat/Level,"All public Utilities (E,G,W,& S)",Frontage on 2 sides of property,Gentle slope,Northridge,...,0,0,0,0,0,12,2008,Warranty Deed - Conventional,Normal Sale,250000


In [3]:
# checando as dimensões
house_df.shape

(1460, 73)

In [4]:
# definindo X
X = house_df.drop(['SalePrice'], axis=1)

# definindo y
y = house_df['SalePrice'].to_frame()

# criando uma lista com nomes das variáveis categóricas
categorical_features = house_df.select_dtypes('object').columns.tolist()

# instanciando o Ordinal Encoder
encoder = OrdinalEncoder(cols=categorical_features)

# treinando o Ordinal Encoder
encoder.fit(X)

# transformando as variáveis
X = encoder.transform(X)

# checando as primeiras linhas transformadas
X.head(2)

Unnamed: 0_level_0,MSSubClass,MSZoning,LotArea,Street,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,...,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,MiscVal,MoSold,YrSold,SaleType,SaleCondition
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,1,1,8450,1,1,1,1,1,1,1,...,61,0,0,0,0,0,2,2008,1,1
2,2,1,9600,1,1,1,1,2,1,2,...,0,0,0,0,0,0,5,2007,1,1


In [5]:
# Separando os dados em treino e teste
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.3)

# intanciando modelo
reg = RandomForestRegressor(n_estimators=200, min_samples_leaf=2)

# treinando o modelo
reg.fit(Xtrain,ytrain)

# realizando as previsões
y_pred = pd.DataFrame(reg.predict(Xtest), columns=['pred'], index=Xtest.index)

## Começando analisar com SmartExplainer Object

A classe SmartExplainer é o objeto principal da biblioteca Shapash. Ele permite que os cientistas de dados realizem várias operações para tornar os resultados mais compreensíveis: vincular codificadores, modelos, previsões, ditado de rótulo e conjuntos de dados. Os usuários do SmartExplainer têm vários métodos descritos a seguir.

In [6]:
# instanciando o SmartExplainer como objeto
xpl = SmartExplainer(features_dict=house_dict) # parametro opcional que especifica cada nome da coluna

In [7]:
# compilando
xpl.compile(
    x=Xtest,
    model=reg,
    preprocessing=encoder,# Optional: use inverse_transform method
    y_pred=y_pred # Optional
)

Backend: Shap TreeExplainer


In [8]:
# iniciando o webapp
app = xpl.run_app()

INFO:root:Your Shapash application run on http://LAPTOP-PNGNJISM:8050/


Dash is running on http://0.0.0.0:8050/



INFO:root:Use the method .kill() to down your app.
INFO:shapash.webapp.smart_app:Dash is running on http://0.0.0.0:8050/



 * Serving Flask app "shapash.webapp.smart_app" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: off


INFO:werkzeug: * Running on http://0.0.0.0:8050/ (Press CTRL+C to quit)
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mGET / HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mGET /_dash-dependencies HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mGET /_dash-layout HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:00:51] "[37mPOST /_dash-update-compone

INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:35] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:35] "[37mPOST /_dash-update-component HTTP/1.1[0m" 204 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:35] "[37mPOST /_dash-update-component HTTP/1.1[0m" 204 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:35] "[37mPOST /_dash-update-component HTTP/1.1[0m" 204 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:35] "[37mPOST /_dash-update-component HTTP/1.1[0m" 204 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:37] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:37] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:37] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkzeug:192.168.0.109 - - [05/May/2021 07:02:38] "[37mPOST /_dash-update-component HTTP/1.1[0m" 200 -
INFO:werkz

In [None]:
# "derruba" o webapp
app.kill()