In [None]:
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'

import matplotlib.pyplot as plt
import numpy as np

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.feature_extraction.text import TfidfVectorizer

from lime import lime_tabular

import statsmodels.api as sm

import math
import random

import shap

In [None]:
df = pd.read_csv('./dataset/df2.csv')

In [None]:
df

In [None]:
formula = 'score_factor ~ Q("priors_count") + Q("two_year_recid") + Q("crime_factor") + Q("age_cat_25 - 45") + Q("age_cat_Greater than 45") + Q("age_cat_Less than 25") + Q("race_African-American") + Q("race_Asian") + Q("race_Caucasian") + Q("race_Hispanic") + Q("race_Native American") + Q("race_Other") + Q("sex_Female") + Q("sex_Male")'

In [None]:
modelo = sm.GLM.from_formula(formula, family=sm.families.Binomial(), data=df)
resultado = modelo.fit()
resultado.summary()

# Interpretabilidad

La interpretabilidad es el grado en que un ser humano puede comprender la causa de una decisión. También se puede definir como el grado en que un ser humano puede predecir consistentemente el resultado del modelo. [Interpretability](https://christophm.github.io/interpretable-ml-book/interpretability.html)

En [Explainable AI: An illuminator in the field of black-box machine learning](https://towardsdatascience.com/explainable-ai-an-illuminator-in-the-field-of-black-box-machine-learning-62d805d54a7a) se explican brevemente algunos métodos. Para mayor detalle, [Molnar](https://christophm.github.io/interpretable-ml-book/index.html) describen más métodos de interpretación de modelos.

## SHAP

SHAP tiene basado en la teoría de juegos cooperativos que vienen con propiedades deseables. [An introduction to explainable AI with Shapley values](https://shap.readthedocs.io/en/latest/example_notebooks/overviews/An%20introduction%20to%20explainable%20AI%20with%20Shapley%20values.html)

In [None]:
etiquetasX = ['priors_count', 'two_year_recid', 'crime_factor', 'age_cat_25 - 45',
       'age_cat_Greater than 45', 'age_cat_Less than 25',
       'race_African-American', 'race_Asian', 'race_Caucasian',
       'race_Hispanic', 'race_Native American', 'race_Other', 'sex_Female',
       'sex_Male']
etiquetaY = ['score_factor']

In [None]:
X = df[etiquetasX].to_numpy()
Y = df[etiquetaY].to_numpy()

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.90, test_size=0.1, stratify=Y, random_state=123)

In [None]:
lr = LogisticRegression()

lr.fit(X_train, Y_train)

## Explicación del modelo 

In [None]:
explainer = shap.Explainer(lr, X_train, feature_names=etiquetasX)
shap_values = explainer(X_test)

### Barras (bars)

In [None]:
shap.plots.bar(shap_values.abs.max(0))

### Enjambre (beeswarm)

In [None]:
shap.plots.beeswarm(shap_values)

## Explicación de una instancia

### Cascada (waterfall)

In [None]:
shap.plots.waterfall(shap_values[0], max_display=14)

## Referencias

Molnar, C. (2022). Interpretable Machine Learning. Retrieved 18 February 2022, from https://christophm.github.io/interpretable-ml-book/index.html

An introduction to explainable AI with Shapley values — SHAP latest documentation. (2022). Retrieved 18 February 2022, from https://shap.readthedocs.io/en/latest/example_notebooks/overviews/An%20introduction%20to%20explainable%20AI%20with%20Shapley%20values.html

Explainable AI: An illuminator in the field of black-box machine learning. (2021). Retrieved 18 February 2022, from https://towardsdatascience.com/explainable-ai-an-illuminator-in-the-field-of-black-box-machine-learning-62d805d54a7a