# Predicción de calidad del vino y explicación con LIME
***
* Problema de regresión
* Datos: [Red Wine Quality](https://archive.ics.uci.edu/dataset/186/wine+quality)
    - Las variables son: Input variables (based on physicochemical tests):
        - 1 - *fixed acidity*
        - 2 - *volatile acidity*
        - 3 - *citric acid*
        - 4 - *residual sugar*
        - 5 - *chlorides*
        - 6 - *free sulfur dioxide*
        - 7 - *total sulfur dioxide*
        - 8 - *density*
        - 9 - *pH*
        - 10 - *sulphates*
        - 11 - *alcohol*
        - 12 - *quality* (puntaje entre 0 y 10), es la variable que se quiere predecir
* Uso de reed neuronal profunda (modelo tipo caja negra) para predecir la calidad
* Explicación del modelo usando método [LIME](https://dl.acm.org/doi/abs/10.1145/2939672.2939778).

# Bibliotecas

In [None]:
import pandas as pd
import numpy as np
np.random.seed(0)
from matplotlib import pyplot as plt

from keras.models import Sequential
from keras.layers import Dense

from sklearn.metrics import mean_absolute_error, mean_squared_error


from lime import lime_tabular

## Datos
***
Training, test y validation sets

In [None]:
df = pd.read_csv('data/winequality-red.csv') 
df.head()


In [None]:
from sklearn.model_selection import train_test_split

X = df.iloc[:,0:11]
y = np.ravel(df.quality)

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2,random_state = 42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train,test_size=0.2,random_state = 42)

## Red neuronal para predecir calidad
***
Arquitectura de la red según [ejemplo](https://www.analyticsvidhya.com/blog/2021/07/plunging-into-deep-learning-carrying-a-red-wine/)

In [None]:
model = Sequential([
    Dense(512, activation='relu', input_shape=[11]),
    Dense(512, activation='relu'),
    Dense(512, activation='relu'),
    Dense(1),
])
model.compile(
    optimizer='adam',
    loss='mae',
)
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    batch_size=256,
    epochs=50,
    verbose= 0
)

In [None]:
model.summary()

## Loss plot

In [None]:
plt.figure()
plt.plot(history.history['loss'], label="Train")
plt.plot(history.history['val_loss'], label="Validation")
plt.xlabel("Época")
plt.ylabel("Error")
plt.legend(loc="best")
plt.show()

## Predicción y desempeño del modelo

In [None]:
y_pred = model.predict(X_test)

In [None]:
mae_pred = mean_absolute_error(y_test, y_pred)

In [None]:
mse_pred = mean_squared_error(y_test, y_pred)

In [None]:
rmse_pred = mean_squared_error(y_test, y_pred, squared=False )

In [None]:
print(mae_pred, mse_pred,rmse_pred)

# LIME

Explicación de la  predicción generada por la red neuronal profunda usando [LIME](https://github.com/marcotcr/lime)

In [None]:
explainer =lime_tabular.LimeTabularExplainer(np.array(X_train),
                    feature_names=X.columns, 
                    class_names=['quality'], 
                    verbose=True, mode='regression')

In [None]:
#! pip install lime

## Eplicación a nivel local
***
* LIME explica la predicción generada por el modelo en  una instancia específica.

In [None]:
id_instancia = 0
exp = explainer.explain_instance(X_test.iloc[id_instancia], model.predict)

In [None]:
exp.as_list()

In [None]:
exp.show_in_notebook(show_table=True, show_all=False)
print(X_test.iloc[id_instancia])

In [None]:
p = exp.as_pyplot_figure(label=1)

<div class="alert-success">
    <h3>Pregunta</h3>
    <hr>
    <ul>
    <li> ¿Qué puede interpretar de la predicción que entrega LIME? </li>
    </ul>
</div>


<div class="alert-success">
    <h2>Tarea</h2>
    <hr>
    <ul>
        <li> Usar el método SubmodularPick para generar una explicación global asociada al método SP-Lime</li>
        <li>Revisar documentación en <a href="https://lime-ml.readthedocs.io/en/latest/lime.html?highlight=submodular_pick#module-lime.submodular_pick">lime doc</a> y <a href="https://github.com/marcotcr/lime/tree/master/doc/notebooks">lime notebooks</a></li>
    </ul>
   
</div>



In [None]:
from lime import submodular_pick