
# Interpretabilidad en Deep Learning

Interpretar redes neuronales profundas (DNN, CNN, RNN) es crucial para comprender cómo y por qué toman decisiones.

Este notebook extiende el análisis hacia modelos de visión por computador y procesamiento de lenguaje natural usando:

- SHAP para modelos densos y convolucionales
- Grad-CAM para CNN
- Integrated Gradients para redes densas
- LIME para NLP



## 1. Interpretabilidad en Modelos Densos con SHAP

Creamos una red simple con Keras sobre el dataset de cáncer de mama (`sklearn.datasets.load_breast_cancer`) y analizamos sus predicciones.


In [None]:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
import shap

# Dataset
data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Modelo
model = tf.keras.Sequential([
    tf.keras.layers.Dense(20, activation="relu", input_shape=(X.shape[1],)),
    tf.keras.layers.Dense(10, activation="relu"),
    tf.keras.layers.Dense(1, activation="sigmoid")
])
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(X_train, y_train, epochs=10, verbose=0)

# SHAP DeepExplainer
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test[:100])
shap.plots.beeswarm(shap_values)



## 2. Interpretabilidad Visual: Grad-CAM para CNN

Grad-CAM visualiza qué regiones de una imagen activan la predicción de una CNN. Usaremos `tf.keras.applications.MobileNetV2`.


In [None]:

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.applications import mobilenet_v2
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model

# Imagen de ejemplo
img_path = tf.keras.utils.get_file('elephant.jpg',
    'https://upload.wikimedia.org/wikipedia/commons/6/6b/Pencil_drawing_of_an_elephant.jpg')
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = mobilenet_v2.preprocess_input(x)

# Modelo base
model = mobilenet_v2.MobileNetV2(weights='imagenet')
preds = model.predict(x)

# Grad-CAM
last_conv_layer = model.get_layer("Conv_1")
grad_model = Model([model.inputs], [last_conv_layer.output, model.output])
with tf.GradientTape() as tape:
    conv_outputs, predictions = grad_model(x)
    loss = predictions[:, tf.argmax(predictions[0])]
grads = tape.gradient(loss, conv_outputs)[0]
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
conv_outputs = conv_outputs[0]
heatmap = tf.reduce_sum(pooled_grads * conv_outputs, axis=-1)

# Mostrar
plt.imshow(img)
plt.imshow(heatmap.numpy(), cmap='jet', alpha=0.5)
plt.title("Grad-CAM Overlay")
plt.axis('off')
plt.show()



## 3. Integrated Gradients

Método para estimar la importancia de cada input integrando gradientes a lo largo de un camino desde una entrada base.

```bash
pip install captum  # (versión para PyTorch)
```

Usaremos TensorFlow's internal method aquí.


In [None]:

# Custom IG en TensorFlow
def integrated_gradients(inputs, model, baseline=None, steps=50):
    if baseline is None:
        baseline = tf.zeros_like(inputs)
    alphas = tf.linspace(0., 1., steps)
    interpolated = [baseline + alpha * (inputs - baseline) for alpha in alphas]
    interpolated = tf.convert_to_tensor(interpolated)

    with tf.GradientTape() as tape:
        tape.watch(interpolated)
        preds = model(interpolated)
    grads = tape.gradient(preds, interpolated)
    avg_grads = tf.reduce_mean(grads, axis=0)
    return (inputs - baseline) * avg_grads

# Aplicación a un input
input_sample = tf.convert_to_tensor([X_test[0]], dtype=tf.float32)
ig_vals = integrated_gradients(input_sample, model)
print("Importancias integradas:", ig_vals.numpy().flatten())



## 4. Interpretabilidad en Texto con LIME

Usamos `lime.lime_text` para explicar la predicción de un clasificador de texto (TF-IDF + SVM).

```bash
pip install lime
```


In [None]:

from sklearn.pipeline import make_pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from lime.lime_text import LimeTextExplainer

texts = ["This movie was amazing!", "Worst movie ever.", "I liked the plot but not the actors."]
labels = [1, 0, 1]

pipe = make_pipeline(TfidfVectorizer(), LogisticRegression())
pipe.fit(texts, labels)

explainer = LimeTextExplainer(class_names=["neg", "pos"])
exp = explainer.explain_instance("The movie was great and the music awesome!", pipe.predict_proba)
exp.show_in_notebook()
