# Explicabilidad y Interpretabilidad en Aprendizaje profundo

#### "Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one."
https://arxiv.org/pdf/1602.04938.pdf LIME

Industrias “sensibles”

Hoy existen ciertas áreas de aplicación que podrían denominarse como “sensibles”, donde el efecto de un error puede ser tan catastrófico o crítico desde el punto de vista humano, que no es trivial automatizar la toma de decisiones con IA sin saber de antemano cómo se justifican ciertas predicciones o decisiones. Esto se refiere a objetivos militares, diagnósticos médicos, detección de fraudes bancarios, seguridad, entre otros. “Si en un hospital, ven una radiografía y mandan al paciente a hacerse un tratamiento y la información estaba errada, ellos deben hacerse responsables. Lo mismo puede suceder en el campo militar, con el lanzamiento de un misil o el ataque a un objetivo de manera incorrecta”,

By “explaining a prediction”, we mean presenting textual or
visual artifacts that provide qualitative understanding of the
relationship between the instance’s components (e.g. words
in text, patches in an image) and the model’s prediction

![image.png](attachment:image.png)

The process of explaining individual predictions is illustrated in Figure 1. It is clear that a doctor is much better
positioned to make a decision with the help of a model if
intelligible explanations are provided. In this case, an explanation is a small list of symptoms with relative weights –
symptoms that either contribute to the prediction (in green)
or are evidence against it (in red). Humans usually have prior
knowledge about the application domain, which they can use
to accept (trust) or reject a prediction if they understand the
reasoning behind it. It has been observed, for example, that
providing explanations can increase the acceptance of movie
recommendations [12] and other automated systems [8].


https://www.cenia.cl/2023/07/03/empresas-que-saben-explicar-sus-soluciones-de-ia-pueden-ser-mas-eficientes-y-confiables/

# CNNS

Grad-CAM

Abstract We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional
Neural Network (CNN)-based models, making them more
transparent and explainable.
Our approach – Gradient-weighted Class Activation Mapping
(Grad-CAM), uses the gradients of any target concept (say
‘dog’ in a classification network or a sequence of words
in captioning network) flowing into the final convolutional
layer to produce a coarse localization map highlighting the
important regions in the image for predicting the concept.


![](GradCAM.png)

Grad-CAM for biased model

![image.png](attachment:image.png)

### Grad-CAM Como funciona por encima
https://arxiv.org/pdf/1610.02391.pdf
- CNN Model Structure: Grad-CAM is applicable to CNN models, typically those used for image classification tasks. These models consist of convolutional layers followed by fully connected layers. The last convolutional layer is of particular interest for Grad-CAM. 

- Forward Pass: Initially, the image is passed through the network to obtain a class prediction.

- Target Class Selection: Choose a class of interest. This could be the predicted class or any other class for which you want to visualize the model's focus.

- Computing Gradients: Once the forward pass is complete, we compute the gradients of the chosen class score (from the output of the network) with respect to the feature maps of the last convolutional layer. These gradients are essentially how much each feature map in the convolutional layer contributes to the class score.

- Pooling Gradients: The gradients are then globally pooled (usually through global average pooling, la diferencia en CAM que usa un global pooling antes de la capa Softmax tiendo que alterar la arquitectura de la red y muchas veces volver a entrenar) to obtain the neuron importance weights. This step involves computing the mean of the gradients for each feature map, resulting in a single scalar for each feature map.

- Weighted Combination of Feature Maps: These neuron importance weights are then used to weight the feature maps of the last convolutional layer. This step combines the feature maps in a weighted manner, where the weights are derived from the importance of each feature map in predicting the class.

- ReLU Activation: The weighted combination of feature maps is followed by a ReLU activation. This is crucial because we are interested in the features that have a positive influence on the class of interest. Negative values are not relevant for the class and thus are set to zero.

- Generating the Heatmap: The resulting matrix after the ReLU activation is a coarse heatmap of the same size as the output of the last convolutional layer. This heatmap is then upscaled to the size of the input image. This upscaled heatmap highlights the important regions of the image for the decision of the chosen class.

- Overlaying the Heatmap: Finally, this heatmap is overlaid on the original image to visualize the areas of the image most relevant to the selected class.

This method is effective because the last convolutional layer in a CNN captures high-level visual features, and by weighting these features based on their impact on the class score, we can visualize the parts of the image that led to a particular classification decision.

 the key difference is in how the importance of each feature map is determined: CAM uses the weights from the softmax layer (which requires a GAP layer just before the softmax layer), while Grad-CAM uses the gradients with respect to the target class (which allows it to be applied to a wider range of CNN architectures without modification). Grad-CAM is generally more flexible and widely applicable due to its architecture-agnostic nature.

## SHAP
https://arxiv.org/pdf/1705.07874.pdf

https://github.com/shap/shap#citations

Understanding why a model makes a certain prediction can be as crucial as the
prediction’s accuracy in many applications. However, the highest accuracy for large
modern datasets is often achieved by complex models that even experts struggle to
interpret, such as ensemble or deep learning models, creating a tension between
accuracy and interpretability. In response, various methods have recently been
proposed to help users interpret the predictions of complex models, but it is often
unclear how these methods are related and when one method is preferable over
another. To address this problem, we present a unified framework for interpreting
predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature
an importance value for a particular prediction. Its novel components include: (1)
the identification of a new class of additive feature importance measures, and (2)
theoretical results showing there is a unique solution in this class with a set of
desirable properties


![image.png](attachment:image.png)

![image.png](attachment:image.png)

 Utiliza los valores de Shapley, un concepto de la teoría de juegos, para asignar a cada característica su importancia en una predicción específica.

#

# Bert Visualizer

https://colab.research.google.com/drive/1hXIQ77A4TYS4y3UthWF-Ci7V7vVUoxmQ?usp=sharing#scrollTo=twSVFOM9SopW

## Attention weights, contrafactual, AI Explainability 360