# Explainable Networks

> What do you call it if someone has absolute faith in artificial intelligence?  - [Naive bias](https://en.m.wikipedia.org/wiki/Naive_Bayes_classifier).


Neural network models have acquired a reputation for being **black boxes** - they make accurate decisions, but we have a hard time explaining what actually influenced the decisions.  This can be an obstacle to the deployment of such models, when its decisions need to be transparent to enable technical, ethical or legal review. The field of **explainable artificial intelligence** aims to remedy this problem. In the following we are going to look at an algorithm that aims to explain predictions in image classification.

## Training Clever Hans

But first let me tell you the story of **Clever Hans**: _Der Kluge Hans_ was a trained horse that made headlines in Germany 1904 due to his amazing intelligence. 

![](https://upload.wikimedia.org/wikipedia/commons/thumb/5/57/Osten_und_Hans.jpg/640px-Osten_und_Hans.jpg)

During demonstrations, Hans was seemingly able to count and arithmetic exercises and answered correctly to the questions of the trainer, a mathematics teacher, by tapping his hooves or shaking his head. Scientists were puzzled. Eventually, a student solved the mystery: He was able to demonstrate that Clever Hans had no concept of mathematics, but was able to pick up on very subtle cues in the body language of the human posing the task. That allowed Hans to detect the right answer for about 90% of the questions.

Ever since then, research in animal cognition is wary of the **Clever Hans effect**, where the animal trainer unwittingly provides cues that are correlated with the right answer.  A similar phenomenon can happen when training machine learning systems:  Rather than learning a generalizable concept, the machine learning model is trained to pick up on _spurious correlations_ of the input data with the correct answer. [Ribeiro et al. ]() provide a practical example for this: An image classifier mislabels a picture of a husky as a wolf, even though the classifier was shown to have high accuracy during model engineering. Using the LIME algorithm - which we will introduce shortly - the authors are able to show the spurious correlation that the model used: Apparently the training images for "wolf" had snow in them.

![](graphics/husky-wolf.png)


This is related to the problem of **leakage** - _leaking_ the label to the classifier, allowing it to cheat. Leakage can be difficult to detect and requires careful cleaning of the data to avoid. A classifier performance that is "too good to be true" is often a sign of leakage. 

How can we gain more certainty about what the ML model reacts to? An analysis of [📓Feature Importance](../ml/ml-feature-engineering.ipynb) can give us insights. Inspecting deep learning models with very high-dimensional inputs is more challenging, but there are algorithmic approaches that can help.


## Preamble

In [None]:
from tensorflow import keras
import numpy
import matplotlib.pyplot as plt
import pandas

## Example Model: Inception V3 for Image Classification

[Inception V3](https://keras.io/applications/#inceptionv3) is a deep convolutional neural network architecture for image classification. A model trained on 1000 classes from the [ImageNet](ImageNet) benchmark data set is provided with `keras`.

In [None]:
from tensorflow.keras.applications import inception_v3

In [None]:
inception_model = inception_v3.InceptionV3()

Let's see how the Inception network classifies several animal photos. We need to do some preprocessing to get them into the right shape.

In [None]:
image_paths = {
    "dog" : "../.assets/data/xai_images/dog.10.jpg",
    "wolf 1":  "../.assets/data/xai_images/wolves-at-play.jpg",
    "dog guitar": "../.assets/data/xai_images/dog-guitar.jpg",
}

In [None]:
def preprocess_image(img_path):
    """Preprocess the """
    img = keras.preprocessing.image.load_img(
        img_path, 
        target_size=(299, 299)
    )
    img_array = keras.preprocessing.image.img_to_array(img)
    #img_array = numpy.expand_dims(img_array, axis=0)
    img_array = inception_v3.preprocess_input(img_array)
    img_array = img_array / 2 + 0.5  # brighten
    return img_array

In [None]:
images = dict(
    (key, preprocess_image(img_path))
              for key, img_path in image_paths.items()
)

In [None]:
plt.imshow(images["dog"])

In [None]:
plt.imshow(images["wolf 1"])

In [None]:
plt.imshow(images["dog guitar"])

Here we call the model on these images and obtain a classification including how confident the network is in its prediction.

In [None]:
predictions = inception_model.predict(
    numpy.array(list(images.values()))
)


In [None]:
from tensorflow.keras.applications import imagenet_utils

In [None]:
decoded_predictions = imagenet_utils.decode_predictions(predictions)

In [None]:
plt.imshow(images["dog"])

In [None]:
pandas.DataFrame(
    decoded_predictions[0], 
    columns=["class", "label", "confidence"]
).set_index("label").plot(kind="bar", ylim=(0,1))

In [None]:
plt.imshow(images["wolf 1"])

In [None]:
pandas.DataFrame(
    decoded_predictions[1], 
    columns=["class", "label", "confidence"]
).set_index("label").plot(kind="bar", ylim=(0,1))

In [None]:
plt.imshow(images["dog guitar"])

In [None]:
pandas.DataFrame(
    decoded_predictions[2], 
    columns=["class", "label", "confidence"]
).set_index("label").plot(kind="bar", ylim=(0,1))

## Explaining Image Classification with LIME

[**LIME**](https://github.com/marcotcr/lime) is an algorithm library that aims to explain the answers of any classifier, including but not limited to neural networks. LIME includes tools specific for image classification.

In [None]:
import lime

In [None]:
from lime.lime_image import LimeImageExplainer

In [None]:
explainer = LimeImageExplainer()

Explaining an instance is rather compute-intensive since it in turn involves estimating a machine learning model:

In [None]:
explanation = explainer.explain_instance(
    images["dog"].astype("double"),
    inception_model.predict, 
    top_labels=5, 
    hide_color=0, 
    num_samples=500
)

In [None]:
from skimage.segmentation import mark_boundaries

In [None]:
explained_image, mask = explanation.get_image_and_mask(
    explanation.top_labels[0], 
    positive_only=True, 
    num_features=5, 
    hide_rest=True
)
plt.figure()
plt.imshow(mark_boundaries(explained_image, mask))
plt.figure()
plt.imshow(images["dog"])

It's the ears!

> The skull should be broad, with a long muzzle and **long, hanging ears**.

-- [Wikipedia: Treeing Walker Coonhound](https://en.m.wikipedia.org/wiki/Treeing_Walker_Coonhound)

In [None]:
explanation = explainer.explain_instance(
    images["wolf 1"].astype("double"), 
    inception_model.predict, 
    top_labels=5, 
    hide_color=0, 
    num_samples=500
)

In [None]:
explained_image, mask = explanation.get_image_and_mask(
    explanation.top_labels[0], 
    positive_only=True, 
    num_features=5, 
    hide_rest=True
)
plt.figure()
plt.imshow(mark_boundaries(explained_image, mask))
plt.figure()
plt.imshow(images["wolf 1"])

In the case of the last photo, there were multiple objects detected. We can use the explainer to segment the image into the parts that provide support for each of the classes. Let's look at the classes "chihuahua" and "acoustic guitar".

In [None]:
explanation = explainer.explain_instance(
    images["dog guitar"].astype("double"), 
    inception_model.predict, 
    top_labels=5, 
    hide_color=0, 
    num_samples=500
)

In [None]:
explained_image, mask = explanation.get_image_and_mask(
    explanation.top_labels[0], 
    positive_only=True, 
    num_features=5, 
    hide_rest=True
)
plt.figure()
plt.imshow(mark_boundaries(explained_image, mask))
plt.figure()
plt.imshow(images["dog guitar"])

In [None]:
explained_image, mask = explanation.get_image_and_mask(
    explanation.top_labels[4], 
    positive_only=True, 
    num_features=5, 
    hide_rest=True
)
plt.figure()
plt.imshow(mark_boundaries(explained_image, mask))
plt.figure()
plt.imshow(images["dog guitar"])

## References

- [Local Interpretable Model-Agnostic Explanations (LIME): An Introduction](https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime)
- [Ribeiro et al.: _“Why Should I Trust You?”
Explaining the Predictions of Any Classifier_](https://arxiv.org/pdf/1602.04938.pdf)
- [Lapuschkin et al. : _Unmasking Clever Hans predictors and assessing
what machines really learn_](https://www.nature.com/articles/s41467-019-08987-4.pdf)
- [Vincent Warmerdam: How to Constrain Artificial Stupidity | PyData London 2019](https://www.youtube.com/watch?v=Z8MEFI7ZJlA)

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2018-2025 [Point 8 GmbH](https://point-8.de)_