# Explaining Facial Expression Recognition with Simplification and Feature Attribution
# Notebook 1:  XAI for Affective Computing (SoSe2022)

In this notebook you will attempt to generate explanations for predictions of two Facial Expression recognition models, one for tabular data extracted from images and one for raw image date, both trained using a subset of the [AffectNet dataset](http://mohammadmahoor.com/affectnet/). AffectNet is a dataset of facial expressions in the wild, and is labeled with 8 facial expression categories: **Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger, and Contempt**. (Have a look at the paper for more details https://arxiv.org/abs/1708.03985). 

In **Part 1**, you will first explore local and global explanations on a tabular dataset of [Facial Action Units (FAUs)](https://imotions.com/blog/facial-action-coding-system/) by using the [LIME python package](https://github.com/marcotcr/lime). The dataset is comprised of FAUs that were extracted from the face images of AffectNet using [OpenFace2.0](https://github.com/TadasBaltrusaitis/OpenFace). This dataset is then used to train a Random Decision Forest (RDF) classifier (trained model is provided in the code below).

In **Part 2**, you generate local explanations for a Convolutional Neural Network (CNN) using both LIME and GradCAM.  The CNN is already trained using the raw images of AffectNet (trained model is provided in the code below). A subset of the test data images is provided for generating and evaluating the explantions.  

To use this notebook, please make sure to go step by step through each of the cells review the code and comments along the way.

See **README** To get Started

## Part 0: Notebook Setup

In [None]:
%load_ext autoreload
%autoreload 2

**Import necessary libraries**

(see README for necessary package installations if you receive a `module not found` error.

In [None]:
import pickle
from pathlib import Path

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")

# import tensorflow for model loading
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical

# import sklearn for processing data and results
from sklearn.metrics import confusion_matrix, classification_report, auc, roc_curve, roc_auc_score
from sklearn.preprocessing import LabelBinarizer

# import model loading function
from model import cnn_model, create_bn_replacment

# import plotting helper functions
import utils

from IPython.display import clear_output
import warnings
warnings.filterwarnings('ignore')

## Part 1: Explanations of Facial Action Units

In this part, we will generate explanations for the Random Decision Forest trained using a dataset of Facial Action Units (as described in the notebook introduction).  

First, let's load the data and the trained models. Then we will evaluate the model peformance, before we start with the explanations.

### Load the data

In [None]:
# Full data from training and evaluation
train_csv = '../data/affectnet_aus/train_aus.csv'
val_csv = '../data/affectnet_aus/val_aus.csv'

# load training and validation data as pandas dataframeas
df_train = pd.read_csv(train_csv)
df_val = pd.read_csv(val_csv)

# smaller dataset for explanations (same data as in Task 1)
xai_csv = '../data/affectnet_aus/eval_aus.csv'
df_xai = pd.read_csv(xai_csv)

# get only the columns storing action units from the dataframe
feature_names = [col for col in df_val if col.startswith('AU')]
categorical_features = [i for i, feat in enumerate(feature_names) if '_c' in feat]

class_names = ['Neutral', 'Happy', 'Sad', 'Surprise', 'Fear', 'Disgust', 'Anger', 'Contempt']  # same class labels as before

In [None]:
# convert data from dataframe to Numpy arrays
X_train = np.array(df_train.loc[:, feature_names])
y_train = np.array(df_train['class'])

X_test = np.array(df_val.loc[:, feature_names])
y_test = np.array(df_val['class'])

X_xai = np.array(df_xai.loc[:, feature_names])
y_xai = np.array(df_xai['class'])

print('Train', X_train.shape, y_train.shape)
print('Test', X_test.shape, y_test.shape)
print('XAI', X_xai.shape, y_xai.shape)

### Load pretrained RDF model
And validate that it works.  
The accuracy of the model in the training data should be around $99.65\%$

In [None]:
with open('../models/affect_rdf.pkl', 'rb') as f:
    clf = pickle.load(f)
    
clf.score(X_train, y_train)

### Now evaluate on the test data
Unfortunately, the accuracy is only $44\%$ but this is still well above chance guessing which would be $1 / 8 * 100 = 12.5\%$ accuracy (since there are 8 total classes)

In [None]:
# get model predictions
y_test_preds = clf.predict(X_test)
y_test_true = y_test

In [None]:
print(classification_report(y_test_true, y_test_preds))

We can also review the confusion matrix to see where the model makes its mistakes

In [None]:
cm_data = confusion_matrix(y_test_true, y_test_preds)
cm = pd.DataFrame(cm_data, columns=class_names, index=class_names)
cm.index.name = 'Actual'
cm.columns.name = 'Predicted'
plt.figure(figsize = (20,10))
plt.title('Confusion Matrix', fontsize = 20)
sns.set(font_scale=1.2)
ax = sns.heatmap(cm, cbar=False, cmap="Blues", annot=True, annot_kws={"size": 16}, fmt='g')

### Evaluate with XAI Data
Here we calculate and evaluate predictions on a smaller `X_xai` dataset.  `X_xai`is a subset of the full test data (from above) and will be used throughout the rest of part 1.

The accuracy should be around $42\%$

In [None]:
# get model predictions
y_xai_preds = clf.predict(X_xai)
y_xai_true = y_xai

print(classification_report(y_xai_true, y_xai_preds))

**Display confusion matrix for `X_xai` dataset**

In [None]:
cm_data = confusion_matrix(y_xai_true, y_xai_preds)
cm = pd.DataFrame(cm_data, columns=class_names, index=class_names)
cm.index.name = 'Actual'
cm.columns.name = 'Predicted'
plt.figure(figsize = (20,10))
plt.title('Confusion Matrix', fontsize = 20)
sns.set(font_scale=1.2)
ax = sns.heatmap(cm, cbar=False, cmap="Blues", annot=True, annot_kws={"size": 16}, fmt='g')

### TASK 1: Implement LIME Local Explanations and SP-LIME for Global Explanations

Now on to the implementation LIME explanations. 

#### Task 1.0: 
**Identify a Few Images to Explain**

The code below will display images from the XAI dataset.
- Try changing value of `start` to get a new set of images (there are 10 images for each class; for example, the class happy will be at indexes 10-19)
- Search through the images to find at least 4 to explain 
    - Find classes that you would like to explain, and from each class select 2 images
        - one should be a correct prediction  
        - and one should be an incorrect prediction

In [None]:
# packages needed for the rest of the tasks
from skimage import io
import lime.lime_tabular

In [None]:
# Gets all images from folder
images = [io.imread(f) for f in df_xai.image]

# gets labels for ground truch and predictions
true_labels = [class_names[idx] for idx in y_xai_true]
pred_labels = [class_names[idx] for idx in y_xai_preds]

In [None]:
# displays first 9 images in array
start = 10
utils.display_nine_images(images, true_labels, pred_labels, start)

In [None]:
#### Enter the Indexes Here ### 
###############################
# you will use this array later in the task
img_idxs = []

#### Task 1.1
Now implement a [LimeTabularExplainer](https://lime-ml.readthedocs.io/en/latest/lime.html#module-lime.lime_tabular), you can review the [LIME tutorial](https://marcotcr.github.io/lime/tutorials/Tutorial%20-%20continuous%20and%20categorical%20features.html) for help. Make sure to indicate which feature are categorical features as stored in the `categorical_features` variable, as continuous and categorical features are treated differently with LIME.

Note: In the feature names, you will see features with a `_c` and a `_r` at the end.  The `_r` means the intentsity of the action unit (i.e., how strong is it's presence), and the `_c` is a binary feature indicating the presence (value=1), or non-presence (value=0), of an action unit.

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 1.2
Generate LIME explanations the previously identified 4 data instances from the `X_xai` dataset, using the `LimeTabularExplainer` and then plotting the explanations for each data instance (see tutorial mentioned above).  

HINT: Before showing an explanation, plot the image using `utils.display_one_image()` utility function.  Use `plt.show()` immediately after calling `utils.display_one_image()` to display the image before the explanation charts.

Make sure to print out the **True** and **Predicted** labels for each instance.

Try experimenting with different parameters for the explainer and explanation.

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Talk 1.3
Identify the important Facial Actions Units and compare with the images at [Facial Action Units](https://imotions.com/blog/facial-action-coding-system/).  What insights do these local explanations provide?

Write your answer here...

- 

#### Task 1.4

Now implement [Submodular Pick](https://lime-ml.readthedocs.io/en/latest/lime.html#lime-submodular-pick-module) instance to generate global explanations and see if it provides you with a more global perspective of how the model makes decisions. You can review the [LIME tutorial](https://github.com/marcotcr/lime/blob/master/doc/notebooks/Submodular%20Pick%20examples.ipynb) for help.

Try setting `num_exps_desired` to 16 to try to get 2 examples per class (although this isn't guaranteed).

In [None]:
# import for submodular pick
from lime import submodular_pick

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 1.5
Now plot the explantions

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Bonus Task
Generate a pandas dataframe of the explanations and explore the dataframe to gain more insight into the explanations, see the [LIME tutorial](https://github.com/marcotcr/lime/blob/master/doc/notebooks/Submodular%20Pick%20examples.ipynb) for an example.

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 1.6

How does LIME Submodular Pick select explanations for a global perspective on the model?

Identify important AUs for each of the classes and compare with the images at [Facial Action Units](https://imotions.com/blog/facial-action-coding-system/).  What insights do these explanations provide? Do you now have a better understanding of how the model is working? If not, what is lacking using the LIME approach and/or what could be done differently?

Write your answer here

...

## Part 2:  Local Explations of Facial Expression Recognition with Images

In this part, we will generate explanations for the CNN trained using facial images (as described in the notebook introduction).

First, let's load the data and the trained models. Then we will evaluate the model peformance, before we start with the explanations.

Set some global variables

In [None]:
SEED = 12
IMG_HEIGHT = 128
IMG_WIDTH = 128
BATCH_SIZE = 80 
NUM_CLASSES = 8
CLASS_LABELS = ['Neutral', 'Happy', 'Sad', 'Surprise', 'Fear', 'Disgust', 'Anger', 'Contempt']

### Load Pretrained CNN Model and Setup Data Generator

In [None]:
# make sure you've downloaded the models from LernraumPlus (see README instructions for Notebook I)
model_path = '../models/affectnet_model_e=60/affectnet_model'

# test loading weights
model_xai = cnn_model(input_shape=(IMG_HEIGHT, IMG_WIDTH, 3), num_classes=NUM_CLASSES)
model_xai.load_weights(model_path).expect_partial()

In [None]:
test_dir = '../data/affectnet/val_class/'

# Load data
test_datagen = ImageDataGenerator(validation_split=0.2,
                                  rescale=1./255)
test_gen = test_datagen.flow_from_directory(directory=test_dir,
                                            target_size=(IMG_HEIGHT, IMG_WIDTH),
                                            batch_size=BATCH_SIZE,
                                            shuffle=False,
                                            color_mode='rgb',
                                            class_mode='categorical', 
                                            seed = SEED)
images, classes = next(test_gen) # since batch size is set to 80, this will load the entire test dataset

### Evaluation and Predictions
Here we evaluate the loaded model to ensure it is working as expected.  You should get around $55\%$ accuracy. While this is not a perfect classifier is above random guessing which is $1 / 8 * 100 = 12.5$ accuracy

Then we load predictions to use throughout the notebook. 

The predictions results can then be viewed with a confusion matrix to see where the model is confused

In [None]:
loss, acc = model_xai.evaluate(test_gen, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100 * acc))

In [None]:
# get softmax predictions from model
preds = model_xai(images)

# convert predictions to integers
y_pred = np.argmax(preds, axis=-1)
y_true = np.argmax(classes, axis=-1)

# print detailed results
print(classification_report(y_true, y_pred, target_names=CLASS_LABELS))

In [None]:
# we can also review the confusion matrix
cm_data = confusion_matrix(y_true, y_pred)
cm = pd.DataFrame(cm_data, columns=CLASS_LABELS, index = CLASS_LABELS)
cm.index.name = 'Actual'
cm.columns.name = 'Predicted'
plt.figure(figsize = (20,10))
plt.title('Confusion Matrix', fontsize = 20)
sns.set(font_scale=1.2)
ax = sns.heatmap(cm, cbar=False, cmap="Blues", annot=True, annot_kws={"size": 16}, fmt='g')

### TASK 2: LIME Local Prediction Explanations

Now that we have our model setup, we will review the images and predictions to identify a few data instances to explain.  

#### Task 2.0

Identify a Few Images to Explain

The code below will display images from the XAI dataset.

- Try changing start value to get a new set of images (there are 10 images for each class, so for example, the class happy will be at indexes 10-19)
- Search through the images to find at least 4 to explain 
    - Find classes that you would like to explain, and from each class select 2 images
        - one should be a correct prediction  
        - and one should be an incorrect prediction
        - These can be the same as previously select (but note that that the indexes are different due to different methods of reading the files from disk)

In [None]:
# displays first 9 images in array
start = 30

true_labels = [CLASS_LABELS[idx] for idx in y_true]
pred_labels = [CLASS_LABELS[idx] for idx in y_pred]
utils.display_nine_images(images, true_labels, pred_labels, start)

In [None]:
#### Enter the Indexes Here ### 
###############################
# you will use this array later in this task
img_idxs = []


#### Task 2.1 
**Implement a LIME Image Explainer**

Implement a [LimeImageExplainer](https://lime-ml.readthedocs.io/en/latest/lime.html#module-lime.lime_image) instance, you can review the [LIME tutorial](https://github.com/marcotcr/lime/blob/master/doc/notebooks/Tutorial%20-%20Image%20Classification%20Keras.ipynb) for help. 

In [None]:
import lime
from lime import lime_image
from lime.wrappers.scikit_image import SegmentationAlgorithm

from skimage.segmentation import mark_boundaries # used to get boundries from explanation for plotting

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 2.2

Now generate the explanations for each of the selected images.

As we discussed in the seminar, LIME requires images to be segmented into superpixels.  For facial expressions the segmentation algorithm is very important and the default of the `explain_instance()` method may not provide good explanations.  Experiment with different segmenters using the LIME `SegmentationAlgorithm` class and pass to `segmenter_fn` argument of the `explain_instance()` method.

- For example: `segmenter = SegmentationAlgorithm('method name', params)`, where params are defined by the skimage segmentation alorithm method
- by default LIME uses: 
```
segmenter = SegmentationAlgorithm('quickshift', kernel_size=4,
                                  max_dist=200, ratio=0.2,
                                  random_seed=random_seed)
```

You can find different algorithms via [skimage segmentation](https://scikit-image.org/docs/stable/api/skimage.segmentation.html)

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 2.3
Print the predicted labels for the top $N$ labels as found by explainer

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 2.4: 

**Visualize Explanations**

Visualize the explanations for each of the 4 data points from LIME using matplotlib's `imshow` function (see above tutorial). (Or pass the explanation to the `display_one_image` from the `utils` module.)

*HINT*: Use the `subplot` parameter of the `display_one_image` to plot a 2x2 grid.  The value should be an integer formated as `RCN` where `R` is the number of rows, `C` is the number of columns, and `C` is the number of the image to plot.  For example, `221` means to plot the first image of a 2x2 grid, `222` means the plot the second images, and so forth... (also see `display_nine_images` for example of this usage.)

Experiment with at least 2 different sets of parameters for the explanation visualizations.  For example, view positive and negative contributions, change the number of features for the explation, or try visualizing a heatmap)

In [None]:
##### YOUR CODE GOES HERE #####
###############################
# (you can use more than one notebook cell for this task)



#### Task 2.5 
**Report on your findings**

What are your insights have you gained about the predictions? Can you identify any patterns that explain how the model is working? Are you more or less confident in the model's performance after reviewing the explanations?

Write here findings here...

### TASK 3: Grad CAM

#### Task 3.1
**Implement the GradCAM Algorithm**

Implement a version of GradCAM based on the [Keras GradCAM Tutorial](https://keras.io/examples/vision/grad_cam/). To better understand the algorithm, rather than just copying and pasting the entire functions, try implementing each main step of the alorithm in a seperate cell.  Then review the output of that cell either by printing the tensor or tensor shape.  Or by visualizing the output with matplotlib, for example with `plt.imshow()`

note:  the final convolution layer in our network is named `final_conv_layer`

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 3.2

Wrap the GradCAM Implementation into functions for generating the heatmaps and creating a superimposed image.  Instead of saving the superimposed image to a file (as done in the tutorial), simply generate the superimposed image and return it. We will use it in the next task.

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 3.3

Using your new GradCAM functions and the `display_one_image()` function in the `utils` model display the heatmaps for your images.  However, instead of just displaying the heatmaps for the top predicated class, for each of your images generate a heat map for each class label, see the file `gradcam_example.png` for an example of what this might look like. Also, for each image, add a border to the image representing the predicted class.  See the docstring of `display_one_image()` for details on usage.

In [None]:
##### YOUR CODE GOES HERE #####
###############################



#### Task 3.5

What patterns do you notice in the heatmaps for the different classes. Do the regions of the image where the model focuses make sense? Or do you notice any unusual biases?

### TASK 4: Final Discussion
Between tasks 1, 2, and 3, which of the 2 methods best help you understand how the model is function, as we've discussed throughout the seminar. Why?

write your answer here...

### BONUS TASK

Try plot LIME output and GradCAM outpt for the same images next too each other.  Are the results similar or are there difference in indentified important regions?

Or feel free to generate your own plots or visualizations to help understand the explanations and models better.

In [None]:
##### YOUR CODE GOES HERE #####
###############################

