# **Facial Expression Recognition**
## **Class Activation Maps**
### Alejandro Alemany, Sara Manrriquez, and Benjamin Zaretzky

Class activation maps are used to visualize which pixels of an image contribute more to the output of a model. In this notebook we explore the effects facial features have on image classification using class activation maps. Specifically, we will use gradient-weighted class activation mapping, as known as Grad-CAM. Grad-CAM is class-specific and creates a heatmap based off input image, the trained CNN, and the class of interest.

## Import Packages

We import all necessary packages.

In [None]:
import os
os.environ["KMP_SETTINGS"] = "false"
import cv2
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

from scipy import stats
from sklearn.ensemble import VotingClassifier
from tensorflow import keras
from tensorflow.keras.models import Sequential, load_model
from tqdm import tqdm 

## Load Data

We load the full data set and isolate the test set. 

In [None]:
# Read in full data set
data = pd.read_csv('../input/challenges-in-representation-learning-facial-expression-recognition-challenge/icml_face_data.csv')
data.columns = ['emotion', 'Usage', 'pixels']
print(data.shape)

In [None]:
# View first five rows
data.head()

In [None]:
# Select only rows that are in the public or private test set
test = data.loc[data["Usage"] != 'Training',['emotion','pixels']]
#test.drop(columns='Usage', inplace=True)
test.head()

## Preprocess Data

The images for this training set are stored as a string. In order to visualize the images we need to process these strings into a 4D array of pixel values.

In [None]:
# Reshape the pixels
test['pixels'] = [np.fromstring(x, dtype=int, sep=' ').reshape(-1,48,48,1) for x in test['pixels']]

In [None]:
# Combine pixels into single array
pixels = np.concatenate(test['pixels'].values)
print(pixels.shape)
# Separate emotion values
labels = test.emotion.values
print(labels.shape)

In [None]:
# Standardize the pixels values between 0 and 1
pixels = pixels / 255

## Load Model

We load our pretrained model from week 3. 

In [None]:
# Week 3 model
wk3_model1 = load_model('../input/dsci-598-fa21/team_01_model_05.h5')

## Week 3 Model Predictions

We predict the labels for the test set.

In [None]:
# Compute the probabilities
wk3_test_probs1 = wk3_model1.predict(pixels)
# Compute the prediction
wk3_test_pred1 = np.argmax(wk3_test_probs1, axis=1)

# Add the prediction to the data
test['wk3_predictions'] = wk3_test_pred1
test.head()

## Heatmap Functions

We create two functions to display the heatmaps. The first function creates a "gradient model" from our CNN. The second function creates a numpy array of the heatmap for an image. 

In [None]:
def create_grad_model(model):
    for layer in reversed(model.layers):
        if len(layer.output_shape) == 4:
            last_conv_layer = layer.name
            break

    grad_model = tf.keras.models.Model(
        inputs=[model.inputs],
        outputs=[model.get_layer(last_conv_layer).output, model.output])
    
    return grad_model 

def compute_heatmap(image, class_ix, grad_model):

    with tf.GradientTape() as tape:
        inputs = tf.cast(image, tf.float32)
        (conv_outputs, predictions) = grad_model(inputs)
        loss = predictions[:, class_ix]
    grads = tape.gradient(loss, conv_outputs)

    cast_conv_outputs = tf.cast(conv_outputs > 0, "float32")
    cast_grads = tf.cast(grads > 0, "float32")
    guided_grads = cast_conv_outputs * cast_grads * grads

    conv_outputs = conv_outputs[0]
    guided_grads = guided_grads[0]

    weights = tf.reduce_mean(guided_grads, axis=(0, 1))

    cam = tf.reduce_sum(tf.multiply(weights, conv_outputs), axis=-1)

    (w, h) = (image.shape[2], image.shape[1])
    heatmap = cv2.resize(cam.numpy(), (w, h))
        
    return heatmap

## Sample of Images

We view a sample of the images within the test set. 

In [None]:
# Label values
emotions = {0: 'Angry', 1: 'Disgust', 2: 'Fear', 3: 'Happy', 4: 'Sad', 5: 'Surprise', 6: 'Neutral'}

In [None]:
# Plot sample images of each emotion
plt.close()
plt.rcParams["figure.figsize"] = [16,16]

row = 0
for emotion in np.unique(labels):

    all_emotion_images = test[test['emotion'] == emotion]
    for i in range(5):
        
        img = all_emotion_images.iloc[i,].pixels.reshape(48,48)
        lab = emotions[emotion]

        plt.subplot(7,5,row+i+1)
        plt.imshow(img, cmap='binary_r')
        plt.text(-30, 5, s = str(lab), fontsize=10, color='b')
        plt.axis('off')
    row += 5

plt.show()

## Week 3 Model with Grad-CAM

### Heatmap of Predictions

We plot the sample of images with heatmaps and predicted labels.

In [None]:
# Plot images with heatmap and predicted label
gm = create_grad_model(wk3_model1)

plt.figure(figsize=[16,16])

for i in range(36):
    img = pixels[i,:,:,0]
    p_dist = wk3_model1.predict(img.reshape(1, 48, 48, 1))
    k = np.argmax(p_dist)
    p = np.max(p_dist)

    heatmap = compute_heatmap((img.reshape(1, 48, 48, 1)), 1, gm)

    plt.subplot(6, 6, i+1)
    plt.imshow(img, alpha=0.8, cmap='binary_r')
    plt.imshow(heatmap, alpha=0.6, cmap='coolwarm')
    plt.title(f'{emotions[k]} - ({emotions[k]} - {p:.4f})')
    plt.axis('off')
    
plt.tight_layout()
plt.show()

From the display of heatmaps above, we can see the eye, cheek, and mouth regions play an important role in predicting facial expressions. 

### Heatmap with Distribution of Predictions

We plot the an image for each class with a heatmap distribution of prediction probabilities. 

In [None]:
# Create a list of images
sel_imgs = []
for i in range(0,7):
    index = labels.tolist().index(i)
    sel_imgs.append(index)

In [None]:
# Plot each image label with prediction probabilities
for n in sel_imgs:
    img = pixels[n,:,:,0]
    p_dist = wk3_model1.predict(img.reshape(1, 48, 48, 1))
    k = np.argmax(p_dist)
    
    plt.figure(figsize=[10,3])
    plt.subplot(1, 3, 1)
    plt.imshow(img, cmap='binary_r')
    plt.title(f'True Label: {emotions[labels[n]]}')
    plt.axis('off')
    
    heatmap = compute_heatmap((img.reshape(1, 48, 48, 1)), 1, gm)
    
    plt.subplot(1, 3, 2)
    plt.imshow(img, alpha=0.8, cmap='binary_r')
    plt.imshow(heatmap, alpha=0.6, cmap='coolwarm')
    plt.title(f'Predicted Label: {emotions[k]}')
    plt.axis('off')
    
    plt.subplot(1, 3, 3)
    plt.bar(emotions.values(), wk3_test_probs1[n, :], color='dodgerblue', edgecolor='k')
    plt.xticks(rotation=45)
    plt.ylim([0,1])
    plt.title('Distribution of Predictions')
    plt.show()

From the plots above, we can see that this model performs well at predicting the emotion happy; however, the model is not as confident predicting other emotions. For example, the model struggles to distringush between angry and sad. 

## Resources

[Class Activation Maps in Deep Learning](https://valentinaalto.medium.com/class-activation-maps-in-deep-learning-14101e2ec7e1)<br/>
[Grad-CAM: Visual Explanations from Deep Networks](https://glassboxmedicine.com/2020/05/29/grad-cam-visual-explanations-from-deep-networks/)<br/>
[Class Activation Maps (Cactus)](https://www.kaggle.com/drbeane/class-activation-maps-cactus#Heatmap-Functions)<br/>
[Facial Expression Recognition with CNN & Grad-CAM](https://www.kaggle.com/baotramduong/facial-expression-recognition-with-cnn-grad-cam)