## Emotion Recognition Project

2019 - OT.

Given the Karolinska faces collection, train a classifier to predict emotion.
Details about the data (KDEF):
70 subjects (35 male, 35 female), all white, between 20 and 30 years of age, photographed with 7 emotion expressions:
neutral, happy, angry, afraid, disgusted, sad, surprised

Files stored in KDEF folder. Image naming code:
Codes:
	Example: AF01ANFL.JPG
		Letter 1: Session 
					A = series one
					B = series two
		Letter 2: Gender 
					F = female
					M = male
		Letter 3 & 4: Identity number
					01 - 35
		Letter 5 & 6: Expression
					AF = afraid
					AN = angry
					DI = disgusted
					HA = happy
					NE = neutral
					SA = sad
					SU = surprised
		Letter 7 & 8: Angle
					FL = full left profile
					HL = half left profile
					S = straight
					HR = half right profile
					FR = full right profile


In [1]:

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os

os.environ['KMP_DUPLICATE_LIB_OK']='True'

Import the libraries needed for a CNN model 

In [2]:
from keras.preprocessing.image import ImageDataGenerator 
from keras.preprocessing.image import save_img
from keras.preprocessing.image import img_to_array
from keras.models import Sequential 
from keras.layers import Conv2D, MaxPooling2D 
from keras.layers import Activation, Dropout, Flatten, Dense 
from keras import backend as K 
from keras.preprocessing.image import load_img

Using TensorFlow backend.


Take the data from KDEF and copy samples in a data tree structured into a train set (with 290 images per emotion, focusing right now only on fearful and happy), and a validation set with 60 images per emotion. 

In [3]:
import os
import fnmatch
import shutil

#Initialize the number of images in class:
k = 0
afraid = r"[A-Z][A-Z][0-9][0-9]AF*.JPG"
happy = r"[A-Z][A-Z][0-9][0-9]HA*.JPG"
basedir = 'data/train/happy/'
# List all files in the female directories with angry expressions using scandir()
for i in range(1,36):
    if i < 10:
        n = '0' + str(i)
    else:
        n = str(i)   
    basepath = 'KDEF/KDEF_and_AKDEF/KDEF/AF' + n + '/'
    with os.scandir(basepath) as entries:
        for entry in entries:
            if entry.is_file():
                if fnmatch.fnmatch(entry.name, happy):
                    #print(entry.name)
                    old_name = os.path.join(basepath, entry.name)
                    #print(old_name)
                     # Initial new name
                    new_name = os.path.join(basedir,entry.name)
                    #print(new_name)
                    shutil.copy(old_name, new_name)
                    
                    k += 1
                    
# List all files in the male directories with angry expressions using scandir()                    
for i in range(1,36):
    if i < 10:
        n = '0' + str(i)
    else:
        n = str(i)   
    basepath = 'KDEF/KDEF_and_AKDEF/KDEF/AM' + n + '/'
    with os.scandir(basepath) as entries:
        for entry in entries:
            if entry.is_file():
                if fnmatch.fnmatch(entry.name, happy):
                    #print(entry.name)
                    old_name = os.path.join(basepath, entry.name)
                    #print(old_name)
                     # Initial new name
                    new_name = os.path.join(basedir,entry.name)
                    #print(new_name)
                    shutil.copy(old_name, new_name)
                    k += 1
print(k)

350


All the KDEF images are 562 by 762, so we should assign these dimensions.
Image size and input shape:


In [3]:
img_width = 562
img_height = 762

In [4]:
if K.image_data_format() == 'channels_first': 
    input_shape = (3, img_width, img_height) 
else: 
    input_shape = (img_width, img_height, 3) 

Assign the train data directory, validation directory, and epochs and batch size for the CNN.
There are 290 images per emotion in the train set, 580 in total, and 60 images per emotion in the validation set, for a total of 120 images. 


In [10]:
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples =580 
nb_validation_samples = 120
epochs = 20
batch_size = 16


Define the convolution networks model and add the layers. We'll use an RMS optimizer and "relu" activation. The output layer - since it's a binary classification - will have a sigmoid activation function. 

In [11]:
model = Sequential() 
model.add(Conv2D(32, (2, 2), input_shape = input_shape)) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size =(2, 2))) 
  
model.add(Conv2D(32, (2, 2))) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size =(2, 2))) 
  
model.add(Conv2D(64, (2, 2))) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size =(2, 2))) 
  
model.add(Flatten()) 
model.add(Dense(64)) 
model.add(Activation('relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(1)) 
model.add(Activation('sigmoid')) 
  
model.compile(loss ='binary_crossentropy', 
                     optimizer ='rmsprop', 
                   metrics =['accuracy']) 

Define the training data with a generator:

In [12]:
train_datagen = ImageDataGenerator( 
    rescale=1. / 255, 
    shear_range=0.2, 
    zoom_range=0.2, 
    horizontal_flip=True) 
  
test_datagen = ImageDataGenerator(rescale=1. / 255) 

Generators for the training and validation images:

In [13]:
train_generator = train_datagen.flow_from_directory( 
    train_data_dir, 
    target_size=(img_width, img_height), 
    batch_size=batch_size, 
    class_mode='binary') 
  
validation_generator = test_datagen.flow_from_directory( 
    validation_data_dir, 
    target_size=(img_width, img_height), 
    batch_size=batch_size, 
    class_mode='binary') 

Found 640 images belonging to 2 classes.
Found 120 images belonging to 2 classes.


Let's fit the model and see how it converges within 20 epochs:

In [14]:
model.fit_generator( 
    train_generator, 
    steps_per_epoch=nb_train_samples // batch_size, 
    epochs=epochs, 
    validation_data=validation_generator, 
    validation_steps=nb_validation_samples // batch_size) 

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0xb366eb160>

In [None]:
Not too bad - particularly given the caveats of very few images and all the downsides of using KDEF stimuli. We got an accuracy of 79% , with

In [15]:

model.save_weights('model_saved.h5') 
