# Face Emotion Recognition with TensorFlow and FER-2013 dataset

In this Jupyter notebook, we’ll explore face emotion recognition using [TensorFlow](https://en.wikipedia.org/wiki/TensorFlow), a powerful open-source library for building machine learning models.

We’ll train a model to recognize different emotions, such as happiness, sadness, or anger, just by looking at facial expressions in images. This is possible because we’ll use [Convolutional Neural Networks (CNNs)](https://www.tensorflow.org/tutorials/images/cnn), a type of deep learning model that is great at recognizing patterns and features in images.

CNNs automatically learn important details, like facial features, from the images, which allows the model to accurately predict emotions.

<img src="https://scx2.b-cdn.net/gfx/news/hires/2019/threecnnmode.jpg" alt="cnn" width="550">



### Step 1: import the necessary libraries

In [1]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator

### Step 2: build the model architecture

In this step, we define the architecture of our neural network using the ```Sequential``` model from Keras. This model will consist of several layers designed to extract features from input images and make predictions.

In [2]:
model = Sequential([ # 2D convolutional neural network
    Conv2D(64, (3, 3), activation='relu', input_shape=(48, 48, 1)), # 48x48 image with 1 channel
    MaxPooling2D(2, 2), # 2x2 pooling
    Conv2D(128, (3, 3), activation='relu'), # 128 filters
    MaxPooling2D(2, 2), # 2x2 pooling
    Flatten(), # flatten to 1D
    Dense(128, activation='relu'), # 128 neurons
    Dropout(0.5), # 50% dropout
    Dense(7, activation='softmax')  # 7 for 7 emotion classes
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # compile model
model.summary() 


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 46, 46, 64)        640       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 23, 23, 64)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 21, 21, 128)       73856     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 10, 10, 128)      0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 12800)             0         
                                                                 
 dense (Dense)               (None, 128)               1

- **Conv2D Layers**: These convolutional layers (with 64 and 128 filters) are responsible for detecting features such as edges and textures in the images.
- **MaxPooling2D Layers**: These layers reduce the spatial dimensions of the image, helping to decrease computation and focus on the most important features.
- **Flatten Layer**: This layer converts the 2D features into a 1D vector, which can be processed by fully connected layers.
- **Dense Layers**: These fully connected layers are where the final classification happens. The last dense layer outputs 7 values, each representing one of the 7 emotion classes (like happiness, sadness, etc.).
- **Dropout Layer**: This layer helps prevent overfitting by randomly setting some neuron connections to zero during training.

The model is compiled using the Adam optimizer and categorical crossentropy loss function, as it’s a multi-class classification problem.


### Step 3: prepare the data for training

In this step, we use the ```ImageDataGenerator``` from Keras to preprocess and augment the training and validation data. This helps the model generalize better and avoid overfitting.

In [3]:
datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2) # normalize pixel values to [0,1] and split data into training and validation sets

train_generator = datagen.flow_from_directory( # load training data
    'fer_2013/train',  
    target_size=(48, 48), # 48x48 images
    color_mode='grayscale', # grayscale images
    class_mode='categorical', # 7 classes
    subset='training', # training set
    batch_size=64 # 64 images per batch
)

validation_generator = datagen.flow_from_directory( # load validation data
    'fer_2013/test',
    target_size=(48, 48),
    color_mode='grayscale',
    class_mode='categorical',
    subset='validation',
    batch_size=64
)


Found 22968 images belonging to 7 classes.
Found 1432 images belonging to 7 classes.


- **Data Augmentation**: ```ImageDataGenerator``` rescales the pixel values by dividing them by 255 to normalize the image data.
- **Training and Validation Split**: we set aside **20%** of the data for validation and use the rest for training.
- **Flow from Directory**: ```flow_from_directory``` method loads the images from the specified directories (```'fer_2013/train'``` and ```'fer_2013/test'```) and resizes them to *48x48* pixels in grayscale mode, which matches the input shape of the model.

These generators will help feed the images into the model during training and validation, allowing the model to learn and evaluate performance on different data sets

## Step 4: train and save the model

In this step, we train the model using the fit method with the training and validation data generators we set up earlier. The model will learn to recognize facial emotions by processing the images over several epochs.

In [4]:
history = model.fit( 
    train_generator, 
    validation_data=validation_generator, 
    epochs=3 
)

# save the model
model.save('emotion_recognition_model.h5')


Epoch 1/3
Epoch 2/3
Epoch 3/3


- **Training**: The model is trained for *3* epochs, meaning it will go through the entire training data three times. During each epoch, the model will adjust its weights to minimize the loss and improve accuracy.
- **Validation**: The validation data is used to evaluate the model's performance at the end of each epoch, helping us monitor how well the model generalizes to unseen data.

Finally, we save the trained model as an ```.h5``` file, so it can be reused or deployed later.

### Step 5: live demo

In this step, we use the trained model to recognize emotions from faces detected in real-time via a webcam feed.

In [5]:
import cv2
import numpy as np

# load pre-trained model and cascade classifier
model = tf.keras.models.load_model('emotion_recognition_model.h5')
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# emotion labels
emotion_labels = ['Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']

# start video capture
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    # convert to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # detect faces
    faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))
    
    for (x, y, w, h) in faces:
        roi_gray = gray[y:y+h, x:x+w]
        roi_gray = cv2.resize(roi_gray, (48, 48))
        roi_gray = roi_gray / 255.0
        roi_gray = np.expand_dims(roi_gray, axis=0)
        roi_gray = np.expand_dims(roi_gray, axis=-1)
        
        # predict emotion
        prediction = model.predict(roi_gray)
        emotion = emotion_labels[np.argmax(prediction)]
        
        # draw rectangle and text
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
        cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)
    
    cv2.imshow('Emotion Recognition', frame)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()




- **Load the Model**: the pre-trained emotion recognition model is loaded from the saved ```.h5``` file.
- **Load Haar Cascade**: we also load the Haar Cascade classifier for detecting faces in the video feed.
- **Emotion Labels**: a list of possible emotion labels (*Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral*) is defined for the output prediction.
- **Video Capture**: we capture video from the webcam and continuously process the frames.
    - **Face Detection**: each frame is converted to grayscale, and faces are detected using the Haar Cascade classifier.
    - **Emotion Prediction**: the detected face regions are resized, normalized, and passed into the trained model for emotion prediction.
    - **Draw Rectangle and Label**: a rectangle is drawn around the detected face, and the predicted emotion is displayed on the screen.

This process continues in real-time until you press ```'q'``` to exit the webcam feed.