# Video Classification Project
By Ashley Chacon

This Google Collab is inspired by a [Video CNN Model YouTube tutorial by Data Magic](https://www.youtube.com/watch?v=UHdrxHPRBng&ab_channel=DataMagic%28bySunnyKusawa%29). The dataset used for training and testing is the [FER-2013 dataset](https://www.kaggle.com/datasets/ananthu017/emotion-detection-fer) found on [Kaggle.com](https://kaggle.com). This system creates an emotion classifer model using a `tf.keras.Sequential` model and a 2D Convolutional Layer.

# Import TensorFlow and other libraries

In [24]:
import cv2
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Dropout, Flatten
from keras.preprocessing.image import ImageDataGenerator
import PIL

import cv2
import numpy as np
from keras.models import model_from_json
from google.colab.patches import cv2_imshow

Accessing my own Google Drive to run project. Download dataset [here](https://www.kaggle.com/datasets/ananthu017/emotion-detection-fer) and upload to this Collab to run project on your own machine and set `reextract_dataset` to `True` in order for the code to properly run.

In [25]:
%cd /content/drive/MyDrive/Research/VideoEmotionRecog/

/content/drive/MyDrive/Research/VideoEmotionRecog


First, before creating a model to classify emotion in video. We must first train a model to classify emotion in images. We use the uploaded dataset to do so.

In [26]:
import pathlib
import zipfile

reextract_dataset = False

if reextract_dataset:
  with zipfile.ZipFile('images-classification.zip', 'r') as zip_ref:
    zip_ref.extractall()

# Rescale data and assign training and validation variables

The RGB channel values are in the `[0, 255]` range. This is not ideal for a neural network; in general you should seek to make your input values small. Now, split your data in its designated space, train or validation.

In [27]:
# Initialize image data generator with rescaling
train_data_gen = ImageDataGenerator(rescale=1./255)
validation_data_gen = ImageDataGenerator(rescale=1./255)

# Preprocess all test images
train_generator = train_data_gen.flow_from_directory(
        './train',
        target_size=(48, 48),
        batch_size=64,
        color_mode="grayscale",
        class_mode='categorical')

# Preprocess all train images
validation_generator = validation_data_gen.flow_from_directory(
        './test',
        target_size=(48, 48),
        batch_size=64,
        color_mode="grayscale",
        class_mode='categorical')


Found 3658 images belonging to 7 classes.
Found 7178 images belonging to 7 classes.


# Implemented Sequential model with 2D Convolutional Layers

Following neural network was provided by [Data Magic's tutorial]((https://www.youtube.com/watch?v=UHdrxHPRBng&ab_channel=DataMagic%28bySunnyKusawa%29)), which was explained step by step.

In [28]:
emotion_model = Sequential()

emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(48, 48, 1)))
emotion_model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))

cv2.ocl.setUseOpenCL(False)

emotion_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


Train your model for 50 epochs, then save the trained image model into a JSON file, and the model's weights in a .h5 file.

In [29]:
retrain_model = False

if retrain_model:
  emotion_model_info = emotion_model.fit_generator(
          train_generator,
          steps_per_epoch=28709 // 64,
          epochs=50,
          validation_data=validation_generator,
          validation_steps=7178 // 64)

  # save model structure in jason file
  model_json = emotion_model.to_json()
  with open("emotion_model.json", "w") as json_file:
      json_file.write(model_json)

  # save trained model weight in .h5 file
  emotion_model.save_weights('emotion_model.h5')


Now, create `emotion_dict` to select what emotions our video model can choose from. Load your trained image model and its weights.

In [30]:
emotion_dict = {0: "Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: "Neutral", 5: "Sad", 6: "Surprised"}

# load json and create model
json_file = open('./emotion_model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
emotion_model = model_from_json(loaded_model_json)

# load weights into new model
emotion_model.load_weights("./emotion_model.h5")
print("Loaded model from disk")


Loaded model from disk


Here, you have the option to use your own webcam for the model to predict the expression on your face! Instead, we use a [sample video](https://www.videezy.com/free-video/mp4-videos). 

In [37]:
# start the webcam feed
#cap = cv2.VideoCapture(0)

# pass here your video path
# you may download one from here : https://www.pexels.com/video/three-girls-laughing-5273028/
# the following video is being tested in this program : https://www.videezy.com/people/8445-dark-haired-girl-in-disbelief-1
cap = cv2.VideoCapture("./Sample-Vid02.mp4")

# Using a Cascade Classifier to find face and predict emotion it displays

Following code gray scales the timestamp of the video, since the image model is trained on gray scale images. The model then uses a Cascade Classifier to find the face in the video, then uses the image model to predict the emotion the person is displaying.

In [38]:
label_px_offset = 50
rect_color = (0, 0, 255)  # BGR
rect_thick = 2

rect_px_offset = 10

adjusted_size = (48,48)

x_coord_adjust = 5
y_coord_adjust = 10
font_scale = 1
text_color = (255, 255, 0) # BGR
text_thick = 2

Download this project to your machine for results. Displaying results results in a large file size, and I am not able to upload the project to GitHub.

In [None]:
for _ in range(90):
    # Find haar cascade to draw bounding box around face
    ret, frame = cap.read()
    frame = cv2.resize(frame, (1280, 720))
    if not ret:
        break
    face_detector = cv2.CascadeClassifier('./haarcascade_frontalface_default.xml')
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # detect faces available on camera
    faces = face_detector.detectMultiScale(gray_frame, scaleFactor=1.3, minNeighbors=5)

    # take each face available on the camera and Preprocess it
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y-label_px_offset), (x+w, y+h+rect_px_offset), rect_color, rect_thick)
        roi_gray_frame = gray_frame[y:y + h, x:x + w]
        cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, adjusted_size), -1), 0)

        # predict the emotions
        emotion_prediction = emotion_model.predict(cropped_img)
        maxindex = int(np.argmax(emotion_prediction))
        cv2.putText(frame, emotion_dict[maxindex], (x+x_coord_adjust, y-y_coord_adjust), 
                    cv2.FONT_HERSHEY_SIMPLEX, font_scale, text_color, text_thick, cv2.LINE_AA)

    cv2.waitKey(0)
    cv2_imshow(frame)

cap.release()
