# Video and Image Emotion Annotation

This script facilitates the detection of faces and annotation of recognized emotions in both videos and images. It utilizes state-of-the-art deep learning models for face detection and emotion recognition, namely RetinaFace and HSEmotionRecognizer, respectively. The goal is to enhance media content understanding by automatically labeling facial expressions with emotional states.
Components:

## Face Detection using RetinaFace:
     The detect_faces function leverages the RetinaFace model to identify faces within a given frame of video or image data. It retrieves facial bounding boxes, providing precise coordinates for subsequent processing.

## Emotion Recognition with HSEmotionRecognizer:
     The HSEmotionRecognizer model, initialized as recognizer, interprets emotional states from extracted face regions. It predicts emotions based on learned features from the provided face images.

## Annotation and Visualization:
     The annotate_frame function annotates each detected face with its recognized emotion. It draws bounding boxes around faces and labels them with the predicted emotional state, enhancing visual understanding of the content.

## Processing Pipeline:
        Video Processing:
        process_video_frames: Iterates through frames of a video, applying face detection and emotion annotation. It saves the processed frames into a temporary video file.
        add_audio_to_video: Incorporates audio from the original video back into the processed frames, creating a final annotated video output.
        process_video: Integrates frame processing and audio addition into a cohesive function for video processing tasks.
        Image Processing:
        process_image: Handles single images by detecting faces, annotating emotions, and optionally combining input and annotated images for visualization.

# Usage:

    Video Processing: Provide paths to video files (*.mp4, *.avi, *.mov, *.mkv) to analyze and annotate facial expressions throughout the video duration.
    Image Processing: For static images (*.jpg, *.jpeg, *.png), the script detects faces, predicts emotions, and optionally displays the original and annotated images side by side.
    live camera processing: you can open the camera and model will detect your face and recogize your emoticons


## Setup
install the required libraries:

In [26]:
pip install retina-face

Note: you may need to restart the kernel to use updated packages.


In [22]:
! pip install retina-face hsemotion moviepy
# tensorflow == 2.



In [23]:
pip show tensorflow


Name: tensorflow
Version: 2.16.1
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: c:\Users\Hendy Group\anaconda3\Lib\site-packages
Requires: tensorflow-intel
Required-by: deepface, retina-face
Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install keras

Note: you may need to restart the kernel to use updated packages.


In [11]:
pip install --upgrade keras


Collecting keras
  Using cached keras-3.3.3-py3-none-any.whl.metadata (5.7 kB)
Using cached keras-3.3.3-py3-none-any.whl (1.1 MB)
Installing collected packages: keras
  Attempting uninstall: keras
    Found existing installation: keras 2.15.0
    Uninstalling keras-2.15.0:
      Successfully uninstalled keras-2.15.0
Successfully installed keras-3.3.3
Note: you may need to restart the kernel to use updated packages.


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-intel 2.15.0 requires keras<2.16,>=2.15.0, but you have keras 3.3.3 which is incompatible.


In [12]:
import keras
print(keras.__version__)


2.15.0


In [1]:
from retinaface import RetinaFace





In [2]:
from moviepy.editor import VideoFileClip
from hsemotion.facial_emotions import HSEmotionRecognizer
import cv2
import numpy as np
import os


In [9]:
# Initialize recognizer
recognizer = HSEmotionRecognizer(model_name='enet_b0_8_best_vgaf', device='cpu')

# Face Detection Function
def detect_faces(frame):
    """ Detect faces in the frame using RetinaFace """
    faces = RetinaFace.detect_faces(frame)
    if isinstance(faces, dict):
        face_list = []
        for key in faces.keys():
            face = faces[key]
            facial_area = face['facial_area']
            face_dict = {
                'box': (facial_area[0], facial_area[1], facial_area[2] - facial_area[0], facial_area[3] - facial_area[1])
            }
            face_list.append(face_dict)
        return face_list
    return []

# Annotation Function
def annotate_frame(frame, faces):
    """ Annotate the frame with recognized emotions using global recognizer """
    for face in faces:
        x, y, w, h = face['box']
        face_image = frame[y:y+h, x:x+w]  # Extract face region from frame
        emotion = classify_emotions(face_image)
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
        cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)

# Emotion Classification Function
def classify_emotions(face_image):
    """ Classify emotions for the given face image using global recognizer """
    results = recognizer.predict_emotions(face_image)
    if results:
        emotion = results[0]  # Get the most likely emotion
    else:
        emotion = 'Unknown'
    return emotion

# Process Video Frames
def process_video_frames(video_path, temp_output_path, frame_skip=0):
    # Load the video
    video_clip = VideoFileClip(video_path)
    fps = video_clip.fps

    # Initialize output video writer
    out = cv2.VideoWriter(temp_output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (int(video_clip.size[0]), int(video_clip.size[1])))

    # Iterate through frames, detect faces, and annotate emotions
    frame_count = 0
    for frame in video_clip.iter_frames():
        if frame_count % frame_skip == 0:  # Process every nth frame
            frame = np.copy(frame)  # Create a writable copy of the frame
            faces = detect_faces(frame)
            annotate_frame(frame, faces)
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)  # Convert RGB to BGR for OpenCV
        out.write(frame)
        frame_count += 1

    # Release resources and cleanup
    out.release()
    cv2.destroyAllWindows()
    video_clip.close()

# Add Audio to Processed Video
def add_audio_to_video(original_video_path, processed_video_path, output_path):
    try:
        original_clip = VideoFileClip(original_video_path)
        processed_clip = VideoFileClip(processed_video_path)
        final_clip = processed_clip.set_audio(original_clip.audio)
        final_clip.write_videofile(output_path, codec='libx264', audio_codec='aac')
    except Exception as e:
        print(f"Error while combining with audio: {e}")
    finally:
        original_clip.close()
        processed_clip.close()

# Process Video
def process_video(video_path, output_path):
    temp_output_path = 'temp_output_video.mp4'

    # Process video frames and save to a temporary file
    process_video_frames(video_path, temp_output_path, frame_skip=1)  # Adjust frame_skip as needed

    # Add audio to the processed video
    add_audio_to_video(video_path, temp_output_path, output_path)

# Process Image
def process_image(input_path, output_path):
    # Ensure output path has a valid extension
    if not output_path.lower().endswith(('.jpg', '.jpeg', '.png')):
        output_path += '.jpg'  # Default to .jpg if no valid extension is found

    # Step 1: Read input image
    image = cv2.imread(input_path)
    if image is None:
        print(f"Error: Unable to read image at '{input_path}'")
        return

    # Step 2: Detect faces and annotate emotions
    faces = detect_faces(image)
    annotate_frame(image, faces)

    # Step 3: Write annotated image to output path
    cv2.imwrite(output_path, image)

    # Step 4: Combine input and output images horizontally
    input_image = cv2.imread(input_path)
    combined_image = cv2.hconcat([input_image, image])

    # Step 5: Save the combined image
    combined_output_path = os.path.splitext(output_path)[0] + '_combined.jpg'
    cv2.imwrite(combined_output_path, combined_image)

    # Step 6: Display the combined image (uncomment if running locally)
    # cv2.imshow('Combined Image', combined_image)
    # cv2.waitKey(0)
    # cv2.destroyAllWindows()

C:\Users\Hendy Group\.hsemotion\enet_b0_8_best_vgaf.pt Compose(
    Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=True)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)


# Time to process the video or image
**NOTE : You can use your own data by changing the path**

In [17]:
if __name__ == "__main__":
    input_path = r'C:\Users\Hendy Group\Downloads\فيديو واتساب بتاريخ 1445-12-12 في 07.06.34_9220b851.mp4'  # Update with your video or image path
    output_path = r'C:\Users\Hendy Group\Downloads\new_11.mp4'  # Update with the desired output path

    if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):
        process_video(input_path, output_path)
    elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):
        process_image(input_path, output_path)
    else:
        print("Unsupported file format. Please provide a video or image file.")

Moviepy - Building video C:\Users\Hendy Group\Downloads\new_11.mp4.
MoviePy - Writing audio in new_11TEMP_MPY_wvf_snd.mp4


                                                                   

MoviePy - Done.
Moviepy - Writing video C:\Users\Hendy Group\Downloads\new_11.mp4



                                                               

Moviepy - Done !
Moviepy - video ready C:\Users\Hendy Group\Downloads\new_11.mp4


In [16]:
if __name__ == "__main__":
    input_path = r'C:\Users\Hendy Group\Downloads\motaz\IMG-20240618-WA0003.jpg'  # Update with your video or image path
    output_path = r'C:\Users\Hendy Group\Downloads\motaz\mai-2.jpg'  # Update with the desired output path

    if input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):
        process_video(input_path, output_path)
    elif input_path.lower().endswith(('.jpg', '.jpeg', '.png')):
        process_image(input_path, output_path)
    else:
        print("Unsupported file format. Please provide a video or image file.")

# <p>open camera for test model</p>

In [12]:
def open_camera():
    cap = cv2.VideoCapture(0)  # Open the default camera

    if not cap.isOpened():
        print("Error: Could not open camera.")
        return

    while True:
        ret, frame = cap.read()
        if not ret:
            break

        faces = detect_faces(frame)
        annotate_frame(frame, faces)

        cv2.imshow('Emotion Recognition', frame)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()

In [13]:
if __name__ == "__main__":
    open_camera()