# Analysis of Elicited and Acted Emotional Expressions in PEDFE
Group Member:
- Zikun Fu
- Tony Wang

## Objective
The goal of this study is to investigate the differences between elicited and acted emotional expressions within the The Padova Emotional Dataset of Facial Expressions(PEDFE) dataset.

## Research Question
How do automated emotion recognition systems perform in differentiating between elicited and acted emotional expressions in the PEDFE dataset?

## Hypothesis
We hypothesize that there will be significant differences in classification accuracy between elicited and acted emotional expressions, reflecting the inherent complexity and subtlety of genuine emotional states.

## Methodology
1. **Data Preparation**: Utilize the PEDFE dataset for the analysis.
2. **Feature Extraction**: Use facial expression analysis tools to extract relevant features from the dataset.
3. **Emotion Classification**: Apply classification models to identify patterns and distinctions between elicited and acted emotions.
4. **Result Analysis**: Evaluate the classifier's performance and analyze the distinctions between genuine and posed expressions.

## Expected Outcomes
The study aims to provide insights into the nuances of emotional expression and the ability of current technological systems to recognize genuine emotional states.

# Padova Emotional Dataset for FER

[Padova Emotional Dataset Article](https://link.springer.com/article/10.3758/s13428-022-01914-4)

- **Objective**: To offer a rich dataset of both genuine (N = 707) and posed (N = 751) emotional expressions for the six universal emotions, using 56 participants.
- **Versions**: 
Available in original clips (with participants' body and background) and modified clips (focusing solely on the face).
- **Validation**: 
Original dataset validated by 122 human raters and the modified dataset by 280 human raters. Includes hit rates for emotion and genuineness, mean standard deviation of genuineness, and intensity perception for each clip.


## Dataset Creation
- **Participants**: 57 participants aged 20 to 30, with one withdrawing consent, resulting in contributions from 56 participants.
- **Experimental Setup**: Designed to record spontaneous emotions with minimal participant awareness of being filmed to preserve natural expressions.
- **Emotion Elicitation Procedure**: Utilized a multi-modal protocol with videos, games, and other tasks to elicit a range of emotions.

## Supplemental Materials

### T1
- **Identification Number**: Each subject is assigned a unique identifier.
- **Gender**: Marked as F (female) or M (male).
- **PEDFE Code**:
  - The first character indicates the subject's number.
  - The second character represents the emotion (d=disgust, h=happiness, f=fear, a=anger, s=surprise, t=sadness).
  - The third character denotes whether the emotion is genuine (g) or posed (s).
- **Hit Rate Accuracy Scores**: Percentage of raters correctly recognizing the emotion and its authenticity.
- **Intensity**: Rated on scales from 0-9.
- **Genuineness**: Rated on scales from -7 to +7.
- **Duration**: Length of each clip in seconds.

In [None]:
import pandas as pd

# Load Supplemental Material 1 (T1)
t1_desc_path = 'data/Supplemental_Material_T1.csv'
t1 = pd.read_csv(t1_desc_path)

# Display the first few rows of the dataset
print(t1.head(3))

### T2
- **Page 1: Emotion Hit Rate**
  - Details the average and standard deviation of hit rates for each actor in PEDFE, divided by emotion. This data provides insights into the accuracy of emotion recognition across the dataset.

- **Page 2: Genuineness Hit Rate**
   - Presents genuineness hit rates for each actor by emotion, including averages and standard deviations. This section highlights how well raters could distinguish between genuine and posed emotions.

### T3
- **Page 1: Emotion Experience**
   - Summarizes the emotions participants reported experiencing for each task, offering a direct view into the emotional impact of the elicitation protocol.

- **Page 2: Levels of Genuineness**
   - Features ratings from participants on the genuineness of their emotions for each task, ranging from completely not genuine to completely genuine. This self-assessment provides a unique perspective on the authenticity of the expressed emotions.

- **Page 3: Intensity Ratings**
   - Contains participants' self-rated intensity for each task, on a scale from 0 (none) to 9 (strong). These ratings offer a quantifiable measure of the emotional strength participants felt during each task.


# Py-Feat Toolbox
[Pyfeat Documentation](https://py-feat.org/pages/intro.html)

**Overview**

The Py-Feat (Python Facial Expression Analysis Toolbox) is a comprehensive library designed for facial feature extraction, emotion recognition, and more, using advanced image processing and machine learning techniques. This toolbox provides an accessible interface to extract and analyze facial expressions from images and videos, enabling researchers to conduct in-depth studies of emotional expressions.

## Available Models

### Face Detection

- **RetinaFace**: A robust single-stage detector that performs dense face localization in uncontrolled environments, known for its high accuracy and efficiency.

- **MTCNN**: Multi-task Cascaded Convolutional Networks that excel in face detection and alignment, combining multiple CNN layers for improved performance.

- **FaceBoxes**: Designed for real-time face detection on CPU, offering high accuracy, particularly in resource-constrained environments.

- **img2pose**: Innovatively estimates 6 degrees of freedom (6DoF) face pose, enabling simultaneous face detection and head pose estimation in a single shot.

- **img2pose-c**: A constrained variant of img2pose, optimized for frontal face images, providing superior face pose estimation accuracy within a specified angle range.

Each model has unique strengths, catering to different requirements in terms of accuracy, speed, and computational resources.

### Facial Landmark Detection

- **MobileFaceNet**: Efficient CNNs for real-time face verification, balancing accuracy and computational demand, suitable for mobile devices.

- **MobileNet**: Designed for mobile vision applications, offering fast and efficient facial landmark detection.

- **PFLD (Practical Facial Landmark Detector)**: Tailored for real-time performance with high accuracy in landmark detection.

## Facial Pose Estimation
- **img2pose**: Integrates face detection and pose estimation, providing 6DoF face pose estimation in one shot.

- **img2pose-c**: Specialized for frontal faces, optimized for accurate pose estimation within a limited angle range.

## Action Unit Detection
- **xgb**: XGBoost model with hinge-loss for AU07, giving binary-like outputs, indicating the proportion of trees detecting the AU.

- **svm**: Uses LinearSVC for binary AU detection. For continuous-valued detections, xgb is recommended.


## Initialize a Detector
- Class that is a combination of a Face, Landmark, Action Unit, and Emotion detection model into a single object<br>
- Available models: https://py-feat.org/pages/models.html

In [1]:
from feat import Detector

detector = Detector(
    face_model="img2pose",
    landmark_model="mobilefacenet",
    au_model='xgb',
    emotion_model="resmasknet",
    facepose_model="img2pose",
    device='cuda' #Use GPU Acceleration
)

### Video Recognition

#### Detector
- `face_model="retinaface"`: Specifies the use of the RetinaFace model for face detection. RetinaFace is known for its accuracy in detecting faces across various orientations and scales.

- `landmark_model="mobilefacenet"`: Chooses MobileFaceNet as the model for detecting facial landmarks. MobileFaceNet is efficient and designed for mobile and embedded vision applications, balancing accuracy and computational resource use.

- `au_model='xgb'`: Utilizes an XGBoost (Extreme Gradient Boosting) model for recognizing facial action units. Action units are fundamental components of facial expressions, corresponding to contractions of specific facial muscles.

- `emotion_model="resmasknet"`: Employs the ResMaskNet model for emotion detection. This model is capable of classifying various emotional states by analyzing facial features and expressions.

- `facepose_model="img2pose"`: Uses the img2pose model to estimate the pose of the face. This includes the orientation of the face in terms of pitch, yaw, and roll angles, providing context for the facial expressions in relation to the camera perspective.

- `device='cuda'`: Specifies that the GPU should be used for computation, leveraging CUDA for acceleration. 


#### Processing Videos
 - https://py-feat.org/basic_tutorials/03_detector_vids.html

In [2]:
# Function that display a video given path
from IPython.core.display import Video
def play_video(video_path):
    return Video(video_path, embed=True)
video_directory = 'data/PEDFE_set_clips/'
play_video(video_directory + "1_dg_1.avi")

In [3]:

import os

# Directory containing videos
video_directory = 'data/PEDFE_set_clips'

# List all video files in the directory
video_files = [f for f in os.listdir(video_directory) if f.endswith(('.avi'))]

# Process each video
for video_file in video_files:
    video_path = os.path.join(video_directory, video_file)
    try:
        # Process the video, skipping frames for efficiency, and aggregate the results
        results = detector.detect_video(video_path, skip_frames=24, aggregate=True)
        
        if not results.empty:
            # Since results are aggregated, we get a single row of results summarizing the video
            aggregated_results = results.iloc[0]  # Access the aggregated results
            print(f"Aggregated results for {video_file}:")
            # display how many predictions have been produced
            #print(results.shape)
            print(aggregated_results)

        else:
            print(f"No faces detected in {video_file}.")
    except Exception as e:
        print(f"Error processing {video_path}: {e}")
        return None

In [4]:
video_prediction = detector.detect_video(video_directory+ "1_dg_1.avi", skip_frames=10, batch_size=35)
video_prediction=video_prediction.extract_mean()
video_prediction.to_csv("result", index=False)

100%|██████████| 1/1 [00:03<00:00,  3.03s/it]


#### Emotion Prediction

In [None]:
video_prediction.loc[[24, 72]].plot_detections(faceboxes=False, add_titles=True)

#### Emotion Development

In [None]:
axes = video_prediction.emotions.plot()