Skip to content

Facial expressivity v1.0

anzar edited this page Jun 14, 2023 · 1 revision
Date completed April 13, 2023
Release where first appeared OpenWillis v1.0
Researcher / Developer Vijay Yadav

1 – Use

import openwillis as ow

framewise_loc, framewise_disp, summary = ow.facial_expressivity(filepath = 'video.mov', baseline_filepath = 'baseline_video.mov')

2 – Methods

Using framewise displacement in facial landmark coordinates to quantify facial expressivity

Methods without a baseline input:

  1. For every frame of the video, coordinates of 468 unique facial landmarks are calculated using the facemesh model within mediapipe. The framewise x, y, and z coordinates for all landmarks (i.e., their locations), whose values range from 0 to 1, are saved in the framewise_loc output.
  2. For each facial landmark, the framewise euclidean distance is calculated. This is also done across all facial landmarks for an overall framewise displacement value. These values range from 0 to 1, with values for the first frame always being 0. The data is saved in the framewise_disp output.
  3. Summary statistics for framewise_disp are saved in the summary output. This includes the mean displacement over the course of the video in all facial landmarks i.e. overall facial expressivity, which is the primary outcome measure of this function.

Methods with a baseline input:

For more information on using a baseline input, see the Research Guidelines on the Github Wiki.

  1. All three steps above are performed on the baseline video to acquire the mean displacement over the video for each facial landmark and all facial landmarks cumulatively (these are not outputted).
  2. Using the main video, framewise_loc is calculated as described above.
  3. For **framewise_disp **on the main video, framewise displacement for each facial landmark is normalized against the overall value of its displacement in the baseline video. The normalized values range from -1 to 1, with negative values signifying displacement lower than baseline and positive values signifying displacement greater than baseline. The method used for normalization is further explained in the appendix of this document.
  4. Summary statistics are saved in the summary output in the same manner as described above.

The primary outcome measure of this function is the overall displacement across all facial landmarks over the course of the video, which is referred to as overall facial expressivity and is saved in summary.

Note: The user can combine the mean displacement of any combination of facial landmarks they wish and derive their own custom measure of overall facial expressivity (e.g. focusing on specific areas of the face). See mediapipe documentation to figure out which landmarks refer to which parts of the face.


3 – Inputs

3.1 – filepath

Type str
Description path to main video

3.2 – baseline_filepath

Type str, optional
Description path to baseline video

4 – Outputs

4.1 – framewise_loc

Type data-type
Description framewise coordinates of 468 facial landmarks. columns refer to landmarks, with every landmark having a value each for its x, y, and z coordinate (ranging between 0 and 1). rows refer to frames in the video.

What the data frame looks like:

frame lmk001_x lmk002_x ... lmk001_y ... lmk001_z ...
0
1
...

4.2 – framewise_disp

Type data-type
Description framewise euclidean distance for each facial landmark. columns refer to individual landmarks, with the last column referring to the mean displacement across all landmarks. range for these values is -1 to 1 in case of baselining and 0-1 otherwise. rows refer to frames in the video.

What the data frame looks like:

frame lmk001 lmk002 overall
0 0 0 0 0
1
...

4.3 – summary

Type data-type
Description summary statistics calculated from the framewise_disp output. contains primary outcome measure i.e. overall facial expressivity, measured as the mean displacement across all facial landmarks.

What the dataframe output looks like:

stats lmk001 lmk002 overall
mean
stdev

The overall mean variable in this table can be considered the primary outcome measure of this function.


5 – Example use

Here, we use the facial expressivity function to process sample data included in the repository.

import openwillis as ow

framewise_loc, framewise_disp, summary = ow.facial_expressivity(filepath = 'data/subj01.mp4', baseline_filepath = 'data/subj01_base.mp4')
framewise_loc.head(2)
frame lmk000_x lmk001_x lmk002_x lmk003_x lmk004_x lmk005_x ...
0 0.533611 0.534253 0.533953 0.524268 0.534301 0.534436 ...
1 0.533908 0.534034 0.534066 0.524676 0.533993 0.534129 ...

2 rows x 1405 columns


6 – Dependencies

Below are dependencies specific to calculation of this measure.

Dependency License Justification
mediapipe Apache 2.0 Continually maintained, 468 3D facial landmarks, plenty of validation data, good documentation. Doesn’t do AU or emotion detection but that’s not needed for this method anyway.
opencv Apache 2.0 Open-source computer vision library for basic CV operations.

Appendix – Normalization

To see the justification for baseline normalization, see the Research Guidelines on the Github Wiki.

Let’s say that:

A is the framewise displacement for a given landmark in the main video; this value ranges from 0 to 1

B is the overall displacement for that landmark calculated from the baseline video, also ranging from 0 to 1

To normalize, we want to divide A by B to get C, the normalized or baseline-corrected value.

When we do this, three scenarios ensue:

  1. If A > B, C can range from 1 to infinity.
  2. If A < B, C will range between 0 and 1.
  3. If A = B, C will be equal to 1.

To avoid the large possible range of values (0 to infinity), we add 1 to both A and** B** before division. So:

  1. When A > B, C can range between 1 and 2.
  2. When A < B, C will range between 0 and 1.
  3. When A = B, C will be equal to 1.

When we do this, we get a range of values between 0 and 2. If we subtract 1 from this value, we get a range between -1 and 1, with negative values signifying scenarios where A > B, positive values signifying when A < B, and a 0 signifying A = B or no difference from baseline.

Clone this wiki locally