Facial expressivity v2.0

Date completed	September 26, 2023
Release where first appeared	OpenWillis v1.4
Researcher / Developer	Vijay Yadav, Georgios Efstathiadis

1 – Use

import openwillis as ow

framewise_loc, framewise_disp, summary = ow.facial_expressivity(filepath = 'video.mov', baseline_filepath = 'baseline_video.mov')

2 – Methods

Using framewise displacement in facial landmark coordinates to quantify facial expressivity

Methods without a `baseline` input:

For every frame of the video, coordinates of 468 unique facial landmarks are calculated using the facemesh model within mediapipe. The framewise x, y, and z coordinates for all landmarks (i.e., their locations), whose values range from 0 to 1, are saved in the framewise_loc output.
For each facial landmark, the framewise euclidean distance is calculated. These values range from 0 to 1, with values for the first frame always being 0. The data is saved in the framewise_disp output.
For specific groups of facial landmarks, the average euclidean distance for all landmarks within that group is calculated. These composite values also range from 0 to 1, with values for the first frame always being 0, and are saved in the** framewise_disp** output. The groups are:
- Overall facial expressivity, saved as: overall
- Upper facial expressivity, saved as: upper_face
- Lower facial expressivity, saved as: lower_face
- Lip expressivity, saved as: lips
- Eyebrow expressivity, saved as: eyebrows
- Mouth openness, saved as: mouth_openness
Summary statistics for framewise_disp are saved in the summary output. This includes the mean displacement over the course of the video in all composite variables i.e. the list above.

Methods with a `baseline` input:

For more information on using a baseline input, see the Research Guidelines on the Github Wiki.

All steps above are performed on the baseline video to acquire the mean displacement over the video for each facial landmark and all facial landmarks cumulatively (these are not outputted).
Using the main video, framewise_loc is calculated as described above.
For framewise_disp on the main video, framewise displacement for each facial landmark is normalized against the overall value of its displacement in the baseline video. The normalized values range from -1 to 1, with negative values signifying displacement lower than baseline and positive values signifying displacement greater than baseline. The method used for normalization is further explained in the appendix of this document.
Summary statistics are saved in the summary output in the same manner as described above.

Additional measures calculated in framewise_disp include:

Framewise displacement of the lower face and upper face separately.
Framewise displacement of the mouth.
Framewise displacement of the eyebrows.
Mouth openness metric calculated as the ratio of the mouth height divided by the minimum of the lower lip height and the upper lip height.

These additional measures are also summarized in the summary output.

Note: The user can combine the mean displacement of any combination of facial landmarks they wish and derive their own custom measure of overall facial expressivity (e.g. focusing on other specific areas of the face). See mediapipe documentation to figure out which landmarks refer to which parts of the face.

3 – Inputs

3.1 – `filepath`

Type	str
Description	path to main video

3.2 – `baseline_filepath`

Type	str, optional
Description	path to baseline video

4 – Outputs

4.1 – `framewise_loc`

Type	data-type
Description	framewise coordinates of 468 facial landmarks. columns refer to landmarks, with every landmark having a value each for its x, y, and z coordinate (ranging between 0 and 1). rows refer to frames in the video.

What the data frame looks like:

frame	lmk001_x	lmk002_x	...	lmk001_y	...	lmk001_z	...
0
1
...

4.2 – `framewise_disp`

Type	data-type
Description	framewise euclidean distance for each facial landmark. columns refer to individual landmarks, with the last few columns representing mean displacement across landmark composites for the entire face, the upper face, the lower face, lips, eyebrows, and a framewise measure of mouth openness. range for these values is -1 to 1 in case of baselining and 0-1 otherwise. rows refer to frames in the video.

What the data frame looks like:

frame	lmk001	lmk002	…	overall	lower_face	upper_face	lips	eyebrows	mouth_openness
0	0	0	0	0	0	0	0	0	0
1
...

4.3 – `summary`

Type	data-type
Description	summary statistics calculated from the framewise_disp output, namely the mean and standard deviation value of overall, upper face, lower face, lips, and eyebrow expressivity as well as mouth openness.

What the dataframe output looks like:

overall_mean	lower_face_mean	upper_face_mean	lips_mean	eyebrows_mean	mouth_openness_mean	overall_std	lower_face_std	upper_face_std	lips_std	eyebrows_std	mouth_openness_std

5 – Example use

Here, we use the facial expressivity function to process sample data included in the repository.

import openwillis as ow

framewise_loc, framewise_disp, summary = ow.facial_expressivity(filepath = 'data/subj01.mp4', baseline_filepath = 'data/subj01_base.mp4')

framewise_loc.head(2)

frame	lmk000_x	lmk001_x	lmk002_x	lmk003_x	lmk004_x	lmk005_x	...
0	0.533611	0.534253	0.533953	0.524268	0.534301	0.534436	...
1	0.533908	0.534034	0.534066	0.524676	0.533993	0.534129	...

2 rows x 1405 columns

6 – Dependencies

Below are dependencies specific to calculation of this measure.

Dependency	License	Justification
mediapipe	Apache 2.0	Continually maintained, 468 3D facial landmarks, plenty of validation data, good documentation. Doesn’t do AU or emotion detection but that’s not needed for this method anyway.
opencv	Apache 2.0	Open-source computer vision library for basic CV operations.

Appendix – Normalization

To see the justification for baseline normalization, see the Research Guidelines on the Github Wiki.

Let’s say that:

A is the framewise displacement for a given landmark in the main video; this value ranges from 0 to 1

B is the overall displacement for that landmark calculated from the baseline video, also ranging from 0 to 1

To normalize, we want to divide A by B to get C, the normalized or baseline-corrected value.

When we do this, three scenarios ensue:

If A > B, C can range from 1 to infinity.
If A < B, C will range between 0 and 1.
If A = B, C will be equal to 1.

To avoid the large possible range of values (0 to infinity), we add 1 to both A and** B** before division. So:

When A > B, C can range between 1 and 2.
When A < B, C will range between 0 and 1.
When A = B, C will be equal to 1.

When we do this, we get a range of values between 0 and 2. If we subtract 1 from this value, we get a range between -1 and 1, with negative values signifying scenarios where A > B, positive values signifying when A < B, and a 0 signifying A = B or no difference from baseline.

OpenWillis was developed by a small team of clinicians, scientists, and engineers based in Brooklyn, NY.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Facial expressivity v2.0

1 – Use

2 – Methods

Methods without a `baseline` input:

Methods with a `baseline` input:

3 – Inputs

3.1 – `filepath`

3.2 – `baseline_filepath`

4 – Outputs

4.1 – `framewise_loc`

4.2 – `framewise_disp`

4.3 – `summary`

5 – Example use

6 – Dependencies

Appendix – Normalization

Table of contents

Clone this wiki locally

Facial expressivity v2.0

1 – Use

2 – Methods

Methods without a baseline input:

Methods with a baseline input:

3 – Inputs

3.1 – filepath

3.2 – baseline_filepath

4 – Outputs

4.1 – framewise_loc

4.2 – framewise_disp

4.3 – summary

5 – Example use

6 – Dependencies

Appendix – Normalization

Table of contents

Clone this wiki locally

Methods without a `baseline` input:

Methods with a `baseline` input:

3.1 – `filepath`

3.2 – `baseline_filepath`

4.1 – `framewise_loc`

4.2 – `framewise_disp`

4.3 – `summary`