**Table of Contents**

1. <a href='#intro'> Introduction  </a>
2. <a href='#lib'> Load libraries </a>
3. <a href='#lmk'> Facial Landmarks </a>
4. <a href='#emo'> Facial Emotion </a>
5. <a href='#aco'> Vocal Acoustic </a>
6. <a href='#trans'> Speech Transcription </a>
7. <a href='#s_char'> Speech Characteristics </a>
8. <a href='#s_sep'> Speaker Separation: Batch </a>
9. <a href='#s_sep_stream'> Speaker Separation: Stream </a>

<a id='intro'></a> 
### 1. Introduction

Digital measurement of health.......

<a id='lib'></a> 
### 2. Load libraries

In [1]:
import openwillis as ow

<a id='lmk'></a> 
### 3. Facial Landmarks

Facial landmark detection and facial expressivity analysis

This function uses mediapipe’s facemesh solution to quantify the framewise 3D positioning of 4768 facial landmarks. It then calculates framewise displacement of those landmarks to quantify movement in facial musculature as a proxy measure of overall facial expressivity.


In [4]:
#Baseline is optional
framewise_loc, framewise_disp, summary = ow.facial_expressivity('data/subj01.mp4', 'data/subj01_base.mp4')

In [6]:
framewise_loc.head(2)

Unnamed: 0,frame,lmk000_x,lmk001_x,lmk002_x,lmk003_x,lmk004_x,lmk005_x,lmk006_x,lmk007_x,lmk008_x,...,lmk458_z,lmk459_z,lmk460_z,lmk461_z,lmk462_z,lmk463_z,lmk464_z,lmk465_z,lmk466_z,lmk467_z
0,0,0.533611,0.534253,0.533953,0.524744,0.534268,0.534301,0.534436,0.458334,0.534459,...,-0.048058,-0.047221,-0.021558,-0.051052,-0.037351,0.004636,-0.002157,-0.010473,0.010571,0.010526
1,1,0.533908,0.534034,0.534066,0.524676,0.533993,0.534129,0.534511,0.458254,0.534685,...,-0.048947,-0.048006,-0.02191,-0.051918,-0.038112,0.005045,-0.001838,-0.010256,0.011397,0.011328


<a id='emo'></a> 
### 4. Facial Emotion

Facial emotion detection and analysis

This function uses serengil/deepface to quantify the framewise expressivity of facial emotions, specifically happiness, sadness, anger, fear, disgust, and surprise – in addition to measuring the absence of any emotion (neutral). The summary output provides overall measurements for the input.


In [13]:
#Baseline is optional
framewise, summary = ow.emotional_expressivity('data/subj01.mp4', 'data/subj01_base.mp4')

In [12]:
framewise.head(2)

Unnamed: 0,frame,angry,disgust,fear,happiness,sadness,surprise,neutral,composite
0,0,0.05145,3.1e-05,0.250485,-0.11309,0.638163,-0.000234,-0.446323,0.137801
1,1,0.006272,-4e-06,0.281984,-0.113109,0.662542,-0.000263,-0.452403,0.13957


<a id='aco'></a> 
### 5. Vocal Acoustic

Vocal acoustic analysis

In [14]:
framewise, pauses, summary = ow.vocal_acoustics('data/trim.wav')

In [18]:
framewise.head(2)

Unnamed: 0,f0,loudness,hnr,form1freq,form2freq,form3freq,form4freq
0,107.723057,49.712931,7.878497,439.767344,1720.286908,2662.749904,4328.910656
1,105.881758,48.591601,9.100513,376.79863,2513.842458,2667.704351,4105.551045


<a id='trans'></a> 
### 6. Speech Transcription

Vosk is an open source speech recognition toolkit, which supports speech recognition for 20+ languages

Language: {'Indian English':'en-in', 'Chinese':'cn', 'Russian':'ru', 'French':'fr', 'German':'de', 'Spanish':'es', 'Portuguese/Brazilian Portuguese':'pt', 'Greek':'gr', 'Turkish':'tr', 'Vietnamese':'vn', 'Italian':'it', 'Dutch':'nl', 'Catalan':'ca', 'Arabic':'ar', 'Farsi':'fa', 'Filipino':'ph', 'Ukrainian':'uk', 'Kazakh':'kz', 'Swedish':'sv', 'Japanese':'ja', 'Esperanto':'eo', 'Hindi':'hi', 'Czech':'cz', 'Polish':'pl'}

In [23]:
json_interval, speech = ow.speech_transcription('data/trim.wav', 'en-us')

<a id='s_char'></a> 
### 7. Speech Characteristics

Speech characteristics related to health and functioning

In [31]:
tag_event, speech_summary  = ow.speech_characteristics(json_interval, 'en-us')

<a id='s_sep'></a> 
### 8. Speaker Separation: Batch 

Separation two speaker from mixed audio file

- Fast Processing

In [5]:
diart_df = ow.speaker_separation('data/trim.wav', 'data/out_dir')

In [6]:
diart_df.head(2)

Unnamed: 0,start_time,interval,speaker,filename,end_time,isOverlap
0,0.009,1.143,speaker0,trim,1.152,0
1,7.358,0.45,speaker0,trim,7.808,0


<a id='s_sep_stream'></a> 
### 8. Speaker Separation: Stream 

In [2]:
diart_df = ow.speaker_separation_stream('data/trim.wav', 'data/out_dir')