# Interactive Demonstration
## OpenWillis user tutorial

This notebook walks through an interactive demonstration of the basic functions in OpenWillis to process audio and video files. This demo currently addresses use cases with a single speaker. It is intended to help users get a sense of what it's like to work with OpenWillis in a jupyter notebook environment. 

__Note:__ Be sure that you have gone through the OpenWillis [installation steps](https://www.notion.so/brooklynhealth/Installing-OpenWillis-and-jupyter-notebook-14983a8fe047814b88ced7d3831791f2?pvs=12) prior to continuing. 

First, we'll load the necessary libraries. Some warning messages may appear but these can be safely ignored if your environment is set up correctly. 

In [None]:
import openwillis as ow
import tensorflow as tf
import whisperx
import moviepy.editor as mp
import pandas as pd
import os

Before getting into the analysis portion, we need to load some data we can work with. For this demonstration, we will use some sample audio and video files of a person reading from a list of [standardized sentences](https://www.cs.columbia.edu/~hgs/audio/harvard.html).

This data can be loaded straight from GitHub into this jupyter notebook environment. 

In [None]:
!git clone https://github.com/bklynhlth/sample_data.git

Below, we'll organize these files so they are easy to access in the code below. 

In [None]:
audio_dir = 'sample_data/audio_files'
video_dir = 'sample_data/video_files'
baseline_dir = 'sample_data/video_files/baseline_videos'

audio_files = [f for f in os.listdir(audio_dir) if f.endswith('.m4a')]
video_files = [f for f in os.listdir(video_dir) if f.endswith('.mp4')]
bl_files = [f for f in os.listdir(baseline_dir)]

Let's check to make sure we have the correct files in each of `audio_files`, `video_files`, and `bl_files`. We're only working with data from 5 videos/audio files, so we should expect 3 lists of files that correspond with 'sentences_audio.m4a', 'sentences_video.mp4', and 'sentences_bl_video.mp4', numbered 1-5. 

In [None]:
[audio_files, video_files, bl_files]

### 1 - Vocal acoustics

#### 1.1 - Single file

Before we continue, you might notice that the audio files in these folders are .m4a. That means we need to first convert to .mp3 or .wav in order to run through OpenWillis. Let's do that first. 

In [None]:
input_folder = 'sample_data/audio_files'
output_folder = 'sample_data/audio_files'

for filename in os.listdir(input_folder):
    if filename.endswith(".m4a"):
      m4a_file = os.path.join(input_folder, filename)
      wav_file = os.path.join(output_folder, filename[:-4] + ".wav") 

      audio = mp.AudioFileClip(m4a_file)
      audio.write_audiofile(wav_file)
      audio.close()

Now, we can proceed with our processing. First, we'll just process a single audio file from the 'audio_files' folder above. 

In [None]:
framewise, summary = ow.vocal_acoustics(audio_path = 'sample_data/audio_files/sentences_1_audio.wav', voiced_segments = False, option = 'simple')

Now let's take a look at our summary data to make sure it populated correctly. We can first specify that we want to print all rows and columns. 

In [None]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

In [None]:
summary

Looks good! Notice that some of the final columns are 'NaN' - these features are specifically for the more advanced options and will not populate if 'simple' is specified in the 'option' parameter. 

#### 1.2 - Multiple files

Here, let's go ahead and run vocal acoustics on all 5 files in our folder using a for loop. 

In [None]:
folder_path = 'sample_data/audio_files'

framewise_data = pd.DataFrame()
summary_data = pd.DataFrame()

for filename in os.listdir(folder_path):
  if filename.endswith('.wav'):
    audio_path = os.path.join(folder_path, filename)

    # Run vocal acoustics function
    framewise, summary = ow.vocal_acoustics(audio_path = audio_path, voiced_segments = False, option = 'simple')

    # Here, make sure we can identify each file by adding the name in the first column of the dataframe, remove '.wav' from the name
    filename_no_ext = os.path.splitext(filename)[0]

    # Add filename column as the first column using insert()
    framewise.insert(0, 'filename', filename_no_ext)
    summary.insert(0, 'filename', filename_no_ext)

    # Store results for each file in each dataframe
    framewise_data = pd.concat([framewise_data, framewise], ignore_index=True)
    summary_data = pd.concat([summary_data, summary], ignore_index=True)

We'll examine the data to make sure it's looking okay. 

In [None]:
summary_data.head()

Below, we will save this output as a .csv so we can further analyze and run statistical tests on the output.

In [None]:
output_dir = 'sample_data' # change to your output path
output_filename = 'summary_data.csv'
output_csv_path = os.path.join(output_dir, output_filename)

summary_data.to_csv(output_csv_path, index = False)

### 2 - Speech characteristics

#### 2.1 - Single file

Now we will continue with the speech characteristics function. First, we will need to transcribe our file, here we are using the 'vosk' transcription function:

In [None]:
transcript_json, transcript_text = ow.speech_transcription_vosk(filepath = 'sample_data/audio_files/sentences_1_audio.wav')

Then we will pass the JSON file from the transcription function directly to the speech characteristics function:

In [None]:
words, turns, summary_sc = ow.speech_characteristics(json_conf = transcript_json, option = 'simple')

In [None]:
# Examine summary data 

summary_sc

#### 2.2 - Multiple files

The below code will run the speech characteristics function on multiple files: 

In [None]:
folder_path = 'sample_data/audio_files'

word_data = pd.DataFrame()
turns_data = pd.DataFrame()
summary_sc_data = pd.DataFrame()

for filename in os.listdir(folder_path):
  if filename.endswith('.wav'):
    audio_path = os.path.join(folder_path, filename)

    # Transcribe
    transcript_json, transcript_text = ow.speech_transcription_vosk(filepath = audio_path)
    words, turns, summary_sc = ow.speech_characteristics(json_conf = transcript_json, option = 'simple')

    # Here, make sure we can identify each file by adding the name in the first column of the dataframe, remove '.wav' from the name
    filename_no_ext = os.path.splitext(filename)[0]

    # Add filename column as the first column using insert()
    words.insert(0, 'filename', filename_no_ext)
    turns.insert(0, 'filename', filename_no_ext)
    summary_sc.insert(0, 'filename', filename_no_ext)

    # Store results for each file in each dataframe
    word_data = pd.concat([word_data, words], ignore_index=True)
    turns_data = pd.concat([turns_data, turns], ignore_index=True)
    summary_sc_data = pd.concat([summary_sc_data, summary_sc], ignore_index=True)

In [None]:
summary_sc_data.head()

### 3 - Facial expressivity

#### 3.1 - Single file

From here, let's take a look at the video data. We'll start with just running facial expressivity on a single video file. 

The video used in this example is 18 seconds and estimated runtime of `facial_expressivity` is approximately the same. For longer videos, you should expect runtimes approximately proportionate to the video duration. 

In [None]:
framewise_loc, framewise_disp, summary_fe = ow.facial_expressivity(filepath = 'sample_data/video_files/sentences_1_video.mp4', baseline_filepath = 'sample_data/video_files/baseline_videos/sentences_1_bl_video.mp4')

We can look at the output in a couple of ways. We can look at the `framewise_disp` output to get a sense of displacement for each facial landmark at each frame. This dataframe contains quite a bit of data, so we can also look at the `summary_fe` output which will give us an overall displacement summary for each composite facial area.

In [None]:
framewise_disp.head()

In [None]:
summary_fe

Just for demonstration, if we don't include a baseline video, the displacement calculations will differ: 

In [None]:
framewise_loc_nobl, framewise_disp_nobl, summary_fe_nobl = ow.facial_expressivity(filepath = 'sample_data/video_files/sentences_1_video.mp4')

summary_fe_nobl

#### 3.2 - Multiple files

When running this function on multiple video files, make sure to match the video file to the baseline file using a subject identifier. 

In [None]:
folder_path = 'sample_data/video_files'
baseline_folder = 'sample_data/video_files/baseline_videos/'

frames_data = pd.DataFrame()
displacement_data = pd.DataFrame()
summary_fe_data = pd.DataFrame()

for filename in os.listdir(folder_path):
    if filename.endswith('.mp4'):
        video_path = os.path.join(folder_path, filename)
        
        # Extract identifier from filename (assuming a pattern like "person1_video.mp4")
        identifier = "_".join(filename.split("_")[:2])  
        baseline_filename = f"{identifier}_bl_video.mp4"  # Construct baseline filename
        baseline_filepath = os.path.join(baseline_folder, baseline_filename)
        
        # Run facial expressivity - this sample uses the same video as a baseline because the samples are of the same person
        framewise_loc, framewise_disp, summary_fe = ow.facial_expressivity(filepath = video_path, baseline_filepath = baseline_filepath)
    
        # Here, make sure we can identify each file by adding the name in the first column of the dataframe, remove '.mp4' from the name
        filename_no_ext = os.path.splitext(filename)[0]

        # Add filename column as the first column using insert()
        framewise_loc.insert(0, 'filename', filename_no_ext)
        framewise_disp.insert(0, 'filename', filename_no_ext)
        summary_fe.insert(0, 'filename', filename_no_ext)

        # Store results for each file in each dataframe
        frames_data = pd.concat([frames_data, framewise_loc], ignore_index=True)
        displacement_data = pd.concat([displacement_data, framewise_disp], ignore_index=True)
        summary_fe_data = pd.concat([summary_fe_data, summary_fe], ignore_index=True)

In [None]:
summary_fe_data.head()

### 4 - Emotional expressivity

#### 4.1 - Single file

When running the `emotional_expressivity` function, be aware that the runtime is considerably slower than for `facial_expressivity`. For the 18 second video, processing time is approximately 50 seconds. For longer videos, plan for a processing time of about 2.5x the file length. 

In [None]:
framewise_ee, summary_ee = ow.emotional_expressivity(filepath = 'sample_data/video_files/sentences_1_video.mp4', baseline_filepath = 'sample_data/video_files/baseline_videos/sentences_1_bl_video.mp4')

In [None]:
summary_ee

Again for demonstration without a baseline video, the expressivity metrics will differ: 

In [None]:
framewise_ee_nobl, summary_ee_nobl = ow.emotional_expressivity(filepath = 'sample_data/video_files/sentences_1_video.mp4')

summary_ee_nobl

#### 4.2 - Multiple files

In [None]:
folder_path = 'sample_data/video_files'
baseline_folder = 'sample_data/video_files/baseline_videos/'  

frames_ee_data = pd.DataFrame()
summary_ee_data = pd.DataFrame()

for filename in os.listdir(folder_path):
    if filename.endswith('.mp4'):
        video_path = os.path.join(folder_path, filename)
        
        # Extract identifier from filename (assuming a pattern like "person1_video.mp4")
        identifier = "_".join(filename.split("_")[:2])  
        baseline_filename = f"{identifier}_bl_video.mp4"  # Construct baseline filename
        baseline_filepath = os.path.join(baseline_folder, baseline_filename)

        # Run emotional expressivity - this sample uses a clip from the same video as a baseline because the samples are of the same person
        framewise_ee, summary_ee = ow.emotional_expressivity(filepath = video_path, baseline_filepath = baseline_filepath)
    
        # Here, make sure we can identify each file by adding the name in the first column of the dataframe, remove '.mp4' from the name
        filename_no_ext = os.path.splitext(filename)[0]

        # Add filename column as the first column using insert()
        framewise_ee.insert(0, 'filename', filename_no_ext)
        summary_ee.insert(0, 'filename', filename_no_ext)

        # Store results for each file in each dataframe
        frames_ee_data = pd.concat([frames_ee_data, framewise_ee], ignore_index=True)
        summary_ee_data = pd.concat([summary_ee_data, summary_ee], ignore_index=True)

In [None]:
summary_ee_data.head()