The First step for preprocessing is extracting information from the Emotion Evaluation Files scattered across all the Session Folders in the IEMOCAP dataset. We start by taking a regular expression as shown and then run a for loop across all the evaluation files. We find the necessary information which includes start_times, end_times, wav_file_names, emotions, vals, acts and doms.

In [None]:
import re
import os
info_line = re.compile(r'\[.+\]\n', re.IGNORECASE)

start_times, end_times, wav_file_names, emotions, vals, acts, doms = [], [], [], [], [], [], []

for sess in range(1, 6):
    emo_evaluation_dir = './IEMOCAP_full_release/Session{}/dialog/EmoEvaluation/'.format(sess)
    evaluation_files = [l for l in os.listdir(emo_evaluation_dir) if 'Ses' in l]
    for file in evaluation_files:
        with open(emo_evaluation_dir + file) as f:
            content = f.read()
        info_lines = re.findall(info_line, content)
        for line in info_lines[1:]:  # the first line is a header
            start_end_time, wav_file_name, emotion, val_act_dom = line.strip().split('\t')
            start_time, end_time = start_end_time[1:-1].split('-')
            val, act, dom = val_act_dom[1:-1].split(',')
            val, act, dom = float(val), float(act), float(dom)
            start_time, end_time = float(start_time), float(end_time)
            start_times.append(start_time)
            end_times.append(end_time)
            wav_file_names.append(wav_file_name)
            emotions.append(emotion)
            vals.append(val)
            acts.append(act)
            doms.append(dom)

We further import the Pandas Library and convert the information into a CSV format

In [None]:
import pandas as pd

df_iemocap = pd.DataFrame(columns=['start_time', 'end_time', 'wav_file', 'emotion', 'val', 'act', 'dom'])

df_iemocap['start_time'] = start_times
df_iemocap['end_time'] = end_times
df_iemocap['wav_file'] = wav_file_names
df_iemocap['emotion'] = emotions
df_iemocap['val'] = vals
df_iemocap['act'] = acts
df_iemocap['dom'] = doms

df_iemocap.tail()

df_iemocap.to_csv('audio_df.csv', index=False)

The existing IEMOCAP dataset is classified into two sets 1)Scripted and 2)Improvised. We restrict our analysis to the improved files[cite source here]. Furthermore, adhering to previous standards of emotion evaluation we restrict our focus to the following emotions "Happy/Excited", "Sad", "Neutral" and "Angry". Therefore the following script excludes the unnecessary emotions from the datafram and stores it to a new file.    

In [2]:
import pandas as pd
audio_df = pd.read_csv('audio_df.csv')

# Remove Scripted sources and oth, xxx, dis, fea, fru and sur emotions 
# audio_impro = audio_df[~audio_df.wav_file.str.contains("script")]
# audio_impro = audio_df
audio_impro = audio_df[~audio_df.emotion.str.contains("oth")]
audio_impro = audio_impro[~audio_df.emotion.str.contains("xxx")]
audio_impro = audio_impro[~audio_df.emotion.str.contains("dis")]
audio_impro = audio_impro[~audio_df.emotion.str.contains("fea")]
audio_impro = audio_impro[~audio_df.emotion.str.contains("fru")]
audio_impro = audio_impro[~audio_df.emotion.str.contains("sur")]
audio_impro_processed = audio_impro.replace('exc', 'hap')

audio_impro_processed.to_csv ('audio_df_improvisedandscripted_4emotions.csv', index = False, header=True)

