# CHI Square Test
To further look into differences in our data, we conduct multiple chi² tests to see if there are any significant differences between females/males and grad students/PhDs regarding different emotion, affect, level of interest and arousal valence attributes.

## Import relevant libraries

In [1]:
import numpy as np
import pandas as pd
from os import listdir
import matplotlib.pyplot as plt
import itertools as it
from statsmodels.sandbox.stats.multicomp import multipletests
import statsmodels.api as sm
#import nltk
import scipy.stats as st
import statsmodels.formula.api as smf
import seaborn as sns
import Helper as hp

## Load .csv data with results of OpenSMILE Analysis
First we load .csv data and clean it (removing of NaNs), then we store information of all files in seperate panda dataframes containing information about affect, emotion and valence/arousal for all participants.

In [2]:
data = pd.read_csv("CHI_2019_FULL.csv")

#Set Labels 
emotion_label = ['Anger', 'Boredom', 'Disgust', 'Fear', 'Happiness', 'Emo_Neutral', 'Sadness']
affect_label = ['Aggressiv', 'Cheerful', 'Intoxicated', 'Nervous', 'Aff_Neutral', 'Tired']
loi_label = ['Disinterest', 'Normal', 'High Interest']

#Get specific data and save it into new data frames
# We use the pandas .copy(deep=True) function to prevent the SettingWithCopyWarning we would otherwise get. Since we do
# not write, but only read from the data, the warning does not affect the data frames
df_emotion = data[['Anger', 'Boredom', 'Disgust', 'Fear', 'Happiness', 'Emo_Neutral', 'Sadness', 'Filename']].copy(deep=True)
df_affect = data[['Aggressiv', 'Cheerful', 'Intoxicated', 'Nervous', 'Aff_Neutral', 'Tired', 'Filename']].copy(deep=True)
df_loi = data[['Disinterest', 'Normal', 'High Interest', 'Filename']].copy(deep=True)
df_ar_val = data[['Arousal', 'Valence', 'Filename']].copy(deep=True)
#For further usage, we want to append the CharacterID as a column, which is saved with other information in the filename
#Since we only want the digits, we can remove all non-digit characters of the filename column and append the column to the df

df_emotion['Char_ID'] = df_emotion['Filename'].replace('\D+','', regex = True).copy(deep=True)
df_affect['Char_ID'] = df_affect['Filename'].replace('\D+','', regex = True).copy(deep=True)
df_loi['Char_ID'] = df_loi['Filename'].replace('\D+','', regex = True).copy(deep=True)
df_ar_val['Char_ID'] = df_ar_val['Filename'].replace('\D+','', regex = True).copy(deep=True)



## Let's load information about the speakers
The speaker ID is saved in a single .csv file containing four important columns: ID, Age, Sex and Acadedmic Status. Since before loaded OpenSMILE csv files are named using the corresponding index (ex. speaker with id 0 has two files 0_a.csv and 0_b.csv), so that a link can be created

In [3]:
char_data = pd.read_csv("CHI_2019_CharacterData.csv")  

#Join above tables and Character Tables

#To Join DataFrames we have to cast the column on which we want to join to int, so that both columns have the same data type
char_data['ID'] = char_data['ID'].astype(int)
df_ar_val['Char_ID'] = df_ar_val['Char_ID'].astype(int)
df_emotion['Char_ID'] = df_emotion['Char_ID'].astype(int)
df_affect['Char_ID'] = df_affect['Char_ID'].astype(int)
df_loi['Char_ID'] = df_loi['Char_ID'].astype(int)

#Safe new data frames
df_ar_val_char = df_ar_val.merge(char_data, how = 'left', left_on='Char_ID', right_on='ID')
df_emotion_char = df_emotion.merge(char_data, how = 'left', left_on='Char_ID', right_on= 'ID')
df_affect_char = df_affect.merge(char_data, how = 'left', left_on='Char_ID', right_on= 'ID')
df_loi_char = df_loi.merge(char_data, how = 'left', left_on='Char_ID', right_on= 'ID')

#Now, we only want to have data containing information about the answers
#For that we need to extract from the filename column, whether the file was part of an answer
#a = answer, p = presentation, q = question
#sentence_type should be the same for all tables, but just to be sure
arval_sentence_type = df_ar_val_char.Filename.str.replace('\d+','').str[3:-4]
df_ar_val_char['SentenceType'] = arval_sentence_type
emo_sentence_type = df_emotion_char.Filename.str.replace('\d+','').str[3:-4]
df_emotion_char['SentenceType'] = emo_sentence_type
aff_sentence_type = df_affect_char.Filename.str.replace('\d+','').str[3:-4]
df_affect_char['SentenceType'] = aff_sentence_type
loi_sentence_type = df_loi_char.Filename.str.replace('\d+','').str[3:-4]
df_loi_char['SentenceType'] = loi_sentence_type

#Now select only those who have SentenceType == 'a'
df_ar_val_char = df_ar_val_char.loc[df_ar_val_char['SentenceType'] == 'q']
df_emotion_char = df_emotion_char.loc[df_emotion_char['SentenceType'] == 'q']
df_affect_char = df_affect_char.loc[df_affect_char['SentenceType'] == 'q']
df_loi_char = df_loi_char.loc[df_loi_char['SentenceType'] == 'q']

## Chi-squared Test of Independence
We Start with characteristic sex. The null hypothesis states that the two categorical variables sex and e.g. emotion are independent.

Since we have float data and chi² needs integer data, such as observation counts, we have to convert our data. To illustrate how this is done, we'll look at a specific emotion, 'Anger'. We need to make sure that in our observation count, we do not have any cells with a value of less than 5, since this yields errors and may falsify the result. So we calculate the quantiles of our emotion 'Anger', which yields us three thresholds to compare the float data. This way, we can count how many samples were in the 1st, 2nd, 3rd or 4th Quantile. We want to compare two (or more) groups, so we compare only the female values and sort them into quartiles, then for male values. This yields a 2x4 table. An example table is printed below. This table is used to calculate the chi2 statistic. Note that the function 'calcFrequencyTable' takes in a pd.DataFrame, not a pd.Series and returns an array of pd.DataFrames. This means, that the function calculates these tables for all different emotions defined in e.g. emotion_label.

In [4]:
#Example Frequency Table for the emotion 'Anger':
#Since the function does the table calculation for all different emotions, we only want to select the first table
#which holds the table for 'anger' (since it's the first element, see declaration of emotion_label at the start)
anger_table = hp.calcFrequencyTable(df_emotion_char, emotion_label, 'Sex')[0]
anger_table

Unnamed: 0,1st Quartile,2nd Quartile,3rd Quartile,4th Quartile
Male,37,34,31,24
Female,13,16,18,26


In [20]:
print('EMOTION\n')
emo_sex_chi2 = hp.chi2(df_emotion_char, emotion_label,'Sex',  True)
print('\nAFFECT\n')
aff_sec_chi2 = hp.chi2(df_affect_char, affect_label,'Sex',  True)
print('\nAROUSAL-VALENCE\n')
ar_val_sec_chi2 = hp.chi2(df_ar_val_char, ['Arousal', 'Valence'], 'Sex', True)
print('\nLEVEL OF INTEREST\n')
loi_sec_chi2 = hp.chi2(df_loi_char, loi_label, 'Sex', True)
print('\nResiduals of: '+ affect_label[5])
aff_sec_chi2[1][5]

EMOTION

Chi square of Anger : 7.979400978917333 with p-value of: 0.046439349202881494
Chi square of Boredom : 9.699800666515792 with p-value of: 0.021298116476395644
Chi square of Disgust : 23.612368926696575 with p-value of: 3.0095000070533557e-05
Chi square of Fear : 14.551714392214812 with p-value of: 0.002242724091618584
Chi square of Happiness : 11.534425052473697 with p-value of: 0.009160724484638925
Chi square of Emo_Neutral : 6.944349481475564 with p-value of: 0.07369285579979096
Chi square of Sadness : 25.78966740773282 with p-value of: 1.055509780245254e-05

AFFECT

Chi square of Aggressiv : 3.7390287152042814 with p-value of: 0.2910582116938856
Chi square of Cheerful : 2.8129300069669094 with p-value of: 0.4213758284501843
Chi square of Intoxicated : 6.046367622065134 with p-value of: 0.10937600254719419
Chi square of Nervous : 11.47643405176813 with p-value of: 0.009409817906417777
Chi square of Aff_Neutral : 10.044232064645819 with p-value of: 0.018193864517542818
Chi squ

Unnamed: 0,1st Quartile,2nd Quartile,3rd Quartile,4th Quartile
Male,-0.223241,0.794123,2.039943,-2.597088
Female,0.223241,-0.794123,-2.039943,2.597088


Analysing the p-values of the emotion of speakers during a question, we see significant results for anger, boredom, disgust, fear, happiness and sadness. Looking at the standardized residuals, we see that for anger females tend to have higher values (4th Quartile ~2.60) than males (1st Quartile ~1.81). Looking at boredom, we see that females tend to have higher values (4th quartile ~2.94) than males (1st Quartile 1.81). Regarding disgust, we see that females tend to have higher values (4th quartile 3.95) than males (1st quartile 3.85). Looking at fear, we see that females have higher values (4th quartile 3.61), but males also have values greater than the median (3rd quartile ~2.04). Also, males have lower values (1st quartile ~1.47), so we can say that females have higher values of fear when asking a question. Regarding happiness, we see that females have only one positive value in the 4th quartile (~3.28), so we can say, that females tend to have a happier voice when asking questions than males. Lastly, if we take a look at the residuals of sadness, we see that females have lower values (1st quartile 4.63 for female), than males (3rd quartile ~1.70; 4th quartile ~3.17), so we can say that males tend to be more sad when asking a question.

Now if we have a look at affect, we only see significant p-values for nervous and tired. Regarding nervousness, we see that males tend to be more nervous when asking a question (4th quartile ~2.49) than females (1st quartile ~2.94). Looking at the residuals of tiredness, we see that males tend to be more tired than females. The differences lie between 1st, 2nd and 3rd quartile, where females have a value of ~3.61 for the 1st quartile and males have values of ~2.16 and ~1.02 for the 2nd and 3rd quartile.

If we have a look at the p-values for arousal valence, we see that females and males differ significantly regarding arousal. Further looking at the residuals, we see that females tend to have higher arousal values than males (3rd and 4th quartile positive for females; 1st and 2nd quartile positive for males).

Looking at the p-values for Level of Interest, we see significant differences in disinterest and high interest. Regarding the residuals we see that females have lower values for disinterest than males (1st and 2nd quartile positive for females, 3rd and 4th quartile positive for males). Looking at high interest we see that females have more values within the 4th quartile (~2.60) and males have more values within the 3rd quartile (~2.04).

Now let's look at Native Speaker

In [6]:
print('EMOTION\n')
emo_age_chi2 = hp.chi2(df_emotion_char, emotion_label,'IsNativeSpeaker', True)
print('\nAFFECT\n')
aff_age_chi2 = hp.chi2(df_affect_char, affect_label, 'IsNativeSpeaker', True)
print('\nAROUSAL-VALENCE\n')
ar_val_age_chi2 = hp.chi2(df_ar_val_char, ['Arousal', 'Valence'],'IsNativeSpeaker' ,True)
print('\nLEVEL OF INTEREST\n')
loi_age_chi2 = hp.chi2(df_loi_char, loi_label, 'IsNativeSpeaker',  True)

EMOTION

Chi square of Anger : 3.4628546739641797 with p-value of: 0.7489049838394384
Chi square of Boredom : 9.575069159420526 with p-value of: 0.1437253285804718
Chi square of Disgust : 5.796508153937251 with p-value of: 0.4463672667898605
Chi square of Fear : 3.3581159203546287 with p-value of: 0.7627409553120313
Chi square of Happiness : 2.6425732810015896 with p-value of: 0.8521826388460916
Chi square of Emo_Neutral : 4.4935598749634975 with p-value of: 0.6101985063475324
Chi square of Sadness : 3.3540875843580675 with p-value of: 0.7632705061484063

AFFECT

Chi square of Aggressiv : 3.0847378867087376 with p-value of: 0.7981386785974371
Chi square of Cheerful : 5.179613843848351 with p-value of: 0.520991492914689
Chi square of Intoxicated : 7.730412087132151 with p-value of: 0.25852678411801183
Chi square of Nervous : 7.988152953749067 with p-value of: 0.23897253639409172
Chi square of Aff_Neutral : 4.409909102154375 with p-value of: 0.6213855200120844
Chi square of Tired : 6.686

## Post-Hoc tests for age and native speaker, as they have three different groups

If a significant p-value for the category 'NativeSpeaker' is found, we do not yet know which groups differ significantly from each other, so post-hoc testing is done for this character feature.

In [7]:
print('EMOTION\n')
print('post-hoc emotions and different groups')
emo_reject_list, emo_corrected_p_vals, emo_combinations, emo_residuals= hp.chi2_post_hoc(df_emotion_char,emotion_label, 'IsNativeSpeaker', 'bonferroni', True, True)
print('\nAFFECT\n')
print('\n post-hoc affect and different groups')
aff_reject_list, emo_corrected_p_vals, emo_combinations, aff_residuals = hp.chi2_post_hoc(df_affect_char, affect_label, 'IsNativeSpeaker' ,'bonferroni', True, True)
print('\nAROUSAL-VALENCE\n')
print('\n post-hoc arousal-valence and different groups')
ar_val_reject_list, ar_val_corrected_p_vals, ar_val_combinations, ar_val_residuals = hp.chi2_post_hoc(df_ar_val_char, ['Arousal', 'Valence'], 'IsNativeSpeaker', 'bonferroni',True, True)
print('\nLEVEL OF INTEREST\n')
print('\n post-hoc level of intereset and different groups')
loi_reject_list, loi_corrected_p_vals, loi_combinations, loi_residuals = hp.chi2_post_hoc(df_loi_char, loi_label, 'IsNativeSpeaker', 'bonferroni', True, True)

EMOTION

post-hoc emotions and different groups
Anger
Combinations: [('Asian Non-Native', 'Europ. Non-Native'), ('Asian Non-Native', 'Native Speaker'), ('Europ. Non-Native', 'Native Speaker')]
Reject List: [False False False]
Corrected p-values: [1. 1. 1.]
Boredom
Combinations: [('Asian Non-Native', 'Europ. Non-Native'), ('Asian Non-Native', 'Native Speaker'), ('Europ. Non-Native', 'Native Speaker')]
Reject List: [False False False]
Corrected p-values: [0.4682962  0.66744414 1.        ]
Disgust
Combinations: [('Asian Non-Native', 'Europ. Non-Native'), ('Asian Non-Native', 'Native Speaker'), ('Europ. Non-Native', 'Native Speaker')]
Reject List: [False False False]
Corrected p-values: [1. 1. 1.]
Fear
Combinations: [('Asian Non-Native', 'Europ. Non-Native'), ('Asian Non-Native', 'Native Speaker'), ('Europ. Non-Native', 'Native Speaker')]
Reject List: [False False False]
Corrected p-values: [1. 1. 1.]
Happiness
Combinations: [('Asian Non-Native', 'Europ. Non-Native'), ('Asian Non-Native', 

## Further Analysis
Now that we know we have significant p-values, we should investigate in which cells the population differs from each other. For this, we can calculate the residuals, which is the difference between the calculated table, and a table cointaining distributed values for which the chi² hypothesis is true.