## ZuCo - Discrete emotion multiple linear regression analysis

With this analysis, we want to search for any sign of the linear correlation between the feature values obtained from the ZuCo dataset applied to the single words that compose the sentences of the mentioned dataset, and the emotion score value obtained from the NRC emotion intensity lexicon.

In [None]:
import pandas as pd
import statsmodels.api as sm
from sklearn.preprocessing import MinMaxScaler

# some_file.py
import sys
# insert at 1, 0 is the script path (or '' in REPL)
sys.path.insert(1, '../')
import BackwardElimination as be

## Import the ZuCo dataset

In [None]:
zuco_ds = pd.read_csv('../Lexicons/ZuCo_words_dataset.csv')

## Normalizing values

In [None]:
scaler = MinMaxScaler()
zuco_ds.iloc[:,1:] = pd.DataFrame(scaler.fit_transform(zuco_ds.iloc[:,1:].values), columns=zuco_ds.columns[1:])

## Get discrete emotion intensity values dataset from NRC lexicon

In [None]:
anger_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-anger-scores.csv')
anticipation_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-anticipation-scores.csv')
disgust_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-disgust-scores.csv')
fear_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-fear-scores.csv')
joy_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-joy-scores.csv')
sadness_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-sadness-scores.csv')
surprise_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-surprise-scores.csv')
trust_lex = pd.read_csv('../Lexicons/NRC_Emotion_Intensity_Lexicon/NRC-Emotion-Intensity-trust-scores.csv')

## Intersect each discrete emotion lexicon with the zuco used words dataset

In [None]:
anger_ds = pd.merge(zuco_ds, anger_lex, how ='inner', on =['Word'])
anger_ds = anger_ds.drop(['Word'], axis=1)

anticipation_ds = pd.merge(zuco_ds, anticipation_lex, how ='inner', on =['Word'])
anticipation_ds = anticipation_ds.drop(['Word'], axis=1)

disgust_ds = pd.merge(zuco_ds, disgust_lex, how ='inner', on =['Word'])
disgust_ds = disgust_ds.drop(['Word'], axis=1)

fear_ds = pd.merge(zuco_ds, fear_lex, how ='inner', on =['Word'])
fear_ds = fear_ds.drop(['Word'], axis=1)

joy_ds = pd.merge(zuco_ds, joy_lex, how ='inner', on =['Word'])
joy_ds = joy_ds.drop(['Word'], axis=1)

sadness_ds = pd.merge(zuco_ds, sadness_lex, how ='inner', on =['Word'])
sadness_ds = sadness_ds.drop(['Word'], axis=1)

surprise_ds = pd.merge(zuco_ds, surprise_lex, how ='inner', on =['Word'])
surprise_ds = surprise_ds.drop(['Word'], axis=1)

trust_ds = pd.merge(zuco_ds, trust_lex, how ='inner', on =['Word'])
trust_ds = trust_ds.drop(['Word'], axis=1)

## Build simple linear regression models

In [None]:
#anger_ds = pd.DataFrame(anger_ds, columns=['MPS','TRT','GD','FFD','SCORE'])
#print (anger_ds)
X = anger_ds[['MPS','TRT','GD','FFD']]
y = anger_ds['Score']
X = sm.add_constant(X)
#model = be.backwardEliminationMLR(X,y)
model = sm.OLS(y, X).fit()
model.summary()

No correlation has been noticed between ZuCo values and anger score values (SL = 0.05)

In [None]:
X = anticipation_ds[['MPS','TRT','GD','FFD']]
y = anticipation_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and anticipaiton score values (SL = 0.05)

In [None]:
X = disgust_ds[['MPS','TRT','GD','FFD']]
y = disgust_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and disgust score values (SL = 0.05)

In [None]:
X = fear_ds[['MPS','TRT','GD','FFD']]
y = fear_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and fear score values (SL = 0.05)

In [None]:
X = joy_ds[['MPS','TRT','GD','FFD']]
y = joy_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and joy score values (SL = 0.05)

In [None]:
X = sadness_ds[['MPS','TRT','GD','FFD']]
y = sadness_ds['Score']
X = sm.add_constant(X)

model = be.backWardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and sadness score values (SL = 0.05)

In [None]:
X = surprise_ds[['MPS','TRT','GD','FFD']]
y = surprise_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and surprise score values (SL = 0.05)

In [None]:
X = trust_ds[['MPS','TRT','GD','FFD']]
y = trust_ds['Score']
X = sm.add_constant(X)

model = be.backwardEliminationMLR(X,y)
model.summary()

No correlation has been noticed between ZuCo values and surprise score values (SL = 0.05)

### Conclusions

The results does not report any meaningful sign of linear correlation between the ZuCo features values and the single discrete emotions.