<a href="https://colab.research.google.com/github/mhuckvale/pals0039/blob/master/Answers_4_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[![PALS0039 Logo](https://www.phon.ucl.ac.uk/courses/pals0039/images/pals0039logo.png)](https://www.phon.ucl.ac.uk/courses/pals0039/)

#Exercise 4.2 Answers

In this exercise we implement a DNN for recognition of emotion from speech.

The data comes from recordings made at the [Enterface05 summer school](http://www.enterface.net/).

The audio recordings have been extracted from the video files, and each has been processed through the [OpenSMILE](https://www.audeering.com/opensmile/) feature extractor. Each recording is represented by a fixed-length vector of 6373 features. Every recording also has an emotion label from the set <tt>['anger', 'disgust', 'fear', 'happiness', 'sadness', 'surprise']</tt> which was the target emotion given to each speaker. The code loads the data from a CSV file, encodes the emotion classes as numbers, the selects random subsets for training and testing. We then build a DNN classifier and display its performance as a confusion matrix.

---
(a) Set-up. Run the code and add comments.

In [None]:
# import standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# import Keras library
%tensorflow_version 2.x
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

---
(b) Download the data set and measure its parameters. Run the code and add comments.

In [None]:
# down load the emotion corpus spreadsheet
df=pd.read_csv("https://www.phon.ucl.ac.uk/courses/pals0039/data/emotion.csv",sep=',')

# print size
print("Number of rows=",len(df))
print("Number of columns=",len(df.columns))

# print first rows
df.head()

---
(c) Convert data and labels into numpy arrays. Run the code and add comments.

In [None]:
# convert the EMOTION column to numbers
df['EMOTION']=pd.Categorical(df['EMOTION'])

# get a list of the named categories
emolist=list(df['EMOTION'].cat.categories)
print(emolist)

# convert data frame to numpy arrays - selecting training features
Xdata=np.array(df.iloc[:,4:])
ydata=np.array(df['EMOTION'].cat.codes)

# get total number of rows
ndata=Xdata.shape[0]
# get random order
p=np.random.permutation(ndata)
# train on 90%
ntrain=int(0.9*ndata)

# divide into training and test data
Xtrain=Xdata[p[:ntrain],:]
Xtest=Xdata[p[ntrain:],:]
ytrain=ydata[p[:ntrain]]
ytest=ydata[p[ntrain:]]

print(Xtrain[:10,:20])
print(ytrain[:10])

---
(d) Build the DNN model. Run the code and add comments.

In [None]:
# get some basic sizes of inputs and outputs
isize=Xtrain.shape[1];
osize=len(emolist)
print("inputs",isize,"outputs",osize)

# use Keras sequentioal model
model = Sequential()
# add dense layer with isize inputs
model.add(Dense(64,activation='tanh',input_shape=(isize,)))
# add a hidden layer
model.add(Dense(16,activation='tanh'));
# add the output layer as set of class probabilities
model.add(Dense(osize, activation='softmax'))
# compile model using the sparse cross-entropy, which allows us to present the classes as integers
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

---
(e) Train the model. Run the code and add comments

In [None]:
# train the model, using 10% as validation data
history=model.fit(Xtrain,ytrain, epochs=10, batch_size=64, validation_split=0.1)

---
(f) Evaluate the model on the test data. Run the code and add comments.

In [None]:
# evaluate the model
loss,accuracy=model.evaluate(Xtest,ytest)
print("Loss",loss,"Accuracy",accuracy)

---
(g) Print confusion matrix. Run the code and add comments.

In [None]:
# get predictions of the model
ypred=model.predict(Xtest)

# use argmax to choose most probable
ypred=np.argmax(ypred,axis=1)

# set up confusion matrix
counts=np.zeros((osize,osize))
correct=0
# compare predictions with correct answer
for i in range(len(ytest)):
  if (ypred[i]==ytest[i]):
    correct += 1
  counts[ytest[i],ypred[i]] += 1

# print the confusions
print(emolist)
print(counts)
print("Correct %.1f%%" % (100*correct/len(ytest)))

---
(h) Experiment with the example to try and improve performance. Here are some ideas:
<ol>
<li>Change the size and structure of the network.
<li>Change the training regime.
<li>(advanced) Select a subset of the 6373 features according to how much they vary between emotion classes in the training set. Eliminate the unused features (those that don't vary much across classes) from both training and testing data.
</ol>