<a href="https://colab.research.google.com/github/mhuckvale/pals0039/blob/master/Answers_4_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[![PALS0039 Logo](https://www.phon.ucl.ac.uk/courses/pals0039/images/pals0039logo.png)](https://www.phon.ucl.ac.uk/courses/pals0039/)

#Exercise 4.3 Answers

In this exercise we implement a DNN for predicting age of the speaker from speech.

The data comes from the [Accents of the British Isles](http://www.thespeechark.com/abi-2-page.html) corpus. Each speaker read the same short passage, and the audio was processed using the [OpenSMILE](https://www.audeering.com/opensmile/) toolkit into fixed-length vectors of 6373 features. We normalise the features across all speakers using z-scores. We add GENDER as an additional input feature. We convert the speaker AGE into a value between 0-1 and build a network with a linear output to predict age. We plot the predictions as a scatter plot, and calculate correlation and mean absolute error.

---
(a) Set up for learning. Run the code and add comments.

In [None]:
# import standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# import keras libraries
%tensorflow_version 2.x
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout

---
(b) Load in the data as two spreadsheets. Run the code and add comments.

In [None]:
# load a CSV file and divide into feats and labels
def LoadCSV(filename,labelname,featcolstart):
  # read CSV file
  df=pd.read_csv(filename,sep=',')
  # convert GENDER column to codes
  gender=pd.Categorical(df['GENDER'])
  # append gender code to audio features
  feats=np.append(np.array(df.iloc[:,featcolstart:]),np.reshape(gender.codes,(len(df),1)),axis=1)
  # convert age to numpy array
  labels=df[labelname].to_numpy()
  # randomise order
  p = np.random.permutation(feats.shape[0])
  return labels[p],feats[p]

# download train and test data
ytrain,Xtrain=LoadCSV("https://www.phon.ucl.ac.uk/courses/pals0039/data/age-train.csv","AGE",4);
ytest,Xtest=LoadCSV("https://www.phon.ucl.ac.uk/courses/pals0039/data/age-test.csv","AGE",4);

# get basic parameters
ntrain=Xtrain.shape[0]
ntest=Xtest.shape[0]
isize=Xtrain.shape[1]
osize=1
print("ntrain",ntrain,"ntest",ntest,"isize",isize,"osize",osize)

---
(c) Normalise the features. Run the code and add comments

In [None]:
# get statistics of all features
mn=np.mean(Xtrain,axis=0)
sd=np.std(Xtrain,axis=0)
sd[(sd<0.001)]=1

# z-score features
Xtrain=(Xtrain-mn)/sd;
Xtest=(Xtest-mn)/sd;

# convert ages to a small number
ytrain=ytrain/100
ytest=ytest/100

---
(d) Build a model. Run the code and add comments.

In [None]:
# use Keras sequential model
model = Sequential()
# add input layer of 64 units
model.add(Dense(64,activation='tanh',input_shape=(isize,)))
# add dropout layer (to encourage generalisation)
model.add(Dropout(0.5))
# add hidden layer
model.add(Dense(32,activation='tanh'))
# add output layer with linear activation
model.add(Dense(osize, activation='linear'))
# compile the model with MSE loss and MAE accuracy
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['mae'])
print(model.summary())

---
(e) Fit model. Run code and add comments.

In [None]:
# train model - use 10% for validation
hist=model.fit(Xtrain,ytrain, epochs=100, batch_size=32, validation_split=0.1)

---
(f) Evaluate the model on test data. Run the code and add comments

In [None]:
# get MSE loss and MAE loss on test data
mseloss,maeloss=model.evaluate(Xtest,ytest)
print("mseloss",mseloss,"maeloss",maeloss*100,"years")

---
(g) Plot predictions. Run the code and add comments

In [None]:
# get predictions of model
ypred=model.predict(Xtest)

# convert back to years
ytest=ytest*100
ypred=ypred.flatten()*100

# calculate some statistics
rmse=np.sqrt(np.mean((ytest-ypred)*(ytest-ypred)))
mae=np.mean(np.abs(ytest-ypred))
corr=np.corrcoef(ytest,ypred);

# plot male and female in different colours
colours=['red','blue']
cols=list(colours[int(i)] for i in Xtest[:,isize-1])
# plot a scatter plot
plt.scatter(x=ytest,y=ypred,c=cols,s=20,alpha=0.5);
# calculate a line of best fit
poly = np.polyfit(ytest, ypred, 1)
x=np.array((np.min(ytest),np.max(ytest)))
y=poly[1]+poly[0]*x
plt.plot(x,y,':g')
plt.xlabel("Actual Age (yr)");
plt.ylabel("Predicted Age (yr)");
plt.grid()
plt.title("RMSE=%.3f MAE=%.3f r=%.3f" % (rmse,mae,corr[0,1]));
plt.show()


---
(h) Experiment with the network to try and improve performance. Some ideas:
<ol>
<li>Remove the feature normalisation step - is it useful?
<li>Change the structure of the network or training regime
<li>(advanced) Select a subset of the 6373 features depending on whether the feature has a useful correlation with speaker AGE. Remove features from the training and test sets which do not correlate with age to any useful degree.
</ol>
