<a href="https://colab.research.google.com/github/mhuckvale/pals0039/blob/master/Answers_6_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[![PALS0039 Logo](https://www.phon.ucl.ac.uk/courses/pals0039/images/pals0039logo.png)](https://www.phon.ucl.ac.uk/courses/pals0039/)

# Exercise 6.2 Answers

In this exercise we build a small vocabulary isolated word recogniser using a recurrent network classifier.


(a) Import the usual libraries. Run the code and add comments.

In [0]:
# import standard library modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# import keras toolkit
%tensorflow_version 2.x
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Embedding, Flatten, SimpleRNN, LSTM, GRU, Bidirectional, Dropout
from tensorflow.keras.preprocessing.sequence import pad_sequences

---
(b) Download a data set and prepare for processing. Run the code and add comments.

In [0]:
# set up the coding we will use for the commands as a dictionary
COMMANDS={ "yes":0, "no":1, "up":2, "down":3, "left":4, "right":5, "on":6, "off":7, "stop":8, "go":9 }

# download the speech data and unpack into contant length sequences
def prepare_data(filename,maxseq):
  # read the dataset
  df=pd.read_csv(filename)
  # group the data by recording
  grouped=df.groupby("FILE")
  nseq=len(grouped)
  # set up the array to hold the feature date (19=#filterbankchannels)
  feats=np.zeros((nseq,maxseq,19))
  # set up the array to hold the command codes
  labels=np.zeros((nseq),dtype='int')
  # loop through each group (=each file)
  i=0
  for name,group in grouped:
    # name = command name, group=speech data
    n=min(len(group),maxseq)
    # copy the speech data into the feature array
    feats[i,0:n,:] = group.iloc[0:n,2:21].to_numpy()
    # copy the command name into the labels array
    labels[i]=COMMANDS[group.LABEL.iat[0]]
    i+=1
  # normalise the data so that the top 50dB is mapped to 0..1
  limit=np.amax(feats)-50
  feats=(feats-limit)/50
  feats[feats<0]=0
  # return the imported data in random order
  p = np.random.permutation(nseq)
  return feats[p,:,:],labels[p]

# load training, validation and test data
Xtrain, ytrain = prepare_data("https://www.phon.ucl.ac.uk/courses/pals0039/data/command-train.csv",100)
Xval, yval = prepare_data("https://www.phon.ucl.ac.uk/courses/pals0039/data/command-valid.csv",100)
Xtest, ytest = prepare_data("https://www.phon.ucl.ac.uk/courses/pals0039/data/command-test.csv",100)

# report what we have got
print(Xtrain.shape,ytrain.shape)
print(Xval.shape,yval.shape)
print(Xtest.shape,ytest.shape)


---
(c) Display some of the command words. Run the code and add comments.

In [0]:
# get list of command words
labellist=list(COMMANDS.keys())
# for the first five words
for i in range(5):
  # display the spectrogram
  word = Xtrain[i]
  plt.imshow(word.T, origin='lower',cmap='binary')
  # title with the name of command
  plt.title(labellist[int(ytrain[i])])
  plt.show()


---
(d) Build a model. Run the code and add comments.

In [0]:
# get basic sizes of problem
seqlen=Xtrain.shape[1]
isize=Xtrain.shape[2]
osize=len(COMMANDS)

# build a recurrent network with 10 outputs
model = Sequential()
model.add(Bidirectional(LSTM(16, return_sequences=True),merge_mode='ave',input_shape=(seqlen,isize)));
model.add(Flatten())
model.add(Dense(osize, activation='softmax'));
#
# compile the network to produce 10-way classifications specified as integer labels
model.compile(loss='sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())

---
(e) Train the model. Run the code and add commants.

In [0]:
# train the model for 25 epochs (may not be enough)
history=model.fit(Xtrain,ytrain, validation_data=(Xval,yval), epochs=25)


---
(f) Evaluate model on test set. Run the code and add comments.

In [0]:
# get the accuracy on the test data
score,acc = model.evaluate(Xtest,ytest,verbose=0)
print("Test accuracy: %.2f" % (acc));

# get the actual predictions
ypred = model.predict(Xtest)
ypred=np.argmax(ypred,axis=1)

# get the list of commands
labellist=COMMANDS.keys()

# use the pandas crosstabs function to calculate and print confusion matrix
y_actu = pd.Categorical.from_codes(ytest, categories=labellist)
y_pred = pd.Categorical.from_codes(ypred, categories=labellist)
df_confusion = pd.crosstab(y_actu, y_pred, margins=False, normalize='index',dropna=False)
df_confusion

---
(g) Experiment with different network configurations and amounts of training. Plot the loss and accuracy curves for the train and validation data. What is the best performance you can obtain on the test set?