## Building and training the model

Import the data from the data processing notebook

In [1]:
%store -r x_train
%store -r x_test
%store -r y_train
%store -r y_test
%store -r yy
%store -r le

Import Deep Learning frameworks

In [2]:
import numpy as np
from keras.models import Sequential
from keras.models import save_model
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics

Using TensorFlow backend.


Build the neural network. The Sequential Model is a plain stack of layers where there is one input tensor and one output tensor. The 256 is the number of nodes in each layer.

Layer 1: Each sample has 40 MFCCs so the input shape is 40. The relu activation is the functiom that gives out the final value for each neuron/node. Relu stands for Rectified Linear Unit and is a common activation function for DL classification problems. The dropout is 0.5 to randomly exclude nodes from each epoch to create better generalisation and less overfitting.  

Layer 2: Is the hidden layer, with the same structure as the first.

Layer 3: Is the output layer, with 2 nodes; one for emergency and one for non emergency. the activation is Softmax, which makes the probability sum up to 1. So the two nodes added together must equal one. This is good for probability problems.

In [3]:
num_labels = yy.shape[1]
filter_size = 2

model = Sequential()

model.add(Dense(256, input_shape=(40,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(num_labels))
model.add(Activation('softmax'))

Categorical crossentropy is used for classifciation where there are more than one output labels. Essentially this gives the model a score of how it is performing. The lower the score the more accurate the predicitons.
The accuracy metric shows the accuracy on the validation data. 
The Adam optimizer is derived from Adaptive Moment Estimation. it is an extension to stochastic gradient descent to update network weights based off the training data.

In [4]:
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

In [5]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 256)               10496     
_________________________________________________________________
activation_1 (Activation)    (None, 256)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 256)               65792     
_________________________________________________________________
activation_2 (Activation)    (None, 256)               0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 2)                

Use the model to evaluate the test data before the network has been trained. This will show how training the model improves the predictions. (If the model has been trained in this session the accuracy will be high)

In [6]:
score = model.evaluate(x_test, y_test, verbose=0)
accuracy = 100*score[1]

print("Pre-training accuracy: %.4f%%" % accuracy)

Pre-training accuracy: 37.3786%


Train the model by fitting it to the training data. Go through 30 iterations to train the model. The number of epochs is largely decided by the user and should really  be how many it takes to improve accuracy before stabilising.

Batch size is definining how many samples go through the netwrok before it is trained. So here, send through 32 samples, then update the network etc. This requires less computational power as the network is not being updated for every sample and the network trains faster.

In [7]:
from datetime import datetime 

num_epochs = 30
num_batch_size = 32

start = datetime.now()

history = model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs, validation_data=(x_test, y_test), verbose=1)


duration = datetime.now() - start
print("Training completed in time: ", duration)

Train on 1645 samples, validate on 412 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
Training completed in time:  0:00:13.808711


Evaluate the model on both the train set and the test set. This should give an indication of overfitting.

In [8]:
# Evaluating the model on the training and testing set
from tensorflow.keras.models import load_model

model = load_model('my_model')
score = model.evaluate(x_train, y_train, verbose=0)
print("Training Accuracy: ", score[1])

score = model.evaluate(x_test, y_test, verbose=0)
print("Testing Accuracy: ", score[1])

Training Accuracy:  0.9410334229469299
Testing Accuracy:  0.9150485396385193


To use the classifier, use the same extract features function as in the previous notebook.

In [9]:
import librosa 
import numpy as np 

def extract_feature(file_name):
   
    try:
        audio_data, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
        mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=40)
        mfccsscaled = np.mean(mfccs.T,axis=0)
        
    except Exception as e:
        print("Error encountered while parsing file: ", file)
        return None, None

    return np.array([mfccsscaled])

In [10]:
def print_prediction(file_name):
    prediction_feature = extract_feature(file_name) 

    predicted_vector = model.predict_classes(prediction_feature)
    predicted_class = le.inverse_transform(predicted_vector) 
    print("The predicted class is:", predicted_class[0], '\n') 

    predicted_proba_vector = model.predict_proba(prediction_feature) 
    predicted_proba = predicted_proba_vector[0]
    for i in range(len(predicted_proba)): 
        category = le.inverse_transform(np.array([i]))
        print(category[0], "\t\t : ", format(predicted_proba[i], '.32f') )

In [11]:
emergency_test_dataset_path = '../Datasets/test/train_balanced/nonEmergency/11.wav'
print_prediction(emergency_test_dataset_path)

The predicted class is: non_emergency 

emergency 		 :  0.00110092386603355407714843750000
non_emergency 		 :  0.99889910221099853515625000000000


In [12]:
model.save("MLP")

In [13]:
%store history

Stored 'history' (History)
