# Proof of Concept Prototype

This is the first attempt at generating a Neural Network which can recognize the cry of a child, using Keras and CNTK

In [37]:
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import BatchNormalization, LeakyReLU
import numpy as np

Having imported the necessary libraries, we now move on to load the files.
The files contain the data for the different sets, and their labels are generated.

In [3]:
cries = np.load("../dataset/slow/cries.npy")
cry_labels = np.ones((cries.shape[0], 1), dtype=int)

noise = np.load("../dataset/slow/noise.npy")
noise_labels = np.zeros((noise.shape[0], 1), dtype=int)

print("Cries #:", cries.shape)
print("Noise #:", noise.shape)

Cries #: (9000L, 128L)
Noise #: (25000L, 128L)


The shape describes the two dimensions of the dataset, i.e. the amount of rows and their length.

Say with a matrix of (100L, 128L), there are 100 rows with a vector of length 128.

Now that we have our datasets loaded, we can design our neural network.

**Note**: These datasets are quite small, but will be expanded in the future.

In [4]:
# Designate how large a part of the datasets will be used for testing and training, respectively.
c_len = cries.shape[0]
c_cut = (c_len/10)*8
n_len = noise.shape[0]
n_cut = (n_len/10)*8

training_data = np.vstack((cries[:c_cut],noise[:n_cut]))
training_labels = np.vstack((cry_labels[:c_cut], noise_labels[:n_cut]))

testing_data = np.vstack((cries[c_cut:],noise[n_cut:]))
testing_labels = np.vstack((cry_labels[c_cut:], noise_labels[n_cut:]))

In [60]:
model = Sequential()
model.add(BatchNormalization(input_shape=(128,)))
model.add(Dense(units=4))
model.add(LeakyReLU())
model.add(Dense(units=1, activation="sigmoid"))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(training_data, training_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x187f1d30>

In [65]:
# The output of the final evaluation lists the following metrics
mets = model.metrics_names
vals = model.evaluate(testing_data, testing_labels, batch_size=128)

for i in range(0, len(mets)):
    print(mets[i], vals[i])

loss 0.22400503666961893
acc 0.9202941176470588


In [62]:
# true positive rate
tp = np.sum(model.predict_classes(cries))
tp_rate = float(tp)/cries.shape[0]

# false positive rate
fp = np.sum(model.predict_classes(noise))
fp_rate = float(fp)/noise.shape[0]

print("tp rate: ", tp_rate, "\nfp rate: ", fp_rate)

tp rate:  0.730222222222 
fp rate:  0.03236


In [67]:
# Save the successful model
model.save("recognizer.h5")

**On saving the model**:
Saving the model allows us to load the configuration with associated weights and configurations, so that we do not have to train it again.

## Connecting to the NXT
Now that the model has been trained, we can move on to using this recognizer to classify the input from the NXT.

In [23]:
# These are in the keras folder
from receiver import NXTReceiver, unpack_u16

rc = NXTReceiver() # Connect to the NXT via MAC address
count = 15
while count != 0:
    lines = []
    for i in range(0, 5): # Receive 5 buffers
        line = rc.recv(256)
        lines.append([unpack_u16(line[i:i+2]) for i in range(0, line.__len__(), 2)])
    val = np.sum(model.predict_classes(np.vstack(lines))))
    if val > 3.5:
        rc.sock.send(b'\x01')
        print("Value:", val)
        count -= 1
    else:
        rc.sock.send(b'\x00')
    

Connecting via Bluetooth...
Connected.
Value: 1.0
Value: 1.0
Value: 0.8
Value: 0.8
Value: 1.0
Value: 1.0
Value: 0.8
Value: 0.8
Value: 0.8
Value: 1.0
Value: 0.8
Value: 0.8
Value: 0.8
Value: 0.8
Value: 1.0
