### Import the relevant libraries

In [1]:
import numpy as np
import tensorflow as tf

### Load the data

In [2]:
# Since we split the training, test and validation data into three files, we load them separately and cast them
# to the required types
extract = np.load('audiobooks_training_data.npz')
train_inputs = extract['inputs'].astype(np.float)
train_targets = extract['targets'].astype(np.int)

# We do the same for our validation and test data
extract = np.load('audiobooks_validation_data.npz')
validation_inputs = extract['inputs'].astype(np.float)
validation_targets = extract['targets'].astype(np.int)

extract = np.load('audiobooks_test_data.npz')
test_inputs = extract['inputs'].astype(np.float)
test_targets = extract['targets'].astype(np.int)

### Build the model

In [3]:
# First, we define the hyperparameters

input_size = 10
output_size = 2
hidden_layer_size = 50 # The hidden layer size determines the complexity of the neural network

model = tf.keras.Sequential([
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
    tf.keras.layers.Dense(output_size, activation='softmax')
]) # Since we are dealing with a classifier, we use softmax as the activation function

### Select the optimizer and loss function

In [4]:
# We use the Adaptive moment estimation as our optimizer. The loss function here is selected in order to pass in 
# integer inputs and transform into one-hot encoding. In addition, we add the accuracy as part of the metrics to 
# track during the training of our model

# custom_optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer='Adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training

In [5]:
BATCH_SIZE = 100
NUM_EPOCHS = 50

# Training the model above with the above hyperparameters only, we notice that the validation loss increases 
# often, which is a sign that we are overfitting. Therefore, we need to set an early stopping mechanism

early_stopping = tf.keras.callbacks.EarlyStopping(patience=2) # By default the validation loss is monitored and the iterations
# are stopped once the validation loss stops improving

model.fit(train_inputs, train_targets, batch_size=BATCH_SIZE, epochs=NUM_EPOCHS, callbacks=[early_stopping],
          validation_data=(validation_inputs,validation_targets), validation_steps= 1,verbose=2)

Train on 3579 samples, validate on 447 samples
Epoch 1/50
3579/3579 - 0s - loss: 197.6834 - accuracy: 0.5398 - val_loss: 6.6062 - val_accuracy: 0.6500
Epoch 2/50
3579/3579 - 0s - loss: 6.2642 - accuracy: 0.7267 - val_loss: 0.2598 - val_accuracy: 0.6900
Epoch 3/50
3579/3579 - 0s - loss: 1.2197 - accuracy: 0.7664 - val_loss: 0.2104 - val_accuracy: 0.8100
Epoch 4/50
3579/3579 - 0s - loss: 1.0835 - accuracy: 0.7647 - val_loss: 0.2612 - val_accuracy: 0.6900
Epoch 5/50
3579/3579 - 0s - loss: 1.0160 - accuracy: 0.7603 - val_loss: 0.1442 - val_accuracy: 0.7000
Epoch 6/50
3579/3579 - 0s - loss: 0.7411 - accuracy: 0.7631 - val_loss: 0.1043 - val_accuracy: 0.8100
Epoch 7/50
3579/3579 - 0s - loss: 0.7528 - accuracy: 0.7614 - val_loss: 0.0900 - val_accuracy: 0.8200
Epoch 8/50
3579/3579 - 0s - loss: 1.0669 - accuracy: 0.7530 - val_loss: 0.1728 - val_accuracy: 0.7600
Epoch 9/50
3579/3579 - 0s - loss: 0.7039 - accuracy: 0.7773 - val_loss: 0.0939 - val_accuracy: 0.8100


<tensorflow.python.keras.callbacks.History at 0x13c194470>

From the validation accuracy attained, we can say that our model learned quite a lot considering that the priors were 50% each. If we are then given customer audiobook information for 10, this implies that we will able to accurately predict the conversion of 8 of those customers.

### Testing the model

In [6]:
test_loss, test_accuracy = model.evaluate(test_inputs,test_targets)

print("Test loss: {0:.2f} Test accuracy: {1:.2f}%".format(test_loss,test_accuracy*100.0))

Test loss: 0.43 Test accuracy: 79.91%
