# Business Case for AudioBooks without Early Stopping

The input layer consisting of 10 units, this is from the csv, there are only two output nodes as there are only two possibilities 0 and 1, the net with two hidden layers and the number of units in each hidden layer will be 50.

For a prototype of an algorithm 50 is a good value, 50 hidden units in the hidden layers provide enough complexity, so expect the algorithm to be much more sophisticated than a linear or logistic regression, at the same time we don't want to put too many units, initially as we want to complete the learning as fast as possible and see if anything is being learned at all.

### Create the machine learning algorithm

#### 1. Import the relevant libraries

In [1]:
import numpy as np
import tensorflow as tf

#### 2. Load the data

In [2]:
# this npz will story each of the three datasets as we load them

npz = np.load('Audiobooks_data_train.npz')

# extract the inputs
# make sure the model learns correctly, expect all inputs to be floats
# the astype method can creates a copy of the array, cast to a specific type

train_inputs = npz['inputs'].astype(np.float)

# extract the targets
# the targets are 0 and 1, but we are not completely certain if they'll be extracted as integers, floats or booleans
# it can use the same method astype and make sure their data type will be np.int

train_targets = npz['targets'].astype(np.int)


npz = np.load('Audiobooks_data_validation.npz')
validation_inputs, validation_targets = npz['inputs'].astype(np.float), npz['targets'].astype(np.int)


npz = np.load('Audiobooks_data_test.npz')
test_inputs, test_targets = npz['inputs'].astype(np.float), npz['targets'].astype(np.int)

#### 3. Model

In [3]:
input_size = 10
output_size = 2
hidden_layer_size = 50

model = tf.keras.Sequential([
                            tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'),
                            tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'),
                            tf.keras.layers.Dense(output_size, activation = 'softmax')
                            ])

# use this loss to ensure the integer targets are one-hot encoded appropriately when calculating the loss

model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])


# have not set the two hyperparameters the batch size and the epochs
# speaking of the batch size, we already said that in this example we won't take advantage of iterable objects that contain the data,
# instead will employ simple arrays while the batching itself will be indicated when we fit the model in a minute or two
# if you were dealing with arrays as we are now, indicating the batch size here would automatically batch the data during the training process



batch_size = 100
max_epochs = 100

model.fit(train_inputs, 
          train_targets,
          batch_size = batch_size,
          epochs = max_epochs,
          validation_data = (validation_inputs, validation_targets),
          verbose = 2
         )


# through the training process
# it can know the training loss is consistently decreasing
# but the validation loss was sometimes increasing
# so it is pretty obviously the model is overfitting
# so in this time, setting an early stopping mechanism really does make a different


Train on 3579 samples, validate on 447 samples
Epoch 1/100
3579/3579 - 1s - loss: 0.6520 - accuracy: 0.6387 - val_loss: 0.5256 - val_accuracy: 0.7494
Epoch 2/100
3579/3579 - 0s - loss: 0.4833 - accuracy: 0.7650 - val_loss: 0.4450 - val_accuracy: 0.7808
Epoch 3/100
3579/3579 - 0s - loss: 0.4247 - accuracy: 0.7818 - val_loss: 0.4001 - val_accuracy: 0.7987
Epoch 4/100
3579/3579 - 0s - loss: 0.3940 - accuracy: 0.7935 - val_loss: 0.3720 - val_accuracy: 0.8188
Epoch 5/100
3579/3579 - 0s - loss: 0.3768 - accuracy: 0.8005 - val_loss: 0.3615 - val_accuracy: 0.8166
Epoch 6/100
3579/3579 - 0s - loss: 0.3656 - accuracy: 0.8005 - val_loss: 0.3471 - val_accuracy: 0.8166
Epoch 7/100
3579/3579 - 0s - loss: 0.3570 - accuracy: 0.8092 - val_loss: 0.3405 - val_accuracy: 0.8345
Epoch 8/100
3579/3579 - 0s - loss: 0.3510 - accuracy: 0.8181 - val_loss: 0.3534 - val_accuracy: 0.8031
Epoch 9/100
3579/3579 - 0s - loss: 0.3499 - accuracy: 0.8122 - val_loss: 0.3330 - val_accuracy: 0.8210
Epoch 10/100
3579/3579 - 0

Epoch 80/100
3579/3579 - 0s - loss: 0.3094 - accuracy: 0.8284 - val_loss: 0.3214 - val_accuracy: 0.8300
Epoch 81/100
3579/3579 - 0s - loss: 0.3047 - accuracy: 0.8346 - val_loss: 0.3193 - val_accuracy: 0.8322
Epoch 82/100
3579/3579 - 0s - loss: 0.3088 - accuracy: 0.8256 - val_loss: 0.3175 - val_accuracy: 0.8233
Epoch 83/100
3579/3579 - 0s - loss: 0.3054 - accuracy: 0.8329 - val_loss: 0.3164 - val_accuracy: 0.8277
Epoch 84/100
3579/3579 - 0s - loss: 0.3071 - accuracy: 0.8293 - val_loss: 0.3212 - val_accuracy: 0.8233
Epoch 85/100
3579/3579 - 0s - loss: 0.3046 - accuracy: 0.8346 - val_loss: 0.3134 - val_accuracy: 0.8389
Epoch 86/100
3579/3579 - 0s - loss: 0.3036 - accuracy: 0.8338 - val_loss: 0.3157 - val_accuracy: 0.8277
Epoch 87/100
3579/3579 - 0s - loss: 0.3078 - accuracy: 0.8265 - val_loss: 0.3191 - val_accuracy: 0.8300
Epoch 88/100
3579/3579 - 0s - loss: 0.3095 - accuracy: 0.8268 - val_loss: 0.3377 - val_accuracy: 0.8143
Epoch 89/100
3579/3579 - 0s - loss: 0.3112 - accuracy: 0.8279 - 

<tensorflow.python.keras.callbacks.History at 0x6411dfa90>