### Create the machine learning algorithm

###### Import the relevant packages

In [1]:
import numpy as np
import tensorflow as tf
import os

###### Data

That's where we load and preprocess our data.

In [2]:
pre_processe_data_path = os.path.join(os.path.pardir, 'data', 'processed')

# let's create a temporary variable npz, where we will store each of the three Audiobooks datasets

npz_train = np.load(os.path.join(pre_processe_data_path, 'Audiobooks_data_train.npz'))

# we extract the inputs using the keyword under which we saved them
# to ensure that they are all floats, let's also take care of that

train_inputs = npz_train['inputs'].astype(np.float)
# targets must be int because of sparse_categorical_crossentropy (we want to be able to smoothly one-hot encode them)
train_targets = npz_train['targets'].astype(np.int)

# we load the validation data in the temporary variable
npz_validation = np.load(os.path.join(pre_processe_data_path, 'Audiobooks_data_validation.npz'))

# we can load the inputs and the targets in the same line
validation_inputs = npz_validation['inputs'].astype(np.float)
validation_targets = npz_validation['targets'].astype(np.int)

# we load the test data in the temporary variable
npz_test = np.load(os.path.join(pre_processe_data_path, 'Audiobooks_data_test.npz'))

# we create 2 variables that will contain the test inputs and the test targets
test_inputs = npz_test['inputs'].astype(np.float)
test_targets = npz_test['targets'].astype(np.int)


### Model
Outline, optimizers, loss, early stopping and training

In [3]:
# Set the input and output sizes
input_size = 10
output_size = 2
# Use same hidden layer size for both hidden layers. Not a necessity.
hidden_layer_size = 50
    
# define how the model will look like
model = tf.keras.Sequential([
    # tf.keras.layers.Dense is basically implementing: output = activation(dot(input, weight) + bias)
    # it takes several arguments, but the most important ones for us are the hidden_layer_size and the activation function
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    # the final layer is no different, we just make sure to activate it with softmax
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])


### Choose the optimizer and the loss function

# we define the optimizer we'd like to use, 
# the loss function, 
# and the metrics we are interested in obtaining at each iteration
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
# That's where we train the model we have built.

# set the batch size
batch_size = 100

# set a maximum number of training epochs
max_epochs = 100

# set an early stopping mechanism
# let's set patience=2, to be a bit tolerant against random validation loss increases
early_stopping = tf.keras.callbacks.EarlyStopping(patience=2)

# fit the model
# note that this time the train, validation and test data are not iterable
model.fit(train_inputs, # train inputs
          train_targets, # train targets
          batch_size=batch_size, # batch size
          epochs=max_epochs, # epochs that we will train for (assuming early stopping doesn't kick in)
          # callbacks are functions called by a task when a task is completed
          # task here is to check if val_loss is increasing
          callbacks=[early_stopping], # early stopping
          validation_data=(validation_inputs, validation_targets), # validation data
          verbose = 2 # making sure we get enough information about the training process
          )  

Train on 3579 samples, validate on 447 samples
Epoch 1/100
3579/3579 - 0s - loss: 0.5689 - accuracy: 0.7882 - val_loss: 0.4370 - val_accuracy: 0.8591
Epoch 2/100
3579/3579 - 0s - loss: 0.3742 - accuracy: 0.8748 - val_loss: 0.3413 - val_accuracy: 0.8770
Epoch 3/100
3579/3579 - 0s - loss: 0.3220 - accuracy: 0.8829 - val_loss: 0.3166 - val_accuracy: 0.8792
Epoch 4/100
3579/3579 - 0s - loss: 0.2974 - accuracy: 0.8908 - val_loss: 0.3005 - val_accuracy: 0.8792
Epoch 5/100
3579/3579 - 0s - loss: 0.2820 - accuracy: 0.8969 - val_loss: 0.2900 - val_accuracy: 0.8837
Epoch 6/100
3579/3579 - 0s - loss: 0.2712 - accuracy: 0.8989 - val_loss: 0.2828 - val_accuracy: 0.8859
Epoch 7/100
3579/3579 - 0s - loss: 0.2664 - accuracy: 0.9000 - val_loss: 0.2795 - val_accuracy: 0.8881
Epoch 8/100
3579/3579 - 0s - loss: 0.2592 - accuracy: 0.9014 - val_loss: 0.2757 - val_accuracy: 0.8926
Epoch 9/100
3579/3579 - 0s - loss: 0.2549 - accuracy: 0.9033 - val_loss: 0.2723 - val_accuracy: 0.8949
Epoch 10/100
3579/3579 - 0

<tensorflow.python.keras.callbacks.History at 0x12e074ac8>

## Test the model

As we discussed in the lectures, after training on the training data and validating on the validation data, we test the final prediction power of our model by running it on the test dataset that the algorithm has NEVER seen before.

It is very important to realize that fiddling with the hyperparameters overfits the validation dataset. 

The test is the absolute final instance. You should not test before you are completely done with adjusting your model.

If you adjust your model after testing, you will start overfitting the test dataset, which will defeat its purpose.

In [4]:
test_loss, test_accuracy = model.evaluate(test_inputs, test_targets)



In [5]:
print('\nTest loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))


Test loss: 0.26. Test accuracy: 91.07%


Using the initial model and hyperparameters given in this notebook, the final test accuracy should be roughly around 91%.

Note that each time the code is rerun, we get a different accuracy because each training is different. 

We have intentionally reached a suboptimal solution, so you can have space to build on it!

In [16]:
predictions = model(test_inputs)
class_names = ['Buy', 'Not-Buy']
for i, logits in enumerate(predictions):
  class_idx = tf.argmax(logits).numpy()
  p = tf.nn.softmax(logits)[class_idx]
  name = class_names[class_idx]
  print("Example {} prediction: {} ({:4.1f}%)".format(i, name, 100*p))

Example 0 prediction: Not-Buy (69.2%)
Example 1 prediction: Buy (70.9%)
Example 2 prediction: Not-Buy (72.8%)
Example 3 prediction: Not-Buy (55.7%)
Example 4 prediction: Buy (70.5%)
Example 5 prediction: Not-Buy (68.2%)
Example 6 prediction: Not-Buy (73.1%)
Example 7 prediction: Not-Buy (65.7%)
Example 8 prediction: Not-Buy (56.4%)
Example 9 prediction: Not-Buy (61.2%)
Example 10 prediction: Not-Buy (65.8%)
Example 11 prediction: Buy (70.5%)
Example 12 prediction: Not-Buy (65.8%)
Example 13 prediction: Not-Buy (73.1%)
Example 14 prediction: Buy (69.3%)
Example 15 prediction: Not-Buy (67.6%)
Example 16 prediction: Not-Buy (65.3%)
Example 17 prediction: Buy (72.1%)
Example 18 prediction: Buy (73.1%)
Example 19 prediction: Buy (70.9%)
Example 20 prediction: Buy (71.3%)
Example 21 prediction: Not-Buy (62.5%)
Example 22 prediction: Not-Buy (73.1%)
Example 23 prediction: Not-Buy (73.0%)
Example 24 prediction: Buy (68.4%)
Example 25 prediction: Buy (71.7%)
Example 26 prediction: Not-Buy (61.9

Example 431 prediction: Buy (58.2%)
Example 432 prediction: Not-Buy (68.5%)
Example 433 prediction: Not-Buy (72.6%)
Example 434 prediction: Buy (69.6%)
Example 435 prediction: Not-Buy (73.1%)
Example 436 prediction: Not-Buy (65.2%)
Example 437 prediction: Buy (55.4%)
Example 438 prediction: Buy (71.9%)
Example 439 prediction: Buy (71.8%)
Example 440 prediction: Not-Buy (67.4%)
Example 441 prediction: Buy (69.9%)
Example 442 prediction: Not-Buy (66.8%)
Example 443 prediction: Not-Buy (73.0%)
Example 444 prediction: Not-Buy (66.9%)
Example 445 prediction: Not-Buy (68.9%)
Example 446 prediction: Buy (67.1%)
Example 447 prediction: Buy (70.9%)


In [11]:
predictions = model(test_inputs)
predictions

<tf.Tensor: id=6263, shape=(448, 2), dtype=float32, numpy=
array([[9.59809422e-02, 9.04019058e-01],
       [9.44087267e-01, 5.59127033e-02],
       [8.84664897e-03, 9.91153419e-01],
       [3.84557813e-01, 6.15442157e-01],
       [9.35128987e-01, 6.48710206e-02],
       [1.18563950e-01, 8.81435990e-01],
       [6.76299678e-04, 9.99323726e-01],
       [1.75945759e-01, 8.24054182e-01],
       [3.71449560e-01, 6.28550470e-01],
       [2.72902131e-01, 7.27097869e-01],
       [1.72046602e-01, 8.27953398e-01],
       [9.34883416e-01, 6.51165396e-02],
       [1.72966227e-01, 8.27033818e-01],
       [7.18894531e-04, 9.99281108e-01],
       [9.06849086e-01, 9.31508914e-02],
       [1.32682279e-01, 8.67317677e-01],
       [1.84791610e-01, 8.15208435e-01],
       [9.74225521e-01, 2.57744379e-02],
       [9.99478042e-01, 5.22000424e-04],
       [9.46068227e-01, 5.39317951e-02],
       [9.56177354e-01, 4.38226461e-02],
       [2.44730011e-01, 7.55269945e-01],
       [4.40501230e-04, 9.99559462e-01]