## Import the libraries

In [1]:
import numpy as np
import tensorflow as tf

## Load the data

In [2]:
npz = np.load('Audiobooks_data_train.npz') # Loading training data
train_inputs = npz['inputs'].astype(np.float) # Storing training inputs as floating points
train_targets = npz['targets'].astype(np.int) # Storing training targets as integers

npz = np.load('Audiobooks_data_validation.npz') # Loading validation data
validation_inputs = npz['inputs'].astype(np.float) # Storing validation inputs as floating points
validation_targets = npz['targets'].astype(np.int) # Storing validation targets as floating points

npz = np.load('Audiobooks_data_test.npz') # Loading test data
test_inputs = npz['inputs'].astype(np.float) # Storing test inputs as floating points
test_targets = npz['targets'].astype(np.int) # Storing test targets as floating points


## Building the model

In [22]:
input_size = 10
output_size = 2
hidden_layer_size = 50

# Flatten command not needed, data already preprocessed

model = tf.keras.Sequential([
                            tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'),
                            tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'),
                            tf.keras.layers.Dense(output_size, activation = 'softmax')
                            ]) 
# Re-used MNIST code, softmax is best for classifier outputs with probabilites, otherwise signmoid

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

batch_size = 100

max_epochs = 100

early_stopping = tf.keras.callbacks.EarlyStopping(patience = 2) # early stopping to prevent overfitting but with tolerance

model.fit(train_inputs, 
          train_targets, 
          batch_size = batch_size, 
          epochs = max_epochs,
          callbacks = [early_stopping],
          validation_data=(validation_inputs, validation_targets),
          verbose = 2)

Epoch 1/100
3579/3579 - 2s - loss: 0.4115 - accuracy: 0.7670 - val_loss: 0.3687 - val_accuracy: 0.7875
Epoch 2/100
3579/3579 - 2s - loss: 0.3686 - accuracy: 0.7972 - val_loss: 0.3901 - val_accuracy: 0.7852
Epoch 3/100
3579/3579 - 2s - loss: 0.3542 - accuracy: 0.7997 - val_loss: 0.3869 - val_accuracy: 0.7450


<tensorflow.python.keras.callbacks.History at 0x2c517e094c0>

Overfitting is present due to oscillating validation accuracy. Early stopping is necessary but with tolerance

## Testing the model

In [23]:
test_loss, test_accuracy = model.evaluate(test_inputs, test_targets)



 Note: The accuracy of the validation, train and test data for the instructor was around 90 % but I got 80 %. Why?
 
 Post-processing:
 1. Doubling the hidden layer size has an insignificant effect
 2. The tanh function on one hidden layer is also insignificant
 3. Using a sigmoid for one hidden layer decrease the accuracy by 5%
 4. Doubling the number of hidden layers is also inconsequential
 5. Using a smaller batch size raised the accuracy by 3%
 6. Using the SGD decreases the accuracy by 4%