## 10. Introduction to Artificial Neural Networks with Keras

### Building Complex Models using the Functional API

An example of a non-sequential neural network is the <b>Wide & Deep</b> neural network. This architecture connects all or part of the inputs directly to the output layer. With this architecture, it is possible to learn both deep patterns  (using the deep path) and simple rules (using the short path). 

In contrast, a regular network forces all the data to flow through all layers, and simple patterns might end up being distorted.

<img src="img3.png" width="900"/>
(Ref. 1)

In [1]:
from keras.datasets import boston_housing

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import KFold, train_test_split

from tensorflow import keras as tf_keras

Using TensorFlow backend.


In [2]:
# Ingestion
###########
(train_data, y_train), (test_data, y_test) = boston_housing.load_data()

# Preprocessing
###############
sc = StandardScaler()
x_train = sc.fit_transform(train_data)
x_test = sc.transform(test_data)

x_train__train, x_train__val, y_train__train, y_train__val = train_test_split(x_train, y_train, test_size=0.15,
                                                                             random_state=0)
NUM_FEATURES = x_train.shape[1:]

<b>Single input, Single output</b>

Let's build a wide & deep network to tackle the **housing prices** problem. Take note of the comments describing each layer.

In [3]:
# Instantiate Model
###################

# Input object. This is needed as we might have multiple inputs.
input_layer = tf_keras.layers.Input(shape=NUM_FEATURES)

# Dense layer with 30 neurons & RELU activation. Notice it is called like a function,
# passing in the input layer. 
hidden_layer1 = tf_keras.layers.Dense(30, activation='relu')(input_layer)
# Another Dense layer. Now, the first hidden layer is passed in.
hidden_layer2 = tf_keras.layers.Dense(30, activation='relu')(hidden_layer1)

# Concatenate layer. concatenates the input & the output of the 2nd hidden layer
concat_layer = tf_keras.layers.Concatenate()([input_layer, hidden_layer2])

# Output layer. Single neuron and no activation function.
output_layer = tf_keras.layers.Dense(1)(concat_layer)

# Finally, create the Keras model with this architecture.
model0 = tf_keras.models.Model(inputs=[input_layer], outputs=output_layer)

The above model is visually represented by the following network diagram:
<img src="img3a.png" width="150"/>
(Ref. 2)

Once you have built the Keras model, the rest of the steps follows the simple workflow: Compile the model, train & tune it, and finalise the tuned model.

In [4]:
# Train & Tune Model
####################
model0.compile(optimizer='sgd', loss='mean_squared_error', metrics=['mae'])
history0 = model0.fit(x_train, y_train,  epochs = 10, verbose=0)

In [5]:
# Save model
# model0.save('model0.h5')

<b>Multiple inputs, Single output</b>

For this model, say we want to use a subset of the features for one input and another subset of features for another. To do this, we need to make changes on <u>both the architecture</u> and the <u>input data</u>.

In [6]:
# Instantiate Model
###################

# Here, we need to specify the no. of features for each input layer
input_layera = tf_keras.layers.Input(shape=(10,))
input_layerb = tf_keras.layers.Input(shape=(7,))

# Dense layers, Concatenate layer & Output layer is the same as previous complex workflows
hidden_layer1 = tf_keras.layers.Dense(30, activation='relu')(input_layerb)
hidden_layer2 = tf_keras.layers.Dense(30, activation='relu')(hidden_layer1)
concat_layer = tf_keras.layers.Concatenate()([input_layera, hidden_layer2])
output_layer = tf_keras.layers.Dense(1)(concat_layer)
model1 = tf_keras.models.Model(inputs=[input_layera, input_layerb], outputs=output_layer)

Here, we need to specify the features that will be fed into the different input layers.

In [7]:
# Prepare data for training model
#################################
inputa_cols = list(range(0,10))
inputb_cols = [1,5,6,7,8,11,12]
x_train__trainA = x_train__train[:,inputa_cols]
x_train__trainB = x_train__train[:,inputb_cols]
x_train__val_A = x_train__val[:,inputa_cols]
x_train__val_B = x_train__val[:,inputb_cols]

In [8]:
# Save model
# model1.save('model1.h5')

Here is what the model looks like:

<img src="img3b.png" width="150"/>

When we train the model, now we need to specify <u>a pair of matrices</u>. The same is true when performing validation & testing.

In [9]:
# Train & Tune Model
####################
model1.compile(optimizer='sgd', loss='mean_squared_error', metrics=['mae'])
model1.fit((x_train__trainA, x_train__trainB), y_train__train, epochs=20,
           validation_data=((x_train__val_A, x_train__val_B), y_train__val), verbose=0)

<tensorflow.python.keras.callbacks.History at 0x1339b5908>

In [10]:
# Prepare test data
###################
x_testA = x_test[:,inputa_cols]
x_testB = x_test[:,inputb_cols]

# Evaluation
model1.evaluate((x_testA, x_testB), y_test)

# Prediction
model1.predict((x_testA[:2], x_testB[:2]))



array([[ 7.2005463],
       [19.398804 ]], dtype=float32)

<b>Multiple inputs, Multiple outputs</b>

For multiple outputs, you can use the following code snippets to help you.

```python
input_layera = tf_keras.layers.Input(shape=(10,))
input_layerb = tf_keras.layers.Input(shape=(7,))

hidden_layer1 = tf_keras.layers.Dense(30, activation='relu')(input_layerb)
hidden_layer2 = tf_keras.layers.Dense(30, activation='relu')(hidden_layer1)
concat_layer = tf_keras.layers.Concatenate()([input_layera, hidden_layer2])
output_layer1 = tf_keras.layers.Dense(1)(concat_layer)
output_layer2 = tf_keras.layers.Dense(1)(hidden_layer2) # Add this
model3 = tf_keras.models.Model(inputs=[input_layera, input_layerb], 
                               outputs=[output_layer1, output_layer2]) # Change this
```

When compiling the model, use different metrics for different outputs

```python
model3.compile(optimizer='sgd', loss='mean_squared_error', metrics=['mae', 'mse'])
```

When evaluating the model, Keras returns the total loss, as well as the individual losses
```python
model3.evaluate((x_testA, x_testB), y_test)```

### Building Dynamic Models Using the Subclassing API

To add flexibility, we can use the Subclassing API to subclass the Model and create the layers needed.

Here, we separate the creating of the layers from their usage.

In [11]:
class WideAndDeepModel(tf_keras.models.Model):
    def __init__(self, units=30, activation='relu', **kwargs):
        super().__init__(**kwargs)
        self.hidden_layer1 = tf_keras.layers.Dense(units, activation=activation)
        self.hidden_layer2 = tf_keras.layers.Dense(units, activation=activation)
        self.output_layer = tf_keras.layers.Dense(1)
    
    def call(self, inputs):
        inputa, inputb = inputs
        hidden1 = self.hidden_layer1(inputb)
        hidden2 = self.hidden_layer2(hidden1)
        conct = tf_keras.layers.Concatenate()([inputa, hidden2])
        ouptt = self.output_layer(conct)
        return ouptt
        

In [12]:
# Load & Train model
model3 = WideAndDeepModel(30, 'relu')
model3.compile(optimizer='sgd', loss='mean_squared_error', metrics=['mae'])
model3.fit((x_train__trainA, x_train__trainB), y_train__train, epochs=20,
           validation_data=((x_train__val_A, x_train__val_B), y_train__val), verbose=0)

<tensorflow.python.keras.callbacks.History at 0x133fb0fd0>

In [13]:
# Evaluate & Predict
model3.evaluate((x_testA, x_testB), y_test)
model3.predict((x_testA[:2], x_testB[:2]))



array([[ 7.9615493],
       [16.662956 ]], dtype=float32)

### Saving & Restoring a Model

This is useful when models take a long time to train or when you need access to a previously trained model.

In [14]:
# Saving a model
model1.save('model3.h5')

In [15]:
# Load & Predict
model1ld = tf_keras.models.load_model('model3.h5')
model1ld.predict((x_testA[10:15], x_testB[10:15]))

array([[16.56538 ],
       [19.089624],
       [17.795437],
       [42.52627 ],
       [20.118984]], dtype=float32)

### Callbacks

Callbacks are useful to perform actions during training. For example, say we want to save the best model during training.

In [16]:
input_layer = tf_keras.layers.Input(shape=NUM_FEATURES)
hidden_layer1 = tf_keras.layers.Dense(30, activation='relu')(input_layer)
hidden_layer2 = tf_keras.layers.Dense(30, activation='relu')(hidden_layer1)
concat_layer = tf_keras.layers.Concatenate()([input_layer, hidden_layer2])
output_layer = tf_keras.layers.Dense(1)(concat_layer)
model0a = tf_keras.models.Model(inputs=[input_layer], outputs=output_layer)
model0a.compile(optimizer='sgd', loss='mean_squared_error', metrics=['mae'])

# Adding a callback to save only the best model
save_best_checkpoint = tf_keras.callbacks.ModelCheckpoint('model0a_best.h5', save_best_only=True)
model0a.fit(x_train, y_train,  epochs = 10, validation_data=(x_train__val, y_train__val), 
            callbacks=[save_best_checkpoint], verbose=0)

<tensorflow.python.keras.callbacks.History at 0x1340d3fd0>

In [17]:
# Adding a callback to Early Stop to avoid wasting time and resources
# with no further optimisation
stop_early_checkpoint = tf_keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)

# Combine both callbacks. Use large epoch number because the model will stop when there 
# is no more better performance in the metrics
model0a.fit(x_train, y_train,  epochs = 100, validation_data=(x_train__val, y_train__val), 
            callbacks=[save_best_checkpoint, stop_early_checkpoint], 
            verbose=0)

<tensorflow.python.keras.callbacks.History at 0x134334dd8>

### Visualisation using TensorBoard

In [18]:
import os

In [19]:
def get_run_logdir(root_logdir):
    import time
    run_id = time.strftime("r_%Y%m%d_%H%M%S")
    return os.path.join(root_logdir, run_id)

root_logdirp = os.path.join(os.curdir, "logs")
run_logdir = get_run_logdir(root_logdirp)
print(run_logdir)

./logs/r_20200601_123434


In [20]:
# Create the Tensorboard callback and use it
tensorboard_cb = tf_keras.callbacks.TensorBoard(run_logdir)
model0a.fit(x_train, y_train,  epochs = 100, validation_data=(x_train__val, y_train__val), 
            callbacks=[save_best_checkpoint, tensorboard_cb], 
            verbose=0)

<tensorflow.python.keras.callbacks.History at 0x134418358>

Finally, you can access the TensorBoard with `python -m tensorboard.main --logdir=r_20200601_122625/`

<img src="img3c.png" width="750"/>

Additional Readings:

- (1)  https://ai.googleblog.com/2016/06/wide-deep-learning-better-together-with.html
- (2)  https://github.com/lutzroeder/Netron