# Dense Neural Network (with TensorFlow/Keras)
This first Notebook will take us through building our first neural network. If you haven't already, be sure to check (and if neccessary) switch to GPU processing by clicking Runtime > Change runtime type and selecting GPU. We can test this has worked with the following code:

In [None]:
import tensorflow as tf

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Num GPUs Available:  1


Hopefully your code shows you have 1 GPU available! Next let's get some data. We'll start with a Python favourite:

In [None]:
# upload an in-built Python (OK semi-in-built) dataset
from sklearn.datasets import load_diabetes

import pandas as pd
import numpy as np

# import the data
data = load_diabetes()
data

{'data': array([[ 0.03807591,  0.05068012,  0.06169621, ..., -0.00259226,
          0.01990749, -0.01764613],
        [-0.00188202, -0.04464164, -0.05147406, ..., -0.03949338,
         -0.06833155, -0.09220405],
        [ 0.08529891,  0.05068012,  0.04445121, ..., -0.00259226,
          0.00286131, -0.02593034],
        ...,
        [ 0.04170844,  0.05068012, -0.01590626, ..., -0.01107952,
         -0.04688253,  0.01549073],
        [-0.04547248, -0.04464164,  0.03906215, ...,  0.02655962,
          0.04452873, -0.02593034],
        [-0.04547248, -0.04464164, -0.0730303 , ..., -0.03949338,
         -0.00422151,  0.00306441]]),
 'target': array([151.,  75., 141., 206., 135.,  97., 138.,  63., 110., 310., 101.,
         69., 179., 185., 118., 171., 166., 144.,  97., 168.,  68.,  49.,
         68., 245., 184., 202., 137.,  85., 131., 283., 129.,  59., 341.,
         87.,  65., 102., 265., 276., 252.,  90., 100.,  55.,  61.,  92.,
        259.,  53., 190., 142.,  75., 142., 155., 225.,  59

We are working on a regression problem, with "structured" data which has already been cleaned and standardised. We can skip the usual cleaning/engineering steps. However, we do need to get the data into TensorFlow:

In [None]:
tf_dataset = tf.data.Dataset.from_tensor_slices((data.data, data.target))

Now our data is stored in tensors we can do train/test splitting as before. However, as we will care about batch size lets pick an easy number to work with:

In [None]:
data_size = len(data.data)
print(data_size)

442


Given 442 records we can take 400 as training (roughly 90% ... it is a small dataset) and keep 42 for test. In TF we use _take_ to subset the first $n$ values and _skip_ to ignore these and subset the rest:

In [None]:
train_dataset = tf_dataset.take(400)
test_dataset = tf_dataset.skip(400)

Now we can set up our batches for training. As we have a nice round 400 let's go with batches of 50 (8 batches in total). We'll also seperate the features and labels:

In [None]:
train_batch = train_dataset.batch(50) # batch size = 50
features, labels = next(iter(train_batch)) # iterate through each batch at training time

# print the first 20 records (rows) of features and labels
print(features[0:20])
print(labels[0:20])

tf.Tensor(
[[ 3.80759064e-02  5.06801187e-02  6.16962065e-02  2.18723855e-02
  -4.42234984e-02 -3.48207628e-02 -4.34008457e-02 -2.59226200e-03
   1.99074862e-02 -1.76461252e-02]
 [-1.88201653e-03 -4.46416365e-02 -5.14740612e-02 -2.63275281e-02
  -8.44872411e-03 -1.91633397e-02  7.44115641e-02 -3.94933829e-02
  -6.83315471e-02 -9.22040496e-02]
 [ 8.52989063e-02  5.06801187e-02  4.44512133e-02 -5.67042229e-03
  -4.55994513e-02 -3.41944659e-02 -3.23559322e-02 -2.59226200e-03
   2.86130929e-03 -2.59303390e-02]
 [-8.90629394e-02 -4.46416365e-02 -1.15950145e-02 -3.66560811e-02
   1.21905688e-02  2.49905934e-02 -3.60375700e-02  3.43088589e-02
   2.26877450e-02 -9.36191133e-03]
 [ 5.38306037e-03 -4.46416365e-02 -3.63846922e-02  2.18723855e-02
   3.93485161e-03  1.55961395e-02  8.14208361e-03 -2.59226200e-03
  -3.19876395e-02 -4.66408736e-02]
 [-9.26954778e-02 -4.46416365e-02 -4.06959405e-02 -1.94418262e-02
  -6.89906499e-02 -7.92878444e-02  4.12768238e-02 -7.63945038e-02
  -4.11761669e-02 -9.6

Now its time to build our model. We'll keep it simple ... a model with an input layer of 10 features and then 2x _Dense_ (fully connected) layers each with 5 neurons and ReLU activation. Our output layer will be size=1 given this is a regression problem and we want a single value output per prediction.

In [None]:
#Can make different choice in here
model = tf.keras.Sequential([
  tf.keras.layers.Dense( 8, activation=tf.nn.relu, input_shape=(10, )),  # 10 features projected in 5 neurons
  tf.keras.layers.Dense( 7, activation=tf.nn.relu),
   tf.keras.layers.Dense( 6, activation=tf.nn.relu),
  tf.keras.layers.Dense(1)  # regression #Output layer
])


In order to optimise the speed of the algorithm, we will compile it (using Keras) alongside specifiying some hyperparameters. Specifially this is an optimiser of 'ADAM', a loss/cost function of MSE, and the same used as a metric (to evaluate the validation set).

We will then fit the algorithm to the data (features and labels) and run for 2,000 epochs (2,000 full presentations of the data):

In [None]:
# compile the model
model.compile(optimizer=tf.keras.optimizers.AdamW(learning_rate=0.001,  weight_decay=0.005),  #AdamW = type of optimiser,  weight_decay: get smaller over time
              loss=tf.keras.losses.MeanSquaredError(),
              metrics=['mse'])

model.fit(features, labels, epochs=1500) #epochs = look at data 1500 times #Loss = squared error

Epoch 1/1500
Epoch 2/1500
Epoch 3/1500
Epoch 4/1500
Epoch 5/1500
Epoch 6/1500
Epoch 7/1500
Epoch 8/1500
Epoch 9/1500
Epoch 10/1500
Epoch 11/1500
Epoch 12/1500
Epoch 13/1500
Epoch 14/1500
Epoch 15/1500
Epoch 16/1500
Epoch 17/1500
Epoch 18/1500
Epoch 19/1500
Epoch 20/1500
Epoch 21/1500
Epoch 22/1500
Epoch 23/1500
Epoch 24/1500
Epoch 25/1500
Epoch 26/1500
Epoch 27/1500
Epoch 28/1500
Epoch 29/1500
Epoch 30/1500
Epoch 31/1500
Epoch 32/1500
Epoch 33/1500
Epoch 34/1500
Epoch 35/1500
Epoch 36/1500
Epoch 37/1500
Epoch 38/1500
Epoch 39/1500
Epoch 40/1500
Epoch 41/1500
Epoch 42/1500
Epoch 43/1500
Epoch 44/1500
Epoch 45/1500
Epoch 46/1500
Epoch 47/1500
Epoch 48/1500
Epoch 49/1500
Epoch 50/1500
Epoch 51/1500
Epoch 52/1500
Epoch 53/1500
Epoch 54/1500
Epoch 55/1500
Epoch 56/1500
Epoch 57/1500
Epoch 58/1500
Epoch 59/1500
Epoch 60/1500
Epoch 61/1500
Epoch 62/1500
Epoch 63/1500
Epoch 64/1500
Epoch 65/1500
Epoch 66/1500
Epoch 67/1500
Epoch 68/1500
Epoch 69/1500
Epoch 70/1500
Epoch 71/1500
Epoch 72/1500
E

<keras.src.callbacks.History at 0x7e29a06a0dc0>

Testing ...

Here we'll send the full test dataset (42 records) as one batch.

In [None]:
test_batch = test_dataset.batch(42) # the whole dataset
test_features, test_labels = next(iter(test_batch))

test_loss, test_mse = model.evaluate(test_features,  test_labels, verbose=2)
print('\nTest MSE:', test_mse) #MSE the lowest = good

2/2 - 0s - loss: 3592.0378 - mse: 3592.0378 - 122ms/epoch - 61ms/step

Test MSE: 3592.037841796875


Finally it's prediction time ...

In [None]:
y_pred = model.predict(test_features)
for pred, real in zip(y_pred, test_labels):
    print(f"Predicted: {pred[0]};    Real: {real}")

_, rmse = model.evaluate(test_features, test_labels, verbose=0)

Predicted: 190.12535095214844;    Real: 175.0
Predicted: 77.57962036132812;    Real: 93.0
Predicted: 123.94548034667969;    Real: 168.0
Predicted: 264.8620910644531;    Real: 275.0
Predicted: 196.36781311035156;    Real: 293.0
Predicted: 304.860107421875;    Real: 281.0
Predicted: 71.8263931274414;    Real: 72.0
Predicted: 145.33189392089844;    Real: 140.0
Predicted: 206.8909454345703;    Real: 189.0
Predicted: 157.62112426757812;    Real: 181.0
Predicted: 169.6982879638672;    Real: 209.0
Predicted: 153.8703155517578;    Real: 136.0
Predicted: 211.5583038330078;    Real: 261.0
Predicted: 105.9012222290039;    Real: 113.0
Predicted: 146.96881103515625;    Real: 131.0
Predicted: 162.42919921875;    Real: 174.0
Predicted: 204.90097045898438;    Real: 257.0
Predicted: 135.2172088623047;    Real: 55.0
Predicted: 74.1446762084961;    Real: 84.0
Predicted: 74.00794982910156;    Real: 42.0
Predicted: 106.22685241699219;    Real: 146.0
Predicted: 183.77883911132812;    Real: 212.0
Predicted: 

Overall they look like decent results. One neural network down ... well done 👊