# 001 - Building and Training Networks in Keras
This workbook contains a set of exercises designed to get you comfortable with building and training Neural Networks using the Keras API.

We will cover:
 - Creating Single layer networks
 - Creating Multi-Layer networks

In [None]:
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt
import pandas as pd

# Display graphs inline
%matplotlib inline

In [None]:
def plotTrainingVsModel(X, y, model):
  predicted = model.predict(X)

  # Plot the data out
  plt.plot(X, y, 'o', color='red', label="Expected Value")
  plt.plot(X, predicted, 'x', color='blue', label="Predicted Value")
  plt.title("Expected vs Model Predicted")
  plt.legend()
  plt.grid(b=True, which='major', color='#666666', linestyle='-')
  plt.show()

## Exercise 1
In this exercise you will use a simple data set and create a single layer network.

The data has the relationship that the output (_y_data_) is __twice the input plus 5__

We want to train a network to learn this sequence and be able to predict with reasonable accuracy the next values in the sequence - for example, if given the value 11 we would expect the output to be 27.

In [None]:
X_data = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
y_data = [7.0, 9.0, 11.0, 13.0, 15.0, 17.0, 19.0, 21.0, 23.0, 25.0]

# Convert these to numpy arrays
X_data = np.array(X_data, dtype=float)
y_data = np.array(y_data, dtype=float)

We have started the model by defining it as a Sequence model. Your task is to complete the line:

`model.add( None)` 

by replacing the text `None` with a defintion for layer.

As a hint, we can define a layer using the syntax

`keras.layers.Dense(units=number_of_units, input_shape=[size_of_input])`

You will need to replace __number_of_units__ and __size_of_feature__ with values based on:
- _number_of_units_: For this exercise set this to 1
- _size_of_input_: this is the size (shape) of an individual input.

__Hint__: This is almost identical to the example given in 'Developing DL Intuition'

In [None]:
model = keras.Sequential()

# YOUR CHANGES START HERE
model.add( None)
# YOUR CHANGES END HERE

# Compile the model with an optimizer and loss function
model.compile(tf.keras.optimizers.RMSprop(0.001), loss='mean_squared_error')

# Produce a sumary of our model
print("\nModel Summary\n")
model.summary()

Now that we have defined the model we want to train the model. We have completed most of this statement, but you need to decide on how many _epochs_ to train for.

To do this, change the value of _epochs_ from _None_ to some value (greater than 1)

In [None]:
# Train the model
history = model.fit(X_data, y_data, epochs=None)

Let's see how good our model is.

If the model, hasn't performed as well as you want it to, then re-run the steps in the exercise and train with a higher number of epochs.

In [None]:
plotTrainingVsModel(X_data, y_data, model)

Now let's test the model against some values it has not seen before. How well did it do?

Try out some other values to see how well the model has learned the relationship __(2 * x) + 5__

In [None]:
# Change this value from 11 if you want to try other values and re-run the cell
value_to_predict = 11

predicted = model.predict(value_to_predict)

print(predicted)

## Exercise 2
In this exercise we have data were each training instance consists of more than 3 values and we want to train a model that learns the relationship between these values and the output value.

The relationship between the input and output is that the output is the middle value of the input

In [None]:
X_data = [
    [1.0, 5.0, 2.0], 
    [4.0, 4.0, 5.0], 
    [7.0, 3.0, 8.0],
    [10.0, 3.0, 6.0],
    [4.0, 9.0, 5.0],
    [7.0, 2.0, 6.0],
    [1.0, 1.0, 3.0],
    [12.0, 4.0, 2.0],
    [6.0, 6.0, 1.0],
    [3.0, 3.0, 3.0]
]
# Here we are using some Python (List Comprehension) to calculate the values for us
y_data = [y  for [x, y, z] in X_data]
print(y_data)

# Convert these to numpy arrays
X_data = np.array(X_data, dtype=float)
y_data = np.array(y_data, dtype=float)

Now we need to build our model - this is the same as in the previous exercise so refer back to your work to get some hints.

As before your task is to complete the line:

`model.add( None)` 

by replacing the text `None` with a defintion for layer.

We can define a layer using the syntax

`keras.layers.Dense(units=number_of_units, input_shape=[size_of_input])`

You will need to replace __number_of_units__ and __size_of_feature__ with values based on:
- _number_of_units_: this defines how many nodes you want in your layer. Set this to 1.
- _size_of_input_: this is the size (shape) of an individual input.

A key difference between the previous model and this one is that previously you had 1 data item for each training example however now we have 3 and so our _size_of_input_ needs to change.

In [None]:
model = keras.Sequential()

# YOUR CHANGES START HERE
model.add( None)
# YOUR CHANGES END HERE

# Compile the model with an optimizer and loss function
model.compile(tf.keras.optimizers.RMSprop(0.001), loss='mean_squared_error')

# Produce a sumary of our model
print("\nModel Summary\n")
model.summary()

In [None]:
# Train the model
history = model.fit(X_data, y_data, epochs=100)

Now let's see how good our model is at predicting the output for a given input

In [None]:
# Change this value if you want to try other values and re-run the cell
value_to_predict = np.array([[1.0, 5.0, 1.0]], dtype=float)

predicted = model.predict(value_to_predict)

print(predicted)

## Exercise 3
In this we will use a more complex set of data and create a multi-layer network.

In this exercise we are less interested in the accuracy of the model (we'll return to this dataset again in a later lesson). For this exercise we are only interested in practicing building and training a multi-level network. 

In [None]:
data_file = keras.utils.get_file(fname="auto-mpg.data", origin="https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
# The data has the following coluns
column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
            'Acceleration', 'Model Year', 'Origin Country']
# Read in the file into a Pandas Dataset
raw_dataset = pd.read_csv(data_file, names=column_names,
                  na_values = "?", comment='\t',
                  sep=" ", skipinitialspace=True)
# We will discard any rows that contain data with missing values
raw_dataset = raw_dataset.dropna()
raw_dataset.drop(['Model Year', 'Origin Country'], axis=1)
raw_dataset.to_csv('simple.csv', index=False, header=False)
raw_dataset.head()

In [None]:
my_data = np.genfromtxt('simple.csv', delimiter=',')
X_data = my_data[:,1:].astype(np.float16)
y_data = my_data[:,0].astype(np.float16)

# Print out the first 5 records
for i in range(5):
    print("X_data = {}\t y_data = {}".format(X_data[i], y_data[i]))

We can determine the shape of our training samples using the _shape_ attribute.

In the case of __X_data__ the shape is (392, 7).
- The first number (392) is the number of training samples
- The second number (7) is the number of features in each training sample. We need this to define the size of the input on our input layer

In the case of __y_data__ the shape is (392,)
- The first number (392) is the number of training samples
- It doesn't have a second number since __y_data__ is a list of number so each training sample correspondes to a single output value.

In [None]:
print("Shape of X_data is {}".format(X_data.shape))
print("Shape of y_data is {}".format(y_data.shape))

Now we need to build our model.

Previously we only create a single layer network. In this exercise we are going to create a Multi-layer network. It will consist of:

__An Input layer__: we create this in the same way as we have done before but given our data is a bit more complex we will want to increase the _number_of_nodes_ from 1 to some higher value such as 32, 64 or 128. The size of the features should be set to the number of features for each sample in X_data.

In this exercise, change the values for _units_ and _input_shape_ to replace the value _None_ with more appropriate values.

__A Hidden layer__: we simply just _One or more_  Dense layers but since this is not an input layer we don't specify the input shape (Keras works this out for us). So we just need to specify the number of nodes and an activation function `model.add( keras.layers.Dense(units=??, activation=tf.nn.relu) )`

In this exercise, change the value for _units_ to replace the value _None_ with more appropriate values. Optionally, one or more additional layers.

__An Output layer__: this is the layer that will produce the prediction and is just another Dense layer but the number of units must match the number of target features (1) in our __y_data__

In this exercise, change the value for _units_ to replace the value _None_ with the appropriate values.

In [None]:
model = keras.Sequential()
# YOUR CHANGES START HERE

# Input Layer
model.add(keras.layers.Dense(units=None, activation=tf.nn.relu, input_shape=[None]) )

# Hidden Layer
model.add( keras.layers.Dense(units=None, activation=tf.nn.relu) )

# Output Layer - we want a single value to be output so our output layer has a single unit
model.add( keras.layers.Dense(None) )

# YOUR CHANGES END HERE

# Compile the model with an optimizer and loss function
model.compile(keras.optimizers.RMSprop(0.001), loss='mean_squared_error')

# Produce Summary
model.summary()

We are now ready to train our model, specify the number of epochs to train for. Aim to get the _loss_ value below 50.

In [None]:
# Train the model
history = model.fit(X_data, y_data, epochs=100)

# Key Learning Points
You have now created a set of models to 'solve' some toy datasets. The key points to remember from this exercise are:

## Networks are like Onions...they have layers
At a simplistic level a Neural Network can be thought of as a series of layers.

__Input Layer__: this layer accepts the raw inputs and passes them to the network for processing.  All Neural Networks have this kind of layer. These layers primarily take the data from the outside for processing and typically don't have any learning associted with them. Keras has a special network layer called __Input__ that we can use for this purpose but we can also specify the input as part of the first layer.

__Hidden Layers__: Networks may have one or more hidden layers, these are layers that sit behind the Inputput layer and perform the learning. Some simple networks might only have 1 layer but complex networks can have many thousands of hidden layers. 

We can add as many hidden layers as we want but there are limits due to the way that Optimisation algorithms work. This has lead to more advance architectures to overcome what is known as the __Vanishing Gradient__ problem.

__Output Layer__: networks contain an output layer that takes the data from the previous layers and marshals this into an output.

## Think about your input data
As data passes between layers the shape (dimentions) of the data often change and within the model, Keras does an excellent job of determining these changes and so we don't need to be concerned about it.

Howver, Keras understands little about the interface into the model and so we must specify the shape of the input at the _Input layer_. This needs to match the shape of the data we are passing into the model. To do this we need to look at the shape of the data for a single training instance and use this to tell Keras what shape the data is.

We don't need to tell Keras how many training instance we have, only the shape of the training isntance.

## Think about your output data
Any model we create aims to produce an output and so we need to consider the shape of our output. Typically if we are predicting a single value (e.g. the next number in a sequence) then our _Output Layer_ will be a _Dense_ layer with a single unit. Later we will see instance were we need more than 1 unit in the output layer.