# <center> Keras </center>
## <center>1.5 Building a network</center>

# Building a network

We will now build a neural network for digit classification. 


# Code

Let us run the code from the previous section first to import the dataset and reshape the input and output data.

In [None]:
# Importing the MNIST dataset
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Processing the input data
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

# Processing the output data
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

Let us now examine the code related to creating a new network

In [None]:
# Build a network
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(units=512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(units=10, activation='softmax'))

### Models
There are two types of models available in Keras: the `Sequential model` and the `Model class used with functional API`.
The `Sequential model` is a linear stack of layers. You can create a Sequential model by passing a list of layer instances to the constructor.<br>
In this workshop, we will discuss only the Sequential model. Interested reader is pointed towards <a href = "https://keras.io/getting-started">official Keras documentation</a> for more details.
<br><br>

### Layers
There are three types of layers in a neural network:
- input layer
- hidden layer
- output layer
<br>

Keras support different kinds of layer, which can be used in the network. The simplest one, which we have also used in this example is the `Dense` layer. A `Dense` layer is a fully connected neural network layer. 
<br>
In the code block above, a fully connected neural network with two layers is created. The first hidden layer with 512 neurons is connected fully with the input layer containing 28 * 28 = 784 input neurons. The second layer is the output layer with 10 neurons which is fully connected with the first layer. <br>

The argument `input_shape` specifies the input dimensions of the neural network. This has to be provided only to the first layer in a sequential model. The argument `units` specifies the number of neurons in each layer. The argument `activation` specifies the activation function used in that layer. More details on units and activation functions are provided in separate sections.

The animation below shows the process of creating a network with Keras for 8 inputs and 1 output with two hidden layers containing 10 and 6 neurons respectively.
<img src="img/keras_layer_stacking.gif" width="100%" />
<br>In our example, we are using an input shape of 784 (number of pixels in each image) and 10 classes for digit classification (from 0-9) with one hidden layer containing 512 neurons.

## Best Practices


- There is no golden rule on how to select the number of hidden layers in a network. This has to be done by experimentation. A large number of hidden layers with lots of neurons may cause overfitting. A small number of hidden layers may not be able to capture the task efficiently. <br>
- One hidden layer is sufficient for the large majority of problems.
- The number of hidden nodes is based on a relationship between:
    - Number of input and output nodes
    - Amount of training data available
    - Complexity of the function that is trying to be learned
    - The training algorithm
    - To minimize the error and have a trained network that generalizes well, you need to pick an optimal number of hidden layers, as well as nodes in each hidden layer.
    - Too few nodes will lead to high error for your system as the predictive factors might be too complex for a small number of nodes to capture
    - Too many nodes will overfit to your training data and not generalize well

<img src="img/hiddenlayers.png" width="60%" />

<img src="img/structure_layer.PNG" width="60%" />

## Pitfalls
Adding many layers with a lot of neurons would not necessarily increase your prediction accuracy. It may lead to overfitting.

# Code 

Let us write a plotting function to visualize the prediction accuracy.

In [None]:
import matplotlib.pyplot as plt
def plot_training_history(history):
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    #loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

Compile the network and plot the results:

In [None]:
# Comile the network
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

# Train the network
history = network.fit(train_images, train_labels, epochs=5, batch_size=128, 
                      verbose=1, validation_data=(test_images, test_labels))

# Plot the training results
plot_training_history(history)

# Task
Add/remove layers with different neurons to the network and see how the prediction accuracy changes. Document the results.<br>

# Summary

We have seen in this section how to build a network with different layers.

# Feedback
<a href = "http://goto/ml101_doc/Keras05">Feedback: Build a network</a> <br>

# Navigation

<div>
<span> <h3 style="display:inline">&lt;&lt; Prev: <a href = "keras04.ipynb">Process output data</a></h3> </span>
<span style="float: right"><h3 style="display:inline">Next: <a href = "keras06.ipynb">Activation functions</a> &gt;&gt; </h3></span>
</div>