# Building a Neural Network in Keras
In the previous section, we built a neural network from scratch, that is, we wrote functions that perform forward-propagation and back-propagation.

We will be building a neural network using the **Keras** library, which provides utilities that make the process of building a complex neural network much easier.

## How to do it...
In this section, let's understand the process of building a model in Keras by using the same *toy dataset* that we worked on in the previous sections.

In [91]:
import numpy as np
x = np.array([[1],[2],[3],[4]])
y = np.array([[2],[4],[6],[8]])

1. Instantiate a model that can be called sequentially to add further layers on top of it. The `Sequential` method enables us to perform the model initialization:

In [92]:
from keras.models import Sequential
model = Sequential()

2. Add a dense layer to the model. A dense layer ensures the connection between various layers in a model. In the following code, we are connecting the input layer to the hidden layer:

In [93]:
from keras.layers import Dense
model.add(Dense(3, activation='relu', input_shape=(1,)))

We ensured that we provide the input shape to the model (we need to specify the shape of data that the model has to expect as this is the first dense layer).

Additionally, we mentioned that there will be three connections made to each input (three units in the hidden layer) and also that the activation that needs to be performed in the hidden layer is the ReLu activation.

3. Connect the hidden layer to the output layer:

In [94]:
model.add(Dense(1, activation='linear'))

Note that in this dense layer, we don't need to specify the input shape, as the model would already infer the input shape from the previous layer. 

Also, given that each output is one-dimensional, our output layer has one unit and the activation that we are performing is the linear activation.

The model summary can now be visualized as follows:

In [95]:
model.summary()

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_15 (Dense)             (None, 3)                 6         
_________________________________________________________________
dense_16 (Dense)             (None, 1)                 4         
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________


The preceding output confirms our discussion in the previous section: that there will be a total of 6 parameters in the connection from the input layer to the hidden layer (3 weights and 3 bias terms). In addition, 3 weights and 1 bias term connect the hidden layer to the output layer.

4. Extract the weight values. The order in which the weight values are presented is obtained by calling the `weights` method on top of the model, as follows:

In [96]:
model.weights

[<tf.Variable 'dense_15/kernel:0' shape=(1, 3) dtype=float32>,
 <tf.Variable 'dense_15/bias:0' shape=(3,) dtype=float32>,
 <tf.Variable 'dense_16/kernel:0' shape=(3, 1) dtype=float32>,
 <tf.Variable 'dense_16/bias:0' shape=(1,) dtype=float32>]

<div class="alert alert-block alert-success">
La siguiente lista generada en forma aleatoria es la que hemos empleado para inicializar la red neuronal correspondiente a nuestra implementacion a mano (scratch) en el cuaderno previo.
</div>

In [97]:
model.get_weights()

[array([[0.22940898, 0.77437544, 0.38963628]], dtype=float32),
 array([0., 0., 0.], dtype=float32),
 array([[-0.82037127],
        [-0.5443506 ],
        [ 1.193303  ]], dtype=float32),
 array([0.], dtype=float32)]

5. Compile the model. This ensures that we define the loss function, the optimizer (to reduce the loss) and the learning rate (corresponding to the optimizer): 

In [98]:
from keras.optimizers import sgd
s = sgd(lr = 0.01)

We specified that the optimizer is the *stochastic gradient descent* and the learning rate is 0.01. Pass the predefined optimizer as a parameter and reduce the *mean squared error* value:

In [99]:
model.compile(optimizer=s,loss='mean_squared_error')

6. Fit the model. Update the weights so that the model is a better fit:

In [100]:
model.fit(x, y, epochs=1, batch_size=4, verbose=1);

Epoch 1/1


The `fit` method expects that it receives two NumPy arrays: an input array and the corresponding output array. Note that `epochs` represents the number of times the total dataset is traversed through, and `batch_size` represents the number of data points that need to be considered in an iteration of updating the weights. Furthermore, `verbose` specifies that the output is more detailed, with information about losses (in training and test datasets) as well as the progress of the model training process. Extract the weight values:

In [101]:
model.get_weights()

[array([[-0.03451818,  0.59924877,  0.7735418 ]], dtype=float32),
 array([-0.08797573, -0.05837556,  0.12796852], dtype=float32),
 array([[-0.74656653],
        [-0.2952211 ],
        [ 1.3186555 ]], dtype=float32),
 array([0.10723891], dtype=float32)]

<div class="alert alert-block alert-info">
El resultado obtenido se corresponde con el asociado al entrenamiento de la red neuronal generada a mano (scratch) en el cuaderno previo!!!
</div>

7. Predict the output for a new set of input using the `predict` method:

In [102]:
x1 = np.array([[5],[6]])
model.predict(x1)

array([[4.5088406],
       [5.351964 ]], dtype=float32)

Notice that, while the preceding output is incorrect, the output when we run for 100 epochs is as follows:

In [103]:
model.fit(x, y, epochs=100, batch_size=4, verbose=1);
model.predict(x1)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

array([[ 9.850092],
       [11.777045]], dtype=float32)

The preceding output will match the expected output (which are 10, 12) as we run for even higher number of epochs.

In [1]:
from IPython.core.display import HTML
css_file = '.././styles/numericalmoocstyle.css'
HTML(open(css_file, 'r').read())