### Introduction

In this notebook, we will build a neural network from scratch and code how it performs predictions using forward propagation. Please note that all deep learning libraries have the entire training and prediction processes implemented, and so in practice you wouldn't really need to build a neural network from scratch. However, to understand how the neural networks are actually built, this notebook can be useful

### Sample Neural Network

Let's look at an example neural network that takes two inputs, has one hidden layer with two nodes, and an output layer with one node.

In [53]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "http://cocl.us/neural_network_example", width=700) 

#### Let's build the sample Neural Network

There are a total of 6 weights and 3 biases, one for each node in the hidden layer as well as for each node in the output layer. Let's randomly initialize the weights and biases in the network

In [38]:
import numpy as np # import Numpy library to generate 
from random import seed

weights = np.around(np.random.uniform(size=6), decimals=2) # initialize the weights
biases = np.around(np.random.uniform(size=3), decimals=2) # initialize the biases

> Let's print the weights and biases cus why not?

In [4]:
print(weights)
print(biases)

[0.9  0.58 0.36 0.79 0.3  0.04]
[0.11 0.06 0.07]


The weights and biases have been assigned successfully. Let's assign values to the input nodes, x1 and x2

In [6]:
x_1 = 0.5 #input value 1
x_2 = 0.8 #input value 2

print("The values of input nodes are, x1 = {}, and x2 = {}".format(x_1, x_2))

The values of input nodes are, x1 = 0.5, and x2 = 0.8


Let's start by computing the weighted sum of the inputs, $z_{1, 1}$, at the first node of the hidden layer and assign the value to **z_11**. Next, let's compute the weighted sum of the inputs, $z_{1, 2}$, at the second node of the hidden layer. Assign the value to  **z_12**.

In [9]:
z_11 = x_1 * weights[0] + x_2 * weights[1] + biases[0]
z_12 = x_1 * weights[2] + x_2 * weights[3] + biases[1]

print('The weighted sum of the inputs at the first node in the hidden layer is {}'.format(np.around(z_11, decimals=4)))
print('The weighted sum of the inputs at the second node in the hidden layer is {}'.format(np.around(z_12, decimals=4)))

The weighted sum of the inputs at the first node in the hidden layer is 1.024
The weighted sum of the inputs at the second node in the hidden layer is 0.872


Next, assuming a sigmoid activation function, let's compute the activation of the first node, $a_{1, 1}$, in the hidden layer and assign the value to **a_11**. Let's also compute the activation of the second node, $a_{1, 2}$, in the hidden layer. Assign the value to **a_12**.

In [11]:
a_11 = 1.0 / (1.0 + np.exp(-z_11))
a_12 = 1.0 / (1.0 + np.exp(-z_12))

print('The activation of the first node in the hidden layer is {}'.format(np.around(a_11, decimals=4)))
print('The activation of the second node in the hidden layer is {}'.format(np.around(a_12, decimals=4)))

The activation of the first node in the hidden layer is 0.7358
The activation of the first node in the hidden layer is 0.7052


Now these activations will serve as the inputs to the output layer. So, let's compute the weighted sum of these inputs to the node in the output layer. Assign the value to **z_2**.

In [12]:
z_2 = a_11 * weights[4] + a_12 * weights[5] + biases[2]

print('The weighted sum of the inputs at the node in the output layer is {}'.format(np.around(z_2, decimals=4)))

The weighted sum of the inputs at the node in the output layer is 0.3189


Finally, let's compute the output of the network as the activation of the node in the output layer. Assign the value to **a_2**.

In [17]:
a_2 = 1.0 / (1.0 + np.exp(-z_2))

print('The activation of the output layer which is equivalent to the prediction made by the network is {}'.format(np.around(a_2, decimals=4)))

The activation of the output layer which is equivalent to the prediction made by the network is 0.5791


Obviously, neural networks for real problems are composed of many hidden layers and many more nodes in each layer. So, we can't continue making predictions using this very inefficient approach of computing the weighted sum at each node and the activation of each node manually. 

In order to code an automatic way of making predictions, let's generalize our network. A general network would take $n$ inputs, would have many hidden layers, each hidden layer having $m$ nodes, and would have an output layer. Although the network is showing one hidden layer, but we will code the network to have many hidden layers. Similarly, although the network shows an output layer with one node, we will code the network to have more than one node in the output layer.

In [54]:
Image(url= "http://cocl.us/general_neural_network", width=700) 

<img src="http://cocl.us/general_neural_network" alt="Neural Network General" width=600px>

### Initialize the network

Let's start by formally defining the structure of the network.

In [33]:
n = 2 # number of inputs
num_hidden_layers = 2 # number of hidden layers
m = [3, 3] # number of nodes in each hidden layer
num_nodes_output = 1 # number of nodes in the output layer

Now that we defined the structure of the network, let's go ahead and inititailize the weights and the biases in the network to random numbers. In order to be able to initialize the weights and the biases to random numbers, we will need to import the **Numpy** library.

In [36]:
def initialize_network(num_inputs, num_hidden_layers, num_nodes_hidden, num_nodes_output):
    
    num_nodes_previous = num_inputs # number of nodes in the previous layer

    network = {}
    
    # loop through each layer and randomly initialize the weights and biases associated with each layer
    for layer in range(num_hidden_layers + 1):
        
        if layer == num_hidden_layers:
            layer_name = 'output' # name last layer in the network output
            num_nodes = num_nodes_output
        else:
            layer_name = 'layer_{}'.format(layer + 1) # otherwise give the layer a number
            num_nodes = num_nodes_hidden[layer] 
     
        # initialize weights and bias for each node
        network[layer_name] = {}
        
        for node in range(num_nodes):
            node_name = 'node_{}'.format(node+1)
            network[layer_name][node_name] = {
                'weights': np.around(np.random.uniform(size=num_nodes_previous), decimals=2),
                'bias': np.around(np.random.uniform(size=1), decimals=2),
            }
    
        num_nodes_previous = num_nodes

    return network # return the network

initialize_network(n, num_hidden_layers, m, num_nodes_output)

{'layer_1': {'node_1': {'weights': array([0.32, 0.41]), 'bias': array([0.58])},
  'node_2': {'weights': array([0.7 , 0.96]), 'bias': array([0.03])},
  'node_3': {'weights': array([0.54, 0.97]), 'bias': array([0.69])}},
 'layer_2': {'node_1': {'weights': array([0.18, 0.57, 0.38]),
   'bias': array([0.33])},
  'node_2': {'weights': array([0.23, 0.62, 0.56]), 'bias': array([0.24])},
  'node_3': {'weights': array([0.07, 0.63, 0.83]), 'bias': array([0.76])}},
 'output': {'node_1': {'weights': array([0.65, 0.43, 0.6 ]),
   'bias': array([0.18])}}}

### Compute weighted sum at each node

The weighted sum at each node is computed as the dot product of the inputs and the weights plus the bias. So let's create a function called *compute_weighted_sum* that does just that.

In [37]:
def compute_weighted_sum(inputs, weights, bias):
    return np.sum(inputs * weights) + bias

#### Use the *initialize_network* function to create a network that:

1. takes 5 inputs
2. has three hidden layers
3. has 3 nodes in the first layer, 2 nodes in the second layer, and 3 nodes in the third layer
4. has 1 node in the output layer

Call the network **small_network**.

In [41]:
small_network = initialize_network(5, 3, [3, 2, 3], 1)
print(small_network)

{'layer_1': {'node_1': {'weights': array([0.42, 0.42, 0.46, 0.37, 0.47]), 'bias': array([0.04])}, 'node_2': {'weights': array([0.08, 0.73, 0.64, 0.03, 0.3 ]), 'bias': array([0.22])}, 'node_3': {'weights': array([0.06, 0.52, 0.42, 0.05, 0.57]), 'bias': array([0.8])}}, 'layer_2': {'node_1': {'weights': array([0.11, 0.28, 0.64]), 'bias': array([0.49])}, 'node_2': {'weights': array([0.51, 0.46, 0.89]), 'bias': array([0.61])}}, 'layer_3': {'node_1': {'weights': array([0.6 , 0.44]), 'bias': array([0.48])}, 'node_2': {'weights': array([0.89, 0.21]), 'bias': array([0.94])}, 'node_3': {'weights': array([0.07, 0.6 ]), 'bias': array([0.03])}}, 'output': {'node_1': {'weights': array([0.67, 0.64, 0.86]), 'bias': array([0.94])}}}


In [42]:
# Let's generate random input values to feed into the small network
np.random.seed(12)
inputs = np.around(np.random.uniform(size=5), decimals=2)

print('The inputs to the network are {}'.format(inputs))

The inputs to the network are [0.15 0.74 0.26 0.53 0.01]


In [43]:
node_weights = small_network['layer_1']['node_1']['weights']
node_bias = small_network['layer_1']['node_1']['bias']

weighted_sum = compute_weighted_sum(inputs, node_weights, node_bias)
print('The weighted sum at the first node in the hidden layer is {}'.format(np.around(weighted_sum[0], decimals=4)))

The weighted sum at the first node in the hidden layer is 0.7342


### Compute node activation

Recall that the output of each node is simply a non-linear tranformation of the weighted sum. We use activation functions for this mapping. Let's use the sigmoid function as the activation function here. So let's define a function that takes a weighted sum as input and returns the non-linear transformation of the input using the sigmoid function.

In [45]:
def node_activation(weighted_sum):
    return 1.0 / (1.0 + np.exp(-1 * weighted_sum))

#### Use the *node_activation* function to compute the output of the first node in the first hidden layer.

In [46]:
node_output  = node_activation(compute_weighted_sum(inputs, node_weights, node_bias))
print('The output of the first node in the hidden layer is {}'.format(np.around(node_output[0], decimals=4)))

The output of the first node in the hidden layer is 0.6757


### Forward Propagation

The final piece of building a neural network that can perform predictions is to put everything together. So let's create a function that applies the *compute_weighted_sum* and *node_activation* functions to each node in the network and propagates the data all the way to the output layer and outputs a prediction for each node in the output layer.

The way we are going to accomplish this is through the following procedure:

1. Start with the input layer as the input to the first hidden layer.
2. Compute the weighted sum at the nodes of the current layer.
3. Compute the output of the nodes of the current layer.
4. Set the output of the current layer to be the input to the next layer.
5. Move to the next layer in the network.
5. Repeat steps 2 - 4 until we compute the output of the output layer.

In [47]:
def forward_propagate(network, inputs):
    
    layer_inputs = list(inputs) # start with the input layer as the input to the first hidden layer
    
    for layer in network:
        
        layer_data = network[layer]
        
        layer_outputs = [] 
        for layer_node in layer_data:
        
            node_data = layer_data[layer_node]
        
            # compute the weighted sum and the output of each node at the same time 
            node_output = node_activation(compute_weighted_sum(layer_inputs, node_data['weights'], node_data['bias']))
            layer_outputs.append(np.around(node_output[0], decimals=4))
            
        if layer != 'output':
            print('The outputs of the nodes in hidden layer number {} is {}'.format(layer.split('_')[1], layer_outputs))
    
        layer_inputs = layer_outputs # set the output of this layer to be the input to next layer

    network_predictions = layer_outputs
    return network_predictions

#### Use the *forward_propagate* function to compute the prediction of our small network

In [48]:
predictions = forward_propagate(small_network, inputs)
print('The predicted value by the network for the given input is {}'.format(np.around(predictions[0], decimals=4)))

The outputs of the nodes in hidden layer number 1 is [0.6757, 0.7226, 0.7917]
The outputs of the nodes in hidden layer number 2 is [0.7813, 0.8799]
The outputs of the nodes in hidden layer number 3 is [0.7918, 0.8606, 0.6485]
The predicted value by the network for the given input is 0.9295
