# Recap

Let's start by randomly initializing the weights and the biases in the network. We have 6 weights and 3 biases, one for each node in the hidden layer as well as for each node in the output layer.

In [1]:
import numpy as np

weights = np.around(np.random.uniform(size = 6), decimals = 2)
bias    = np.around(np.random.uniform(size = 3), decimals = 2)

In [2]:
print('Weights:', weights)
print('Bias:', bias)

Weights: [0.44 0.07 0.91 0.75 0.04 0.2 ]
Bias: [0.61 0.4  0.51]


Now that we have the weights and the biases defined for the network, let's compute the output for a given input, $x_1$ and $x_2$.

In [3]:
x_1 = 0.5
x_2 = 0.85
print('x1 is {}, x2 is {}'.format(x_1, x_2))

x1 is 0.5, x2 is 0.85


Let's start by computing the wighted sum of the inputs, $z_{1, 1}$, at the first node of the hidden layer.

In [4]:
z_11 = x_1 * weights[0] + x_2 * weights[1] + bias[0]
print('The weights sum of the inputs at the first node in the hidden layer is', z_11)

The weights sum of the inputs at the first node in the hidden layer is 0.8895


Next, let's compute the weighted sum of the inputs, $z_{1, 2}$, at the second node of the hidden layer. Assign the value to z_12.

In [5]:
z_12 = x_1 * weights[2] + x_2 * weights[3] + bias[1]
print('The weights sum of the inputs at the second node in the hidden layer is', z_12)

The weights sum of the inputs at the second node in the hidden layer is 1.4925000000000002


Next, assuming a sigmoid activation function, let's compute the activation of the first node, $a_{1,1}$ in the hidden layers

Now, we build a sigmoid activation function

In [6]:
def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

In [7]:
a_11 = sigmoid(z_11)
print('The activation of the first node in the hidden layer is', a_11)

The activation of the first node in the hidden layer is 0.7087869793423419


Let's also compute the activation function of the second node, $a_{1,2}$ in the hidden layer. Assign the values to a_12

In [8]:
a_12 = sigmoid(z_12)
print('The activation of the second node in the hidden layer is', a_12)

The activation of the second node in the hidden layer is 0.8164532124233763


Now these activations will serve as the inputs to the output layer. So let's compute the weighted sum of these input to the node in the output layer. Assign the value to z_2

In [9]:
z_2 = a_11 * weights[4] + a_12 * weights[5] + bias[2]
print('The weighted sum of the input at the node in the output layer is', z_2)

The weighted sum of the input at the node in the output layer is 0.701642121658369


Finally, let's compute the output of the network as the activation of the node in the output layer. Assign the value to a_2

In [10]:
a_2 = sigmoid(z_2)
print('The output of the network for x1 = 0.5 and x2 = 0.85 is', a_2)

The output of the network for x1 = 0.5 and x2 = 0.85 is 0.6685517510721143


Obviously, neural networks for real problems are composed of many hidden layers and many more nodes in each layer. So, we can't continue making predictions using this very inefficient approach of computing the weighted sum at each node and the activation of each node manually.

In order to code an automatic way of making predictions, let's generalize our network. A general network would take $n$ inputs, would have many hidden layers, each hidden layer having $m$ nodes, and would have an output layer. Although the network is showing one hidden layer, but we will code the network to have many hidden layers. Similarly, although the network shows an output layer with one node, we will code the network to have more than one node in the output layer.

# Initialize a Network

Let's start by formally defining the structure of the network. We have **n** is *number of inputs*, **num_hidden_layers** is *number of hidden layer*, **m** is *number of node in each hidden layer*, **num_nodes_output** is *num of nodes in the output layer*.

In [11]:
n = 2
num_hidden_layers = 2
m = [2, 2]
num_nodes_output = 1

Now we defined the structure of the network, let's go ahead and initialize the weights and the biases in the network to random numbers.

In [12]:
def initializeNetwork(num_inputs, num_hidden_layers, num_nodes_hidden, num_nodes_output):
    num_nodes_previous = num_inputs 
    network = {}
    for layer in range(num_hidden_layers + 1):
        if layer == num_hidden_layers:
            layer_name = 'output'
            num_nodes = num_nodes_output
        else:
            layer_name = 'layer_{}'.format(layer + 1)
            num_nodes = num_nodes_hidden[layer] 
        
        network[layer_name] = {}
        for node in range(num_nodes):
            node_name = 'node_{}'.format(node+1)
            network[layer_name][node_name] = {
                'weights': np.around(np.random.uniform(size=num_nodes_previous), decimals=2),
                'bias': np.around(np.random.uniform(size=1), decimals=2),
            }
    
        num_nodes_previous = num_nodes
    
    return network

Use initializeNetwork function to create a network that:
    1. 5 input.
    2. 3 hidden layer.
    3. 3 nodes in the first layer, 2 nodes in the second layer and 3 nodes in the third layer.
    4. 1 node in the output layer

In [13]:
small_network = initializeNetwork(5, 3, [3,2,3], 1)
print(small_network)

{'layer_1': {'node_1': {'weights': array([0.15, 0.47, 0.16, 0.98, 0.48]), 'bias': array([0.62])}, 'node_2': {'weights': array([0.58, 0.21, 0.9 , 0.75, 0.15]), 'bias': array([0.99])}, 'node_3': {'weights': array([0.55, 0.31, 0.06, 0.5 , 0.05]), 'bias': array([0.34])}}, 'layer_2': {'node_1': {'weights': array([0.57, 0.14, 0.81]), 'bias': array([0.78])}, 'node_2': {'weights': array([0.46, 0.34, 0.43]), 'bias': array([0.43])}}, 'layer_3': {'node_1': {'weights': array([0.49, 0.23]), 'bias': array([0.14])}, 'node_2': {'weights': array([0.4, 0.6]), 'bias': array([0.77])}, 'node_3': {'weights': array([0.86, 0.6 ]), 'bias': array([0.83])}}, 'output': {'node_1': {'weights': array([0.2 , 0.34, 0.13]), 'bias': array([0.78])}}}


# Compute weighted sum at each node

In [14]:
def compute_weights_sum(input, weights, bias):
    return np.sum(input * weights) + bias

Let's generate 5 input that we can feed to **small_network**.

In [15]:
from random import seed
import numpy as np

np.random.seed(12)
inputs = np.around(np.random.uniform(size = 5), decimals = 2)
print('The inputs to the network are', inputs)

The inputs to the network are [0.15 0.74 0.26 0.53 0.01]


Use the **compute_weights_sum** function to compute the weighted sum at the first node in the first hidden layer

In [16]:
node_weights = small_network['layers_1']['node_1']['weights']
node_bias    = small_network['layers_1']['node_1']['bias']
weighted_sum = compute_weights_sum(inputs, node_weights, node_bias)
print('The weighted sum of the first node in the first hidden layer is', weighted_sum[0])

KeyError: 'layers_1'

# Compute node activation

Recall that the output of each node is simply a non-linear tranformation of the weighted sum. We use activation function for this mapping. Let's use the sigmoid function as the activation function here. So let's define a function that takes a weighted sum as input and returns the non-linear transformation of the input using the sigmoid function

In [None]:
def node_activation(weighted_sum):
    return 1.0 / (1.0 + np.exp(-1 * weighted_sum))

Use **node_activation** function to compute the output of the first node in the first hidden layer.

In [None]:
node_activation_result = node_activation(weighted_sum)
print('The output of the first node in the first hidden layer is', node_activation_result[0])

# Forward Propagation

The final piece of building a neural network that can perform predictions is to put everything together. So let's create a function that applies the compute_weighted_sum and node_activation functions to each node in the network and propagates the data all the way to the output layer and outputs a prediction for each node in the output layer.

The way we are going to accomplish this is through the following procedure:

    1. Start with the input layer as the input to the first hidden layer.
    2. Compute the weighted sum at the nodes of the current layer.
    3. Compute the output of the nodes of the current layer.
    4. Set the output of the current layer to be the input to the next layer.
    5. Move to the next layer in the network.
    6. Repeat steps 2 - 4 until we compute the output of the output layer.

In [None]:
def forward_propagation(network, inputs):
    layer_inputs = list(inputs)
    for layer in network:
        layer_data = network[layer]
        layer_outputs = []
        for layer_node in layer_data:
            node_data = layer_data[layer_node]
            node_output = node_activation(compute_weights_sum(layer_inputs, node_data['weights'], node_data['bias']))
            layer_outputs.append(np.around(node_output[0], decimals = 4))
        if layer != 'output':
            print('The outputs of the nodes in the hidden layer number {} is {}'.format(layer.split('_')[1], layer_outputs))
        layer_inputs = layer_outputs
    network_predictions = layer_outputs
    return network_predictions

In [None]:
forward_propagation_result = forward_propagation(small_network, inputs)