# Artificial Neural Networks - Forward Propagation

## Introduction

In this lab, we will build a neural network from scratch and code how it performs predictions using forward propagation. Please note that all deep learning libraries have the entire training and prediction processes implemented, and so in practice you wouldn't really need to build a neural network from scratch. However, hopefully completing this lab will help you understand neural networks and how they work even better.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>    
1. <a href="#item1">Recap</a> <br></br> 
2. <a href="#item2">Initalize a Network</a>   <br></br> 
3. <a href="#item3">Compute Weighted Sum at Each Node</a>  <br></br>  
4. <a href="#item4">Compute Node Activation</a>   <br></br> 
5. <a href="#item5">Forward Propagation</a> <br></br> 
</font>
</div>

### Recap

From the videos, let's recap how a neural network makes predictions through the forward propagation process. Here is a neural network that takes two inputs, has one hidden layer with two nodes, and an output layer with one node.

<!-- 
<img src="http://cocl.us/neural_network_example" alt="Neural Network Example" width=600px>
-->

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/neural_network_example.png" alt="Neural Network Example" width=600px>

#### Side note:

Definition of the **sigmoid function** $f$ :

<!-- $a = f(z) = \frac{1}{1 + e^{-z}}$ -->
$$
a = f(z) = \frac{1}{1 + e^{-z}}
$$

    a = f(z) = 1 / (1 + e ^ (-z))


Let's start by randomly initializing the weights and the biases in the network. We have 6 weights and 3 biases, one for each node in the hidden layer as well as for each node in the output layer.

In [1]:
import numpy as np

In [2]:
np.random.seed(42)

In [3]:
weights = np.around(np.random.uniform(size=6), decimals=2) # initialize the weights
biases = np.around(np.random.uniform(size=3), decimals=2) # initialize the biases

Let's print the weights and biases for sanity check.

In [4]:
print(weights)
print(biases)

[0.37 0.95 0.73 0.6  0.16 0.16]
[0.06 0.87 0.6 ]


Now that we have the weights and the biases defined for the network, let's compute the output for a given input, $x_1$ and $x_2$.

In [5]:
x_1 = 0.5 # input 1
x_2 = 0.85 # input 2

# print('x1 is {} and x2 is {}'.format(x_1, x_2))
print(f'x1 is {x_1} and x2 is {x_2}')

x1 is 0.5 and x2 is 0.85


Let's start by computing the weighted sum of the inputs, $z_{1, 1}$, at the first node of the hidden layer.

In [6]:
z_11 = x_1 * weights[0] + x_2 * weights[1] + biases[0]

# print('The weighted sum of the inputs at the first node in the hidden layer is {}'.format(z_11))
print(f'The weighted sum of the inputs at the first node in the hidden layer is {z_11}')

The weighted sum of the inputs at the first node in the hidden layer is 1.0525


Next, let's compute the weighted sum of the inputs, $z_{1, 2}$, at the second node of the hidden layer. Assign the value to **z_12**.

In [7]:
### type your answer here
z_12 = x_1 * weights[2] + x_2 * weights[3]  + biases[1]

Double-click __here__ for the solution.
<!-- The correct answer is:
z_12 = x_1 * weights[2] + x_2 * weights[3] + biases[1]
-->

Print the weighted sum.

In [8]:
# print('The weighted sum of the inputs at the second node in the hidden layer is {}'.format(np.around(z_12, decimals=4)))
print(f'The weighted sum of the inputs at the second node in the hidden layer is {np.around(z_12, decimals=4)}')

The weighted sum of the inputs at the second node in the hidden layer is 1.745


Next, assuming a sigmoid activation function, let's compute the activation of the first node, $a_{1, 1}$, in the hidden layer.

In [9]:
# a_11 = 1.0 / (1.0 + np.exp(-z_11))

In [10]:
def sigmoid(z):
    return (1.0 / (1.0 + np.exp(-z)))

In [11]:
a_11 = sigmoid(z_11)

In [12]:
# print('The activation of the first node in the hidden layer is {}'.format(np.around(a_11, decimals=4)))
print(f'The activation of the first node in the hidden layer is {np.around(a_11, decimals=4)}')

The activation of the first node in the hidden layer is 0.7413


Let's also compute the activation of the second node, $a_{1, 2}$, in the hidden layer. Assign the value to **a_12**.

In [13]:
### type your answer here

a_12 = sigmoid(z_12)

Double-click __here__ for the solution.
<!-- The correct answer is:
a_12 = 1.0 / (1.0 + np.exp(-z_12))
-->

Print the activation of the second node.

In [14]:
# print('The activation of the second node in the hidden layer is {}'.format(np.around(a_12, decimals=4)))
print(f'The activation of the second node in the hidden layer is {np.around(a_12, decimals=4)}')

The activation of the second node in the hidden layer is 0.8513


Now these activations will serve as the inputs to the output layer. So, let's compute the weighted sum of these inputs to the node in the output layer. Assign the value to **z_2**.

In [15]:
### type your answer here

z_2 = a_11 * weights[4] + a_12 * weights[5]  + biases[2]

Double-click __here__ for the solution.
<!-- The correct answer is:
z_2 = a_11 * weights[4] + a_12 * weights[5] + biases[2]
-->

Print the weighted sum of the inputs at the node in the output layer.

In [16]:
# print('The weighted sum of the inputs at the node in the output layer is {}'.format(np.around(z_2, decimals=4)))
print(f'The weighted sum of the inputs at the node in the output layer is {np.around(z_2, decimals=4)}')

The weighted sum of the inputs at the node in the output layer is 0.8548


Finally, let's compute the output of the network as the activation of the node in the output layer. Assign the value to **a_2**.

In [18]:
### type your answer here

a_2 = sigmoid(z_2)

Double-click __here__ for the solution.
<!-- The correct answer is:
a_2 = 1.0 / (1.0 + np.exp(-z_2))
-->

Print the activation of the node in the output layer which is equivalent to the prediction made by the network.

In [19]:
# print('The output of the network for x1 = 0.5 and x2 = 0.85 is {}'.format(np.around(a_2, decimals=4)))
print(f'The output of the network for x1 = 0.5 and x2 = 0.85 is {np.around(a_2, decimals=4)}')

The output of the network for x1 = 0.5 and x2 = 0.85 is 0.7016


<hr>

Obviously, neural networks for real problems are composed of many hidden layers and many more nodes in each layer. So, we can't continue making predictions using this very inefficient approach of computing the weighted sum at each node and the activation of each node manually. 

In order to code an automatic way of making predictions, let's generalize our network. A general network would take $n$ inputs, would have many hidden layers, each hidden layer having $m$ nodes, and would have an output layer. Although the network is showing one hidden layer, but we will code the network to have many hidden layers. Similarly, although the network shows an output layer with one node, we will code the network to have more than one node in the output layer.

<!-- 
<img src="http://cocl.us/general_neural_network" alt="Neural Network General" width=600px>
-->

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/neural_network_general.png" alt="Neural Network General" width=600px>

## Initialize a Network

Let's start by formally defining the structure of the network.