# 04-Forward Propagation

![](https://images.unsplash.com/photo-1478796415026-3c85ee65975e?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1050&q=80)
Photo by [Gaelle Marcel](https://unsplash.com/photos/gIj7RJPAkJA)


In this exercise you will implement your own forward propagation of a neural network, using Python and numpy only.

We consider again our example of neural network in the lectures:
![](../../../00-Lectures/images/MLP_with_activations.png)

Begin by defining a numpy array `X` with 3 random values.

In [4]:
np.random.seed(0)

In [5]:
# TODO: define X
### STRIP_START ###
import numpy as np

X=np.random.rand(3)
X
### STRIP_END ###

array([0.5488135 , 0.71518937, 0.60276338])

We will now redefine the sigmoid function, which will be our activation function in our neural network.

Reminder:
$$
sigmoid(x) = \frac{1}{1 + e^{- x}} 
$$

In [18]:
# TODO: Define the sigmoid function as g(x)
### STRIP_START ###
def g(x):
    return 1./(1+np.exp(-x))
### STRIP_END ###

As you can see on the plot, $a^{[1]}_1$ is computed with the values $x_1$, $x_2$ and $x_3$ as well as the associated weights $W^{[1]}_{1}$ and $b^{[1]}_{1}$:

$$
a^{[1]}_{1} = g(W^{[1]}_{1} \times X + b^{[1]}_{1})
$$

Begin by defining randomly $W^{[1]}_{1}$ (which is a numpy array of three values) and $b^{[1]}_{1}$ and then compute $a^{[1]}_{1}$. 

In [19]:
# TODO: Compute the activation of the first unit of the first layer a11
### STRIP_START ###
W11 = np.random.rand(3)
b11 = np.random.rand(1)

a11 = g(np.dot(X, np.transpose(W11))+b11)
a11
### STRIP_END ###

array([0.69207739])

Now do the same for the two other units of the first layer: compute $a^{[1]}_2$, $a^{[1]}_3$ and $a^{[1]}_4$ as `a12`, `a13` and `a14`.

Reminder:

$$
a^{[1]}_{2} = g(W^{[1]}_{2} \times X + b^{[1]}_{2})
$$

$$
a^{[1]}_{3} = g(W^{[1]}_{3} \times X + b^{[1]}_{3})
$$

$$
a^{[1]}_{4} = g(W^{[1]}_{4} \times X + b^{[1]}_{4})
$$


In [20]:
# TODO: compute a12, a13 and a14
### STRIP_START ###
W12 = np.random.rand(3)
b12 = np.random.rand(1)

W13 = np.random.rand(3)
b13 = np.random.rand(1)

W14 = np.random.rand(3)
b14 = np.random.rand(1)

a12 = g(np.dot(X, np.transpose(W12))+b12)
a13 = g(np.dot(X, np.transpose(W13))+b13)
a14 = g(np.dot(X, np.transpose(W14))+b14)
### STRIP_END ###

Now we want to transform our values $a^{[1]}_i$ (saved into `a11`, `a12`, `a13` and `a14` into a single vector of 4 values `a1`.

In [21]:
# TODO: compute a1
### STRIP_START ###
a1 = np.concatenate([a11, a12, a13, a14])
a1
### STRIP_END ###

array([0.69207739, 0.84369753, 0.88187849, 0.76261839])

The first layer is computed, now let's continue, we want to compute the values of the second layer, using the following formulas, still defining random values for weights and bias:

$$
a^{[2]}_{1} = g( W^{[2]}_{1} \times a^{[1]} + b^{[2]}_{1})
$$

$$
a^{[2]}_{2} = g(W^{[2]}_{2} \times a^{[1]} + b^{[2]}_{2})
$$

$$
a^{[2]}_{3} = g(W^{[2]}_{3} \times a^{[1]} + b^{[2]}_{3})
$$

$$
a^{[2]}_{4} = g(W^{[2]}_{4} \times a^{[1]} + b^{[2]}_{4})
$$

Be careful, now the weights $W^{[2]}_i$ might not have the same dimension...

In [22]:
# TODO: compute a21, a22, a23, a24
### STRIP_START ###
W21 = np.random.rand(4)
b21 = np.random.rand(1)

W22 = np.random.rand(4)
b22 = np.random.rand(1)

W23 = np.random.rand(4)
b23 = np.random.rand(1)

W24 = np.random.rand(4)
b24 = np.random.rand(1)

a21 = g(np.dot(a1, np.transpose(W21))+b21)
a22 = g(np.dot(a1, np.transpose(W22))+b22)
a23 = g(np.dot(a1, np.transpose(W23))+b23)
a24 = g(np.dot(a1, np.transpose(W24))+b24)
### STRIP_END ###

Again, compute the vector `a2`, concatenation of `a21`, `a22`, `a23` and `a24`.

In [23]:
# TODO: compute a2
### STRIP_START ###
a2 = np.concatenate([a21, a22, a23, a24])
a2
### STRIP_END ###

array([0.74922896, 0.86458958, 0.90727622, 0.84250911])

Finally, compute the output value `a3` using the following formula:

$$
a^{[3]}_{1} = g(W^{[3]}_{1} \times a^{[2]} + b^{[3]}_{1})
$$

Again with random weights and bias

In [24]:
# TODO: compute a3
### STRIP_START ###
W31 = np.random.rand(4)
b31 = np.random.rand(1)

a3 = g(np.dot(a2, np.transpose(W31))+b31)
print(a3)
### STRIP_END ###

[0.82879239]


You have built your own neural network, impressive, right?

You now know how a neural network can compute a value (for regression or classification) just by having units and layers.

# Optional: vectorization

This part is optional and a bit more complicated.

You might have noticed that we made a lot of computations that could be vectorized. Meaning, instead of computing separately `a11`, `a12`, `a13`, `a14`, we could have compute directly `a1`. It works for other layers too. We will do it that way now.

We will keep our input vector `X`, and define a weight matrix `W1` and a bias vector `b1` randomly. And then, having those three variables, compute `a1` in just one line of code.

In [34]:
# TODO: compute a1 in one line
### STRIP_START ###
W1 = np.random.rand(3, 4)
b1 = np.random.rand(4)

a1 = g(np.dot(X, W1) + b1)
a1
### STRIP_END ###

array([0.72901137, 0.82103318, 0.82225048, 0.8266251 ])

It gets easier, right? Now that you got it, compute `a2` and finally `a3` in just a couple of lines of code.

In [36]:
# TODO: compute a2 and a3
### STRIP_START ###
W2 = np.random.rand(4, 4)
b2 = np.random.rand(4)
W3 = np.random.rand(4)
b3 = np.random.rand(1)

a2 = g(np.dot(a1, W2) + b2)
a3 = g(np.dot(a2, W3) + b3)

a3
### STRIP_END ###

array([0.94148338])

This process is called vectorization, and helps a lot in computing matrix calculations.

If you still have time, you can try to define a function that takes as parameters an input vector `X`, the number of layers `L` and the units per layer `units` and returns the output of the associated neural network.

# Optional: generalization

If you have more time, try to generalize what you did: build a function (or a class!) that computes the forward propagation, with the number of input features, layers and units per layers as **parameters** of the function.