# Building Hypothesis Function

### Introduction

Now let's try to build the weight and bias matrices of a neural network for the mnist dataset.  The mnist dataset is a classic dataset for practicing with a neural network.  We'll see that it is a dataset of images of handwritten digits.  The task of a neural network, is to train a neural network that can predict the associated digit for each handwritten image.

In this lesson, we won't be training the network, but will focus on constructing the weight matrices and bias vecotrs for a two layer neural network.

### Architecting our Network

In this lesson, we'll build weight matrices and bias vectors for a network that looks like the following:

$$
\begin{aligned}
z_1 & = xW_1 + b_1 \\
a_1 & = \sigma(z_1) \\
z_2 & = a_1W_2 + b_2 \\
\end{aligned}
$$

### Automating the construction

Below we have created a function that follows the general pattern for constructing the linear layers of a neural network.  We execute the function in the cell below.  See if you can understand what the function is doing.  

In [42]:
import numpy as np
np.random.seed(0)

def init_model(n_features, neur_l1, neur_out):
    W1 = np.random.randn(n_features, neur_l1) 
    b1 = np.zeros((1, neur_l1))
    W2 = np.random.randn(neur_l1, neur_out) 
    b2 = np.zeros((1, neur_out))
    model = {'W1': W1, 'b1': b1, 'W2': W2, 'b2': b2}
    return model

In [43]:
# n_features = 28*28 n_neurons_l1 = 3 n_neurons_ouput = 2 
initial_model = init_model(28*28, 15, 10)
# initial_model

And then uncomment the `initial_model` to see if matrices and vectors constructed by the function match the weight matrices and bias vectors that you constructed above.

### The forward method

We can translate the above into code with the `feed_forward` method.

In [9]:
import numpy as np
def sigma(x):
    return 1/(1 + np.exp(-x))

In [10]:
def forward(X, model):
    W1, b1, W2, b2 = tuple(model.values())
    z1 = X.dot(W1) + b1 
    a1 = sigma(z1)
    z2 = a1.dot(W2) + b2
    return (z1, a1, z2)

> Here we return the prediction $z_2$ from the model, as well as $z_1$ and $a_1$ as we will need them later when in computing the gradients.

In [11]:
forward(X[:1], model)

(array([[1.66364522, 0.82397314, 0.19343284]]),
 array([[0.84072672, 0.69507908, 0.54820799]]),
 array([[0.46533583, 0.55159625]]))

> The last array shows our predictions ($z_2$) for our neural network.

<center>
<a href="https://www.jigsawlabs.io/free" style="position: center"><img src="https://storage.cloud.google.com/curriculum-assets/curriculum-assets.nosync/mom-files/jigsaw-labs.png" width="15%" style="text-align: center"></a>
</center>