# Building Hypothesis Function

### Introduction

In the last lab we built out the weights and biases for the MNIST dataset.

In this lesson, we'll simply wrap the building of our layers in a couple of functions.  This way we're a little more familiar with it when we go to train the network.

### Architecting our Network

As we know, we build weight matrices and bias vectors for a network that look like the following:

$$
\begin{aligned}
z_1 & = W_1x + b_1 \\
a_1 & = \sigma(z_1) \\
z_2 & = W_2a_1 + b_2 \\
\end{aligned}
$$

### Automating the construction

Below we have created a function that follows the general pattern for constructing the linear layers of a neural network.  We execute the function in the cell below.  See if you can understand what the function is doing.  

In [145]:
import numpy as np
np.random.seed(0)

def init_model(n_features, neur_l1, neur_out):
    W1 = np.random.randn(n_features, neur_l1) 
    b1 = np.zeros((1, neur_l1))
    W2 = np.random.randn(neur_l1, neur_out) 
    b2 = np.zeros((1, neur_out))
    model = {'W1': W1, 'b1': b1, 'W2': W2, 'b2': b2}
    return model

In [146]:
# n_features = 28*28 n_neurons_l1 = 3 n_neurons_ouput = 2 
model = init_model(28*28, 15, 10)
# model

And then uncomment the `initial_model` to see if matrices and vectors constructed by the function match the weight matrices and bias vectors that you constructed in the previous lab.

### The forward method

So we just wrote a function that initializes the layers of our neural network.

In [147]:
initial_model.keys()

dict_keys(['W1', 'b1', 'W2', 'b2'])

Next, let's write a function that automatically makes predictions based on these weights and biases.  Now the outputs from the linear layers just involves the dot products.  But the activation layers involve some non-linear functions.

Let's define them below.

In [148]:
import numpy as np
def sigma(z):
    return 1/(1 + np.exp(-z))

In [149]:
def softmax(last_layer):
    return np.exp(last_layer)/np.sum(np.exp(last_layer))

In [150]:
X.shape

(50, 784)

In [151]:
def forward(X, model):
    W1, b1, W2, b2 = tuple(model.values())
    Z1 = (X.dot(W1) + b1)
    A1 = sigma(Z1)
    Z2 = (A1.dot(W2) + b2)
    predictions = softmax(Z2)
    return (Z1, Z2, predictions)

In [152]:
forward(X, model)[1].shape
# (50, 15)

(50, 10)

> Here we return the prediction $z_2$ from the model, as well as $z_1$ and $a_1$ as we will need them later when in computing the gradients.

In [153]:
W1, b1, W2, b2 = tuple(model.values())

In [154]:
X = np.random.randn(50, 28*28) # 28*28 = 784 pixels
forward(X[:20], model)[-1].shape

(20, 10)

In [155]:
z1.shape

(15, 50)

> The last array shows our predictions ($z_2$) for our neural network.

<center>
<a href="https://www.jigsawlabs.io/free" style="position: center"><img src="https://storage.cloud.google.com/curriculum-assets/curriculum-assets.nosync/mom-files/jigsaw-labs.png" width="15%" style="text-align: center"></a>
</center>