# Building Layers

### Introduction

### Multiple Layers

Let's start by taking another look at our single layer matrix:

<img src="./first-layer.png" width="20%">

In [2]:
import numpy as np
np.random.seed(2)
W_1 = np.random.randn(10, 5)

b_1 = np.random.randn(5)
b_1

array([ 1.00036589, -0.38109252, -0.37566942, -0.07447076,  0.43349633])

In [3]:
def sigmoid(value): return 1/(1 + np.exp(-value))

x = np.array([.9, .4, .5, .6, .9, .8, .7, .2, .4, .3])
sigmoid(x.dot(W_1) + b_1)

array([0.28594759, 0.91737969, 0.00451515, 0.17750781, 0.32831314])

<img src="./two-layers.png" width="40%">

In [6]:
W_1.shape

(10, 5)

In [8]:
b_1.shape

(5,)

> So we confirmed our first layer has five neurons ten parameters each.

In [10]:
W_1.shape

(10, 5)

In [11]:
W_2 = np.random.randn(5,3)

In [12]:
b_2 = np.random.randn(3)

We can place these two layers into a single list.

Now let's play around with this a little bit.  If we want to execute the first layer, we do the following:

In [16]:
first_output = sigmoid(x.dot(W_1) + b_1)
first_output

array([0.28594759, 0.91737969, 0.00451515, 0.17750781, 0.32831314])

And then this is fed into the second layer.

In [41]:
first_output.dot(W_2) + b_2

array([-0.92479789, -0.86391654, -0.49823304])

Great.  Now let's use a loop to clean this up.

So that is how our neural network makes a prediction.  

1. Initialize random weights and biases for each layer where the number of weights of each neuron are the columns, and there is a separate row for each neuron.

2. Feed forward each layer with the formula $\sigma(W \cdot x + b)$, where x is the vector of inputs for the first layer, and afterwards is the vector of the previous layer's outputs

### The last layer

We want our final layer to tell us the likelihood that our observation is each possibility.  So if we want our network to classify an observation as cancerous or benign, our last layer has two outputs.  If we would like our network to classify an image of one of twenty-six letters our last layer has 26 outputs (27 if we have a none of the above category).  So in the last layer, the number of neurons determines the number of outputs we predict.  So for a neuron predicting letters, we would have 26 neurons in the last layer.

In [59]:
z = np.array([1, 2, 3, 10, 4, 5, 6, 7])
z

array([ 1,  2,  3, 10,  4,  5,  6,  7])

In [60]:
exp_scores = np.exp(z)

In [61]:
exp_scores

array([2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 2.20264658e+04,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03])

In [62]:
probs = exp_scores / np.sum(exp_scores)

In [63]:
probs.round(3)

array([0.   , 0.   , 0.001, 0.927, 0.002, 0.006, 0.017, 0.046])