# Tutorial

## Implementing the hidden layer

Weights are stored in a **matrix**, indexed as $w_{ij}$. Each **row** in the matrix will correspond to the weights **leading out** of a **single output unit**, and each **column** will correspond to the weights **leading in** to a **single hidden unit**. For our three input units and two hidden units, the weights matrix looks like this:

![caption](./images/multilayer-diagram-weights.png)

To initialize these weights in NumPy, we have to provide the shape of the matrix. If `features` is a 2D array containing the input data:

```
# Number of records and input units
n_records, n_inputs = features.shape
# Number of hidden units
n_hidden = 2
weights_input_to_hidden = np.random.normal(0, n_input**-0.5, size=(n_inputs, n_hidden))
```

This creates a 2D array (i.e. a matrix) named `weights_input_to_hidden` with dimensions `n_inputs` and `n_hidden`. Remember how the input to a hidden unit is the sum of all hte inputs multiplied by the hidden unit's weights. So for each hidden layer unit, $h_j$, we need to calculate the following: $$h_j = \Sigma_{i} w_{ij}x_i$$. To do that we now need matrix multiplication... In this case, we are multiplying the inputs (a row vector) by the weights. To do this, you take the dot (inner) product of the inputs with each column in the weights matrix. For example, to calculate the input to the first hidden unit, $j = 1$, you'd take the dot product of the inputs with the first column of the weights matrix, like so:

![caption2](./images/input-times-weights.png)

Calculating the input to the first hidden unit with the first column of the weights matrix.

$$h_1 = x_1w_{11} + x_2w_{21} + x_3w_{31}$$

And for the second hidden layer input, you calculate the dot product of the inputs with the second column. And so on and so forth.

In NumPy, you can do this for all the inputs and all the outputs at once using `np.dot`

```
hidden_inputs = np.dot(inputs, weights_input_to_hidden)
```

You could also define your weights matriz such that it has dimensions `n_hidden` by `n_inputs` then multiple like so where the inputs form a *column vector*:

![caption3](./images/inputs-matrix.png)

**Note:** THe weight indices have changed because, in matrix notatoin, the row index always precedes the column index.

The important thing with matrix multiplication is that *the dimensions match.* For matrix multiplication to work, there has to be the same number of elements in the dot products. In the first example, there are three columns in the input vector, and three rows in the weights matrix. In the second example, there are three columns in the weights matrix and three rows in the input vector. If the dimensions don't match, you will get an error.

The dot product cannot be computed for a 3x2 matrix and a 3-element array. That is because the 2 columns in the matrix don't match the number of elements in the array. The rule is that if you are multiplying an array from the left, the array must have been the same number of elements as there are rows in the matrix. And if you are multiplying the *matrix* from the left, the number of columns in the matrix must equal the number of elements in the array on the right.

## Making a column vector

Sometimes you will want a column vector, even though by default NumPy's arrays work like row vectors. It is possible to get the transpose of an array like `arr.T`, but for a 1D array, the transpose will return a row vector. Instead, use `arr[:, None]` to create a column vector.

In [1]:
import numpy as np

In [2]:
features = np.array([0.49671415, -0.1382643 ,  0.64768854])

In [3]:
print(features)

[ 0.49671415 -0.1382643   0.64768854]


In [4]:
print(features.T)

[ 0.49671415 -0.1382643   0.64768854]


In [5]:
print(features[:, None])

[[ 0.49671415]
 [-0.1382643 ]
 [ 0.64768854]]


Alternatively, you can create arrays with two dimensions. Then you can use `arr.T` to get the column vector.

In [7]:
np.array(features, ndmin=2)

array([[ 0.49671415, -0.1382643 ,  0.64768854]])

In [8]:
np.array(features, ndmin=2).T

array([[ 0.49671415],
       [-0.1382643 ],
       [ 0.64768854]])

# Programming Quiz

You will implement a forward pass through a 4x3x2 network, with sigmoid activation functions for both layers.

Things to do:
* Calculate the input to the hidden layer.
* Calculate the hidden layer output.
* Calculate the input to the output layer.
* Calculate the output of the network.

In [20]:
import numpy as np

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1/(1+np.exp(-x))

# Network size
N_input = 4
N_hidden = 3
N_output = 2

np.random.seed(42)

# Make some fake data
X = np.random.randn(4)

weights_input_to_hidden = np.random.normal(0, scale=0.1, size=(N_input, N_hidden))
weights_hidden_to_output = np.random.normal(0, scale=0.1, size=(N_hidden, N_output))

# TODO: Make a forward pass through the network

hidden_layer_in = np.dot(X, weights_input_to_hidden)
hidden_layer_out = sigmoid(hidden_layer_in)

print('Hidden-layer Output:')
print(hidden_layer_out)

output_layer_in = np.dot(hidden_layer_out, weights_hidden_to_output)
output_layer_out = sigmoid(output_layer_in)

print('Output-layer Output:')
print(output_layer_out)

Hidden-layer Output:
[0.41492192 0.42604313 0.5002434 ]
Output-layer Output:
[0.49815196 0.48539772]
