# Looking at the Math Behind a Neural Network:
Here is a primer on the basic linear algebra that goes on under the hood. This particular notebook is not meant to stand alone. This is because there exist vast resources that dive deep into the mathematics behind neural networks. 

This particular notebook should be treated as a way to experiment with some basic code as you read along more in-depth material. I suggest taking a look at the excellent, free online book found here:

http://neuralnetworksanddeeplearning.com/

In [1]:
import torch

In [2]:
def activation(x):
    """
    sigmoid activation function on x
    """
    return 1/(1 + torch.exp(-x))

## Single Layer Network:
A vanilla, fully-connected network is nothing but a set of matrices representing the weights and biases of the network. 

Each layer of a fully connected neural network is a stack of nodes. A given node in layer $l$, will recieve the output of every node in layer $l-1$. Each input to this single node will have a weight assigned to it. The node itself will also have a bias. 

These weights and biases form the network, and they are what we will later learn to train through back-propagation. For now, here is a simplistic look at how feedforward works:

In [3]:
torch.manual_seed(7)
features = torch.randn((1, 5))
weights = torch.randn_like(features)
bias = torch.randn((1,1))

1. A set of inputs to the network called `features` is created. Here, `features` is a row vector.
2. A single 'layer' of weights and bias are randomly initialized. 
3. We perform matrix-vector multiplication between the features and the layer weights.
4. We feed the result of the matrix-vector multiplication into an 'activation' function which squishes the output to a desired range, usually [0, 1].

In [4]:
output = activation(torch.mm(features.T, weights) + bias)
output

tensor([[0.6104, 0.5914, 0.5341, 0.5738, 0.6375],
        [0.4047, 0.5095, 0.7836, 0.6050, 0.2680],
        [0.3706, 0.4952, 0.8153, 0.6103, 0.2184],
        [0.7883, 0.6713, 0.2581, 0.5408, 0.8995],
        [0.2323, 0.4296, 0.9169, 0.6344, 0.0740]])

In mathematical notation, we are simply computing the equation of a hyper-plane (a line generalized to higher dimensions):


$$
y = m_1x_1 + m_2x_2 + \dots m_nx_n + b = \sum_{i = 0}^n m_ix_i + b 
$$
$$
\text{output} = \sigma (y)
$$

## 'Deep' Neural Networks:
A deep neural network is one that has more than a single hidden layer. Below, we take a look at an example of a feed-forward network with 2 hidden layers.

This time our weights are actual 2-dimensional matrices rather than simple row or column vectors. Each element of a weight matrix can be indexed as follows:

$$ w^l_{i, j}$$

Where the specific indeces can be described as:

\begin{align}
l& \text{ - Denotes layer $l$, the current layer.} \\
i& \text{ - Denotes input from the $i$th node in layer $l$.} \\
j& \text{ - Denotes the $j$th node in the previous layer $l - 1$.}
\end{align}

Therefore $w^l_{i, j}$ represents the weight of the input from neuron $j$ in the last layer to neuron $i$ in the current layer. 

Below is a functional example in code, but I highly encourage that you read the linked book for a deeper understanding. It has been written so well, it is almost pointless for me to write more here.

In [6]:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

# feed forward:
a1 = activation(torch.mm(features, W1) + B1)
output = activation(torch.mm(a1, W2) + B2)
output

tensor([[0.3171]])