# Neural Networks

Neural networks are the workhorse of modern machine learning algorithms. They are based on how a human brain works and operates. They utilise statistics and derivative mathematics to discover the relationship between inputs and outputs. In this Jupyter Notebook we will explore some of the underlying principles behind how neural networks work and slowly unfold some of the complex mathematics that underpins them.

## Neural Network Topology

An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal from one artificial neuron to another. An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it. 

There are three key components that make up the neural network, these are the activation function, the weights and biases.
- An activation function is a mathematical function that applies a mathematical formula to all of the inputs into the neuron to produce some (usually non-linear output).
- The weights are the multiplier applied to the connection between neurons that modify the values of the output of one neuron into the input of the next neuron.
- Biases are added or subtracted from neurons to shift the activation function.

These three properties of a neural network can be easily shown in the image below:
![](images/neurons.gif)

## Feed Forward

Before going in depth into how neural networks work, it is important to understand the concept of 'feed forward' in the context of neural networks. To help explain the concept of feed forward, we will use the previous 'students' dataset to find the relationship between exam score 1, 2, and whether they will be admitted or not.

In [None]:
from keras.models import Sequential
from keras.layers import Dense
import keras.utils
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

Like before, we will import the dataset. This time though, we will convert the 'y' field into two categories. This essentially makes two new columns 'Not Admitted' and 'Admitted', so when 'y' was previously a 0 this will now correspond to a 1 in the 'Not Admitted' column.

In [None]:
# Admission data: 
# - exam 1 score (x1) 
# - exam 2 score (x2)
# - admitted (y)
data = np.loadtxt('/aiuoa/datasets/students_1.txt', delimiter=',')

In [None]:
# Separate features (x1, x2) from target (y)
X, y = np.hsplit(data, np.array([2]))
y = keras.utils.to_categorical(y)
y_shape = y.shape[1]

Below we train a simple categorical neural network on the data given above utilising the sigmoid function that was showcased when looking at the logistic regression example. ![](images/sigmoid.png)

### Simple Network Architecture

In the code below we have declared a neural network that utilises the sigmoid function and only connects the input layer to the output layer. This results in a pretty bad neural network that appears to be no better than flipping a coin to determine whether a student was admitted or not.

In [None]:
model = Sequential()
# Output layer
model.add(Dense(2, activation='sigmoid', input_dim=2))

# For a binary classification problem
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(X, y, epochs=5, batch_size=100)

We can visualise this neural network using the VisualizeNN script which shows the relationship between the neurons and the weights. (Note this visualisation tool does not show biases).

In [None]:
import VisualizeNN as VisNN

In [None]:
# Draw the Neural Network with weights
network_structure = np.hstack(([X.shape[1]], [y_shape]))
weights = []
for i in range(0, len(model.get_weights())):
    if "bias" not in model.weights[i].name:
        weights.append(model.get_weights()[i])
network = VisNN.DrawNN(network_structure, weights)
network.draw()

### Complex Network Architecture

Now that we have shown that connecting the input straight to the output does not yield promising results, let us intoduce a hidden layer with three neurons to see if we can train this network to find the relationship between the input and output.

In [None]:
model = Sequential()
# Hidden Layer 1
model.add(Dense(3, activation='sigmoid', input_dim=2))

# Output layer
model.add(Dense(2, activation='sigmoid'))

# For a binary classification problem
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(X, y, epochs=5000, batch_size=100)

Sometimes it is useful to verify the model structure before going any further.

In [None]:
SVG(model_to_dot(model).create(prog='dot', format='svg'))

In [None]:
# Get layer size.
layer_size = []
for layer in model.layers:
    layer_size.append(int(layer.get_output_at(0).shape[1]))
layer_size.pop()
print(layer_size)

Finally we can visualise the network (without biases).

In [None]:
# Draw the Neural Network with weights
network_structure = np.hstack(([X.shape[1]], np.asarray(layer_size), [y_shape]))
weights = []
for i in range(0, len(model.get_weights())):
    if "bias" not in model.weights[i].name:
        weights.append(model.get_weights()[i])
network = VisNN.DrawNN(network_structure, weights)
network.draw()

We can also inspect the weight values of the network.

In [None]:
model.get_weights()

Finally we can verify that the network is working as expected by placing in some values.

In [None]:
test = np.array([[25, 25]])
print('true: 0, predicted:' + str(model.predict_classes(test)))
test = np.array([[100, 100]])
print('true: 1, predicted:' + str(model.predict_classes(test)))
test = np.array([[50, 50]])
print('true: 1/0, predicted:' + str(model.predict_classes(test)))
test = np.array([[60, 60]])
print('true: 1, predicted:' + str(model.predict_classes(test)))