# Neural Network Library Documentation

### Released: 08/05/2018

This documentation describes how to use the Neural Network Library.

To import the Neural Network Library, use the import statements below.

In [1]:
import nn
import numpy as np

# Data Format

The train function accepts datasets. It requires a train_set to train, and also accepts a validation_set for evaluation. The score function accepts a test_set. These different sets are consistent with the usual machine learning methodologies involving training, testing, and validation. 

Each of the train_set, test_set, and validation_set are examples of a dataset. In this documentation, dataset has a specific meaning, it's a list containing two arrays. The first array is an array of feature vectors, the second item is an arary of label vectors. Each feature vector and label vector must be an array in it's own right, even if it has only one entry. 

The following summary should help explain how to put together an appropriate dataset.

In [None]:
# train_set is a list of two arrays.
train_set = [x, y]

# x is a numpy array where each element is itself an array of features.
x.shape = (16, 4)
x = array([[ 0.25,  0.5 ,  0.75,  0.3 ],
           [ 0.75,  0.25,  0.6 ,  0.5 ],
           [ 0.25,  0.8 ,  0.2 ,  0.25],
           [ 0.7 ,  0.75,  0.1 ,  0.4 ],
           [ 0.2 ,  0.5 ,  0.75,  0.25],
           [ 0.5 ,  0.8 ,  0.75,  0.25],
           [ 0.25,  0.75,  0.5 ,  0.5 ],
           [ 0.1 ,  0.5 ,  0.75,  0.9 ],
           [ 0.7 ,  0.75,  0.4 ,  0.5 ],
           [ 0.25,  0.5 ,  0.75,  0.2 ],
           [ 0.1 ,  0.75,  0.7 ,  0.8 ],
           [ 0.5 ,  0.4 ,  0.75,  0.25],
           [ 0.75,  0.5 ,  0.5 ,  0.9 ],
           [ 0.6 ,  0.75,  0.6 ,  0.5 ],
           [ 0.2 ,  0.7 ,  0.25,  0.75],
           [ 0.25,  0.3 ,  0.75,  0.5 ]])
      
# y is a numpy array where each element is itself an array of labels, in this case, there is only one label.
y.shape = (16, 1)
y = array([[ 0.6125],
           [ 1.0475],
           [ 0.5325],
           [ 0.915 ],
           [ 0.5525],
           [ 0.95  ],
           [ 0.7375],
           [ 1.01  ],
           [ 1.105 ],
           [ 0.5375],
           [ 1.015 ],
           [ 0.7125],
           [ 1.3125],
           [ 1.095 ],
           [ 0.6025],
           [ 0.6625]])

# Functions

### toy_data_wiggle

This function returns a toy dataset for a basic wiggly line. There are only 16 datapoints in the training set and 4 points in the test set. There is no validation set. Each feature vector contains 4 features, each label vector contains 1 label.

In [None]:
nn.toy_data_wiggle()

In [None]:
train_set, valid_set, test_set = nn.toy_data_wiggle()

<hr>

### toy_data_mnist

This function returns a toy dataset for MNIST. There are 50,000 datapoints in the train_set, 10,000 datapoints in the valid_set, and 10,000 datapoints in the test_set. Each feature vector contains 784 features, each of the pixels of the image. Each label vector contains 10 labels, a one hot encoded vector of the numeral represented by the image.

In [None]:
nn.toy_data_mnist()

In [2]:
train_set, valid_set, test_set = nn.toy_data_mnist()

<hr>

### init

This function initialises the neural network object, it requires a vector of layer sizes.

In [None]:
__init__(layers=[784, 30, 10], 
        cost_function="quadratic", 
        activation_function="sigmoid", 
        regularisation_coefficient=0.001)

In [None]:
my_network = nn.network(layers=[784, 30, 10], 
                        cost_function="quadratic", 
                        activation_function="sigmoid", 
                        regularisation_coefficient=0.001)

<u>Parameters</u><br><br>
<b>layers : </b> list, required
<br><br>A list of layer sizes, including input and output layers.
<br>Example: [784, 30, 10]

<br>
<b>cost_function : </b> string, optional (default="quadratic"), values=(quadratic, cross-entropy)
<br><br>The specified cost function for training, and evaluation. 
<br>Use cross-entropy with sigmoid neurons.

<br>
<b>activation_function : </b> string, optional (default="sigmoid"), values=(sigmoid, relu, tanh, arctan)
<br><br>The specified activation function for the non-linearity.

<br>
<b>regularisation_coefficient : </b> float, optional (default="0.0")
<br><br>The lambda coefficient for l2 regularisation. If left at 0, no regularisation is applied.

<hr>

### train

This function trains the network for a number of epochs, state is maintained between calls, so running the function twice in a row will train for double the number of epochs. The usual parameters are required for running a training operation, including learning rate, number of epochs, and batch size. There are options for outputting training progress including the frequency of evaluation, method of output and whether or not a validation set is used. 

In [None]:
nn.train(train_set, epochs, learning_rate, 
         batch_size=32, progress=None, 
         evaluation_method="loss", evaluation_frequency=100, 
         validation_set=None)

In [None]:
my_network.train(train_set=train_set, epochs=100, learning_rate=0.1, 
                 batch_size=50, progress=None, 
                 evaluation_method="loss", evaluation_frequency=1, 
                 validation_set=valid_set)

<u>Parameters</u><br><br>
<b>train_set : </b> dataset
<br><br>The training dataset. See section on Input Data Format above for details.

<br><b>epochs :</b> int, required
<br><br>The number of epochs to train for. (Complete passes through the training data.)

<br><b>learning_rate :</b> float, required
<br>The learning rate, controls how large the weight updates are. 
<br><br>Small values may cause the network to learn very slowly. Large values may cause non-convergence.

<br><b>batch_size :</b> int, optional (default=32)
<br><br>The number of training examples to use in Stochastic Gradient Descent every time the weights are updated.

<br><b>progress :</b> string, optional (default=None)
<br><br>Takes three values, "graph", "percentage_only", or "text". If graph, a graph will be displayed with the training losses, if text is specified, then training loss will be printed as it goes. In both cases, if validation data is provided with the validation_set parameters, this will be printed also.

<br><b>evaluation_method :</b> string, optional (default="loss"), values=(loss, accuracy)
<br><br>Determines whether results are displayed as a loss or accuracy. 
<br>If loss is chosen, the training and validation loss will be displayed, using whichever cost function provided when the model was created. If accuracy is chosen, then accuracy will be displayed.

<br><b>evaluation_frequency :</b> int, optional (default=100)
<br><br>The frequency in epochs to evaluate progress. 
<br>Example: If number of epochs is 1000, and evaluation_frequency is 100, then ten evaluations will be made, giving you ten data points on the loss graphs.
<br>If too many evaluations are made, training will be slowed down.

<br><b>validation_set :</b> dataset, optional (default=None)
<br><br>The validation dataset. See section on Input Data Format above for details.
<br>If this is provided, the algorithm will evaluate the model on the validation data at each evaluation (see evaluation_frequency parameter). It will automatically be plotted on the output loss graphs as a seperate line.

<br><b>dropout :</b> int, optional (default=None)
<br><br>The keep probability for dropout.

<br><br>
<u>Outputs</u>

Note that when calculating the training loss, the model will use only a portion of the training set. The size of this portion will be the same as the size of the validation set, otherwise it will be 1000.

<br><b>Textual Loss Output :</b> string
<br><br>Results are output as new lines. The result that is output depends on what parameter was provided for evaluation_method. It will either be an accuracy or a loss. If loss is provided, the cost function used will be the one specified at model creation, the output will not indicate which loss model is being used.
<br> This output is only outputted if the progress parameter is set to "text".
<br>An example is shown below.

Training/Validation Accuracy: 0.954/0.8796. <br>
Training/Validation Accuracy: 0.955/0.8799. <br>
Training/Validation Accuracy: 0.956/0.88. <br>
Training/Validation Accuracy: 0.959/0.8802. <br>
Training/Validation Accuracy: 0.959/0.8799. <br>
Training/Validation Accuracy: 0.959/0.8799. <br>
Training/Validation Accuracy: 0.96/0.8798. <br>
Training/Validation Accuracy: 0.96/0.8803. <br>
Training/Validation Accuracy: 0.96/0.8809. <br>
Training/Validation Accuracy: 0.96/0.8809.<br>

<br><b>Graphical Loss Output :</b> bokeh graph
<br><br>Results are output as a bokeh graph with a standard set of tools for navigation. The result that is output depends on what parameter was provided for evaluation_method. It will either be an accuracy or a loss. If loss is provided, the cost function used will be the one specified at model creation, the graph will not indicate which loss model is being used.
<br>This output is only outputted if the progress parameter is set to "graph".
<br>Whilst the network is training, progress is shown as a percentage.
<br>An example is shown below.

![title](graph_accuracy.png)

<hr>

### evaluate

This function evaluates the model once, using the provided test set, and a specified evaluation method. 

In [None]:
nn.evaluate(test_set, evaluation_method="loss")

<u>Parameters</u><br><br>
<b>test_set : </b> dataset
<br><br>The test dataset. See section on Input Data Format above for details.

<br>
<b>evaluation_method : </b> string, optional (default="loss"), values=(loss, accuracy)
<br><br>Specifies whether the evaluation is done using an accuracy or a loss. If loss is chosen, then the cost function used will be the one specified at model creation.

<br>
<u>Output</u><br><br>
<b>score : </b> float
<br><br>An accuracy score between 0 and 1.
<hr>

### search_parameter

This is an advanced user function that searches over a a range of values for a supplied parameter.

In [None]:
nn.search_parameter(train_set, valid_set, 
                    search_parameter, levels, 
                    epochs, evaluation_frequency, 
                    model_parameters, train_parameters)

In [None]:
nn.search_parameter(train_set=train_set, valid_set=valid_set, 
                    search_parameter="regularisation_coefficient", levels=[0.0001, 0.0002, 0.0003, 0.0004, 0.0005],
                    epochs=100, evaluation_frequency=2,
                    model_parameters={"layers":[784, 30, 10],
                                      "cost_function":"quadratic",
                                      "activation_function":"sigmoid",
                                      "regularisation_coefficient":0.001},
                    train_parameters={"learning_rate":1,
                                      "batch_size":50,
                                      "dropout":0})

<u>Parameters</u><br><br>
<b>train_set : </b> dataset
<br><br>The training dataset. See section on Input Data Format above for details.

<br>
<b>valid_set : </b> dataset
<br><br>The validation dataset. See section on Input Data Format above for details. Note that this is NOT optional.

<br>
<b>search_parameter : </b> string
<br><br>The parameter to search, must be a valid parameter.

<br>
<b>levels : </b> list
<br><br>The different levels to search. A neural network will be created and trained for each value in the list. The types of the values will depend on which parameter is being searched.

<br>
<b>epochs : </b> int
<br><br>The number of epochs to train for. See train function for more information.

<br>
<b>evaluation_frequency : </b> int
<br><br>The frequency at which the model is evaluated. See train function for more information.

<br>
<b>model_parameters : </b> dictionary
<br><br>A dictionary of all of the model parameters. Must include entries for "layers", "cost_function", "activation_function", and "regularisation_coefficient". All parameters must be provided, even the one that is being searched, this parameter will be internally overridden by the different levels.

<br>
<b>train_parameters : </b> dictionary
<br><br>A dictionary of all of the training parameters. Must include entries for "learning_rate", "batch_size", and "dropout". All parameters must be provided, even the one that is being searched, this parameter will be internally overridden by the different levels.

<br><b>Graphical Accuracy Output :</b> bokeh graph
<br><br>Results are output as a bokeh graph with the validation accuracies plotted for each level. 
<br>An example is shown below.

![title](search_parameter.png)

### reset

This function resets the weights and biases in the network. It takes no parameters.

In [None]:
nn.reset()

<hr>


### show_settings

This function prints out the current settings of the model that were specified at model creation.

In [None]:
my_network.show_settings()

<hr>

### draw_network

This function draws a pretty picture of your network.

In [None]:
my_network.draw_network()

The output will be something like the image below.

![title](draw_network.png)