Multi Layer perceptron (MLP) is an artificial neural network with one or more hidden layers between input and output layer. Refer to the following figure:
Image from Karim, 2016. A multilayer perceptron with six input neurons, two hidden layers, and one output layer.
MLP's are fully connected (each hidden node is connected to each input node etc.). They use backpropagation as part of their learning phase. MLPs are widely used for pattern classification, recognition, prediction and approximation. Multi Layer Perceptron can solve problems which are not linearly separable (Neuroph).
This implementation of MLP was written using C and can perform multi-class classification. Each of the hidden layers and the output layer can run their own activation functions which can be specified during runtime. Supported activation functions are:
- identity
f(x) = x
- sigmoid
f(x) = 1/(1 + e^-x)
- tanh
f(x) = tanh(x)
- relu
f(x) = max(0, x)
- softmax
f(x) = e^x / sum(e^x)
First, clone the project:
~$ git clone https://github.com/manoharmukku/multilayer-perceptron-in-c
Then, go to the cloned directory, and compile the project as below:
~$ make
Then run the program with your desired parameters as below:
~$ ./MLP 3 4,5,5 softmax,relu,tanh 1 sigmoid 0.01 10000 data/data_train.csv 1096 5 data/data_test.csv 275 5
Program parameters explanation:
Argument 0: Executable file name Ex: ./MLP
Argument 1: Number of hidden layers Ex: 3
Argument 2: Number of units in each hidden layer from left to right separated by comma (no spaces in-between) Ex: 4,5,5
Argument 3: Activation function of each hidden layer from left to right separated by comma (no spaces in-between) Ex: softmax,relu,tanh
Argument 4: Number of units in output layer (Specify 1 for binary classification and k for k-class multi-class classification) Ex: 1
Argument 5: Output activation function Ex: sigmoid
Argument 6: Learning rate parameter Ex: 0.01
Argument 7: Maximum number of iterations to run during training Ex: 10000
Argument 8: Path of the csv file containing the train dataset Ex: data/data_train.csv
Argument 9: Number of rows in the train dataset (Number of samples) Ex: 1096
Argument 10: Number of columns in the train dataset (Number of input features + 1 (output variable)). The output variable should always be in the last column Ex: 5
Argument 11: Path of the csv file containing the test dataset Ex: data/data_test.csv
Argument 12: Number of rows in the test dataset (Number of samples) Ex: 275
Argument 13: Number of columns in the test dataset (Number of input features + 1 (output variable)). The output variable should always be in the last column Ex: 5
- The datasets should be in .csv format
- All the feature values should be real (numerical)
- There should not be any header row
- There should not be any index column specifying the row number
- The output variable should always be in the last column
- If binary classification, the output variable can take values from 0 or 1 only
- If multi-class classification with k-classes, the output variable can take values from 1, 2, 3,..., k only
- Make sure to specify the correct paths of the data files in the arguments
Example dataset used for training and testing: Banknote authentication dataset from ULI ML Repository
- https://www.coursera.org/lecture/machine-learning/backpropagation-algorithm-1z9WW
- https://gist.github.com/amirmasoudabdol/f1efda29760b97f16e0e
- https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c
- https://stackoverflow.com/questions/33058848/generate-a-random-double-between-1-and-1
- https://madalinabuzau.github.io/2016/11/29/gradient-descent-on-a-softmax-cross-entropy-cost-function.html
- http://dai.fmph.uniba.sk/courses/NN/haykin.neural-networks.3ed.2009.pdf
- https://jamesmccaffrey.wordpress.com/2017/06/23/two-ways-to-deal-with-the-derivative-of-the-relu-function/
- https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative
- https://theclevermachine.wordpress.com/2014/09/08/derivation-derivatives-for-common-neural-network-activation-functions/
- https://www.geeksforgeeks.org/shuffle-a-given-array/