Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG] Multi-layer perceptron (MLP) #2120
Multi-layer perceptron (MLP)
PR closed in favor or #3204
This is an extention to larsmans code.
A multilayer perceptron (MLP) is a feedforward artificial neural network model that tries to learn a function f(X)=y where y is the output and X is the input. An MLP consists of multiple layers, usually of one hidden layer, an input layer and an output layer, where each layer is fully connected to the next one. This is a classic algorithm that has been extensively used in Neural Networks.
Code Check out :
Tutorial link:- http://easymachinelearning.blogspot.com/p/multi-layer-perceptron-tutorial.html
Sample Benchmark:- `MLP` on the scikit's `Digits` dataset gives, - Score for `tanh-based sgd`: 0.981 - Score for `logistic-based sgd`: 0.987 - Score for `tanh-based l-bfgs`: 0.994 - Score for `logistic-based l-bfgs`: 1.000
I read that Cython code is easy to produce (just a matter of adding some prefixes and compiling the code). I will Cython the code and see if it adds benefits.
Thanks for the review
@larsmans, using the whole 20 categories of 20news (not the watered down version) modeled by tf-idf scikit vectorizer, yielding an Input matrix of 18828 rows and 74324 columns aka features, and with 100 hidden neurons, the algorithm fitted on the whole sparse matrix with around 1 second per iteration. It seems like a good enough speed for MLP for such large data, but I might be wrong.
Right now, I applied the code on 4 categories of the 20news corpus, with 100 iterations and 100 hidden neurons, l_bfgs achieved an average f1-score of 0.87. I might need to leave the code run for a long time before it converges (it doesn't converge even after 300 iteration), thus, I suspect there is a bug in my code.
In your pull request you mentioned that you tested your code on a small subset of 20news corpus achieving similar results, did you use 4 categories too?
Sorry for being slow in responding, I had a bug in the code which took time to fix because the transposed
Moreover, I just committed a lot of changes, including,
The performance benchmark on the
Will post the test results on the
Some of the remaining TODO's would be:
Thank you for your great reviews!
Updates- Replaced scipy `minimize` with`l-bfgs` - So not to compel users to install scipy 13.0+ - Renames, as per the comments, done - Divided the function to `MLPClassifier` and `MLPRegressor` - Set `square_loss` default for `MLPRegressor` - Set `log` default for `MLPClassifier` - Fixed long lines (some lines might still be long) - Added learning rates, that include, `optimal`, `constant`, and `invscaling` - New Benchmark on the `digits` dataset (100 iterations, 100 hidden neurons, `log` loss) - `tanh-based SGD` : `0.957` - `tanh-based l_bfgs` : `0.985` - `logistic-based SGD` : `0.992` - `logistic-based l_bfgs` : `1.000` (converged in `70` iterations)
These are interesting results because
The documentation will be updated once the code is deemed reliable.
Thank you for your great reviews and tips! :)
Hi everyone, sorry for being inactive in this, it's been a laborious 2 weeks :). I have updated the code by improving the documentation and eliminating
The code seems to be accepted by the travis build, however, MLPRegressor is yet to be implemented, but will be done soon.
PS: I'm also writing a blog that aims in helping 'newcomers to the field' (maybe practitionars even) engage in Neural Networks
Thanks in advance!
Okay done :), I have fixed the binary classification, I'm getting 100% score with
Gladly, it passed the travis test, now what is left is to re-use some of scikit's cython-based loss functions (and logistic) for improved speed and implement MLP for regression.
In addition, the packing and unpacking methods are to be improved.