New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dropout for neural networks #130
Comments
Hi Alex, I ha e a change set in review that includes ensemble techniques, and I'm Best,
|
Sure, I'd love to have the help. In practice, although dropout achieves something equivalent to an ensemble, it should be (I think?) much easier to code. We won't have to actually track a large number of different models, since they all occupy the same weight space. During each training iteration you just give each hidden layer neuron a 50% chance of sending nothing, regardless of input. Then during the actual classification you divide each neuron's input by 2 if it gets input from a hidden layer, since all neurons will be present and the expected value doubles. At least, that is my understanding of it. I'm not an expert :) |
Actually, as I reread the paper, it looks like some adjustments need to be made to the weight normalization rules, and some other things. Not as simple as I made it sound in the last comment, but not too hard either. |
Agreed. If you make a fork for a submission, I'd be happy to help. On 30 January 2013 12:24, Alex Robbins notifications@github.com wrote:
Alan |
Sorry about joining this a bit late. But if you would like to contribute code for dropout, I would be interested in adding it to Encog. |
Although the thread went a bit silent, I have it on my list to add dropout
|
Sorry. No progress to report. Priorities changed at work and I ended up
|
No worries, I'll up the priority on adding it then. On 20 March 2013 04:15, Alex Robbins notifications@github.com wrote:
Alan |
Thanks both of you. Let me know if you need anything. |
Just to let you both know I've started work on this in my private fork today, ETA 2 weeks max. Thanks for your patience! |
Alex, in case you wanted more documentation on dropout, I found the original thesis from one of the paper authors at http://www.cs.toronto.edu/~nitish/msc_thesis.pdf |
Very interesting. Thanks for the link! |
Okay let me know if you need anything and thanks! |
Hi Alex, nitbix, jeff, Can you tell me how to implement dropout in a concrete manner? |
Hello! I already have a merge request with dropout on a few training algorithms. Alan
|
Thanks Alan |
Oh I see! Your best bet is to probably read the original paper from Hinton
|
By the way, I do hope to get this merged into the next Encog version. I just have not had the time. There are quite a few changes, and some breaking to existing programs based on Encog... So adding is non-trivial. |
Hi Jeff, I completely understand. If you need any help, or want to point out some of Thanks! On 27 February 2014 04:14, Jeff Heaton notifications@github.com wrote:
Alan |
Hi Jeff, Just to let you know that I've merged the latest master HEAD into Best, Alan On 27 February 2014 at 09:26, Alan Mosca nitbix@nitbix.com wrote:
Alan |
Thank you very much! I will take a look. This will be my first priority Also, my paper on Encog was accepted by the JMLR. I cited your masters @mastersthesis{Mosca:mscdissertation, On Wed, Mar 4, 2015 at 12:52 PM, nitbix notifications@github.com wrote:
|
Hi Jeff, Thanks for the citation, that's perfect. Let me know if there's anything Thanks, Alan
|
Hi all, |
Yes, in master the constructor for BasicLayer now takes an additional On 19 August 2015 at 09:24, Lachlan Phillips notifications@github.com
Alan |
This was merged with Encog previously. |
Dropout is the concept of training a neural network whose hidden layer neurons have only a .5 probability of being present in any training run. The end result is that you are effectively training an ensemble of neural networks that have massive weight sharing. This speeds training greatly (relative to training the entire ensemble separately), and eliminates some of the overfitting often associated with neural nets. For instance, neural nets with dropout never need to have their training stopped early to avoid overfitting.
Discussed in this paper: http://arxiv.org/abs/1207.0580
And in this Google tech talk: http://www.youtube.com/watch?v=DleXA5ADG78
I'd be willing to contribute code if this is something you'd be willing to include. Thanks!
The text was updated successfully, but these errors were encountered: