Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dropout for neural networks #130

Closed
alexrobbins opened this issue Jan 29, 2013 · 25 comments
Closed

Add dropout for neural networks #130

alexrobbins opened this issue Jan 29, 2013 · 25 comments

Comments

@alexrobbins
Copy link

Dropout is the concept of training a neural network whose hidden layer neurons have only a .5 probability of being present in any training run. The end result is that you are effectively training an ensemble of neural networks that have massive weight sharing. This speeds training greatly (relative to training the entire ensemble separately), and eliminates some of the overfitting often associated with neural nets. For instance, neural nets with dropout never need to have their training stopped early to avoid overfitting.

Discussed in this paper: http://arxiv.org/abs/1207.0580
And in this Google tech talk: http://www.youtube.com/watch?v=DleXA5ADG78

I'd be willing to contribute code if this is something you'd be willing to include. Thanks!

@nitbix
Copy link
Contributor

nitbix commented Jan 29, 2013

Hi Alex,

I ha e a change set in review that includes ensemble techniques, and I'm
using that code for a review paper I'm working on. One of the things I
wanted to include late during the process was dropout. Maybe we can pool
resources and work on it together? I would be more than happy to help! If
you want to see what I did for ensembles look at my submission (there's
probably some cleanup still to do, but the juice is mostly there).

Best,
Alan
On 29 Jan 2013 16:22, "Alex Robbins" notifications@github.com wrote:

Dropout is the concept of training a neural network whose hidden layer
neurons have only a .5 probability of being present in any training run.
The end result is that you are effectively training an ensemble of neural
networks that have massive weight sharing. This speeds training greatly,
and eliminates some of the overfitting often associated with neural nets.
For instance, neural nets with dropout never need to have their training
stopped early to avoid overfitting.

Discussed in this paper: http://arxiv.org/abs/1207.0580
And in this Google tech talk: http://www.youtube.com/watch?v=DleXA5ADG78

I'd be willing to contribute code if this is something you'd be willing to
include. Thanks!


Reply to this email directly or view it on GitHubhttps://github.com//issues/130.

@alexrobbins
Copy link
Author

Sure, I'd love to have the help. In practice, although dropout achieves something equivalent to an ensemble, it should be (I think?) much easier to code. We won't have to actually track a large number of different models, since they all occupy the same weight space. During each training iteration you just give each hidden layer neuron a 50% chance of sending nothing, regardless of input. Then during the actual classification you divide each neuron's input by 2 if it gets input from a hidden layer, since all neurons will be present and the expected value doubles.

At least, that is my understanding of it. I'm not an expert :)

@alexrobbins
Copy link
Author

Actually, as I reread the paper, it looks like some adjustments need to be made to the weight normalization rules, and some other things. Not as simple as I made it sound in the last comment, but not too hard either.

@nitbix
Copy link
Contributor

nitbix commented Feb 7, 2013

Agreed. If you make a fork for a submission, I'd be happy to help.

On 30 January 2013 12:24, Alex Robbins notifications@github.com wrote:

Actually, as I reread the paper, it looks like some adjustments need to be
made to the weight normalization rules, and some other things. Not as
simple as I made it sound in the last comment, but not too hard either.


Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-12901232.

Alan

@jeffheaton
Copy link
Owner

Sorry about joining this a bit late. But if you would like to contribute code for dropout, I would be interested in adding it to Encog.

@nitbix
Copy link
Contributor

nitbix commented Mar 20, 2013

Although the thread went a bit silent, I have it on my list to add dropout
within the next two months, unless someone else gets there before me, given
that I'm going to need it for a paper I'm writing.
Alex did you end up doing anything?
On 19 Mar 2013 23:01, "Jeff Heaton" notifications@github.com wrote:

Sorry about joining this a bit late. But if you would like to contribute
code for dropout, I would be interested in adding it to Encog.


Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-15149078
.

@alexrobbins
Copy link
Author

Sorry. No progress to report. Priorities changed at work and I ended up
working on something else.
On Mar 19, 2013 5:16 PM, "nitbix" notifications@github.com wrote:

Although the thread went a bit silent, I have it on my list to add dropout
within the next two months, unless someone else gets there before me,
given
that I'm going to need it for a paper I'm writing.
Alex did you end up doing anything?
On 19 Mar 2013 23:01, "Jeff Heaton" notifications@github.com wrote:

Sorry about joining this a bit late. But if you would like to contribute
code for dropout, I would be interested in adding it to Encog.


Reply to this email directly or view it on GitHub<
https://github.com/encog/encog-java-core/issues/130#issuecomment-15149078>

.


Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-15151822
.

@nitbix
Copy link
Contributor

nitbix commented Mar 20, 2013

No worries, I'll up the priority on adding it then.

On 20 March 2013 04:15, Alex Robbins notifications@github.com wrote:

Sorry. No progress to report. Priorities changed at work and I ended up
working on something else.
On Mar 19, 2013 5:16 PM, "nitbix" notifications@github.com wrote:

Although the thread went a bit silent, I have it on my list to add
dropout
within the next two months, unless someone else gets there before me,
given
that I'm going to need it for a paper I'm writing.
Alex did you end up doing anything?
On 19 Mar 2013 23:01, "Jeff Heaton" notifications@github.com wrote:

Sorry about joining this a bit late. But if you would like to
contribute
code for dropout, I would be interested in adding it to Encog.


Reply to this email directly or view it on GitHub<

https://github.com/encog/encog-java-core/issues/130#issuecomment-15149078>

.


Reply to this email directly or view it on GitHub<
https://github.com/encog/encog-java-core/issues/130#issuecomment-15151822>

.


Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-15157586
.

Alan

@jeffheaton
Copy link
Owner

Thanks both of you. Let me know if you need anything.

@nitbix
Copy link
Contributor

nitbix commented May 10, 2013

Just to let you both know I've started work on this in my private fork today, ETA 2 weeks max. Thanks for your patience!

@nitbix
Copy link
Contributor

nitbix commented May 10, 2013

Alex, in case you wanted more documentation on dropout, I found the original thesis from one of the paper authors at http://www.cs.toronto.edu/~nitish/msc_thesis.pdf

@alexrobbins
Copy link
Author

Very interesting. Thanks for the link!

@jeffheaton
Copy link
Owner

Okay let me know if you need anything and thanks!

@challarao
Copy link

Hi Alex, nitbix, jeff, Can you tell me how to implement dropout in a concrete manner?

@nitbix
Copy link
Contributor

nitbix commented Feb 25, 2014

Hello!

I already have a merge request with dropout on a few training algorithms.
If you tell me which ones you are interested in, I'll make sure I have
those added.
You'll find the underged fork in my repos, but it comes without warranty!
Feel free to send suggestions/fixes and improvements ;)

Alan
On 25 Feb 2014 19:22, "challarao" notifications@github.com wrote:

Hi Alex, nitbix, jeff, Can you tell me how to implement dropout in a
concrete manner?

Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-36046762
.

@challarao
Copy link

Thanks Alan
I'm currently not using Encog. I'm just trying to understand dropout hence the last comment.

@nitbix
Copy link
Contributor

nitbix commented Feb 25, 2014

Oh I see! Your best bet is to probably read the original paper from Hinton
et al first if you haven't already. I can dig it up for you if you don't
have it.
On 25 Feb 2014 19:52, "challarao" notifications@github.com wrote:

Thanks Alan
I'm currently not using Encog. I'm just trying to understand dropout hence
the last comment.

Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-36050214
.

@jeffheaton
Copy link
Owner

By the way, I do hope to get this merged into the next Encog version. I just have not had the time. There are quite a few changes, and some breaking to existing programs based on Encog... So adding is non-trivial.

@nitbix
Copy link
Contributor

nitbix commented Feb 27, 2014

Hi Jeff,

I completely understand. If you need any help, or want to point out some of
the breakages I can try to fix them.

Thanks!

On 27 February 2014 04:14, Jeff Heaton notifications@github.com wrote:

By the way, I do hope to get this merged into the next Encog version. I
just have not had the time. There are quite a few changes, and some
breaking to existing programs based on Encog... So adding is non-trivial.

Reply to this email directly or view it on GitHubhttps://github.com//issues/130#issuecomment-36208714
.

Alan

@nitbix
Copy link
Contributor

nitbix commented Mar 4, 2015

Hi Jeff,

Just to let you know that I've merged the latest master HEAD into
https://github.com/nitbix/encog-java-core master HEAD, so finishing the
pull request should be a lot less work for you.

Best,

Alan

On 27 February 2014 at 09:26, Alan Mosca nitbix@nitbix.com wrote:

Hi Jeff,

I completely understand. If you need any help, or want to point out some
of the breakages I can try to fix them.

Thanks!

On 27 February 2014 04:14, Jeff Heaton notifications@github.com wrote:

By the way, I do hope to get this merged into the next Encog version. I
just have not had the time. There are quite a few changes, and some
breaking to existing programs based on Encog... So adding is non-trivial.


Reply to this email directly or view it on GitHub
#130 (comment)
.

Alan

Alan

@jeffheaton
Copy link
Owner

Thank you very much! I will take a look. This will be my first priority
on the next Encog version.

Also, my paper on Encog was accepted by the JMLR. I cited your masters
thesis for the dropout/regularization that you added to the Encog project.
Is this how you wanted it? I can still change it, I will be in the final
publication queue for several months, I am sure. Here is the bibtex I used.

@mastersthesis{Mosca:mscdissertation,
author = {Mosca, A.},
title = {Extending Encog: A study on classifier ensemble techniques},
school = {Birkbeck, University of London},
year = {2012}
}

On Wed, Mar 4, 2015 at 12:52 PM, nitbix notifications@github.com wrote:

Hi Jeff,

Just to let you know that I've merged the latest master HEAD into
https://github.com/nitbix/encog-java-core master HEAD, so finishing the
pull request should be a lot less work for you.

Best,

Alan

On 27 February 2014 at 09:26, Alan Mosca nitbix@nitbix.com wrote:

Hi Jeff,

I completely understand. If you need any help, or want to point out some
of the breakages I can try to fix them.

Thanks!

On 27 February 2014 04:14, Jeff Heaton notifications@github.com wrote:

By the way, I do hope to get this merged into the next Encog version. I
just have not had the time. There are quite a few changes, and some
breaking to existing programs based on Encog... So adding is
non-trivial.


Reply to this email directly or view it on GitHub
<
https://github.com/encog/encog-java-core/issues/130#issuecomment-36208714>
.

Alan

Alan


Reply to this email directly or view it on GitHub
#130 (comment)
.

@nitbix
Copy link
Contributor

nitbix commented Mar 4, 2015

Hi Jeff,

Thanks for the citation, that's perfect. Let me know if there's anything
else I can do to help with getting everything merged.

Thanks,

Alan
On 4 Mar 2015 20:25, "Jeff Heaton" notifications@github.com wrote:

Thank you very much! I will take a look. This will be my first priority
on the next Encog version.

Also, my paper on Encog was accepted by the JMLR. I cited your masters
thesis for the dropout/regularization that you added to the Encog project.
Is this how you wanted it? I can still change it, I will be in the final
publication queue for several months, I am sure. Here is the bibtex I used.

@mastersthesis{Mosca:mscdissertation,
author = {Mosca, A.},
title = {Extending Encog: A study on classifier ensemble techniques},
school = {Birkbeck, University of London},
year = {2012}
}

On Wed, Mar 4, 2015 at 12:52 PM, nitbix notifications@github.com wrote:

Hi Jeff,

Just to let you know that I've merged the latest master HEAD into
https://github.com/nitbix/encog-java-core master HEAD, so finishing the
pull request should be a lot less work for you.

Best,

Alan

On 27 February 2014 at 09:26, Alan Mosca nitbix@nitbix.com wrote:

Hi Jeff,

I completely understand. If you need any help, or want to point out
some
of the breakages I can try to fix them.

Thanks!

On 27 February 2014 04:14, Jeff Heaton notifications@github.com
wrote:

By the way, I do hope to get this merged into the next Encog version.
I
just have not had the time. There are quite a few changes, and some
breaking to existing programs based on Encog... So adding is
non-trivial.


Reply to this email directly or view it on GitHub
<

https://github.com/encog/encog-java-core/issues/130#issuecomment-36208714>

.

Alan

Alan


Reply to this email directly or view it on GitHub
<
https://github.com/encog/encog-java-core/issues/130#issuecomment-77220296>
.


Reply to this email directly or view it on GitHub
#130 (comment)
.

@thorinii
Copy link

Hi all,
Did this get completed?

@nitbix
Copy link
Contributor

nitbix commented Aug 19, 2015

Yes, in master the constructor for BasicLayer now takes an additional
dropoutRate, which gets used by the training algorithms.

On 19 August 2015 at 09:24, Lachlan Phillips notifications@github.com
wrote:

Hi all,
Did this get completed?


Reply to this email directly or view it on GitHub
#130 (comment)
.

Alan

@jeffheaton
Copy link
Owner

This was merged with Encog previously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants