-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coffee break to allow for accumulation of statistics and reproduce ~100 errors on MNIST #6
Conversation
This commit changes the way coffee breaks work -> each break now returns a dict in which statistics are recorded. These dicts can then be recoreded by the AccumulateStatistics break and saved to a hdf5 file. This is quite useful for monitoring training.
it can be used to plot data that was written by the AccumulateStatistics break and even works during training. This is analogous to the plot_monitor.py script of pylearn2.
…ishs dropout paper. This is a first quick sketch that gives the desired result of ~100 errors on MNIST using a fully connected network of 2 layers a 1200 units with dropout. NOTE: this is still missing the constraint on the L2 norm of the weights, will add them in a later comit.
Oh and I have already added norm constraints (as mentioned in the commit for the dropout example) but still need to add tests for them. |
@stokasto Nice job! Thanks! This is very cool! I have been thinking about allowing coffee-breaks to return the statistics and save somewhere, but haven't arrive at a conclusion of what is the best way to do it yet. I will pull down your changes and try your example locally this afternoon, and merge this PR. Meanwhile, could you add a comment header to the new MNIST example describing it? It would be even better if you could provide the reference to the paper whose results we are reproducing here. I think I might try to convert your MNIST example to a IJulia notebook example at a later stage since now we have the ability to show some cool plot of the training progress. :) Again, good job! Thank you very much for the contributions! |
Sure I'll do that, so maybe wait with pulling until then. |
@stokasto Yes, I noticed that. I was having some struggle with that. Mainly how to allow the snapshot-coffee-break to load at iteration 0 yet do not save at iteration 0. I ended up making the morning coffee break one "day" before the evening coffee break. But I agree with you that making them the same day is more consistent. I will do some refactoring to deal with the snapshot coffee break later. BTW: your new MNIST example runs smoothly for me. I'm ready to merge this PR after you add some short comment to the example file. Thanks again! |
…iple files for plotting
I added a short description to the example file. |
Oh and btw, I actually don't think it is too bad to save a model with iteration count 0 |
Add coffee break to allow for accumulation of statistics and reproduce ~100 errors on MNIST
@stokasto Awesome! |
@stokasto I am thinking of doing some refactoring for your statistics accumulation code. Currently statistics are put explicitly in a container |
Sure, sounds good I just added it this way because it was easiest and I wanted to get some plots quickly :). |
This is a slightl more involved (bu still small) pull request that slightly changes the way the coffee breaks work:
-> each break now returns a dict in which statistics are recorded.
These dicts can then be recoreded by the AccumulateStatistics break and saved to a hdf5 file.
I have also added a plot_statistics.jl script to tools with which such a statistics file can be plotted. This is quite useful for monitoring training and resembles the way pylearn2 handles plotting which I quite like :).
I have also added a first example for training a fully connected network on MNIST with dropout that gives ~103 errors on MNIST and thus reproduces the results from the dropout paper.
Let me know if you prefer a different way of handling the stats logging or feel free to pull as is.