-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialization weights #4
Comments
It turns out that I needed to reduce the learning rate. After reducing the learning rate by 10x and increasing the effective batch size by 2x, I was able to train from scratch. Less extreme measures are most likely sufficient. @ducha-aiki thanks. LSUV does seem to have a slightly faster start; in this case, my biggest problem was the learning rate. |
@liuyipei with LSUV I was able to converge with big lr. But it is good, that other ways work as well :) |
I like how you have trainval and solver in one file. Does Caffe accept that Anyway, it looks convenient!
|
@forresti it accepts, see example in caffe master branch: https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_consolidated_solver.prototxt |
@liuyipei |
This work is very exciting! The provided weights does work as expected. The prototxt works out of the box with the default ilsvrc2012 lmdb data that came with caffe's examples.
However, my training loss from scratch has not decreased even after the full 85k iterations. I tried rebuilding the latest version of caffe, running a second time, and increasing the batch size by 4x: none of these attempts seemed to help. Am I correct in understanding that the model is meant to be trained end-to-end without tricks like layer-by-layer training or anything like that?
To help me diagnose my problem, would it be possible for you to provide a reference set of initialization weights caffemodel (or/and one of your earliest intermediate snapshots)?
Thank you for your help!
The text was updated successfully, but these errors were encountered: