Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any examples of using .solverstate with python interface? #3651

Closed
5argon opened this issue Feb 9, 2016 · 4 comments
Closed

Any examples of using .solverstate with python interface? #3651

5argon opened this issue Feb 9, 2016 · 4 comments

Comments

@5argon
Copy link

5argon commented Feb 9, 2016

I only saw how to initialize a network using .caffemodel, but what about .solverstate? Thank you.

@zizhaozhang
Copy link

Use solver.restore('***.solverstate').
No need to use solver.net.copy_from().

On Tue, Feb 9, 2016 at 8:08 AM, Sirawat Pitaksarit <notifications@github.com

wrote:

I only saw how to initialize a network using .caffemodel, but what about
.solverstate? Thank you.


Reply to this email directly or view it on GitHub
#3651.

Best Regards,
Zizhao

@5argon
Copy link
Author

5argon commented Feb 9, 2016

Thank you for the quick reply! May I ask a bit further, it seems like .caffemodel is a subset of .solverstate. What will be missing if I initialize the network using .caffemodel?

@zizhaozhang
Copy link

If you use copy_from, it only copies the model parameters corresponding to
the layer names in your prototxt, i.e., link the the same name in your
prototxt and caffemodel and copy the weights. But other training parameters
will re-start based on your settings in the solver.prototxt. It is more
likely used for fine-tune a new model to a certain task.

If you restore the solverstate. it will fully recover the training process
stopped in the last solverstate including the iteration, current learning
rate, etc. It will also find the .caffemodel file internally. If your
caffemodel misses, re-training will never be working. But I am not sure if
solverstate file has a complete copy of the weights (it seems it has due to
the size of the solverstate file).

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <notifications@github.com

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?


Reply to this email directly or view it on GitHub
#3651 (comment).

Best Regards,
Zizhao

@seanbell
Copy link

seanbell commented Feb 9, 2016

.caffemodel contains the weights. .solverstate contains the momentum vector. Both are needed to restart training. If you restart training without momentum, the loss will spike up and it will take ~50k iterations to recover. At test time you only need .caffemodel.

I am closing this as this belongs on the user group, thanks!

@seanbell seanbell closed this as completed Feb 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants