Any examples of using .solverstate with python interface? #3651

5argon · 2016-02-09T13:07:28Z

I only saw how to initialize a network using .caffemodel, but what about .solverstate? Thank you.

zizhaozhang · 2016-02-09T13:48:35Z

Use solver.restore('***.solverstate').
No need to use solver.net.copy_from().

On Tue, Feb 9, 2016 at 8:08 AM, Sirawat Pitaksarit <notifications@github.com

wrote:

I only saw how to initialize a network using .caffemodel, but what about
.solverstate? Thank you.

—
Reply to this email directly or view it on GitHub
#3651.

Best Regards,
Zizhao

5argon · 2016-02-09T14:05:17Z

Thank you for the quick reply! May I ask a bit further, it seems like .caffemodel is a subset of .solverstate. What will be missing if I initialize the network using .caffemodel?

zizhaozhang · 2016-02-09T16:20:55Z

If you use copy_from, it only copies the model parameters corresponding to
the layer names in your prototxt, i.e., link the the same name in your
prototxt and caffemodel and copy the weights. But other training parameters
will re-start based on your settings in the solver.prototxt. It is more
likely used for fine-tune a new model to a certain task.

If you restore the solverstate. it will fully recover the training process
stopped in the last solverstate including the iteration, current learning
rate, etc. It will also find the .caffemodel file internally. If your
caffemodel misses, re-training will never be working. But I am not sure if
solverstate file has a complete copy of the weights (it seems it has due to
the size of the solverstate file).

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <notifications@github.com

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?

—
Reply to this email directly or view it on GitHub
#3651 (comment).

Best Regards,
Zizhao

seanbell · 2016-02-09T23:23:00Z

.caffemodel contains the weights. .solverstate contains the momentum vector. Both are needed to restart training. If you restart training without momentum, the loss will spike up and it will take ~50k iterations to recover. At test time you only need .caffemodel.

I am closing this as this belongs on the user group, thanks!

seanbell closed this as completed Feb 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any examples of using .solverstate with python interface? #3651

Any examples of using .solverstate with python interface? #3651

5argon commented Feb 9, 2016

zizhaozhang commented Feb 9, 2016

5argon commented Feb 9, 2016

zizhaozhang commented Feb 9, 2016

seanbell commented Feb 9, 2016

Any examples of using .solverstate with python interface? #3651

Any examples of using .solverstate with python interface? #3651

Comments

5argon commented Feb 9, 2016

zizhaozhang commented Feb 9, 2016

5argon commented Feb 9, 2016

zizhaozhang commented Feb 9, 2016

seanbell commented Feb 9, 2016