New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Most optimizers don't save iterations
as a weight!!
#13027
Comments
Is |
@danmoller No, |
Based on a recent experimental run using tensorflow 2.4.1, I'm wondering if this is fully resolved. More specifically, after 10 epochs of optimization using Or, of course, I might be using it wrong! If folks think the relevant state is saved and restored, I'll try to make this more concrete with a minimum working example and see what happens there (my use is currently embedded in a larger program). |
@mfenner1 , you can try to use the |
@danmoller Thanks for getting back to me here. I did give that try as well: after an |
@danmoller So, I did make a MWE that allowed me to store and reload models (and I could easily use different optimizers). After doing that, both For reference, my # initial call
model.fit(..., epochs=10, ...)
# subsequent call
model.fit(..., epochs=20, initial_epoch=11, ....) Best, |
I was checking whether Keras saves the optimizers states, and it happens that it does that based on the
self.weights
variable of the optimizer.Looking at the source code for optimizers: https://github.com/keras-team/keras/blob/master/keras/optimizers.py
When it's about
SGD
, everything seems ok, theiterations
variable is part of the weights:Now, if you look at most other optimizers in the same page, they just save their accumulators, moments, etc., but they don't save
iterations
.When there is a
decay
involved, this would spoil saving and loading the optimizers.This is a suggestion to fix this in all optimizers by adding
iterations
to the list.For instance, take the
Adadelta
optimizer and replace the following line:With this:
The text was updated successfully, but these errors were encountered: