Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AdamOptimizer + checkpoints #634

Closed
wchan opened this issue Dec 28, 2015 · 8 comments
Closed

AdamOptimizer + checkpoints #634

wchan opened this issue Dec 28, 2015 · 8 comments

Comments

@wchan
Copy link

wchan commented Dec 28, 2015

When using the AdamOptimizer, does the checkpoint save the state of the AdamOptimizer? If yes, how do you clear the state (i.e., restart the AdamOptimizer but keep the model weights from the checkpoint); if no, how do you maintain the Adam state. Question can be applied to AdaGrad as well.

@girving
Copy link
Contributor

girving commented Dec 28, 2015

This kind of question should be asked on stackoverflow, since it is not about a bug or feature request for tensorflow. However, the answer is yes: the state of optimizers is stored in tf.Variable objects just like normal state, and these are saved to checkpoints.

@girving girving closed this as completed Dec 28, 2015
@xksteven
Copy link

Can this be reopened as a feature request to create an API call to reset the state of an optimizer?

There doesn't seem to be a nice way to do it as far as I know.

@yaroslavvb
Copy link
Contributor

You could "reset" state of optimizer by creating new optimizer over the
same parameter variable

On Tue, Sep 20, 2016 at 10:23 PM, Steven Basart notifications@github.com
wrote:

Can this be reopened as a feature request to create an API call to reset
the state of an optimizer?

There doesn't seem to be a nice way to do it as far as I know.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#634 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABaHMt3mME5XOahc81xRZZbU8OqkbWqks5qsL9YgaJpZM4G7spa
.

@xksteven
Copy link

xksteven commented Sep 21, 2016

That seems more like a "hack" in the sense that when I save my new model now I'll need to find a way to find and exclude the old optimizer variables and only save the old variables except the old optimizer and new optimizer. This starts to break the code. I was interested in experimenting with this technique here but other optimization schemes. However I don't see a nice way to do this once the model has been saved and I need to restart it.

Unless I'm misunderstanding how you intend to code it up?

I was imagining I have my graph:

def train():
  with tf.Session() as sess:
    inputs = get_inputs()
    model_out = infer(inputs)
    loss = (model_out, targets)
    train_op = (loss)
    saver = tf.train.Saver()
    sess.run(tf.initialize_all_variables())
    saver.restore(sess, checkpoint)

That's the old model then I imagine the change your suggesting is to add the one below?

    .... (same as above)
    saver.restore(sess, checkpoint)
    train_op = (loss)
    sess.run(tf.initialize_variables(tf.report_uninitialized_variables(tf.all_variables)))

@yaroslavvb
Copy link
Contributor

yaroslavvb commented Sep 21, 2016

You could also load checkpoint in a regular way, and then run initialize on all the variables you want to reset sess.run([v.initialize for v in variables_to_reset])

@dylanbfox
Copy link

@xksteven did that method work for you? simply overwriting the train_op variable after restoring the checkpoint?

@xksteven
Copy link

@dylanbfox

If you're using the tensorflow built in saver then if you "reset" the adam optimizer by simply creating a new adam optimizer then it will create an entire extra set of parameters for each variable in the graph while maintaining the old unused ones. This process will not work as a long term solution as you're almost doubling the size of graph every time you reset it in this way.

It is not a viable solution.

I honestly still don't have a good way to do this as one would need to iterate through all of the variables and reset all of the "slot" variables. I do not know of a way that works to accomplish this.

I would suggest keeping this issue open.

@ghost
Copy link

ghost commented Jan 11, 2018

To solve this problem, you need to simply pass a list of variables to restore to the Saver object. I had this problem before, and after experimenting, I have posted my finding on saving and restoring variables along with the code on a stack overflow post. Here is the link:

https://stackoverflow.com/questions/48161147/error-restoring-model-in-tensorflow-after-changing-the-optimizer-paramter/48212514#48212514

mahmoud-abuzaina pushed a commit to Intel-tensorflow/tensorflow that referenced this issue Oct 4, 2023
…stration (tensorflow#634)

* Block MatMul + BiasAdd fusion for FP16 to use MatMul primitive

* Adding guards around FP16 registration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants