-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keras optimizer.iterations are not properly save & restored #48947
Comments
I was able to reproduce the error in tf 2.4, tf 2.5rc2 and tf-nightly.Please find the gist .Thanks |
@blackyang I think this prints the number of individual batches where the updates have been performed, in your case if you are doing n iterations on your data then |
The problem is that OptimizerV2.iterations is exactly what gets passed to a LearningRateSchedule to determine the learning rate. If it reports 4 when it should be 7 then training isn't going to work properly when it saves and resumes.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
@sachinprasadhs thank you for your reply! yes I completely understand why it prints 4 instead of 7, I was saying that it should print 7 (or have an option to specify whether to print 4 by resetting or still print 7), otherwise the learning rate scheduler is wrong another way is to update learning rate scheduler to not use this iteration any thoughts? thank you! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
any updates? thx |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
I was able to reproduce the error in tf 2.6 . Please find the gist |
Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
optimizer.iterations are not properly save & restored. After restoring iterations are reset as 0, which leads to wrong lr based on lr_scheduler
Describe the expected behavior
provide an option to either reset it or not, for backward compatibility maybe default to reset
Contributing - Do you
want to contribute a PR? (yes/no): - Briefly describe your candidate solution
(if contributing):
Standalone code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.
the last print shows 4, but it should 3+4=7
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
The text was updated successfully, but these errors were encountered: