Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras_model_fn global steps dont increase #444

Closed
gautiese opened this issue Oct 26, 2018 · 1 comment
Closed

Keras_model_fn global steps dont increase #444

gautiese opened this issue Oct 26, 2018 · 1 comment

Comments

@gautiese
Copy link

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans):Tensorflow / Keras
  • Framework Version:1.11
  • Python Version:2
  • CPU or GPU:GPU
  • Python SDK Version:
  • Are you using a custom image:No

Describe the problem

I am getting an error that the global steps are not increasing. Further, the training seeps to have a Warm Start (dont know why, or if thats a problem at all)

Minimal repro / logs

2018-10-26 07:24:04,680 INFO - tensorflow - loss = 0.5054888, step = 0 2018-10-26 07:24:12,735 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize. 2018-10-26 07:24:18,908 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize. 2018-10-26 07:24:25,089 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize. 2018-10-26 07:24:31,263 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize. 2018-10-26 07:24:39,229 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize. 2018-10-26 07:24:39,230 INFO - tensorflow - loss = 0.5811208, step = 0 (34.550 sec) 2018-10-26 07:24:45,407 WARNING - tensorflow - It seems that global step (tf.train.get_global_step) has not been increased. Current value (could be stable): 0 vs previous value: 0. You could increase the global step by passing tf.train.get_global_step() to Optimizer.apply_gradients or Optimizer.minimize.

This is happening on any keras_model_fn, even then one in examples.

@mvsusp
Copy link
Contributor

mvsusp commented Oct 26, 2018

Hi @gautiese

It seems that Keras optimizers are not working as they used to be in TF 1.11. This issue disappears if you change the Keras optimizer by a TF optimizer. We changed the example to address this change https://github.com/awslabs/amazon-sagemaker-examples/pull/442/files?utf8=%E2%9C%93&diff=split&w=1#diff-11cd3635a41f977f48d21c3a65d3e774L71

I will close this ticket now. Feel free to open it again if you have additional questions.

Thanks for using SageMaker!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants