New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Retval[0] does not have value #18
Comments
Please provide the the command you are running, including the parameters |
sudo python ./train_seglink.py --dataset_name=icdar2015 --dataset_dir=/home/zst/result --batch_size=1 --train_dir=/home/zst/zst/seglink-master/train_dir --checkpoint_path=/home/zst/zst/seglink |
@dengdan 学习率改为了0.00001 |
Why sudo? |
因为我用sudo去使用python2.7,如果不用sudo它使用的是anconda python |
我还有一个问题就是大概训练多少次的时候开始收敛啊,我已经迭代了50万次了,还没有收敛 |
|
@dengdan 我采取了您的建议在另一台机器装了tensorflow1.1.0 还是这样的问题,但是每次报错前,它都会自动保存相应的checkpoint,再次训练时它会在原有的迭代基础上继续训练,这样对结果影响大吗? |
Well, I am not sure about it, because the reason for the bug has not been figured out yet, and how the bug can be reproduced is also unknown. |
yes |
When i train step at 500000 ,i find the fmean fall,it should be overfit,Can i know the number of you training set ,I guess the reason for the decline in fmean is that the training set is not enough. |
SynthText 0.8M, IC15 1000. |
您好,请问您通过调参使损失顺利下降成功了吗? |
INFO:tensorflow:global step 109662: loss = 5.3843 (0.160 sec/step)
INFO:tensorflow:global step 109663: loss = 4.5832 (0.256 sec/step)
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Retval[0] does not have value
INFO:tensorflow:global step 109664: loss = 8.8361 (0.098 sec/step)
INFO:tensorflow:Finished training! Saving model to disk.
Traceback (most recent call last):
File "./train_seglink.py", line 275, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "./train_seglink.py", line 271, in main
train(train_op)
File "./train_seglink.py", line 260, in train
session_config = sess_config
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 759, in train
sv.saver.save(sess, sv.save_path, global_step=sv.global_step)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 296, in stop_on_exception
yield
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 494, in run
self.run_loop()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 994, in run_loop
self._sv.global_step])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value
zst@zst-robot1:~/zst/seglink-master$
我的tf 版本是1.2.1,我也尝试在1.1.0上运行也会出现这样的错误.
The text was updated successfully, but these errors were encountered: