Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong with checkpoint file #14

Open
JohnnieXDU opened this issue Sep 9, 2017 · 4 comments
Open

Wrong with checkpoint file #14

JohnnieXDU opened this issue Sep 9, 2017 · 4 comments

Comments

@JohnnieXDU
Copy link

JohnnieXDU commented Sep 9, 2017

Hello, thanks to provide checkpoint file for im2txt project. However, when i follow your instruction to run the code, i met a question: an error that says some variables are not found in checkpoint.

Concretely, the main error tips are described as follows:
NotFoundError (see above for traceback): Key lstm/basic_lstm_cell/kernel not found in checkpoint [[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_381/tensor_names, save/RestoreV2_381/shape_and_slices)]]

The whole error are describes as follows:
2017-09-09 16:58:27.152998: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 381.9.0 2017-09-09 16:58:27.153007: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 381.9.0 INFO:tensorflow:Loading model from checkpoint: model.ckpt-1000000 INFO:tensorflow:Restoring parameters from model.ckpt-1000000 2017-09-09 16:58:27.548519: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key lstm/basic_lstm_cell/kernel not found in checkpoint 2017-09-09 16:58:27.548988: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key lstm/basic_lstm_cell/bias not found in checkpoint Traceback (most recent call last): File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/home/server109/anaconda3/lib/python3.6/contextlib.py", line 89, in __exit__ next(self.gen) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.NotFoundError: Key lstm/basic_lstm_cell/kernel not found in checkpoint [[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_381/tensor_names, save/RestoreV2_381/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 85, in <module> tf.app.run() File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 65, in main restore_fn(sess) File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/inference_wrapper_base.py", line 96, in _restore_fn saver.restore(sess, checkpoint_path) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1548, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key lstm/basic_lstm_cell/kernel not found in checkpoint [[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_381/tensor_names, save/RestoreV2_381/shape_and_slices)]]

Caused by op 'save/RestoreV2_381', defined at: File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 85, in <module> tf.app.run() File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 51, in main FLAGS.checkpoint_path) File "/home/server109/Documents/Show_and_Tell/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/inference_wrapper_base.py", line 116, in build_graph_from_config saver = tf.train.Saver() File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1139, in __init__ self.build() File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1170, in build restore_sequentially=self._restore_sequentially) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 691, in build restore_sequentially, reshape) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps tensors = self.restore_op(filename_tensor, saveable, preferred_shard) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op [spec.tensor.dtype])[0]) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 640, in restore_v2 dtypes=dtypes, name=name) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/server109/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__ self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key lstm/basic_lstm_cell/kernel not found in checkpoint [[Node: save/RestoreV2_381 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_381/tensor_names, save/RestoreV2_381/shape_and_slices)]]

How can i fix this? Thanks :)

@KranthiGV
Copy link
Owner

@JohnnieXDU
It is an issue because of difference in naming in tensorflow.
Can you try downgrading your tensorflow to 1.0 and test? (Temp fix)
It should work for now.

@koumick
Copy link

koumick commented Nov 27, 2017

Thank you for your great source.

I have the same error when I run with TF 1.3.
Then I tried with TF 1.0, but a little bit different error occurred as follow.

NotFoundError (see above for traceback): Key lstm/basic_lstm_cell/bias not found in checkpoint [[Node: save/RestoreV2_380 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_380/tensor_names, save/RestoreV2_380/shape_and_slices)]]

Only one word "kernel" (in the error message with 1.3) changed into "bias" with 1.0.

Does anyone know what is wrong?

@hitvoice
Copy link

Same error with tf1.8rc0, py2

NotFoundError (see above for traceback): Key lstm/basic_lstm_cell/bias not found in checkpoint
         [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
         [[Node: save/RestoreV2/_755 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_765_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

@AaratiAkkapeddi
Copy link

happening to me as well in 1.11.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants