New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR:tensorflow: Failed to close session after error.Other threads may hang. #44824
Comments
@etetteh |
I am using TensorFlow version 1.15.4, and a Google cloud TPU of the same version on GCP.
Pretty much, what I am doing is running the following line of code:
|
@etetteh |
@Saduf2019 The GOOGLE ELECTRA code base is in version 1.15, or you mean upgrade the version of the tpu? |
Closing this issue since its resolved on another thread. Thanks! |
I am trying to pretrain my ELECTRA base, I keep getting this output:
Running training
2020-11-13 08:00:18.044763: W tensorflow/core/distributed_runtime/rpc/grpc_session.cc:370] GrpcSession::ListDevices will initialize the session with an empty graph and other defaults because the session has not yet been created.
Model is built!
2020-11-13 08:00:48.956655: W tensorflow/core/distributed_runtime/rpc/grpc_session.cc:370] GrpcSession::ListDevices will initialize the session with an empty graph and other defaults because the session has not yet been created.
ERROR:tensorflow:Error recorded from infeed: From /job:worker/replica:0/task:0:
{{function_node _inference_tf_data_experimental_map_and_batch_69}} Key: segment_ids. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[input_pipeline_task0/while/IteratorGetNext]]
ERROR:tensorflow:Closing session due to error From /job:worker/replica:0/task:0:
{{function_node _inference_tf_data_experimental_map_and_batch_69}} Key: segment_ids. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[input_pipeline_task0/while/IteratorGetNext]]
2020-11-13 08:01:08.642776: W tensorflow/core/distributed_runtime/rpc/grpc_remote_master.cc:157] RPC failed with status = "Unavailable: Socket closed" and grpc_error_string = "{"created":"@1605254468.642525410","description":"Error received from peer","file":"external/grpc/src/core/lib/surface/call.cc","file_line":1039,"grpc_message":"Socket closed","grpc_status":14}", maybe retrying the RPC
2020-11-13 08:01:08.642779: W tensorflow/core/distributed_runtime/rpc/grpc_remote_master.cc:157] RPC failed with status = "Unavailable: Socket closed" and grpc_error_string = "{"created":"@1605254468.642549072","description":"Error received from peer","file":"external/grpc/src/core/lib/surface/call.cc","file_line":1039,"grpc_message":"Socket closed","grpc_status":14}", maybe retrying the RPC
ERROR:tensorflow:Error recorded from outfeed: Step was cancelled by an explicit call to
Session::Close()
.ERROR:tensorflow:
Failed to close session after error.Other threads may hang.
2020-11-13 08:01:50.857700: W tensorflow/core/distributed_runtime/rpc/grpc_session.cc:370] GrpcSession::ListDevices will initialize the session with an empty graph and other defaults because the session has not yet been created.
ERROR:tensorflow:Error recorded from infeed: From /job:worker/replica:0/task:0:
{{function_node _inference_tf_data_experimental_map_and_batch_69}} Key: segment_ids. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[input_pipeline_task0/while/IteratorGetNext]]
The text was updated successfully, but these errors were encountered: