Usage: kill [options] [...] Options: [...] send signal to every listed -, -s, --signal specify the to be sent -l, --list=[] list all signal names, or convert one to a name -L, --table list all signal names in a nice table -h, --help display this help and exit -V, --version output version information and exit For more details see kill(1). result root 4049 4040 0 09:28 pts/20 00:00:00 grep movielens- ok /usr/local/lib/python3.8/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" /usr/local/lib/python3.8/dist-packages/tensorflow_recommenders_addons/utils/ensure_tf_install.py:53: UserWarning: Tensorflow Recommenders Addons supports using Python ops for all Tensorflow versions above or equal to 2.5.1 and strictly below 2.5.1 (nightly versions are not supported). The versions of TensorFlow you are currently using is 2.8.3 and is not supported. Some things might work, some things might not. If you were to encounter a bug, do not file an issue. If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the Recommenders Addons's version. You can find the compatibility matrix in Recommenders Addon's readme: https://github.com/tensorflow/recommenders-addons warnings.warn( WARNING:tensorflow:dynamic_embedding.GraphKeys has already been deprecated. The Variable will not be added to collections because it does not actully own any value, but only a holder of tables, which may lead to import_meta_graph failed since non-valued object has been added to collection. If you need to use `tf.compat.v1.train.Saver` and access all Variables from collection, you could manually add it to the collection by tf.compat.v1.add_to_collections(names, var) instead. 2023-05-28 09:28:04.792032: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: UNKNOWN ERROR (34) INFO:tensorflow:TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'chief', 'index': 0}} I0528 09:28:04.792262 140045272737600 run_config.py:549] TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'chief', 'index': 0}} INFO:tensorflow:Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:chief" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'chief', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2224', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 0, '_is_chief': True} I0528 09:28:04.793045 140045272737600 estimator.py:202] Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:chief" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'chief', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2224', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 0, '_is_chief': True} INFO:tensorflow:Not using Distribute Coordinator. I0528 09:28:04.793832 140045272737600 estimator_training.py:182] Not using Distribute Coordinator. INFO:tensorflow:Start Tensorflow server. I0528 09:28:04.794151 140045272737600 training.py:777] Start Tensorflow server. WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/training_util.py:396: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. W0528 09:28:04.816967 140045272737600 deprecation.py:337] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/training_util.py:396: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. INFO:tensorflow:Calling model_fn. I0528 09:28:04.874122 140045272737600 estimator.py:1173] Calling model_fn. INFO:tensorflow:Done calling model_fn. I0528 09:28:05.118644 140045272737600 estimator.py:1175] Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. I0528 09:28:05.120058 140045272737600 basic_session_run_hooks.py:558] Create CheckpointSaverHook. INFO:tensorflow:Warm-starting from: ckpt1 I0528 09:28:05.166725 140045272737600 warm_start_util.py:113] Warm-starting from: ckpt1 k>>> (None, 1) k>>> (None, 1) k>>> (None, 1) mode>>> train ids>>>shape (None, 2) ps_list ['/job:ps/replica:0/task:0/CPU:0'] warm start ckpt1 de_variables>>> [] saveable.op>>> Tensor("Const:0", shape=(), dtype=string, device=/job:chief/task:0/device:CPU:0) Traceback (most recent call last): File "main.py", line 270, in app.run(main) File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run _run_main(main, args) File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "main.py", line 260, in main train(FLAGS.model_dir) File "main.py", line 220, in train tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 504, in train_and_evaluate return executor.run() File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 671, in run getattr(self, task_to_run)() File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 676, in run_chief return self._start_distributed_training( File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 825, in _start_distributed_training self._estimator.train( File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 360, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1186, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1217, in _train_model_default return self._train_with_estimator_spec(estimator_spec, worker_hooks, File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1512, in _train_with_estimator_spec with training.MonitoredTrainingSession( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/monitored_session.py", line 609, in MonitoredTrainingSession return MonitoredSession( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/monitored_session.py", line 1054, in __init__ super(MonitoredSession, self).__init__( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/monitored_session.py", line 745, in __init__ h.begin() File "/usr/local/lib/python3.8/dist-packages/tensorflow_recommenders_addons/dynamic_embedding/python/ops/warm_start_util.py", line 190, in begin self._restore_op = warm_start( File "/usr/local/lib/python3.8/dist-packages/tensorflow_recommenders_addons/dynamic_embedding/python/ops/warm_start_util.py", line 142, in warm_start with ops.colocate_with(saveable.op._resource_handle.op): File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 513, in __getattr__ self.__getattribute__(name) AttributeError: 'Tensor' object has no attribute '_resource_handle' /usr/local/lib/python3.8/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" /usr/local/lib/python3.8/dist-packages/tensorflow_recommenders_addons/utils/ensure_tf_install.py:53: UserWarning: Tensorflow Recommenders Addons supports using Python ops for all Tensorflow versions above or equal to 2.5.1 and strictly below 2.5.1 (nightly versions are not supported). The versions of TensorFlow you are currently using is 2.8.3 and is not supported. Some things might work, some things might not. If you were to encounter a bug, do not file an issue. If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the Recommenders Addons's version. You can find the compatibility matrix in Recommenders Addon's readme: https://github.com/tensorflow/recommenders-addons warnings.warn( WARNING:tensorflow:dynamic_embedding.GraphKeys has already been deprecated. The Variable will not be added to collections because it does not actully own any value, but only a holder of tables, which may lead to import_meta_graph failed since non-valued object has been added to collection. If you need to use `tf.compat.v1.train.Saver` and access all Variables from collection, you could manually add it to the collection by tf.compat.v1.add_to_collections(names, var) instead. 2023-05-28 09:28:05.647171: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: UNKNOWN ERROR (34) INFO:tensorflow:TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'worker', 'index': 0}} I0528 09:28:05.647387 139833197688640 run_config.py:549] TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'worker', 'index': 0}} INFO:tensorflow:Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:worker/task:0" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'worker', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2222', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 1, '_is_chief': False} I0528 09:28:05.648209 139833197688640 estimator.py:202] Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:worker/task:0" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'worker', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2222', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 1, '_is_chief': False} INFO:tensorflow:Not using Distribute Coordinator. I0528 09:28:05.649021 139833197688640 estimator_training.py:182] Not using Distribute Coordinator. INFO:tensorflow:Start Tensorflow server. I0528 09:28:05.649334 139833197688640 training.py:777] Start Tensorflow server. INFO:tensorflow:Waiting 5 secs before starting training. I0528 09:28:05.661170 139833197688640 training.py:821] Waiting 5 secs before starting training. /usr/local/lib/python3.8/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" /usr/local/lib/python3.8/dist-packages/tensorflow_recommenders_addons/utils/ensure_tf_install.py:53: UserWarning: Tensorflow Recommenders Addons supports using Python ops for all Tensorflow versions above or equal to 2.5.1 and strictly below 2.5.1 (nightly versions are not supported). The versions of TensorFlow you are currently using is 2.8.3 and is not supported. Some things might work, some things might not. If you were to encounter a bug, do not file an issue. If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the Recommenders Addons's version. You can find the compatibility matrix in Recommenders Addon's readme: https://github.com/tensorflow/recommenders-addons warnings.warn( WARNING:tensorflow:dynamic_embedding.GraphKeys has already been deprecated. The Variable will not be added to collections because it does not actully own any value, but only a holder of tables, which may lead to import_meta_graph failed since non-valued object has been added to collection. If you need to use `tf.compat.v1.train.Saver` and access all Variables from collection, you could manually add it to the collection by tf.compat.v1.add_to_collections(names, var) instead. 2023-05-28 09:28:06.595631: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: UNKNOWN ERROR (34) INFO:tensorflow:TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'ps', 'index': 0}} I0528 09:28:06.595910 140074642167616 run_config.py:549] TF_CONFIG environment variable: {'cluster': {'worker': ['localhost:2222'], 'ps': ['localhost:2223'], 'chief': ['localhost:2224']}, 'task': {'type': 'ps', 'index': 0}} INFO:tensorflow:Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:worker" device_filters: "/job:chief" device_filters: "/job:master" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'ps', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2223', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 2, '_is_chief': False} I0528 09:28:06.596813 140074642167616 estimator.py:202] Using config: {'_model_dir': './ckpt', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': device_filters: "/job:ps" device_filters: "/job:worker" device_filters: "/job:chief" device_filters: "/job:master" allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 2, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({'chief': ['localhost:2224'], 'ps': ['localhost:2223'], 'worker': ['localhost:2222']}), '_task_type': 'ps', '_task_id': 0, '_evaluation_master': '', '_master': 'grpc://localhost:2223', '_num_ps_replicas': 1, '_num_worker_replicas': 2, '_global_id_in_cluster': 2, '_is_chief': False} INFO:tensorflow:Not using Distribute Coordinator. I0528 09:28:06.597707 140074642167616 estimator_training.py:182] Not using Distribute Coordinator. INFO:tensorflow:Start Tensorflow server. I0528 09:28:06.597994 140074642167616 training.py:777] Start Tensorflow server. E0528 09:28:06.604324975 4150 server_chttp2.cc:40] {"created":"@1685266086.604274236","description":"No address added out of total 1 resolved","file":"external/com_github_grpc_grpc/src/core/ext/transport/chttp2/server/chttp2_server.cc","file_line":395,"referenced_errors":[{"created":"@1685266086.604272760","description":"Failed to add any wildcard listeners","file":"external/com_github_grpc_grpc/src/core/lib/iomgr/tcp_server_posix.cc","file_line":342,"referenced_errors":[{"created":"@1685266086.604235988","description":"Address family not supported by protocol","errno":97,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":420,"os_error":"Address family not supported by protocol","syscall":"socket","target_address":"[::]:2223"},{"created":"@1685266086.604272423","description":"Unable to configure socket","fd":6,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/tcp_server_utils_posix_common.cc","file_line":216,"referenced_errors":[{"created":"@1685266086.604269617","description":"Address already in use","errno":98,"file":"external/com_github_grpc_grpc/src/core/lib/iomgr/tcp_server_utils_posix_common.cc","file_line":189,"os_error":"Address already in use","syscall":"bind"}]}]}]} 2023-05-28 09:28:06.604412: E tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:566] UNKNOWN: Could not start gRPC server Traceback (most recent call last): File "main.py", line 270, in app.run(main) File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run _run_main(main, args) File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "main.py", line 260, in main train(FLAGS.model_dir) File "main.py", line 220, in train tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 504, in train_and_evaluate return executor.run() File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 671, in run getattr(self, task_to_run)() File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 718, in run_ps server = self._start_std_server(config) File "/usr/local/lib/python3.8/dist-packages/tensorflow_estimator/python/estimator/training.py", line 786, in _start_std_server server = server_lib.Server( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/server_lib.py", line 144, in __init__ self._server = c_api.TF_NewServer(self._server_def.SerializeToString()) tensorflow.python.framework.errors_impl.UnknownError: Could not start gRPC server WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/training_util.py:396: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. W0528 09:28:10.672068 139833197688640 deprecation.py:337] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/training_util.py:396: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. INFO:tensorflow:Calling model_fn. I0528 09:28:10.704444 139833197688640 estimator.py:1173] Calling model_fn. INFO:tensorflow:Done calling model_fn. I0528 09:28:10.922719 139833197688640 estimator.py:1175] Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. I0528 09:28:10.924189 139833197688640 basic_session_run_hooks.py:558] Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. I0528 09:28:11.010834 139833197688640 monitored_session.py:243] Graph was finalized. INFO:tensorflow:Running local_init_op. I0528 09:28:11.054974 139833197688640 session_manager.py:527] Running local_init_op. INFO:tensorflow:Done running local_init_op. I0528 09:28:11.061418 139833197688640 session_manager.py:530] Done running local_init_op. INFO:tensorflow:loss = 0.00030233845, step = 3128 I0528 09:28:11.190842 139833197688640 basic_session_run_hooks.py:266] loss = 0.00030233845, step = 3128 INFO:tensorflow:emb_size = 10 I0528 09:28:11.191122 139833197688640 basic_session_run_hooks.py:266] emb_size = 10 INFO:tensorflow:emb_size = 10 (0.147 sec) I0528 09:28:11.338624 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.147 sec) INFO:tensorflow:emb_size = 10 (0.035 sec) I0528 09:28:11.373679 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.035 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:11.409201 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.035 sec) I0528 09:28:11.444450 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.035 sec) INFO:tensorflow:emb_size = 10 (0.032 sec) I0528 09:28:11.476409 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.032 sec) INFO:tensorflow:emb_size = 10 (0.032 sec) I0528 09:28:11.508870 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.032 sec) INFO:tensorflow:emb_size = 10 (0.033 sec) I0528 09:28:11.541531 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.033 sec) INFO:tensorflow:emb_size = 10 (0.038 sec) I0528 09:28:11.579110 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.038 sec) INFO:tensorflow:emb_size = 10 (0.037 sec) I0528 09:28:11.616489 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.037 sec) INFO:tensorflow:loss = 0.00021921672, step = 3228 (0.463 sec) I0528 09:28:11.654058 139833197688640 basic_session_run_hooks.py:264] loss = 0.00021921672, step = 3228 (0.463 sec) INFO:tensorflow:emb_size = 10 (0.038 sec) I0528 09:28:11.654341 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.038 sec) INFO:tensorflow:emb_size = 10 (0.034 sec) I0528 09:28:11.688720 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.034 sec) INFO:tensorflow:emb_size = 10 (0.034 sec) I0528 09:28:11.722474 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.034 sec) INFO:tensorflow:emb_size = 10 (0.038 sec) I0528 09:28:11.760245 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.038 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:11.796334 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.037 sec) I0528 09:28:11.832874 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.037 sec) INFO:tensorflow:emb_size = 10 (0.037 sec) I0528 09:28:11.869731 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.037 sec) INFO:tensorflow:emb_size = 10 (0.038 sec) I0528 09:28:11.907912 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.038 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:11.944333 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.033 sec) I0528 09:28:11.977528 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.033 sec) INFO:tensorflow:loss = 0.0001237304, step = 3328 (0.360 sec) I0528 09:28:12.013804 139833197688640 basic_session_run_hooks.py:264] loss = 0.0001237304, step = 3328 (0.360 sec) INFO:tensorflow:emb_size = 10 (0.037 sec) I0528 09:28:12.014047 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.037 sec) INFO:tensorflow:emb_size = 10 (0.034 sec) I0528 09:28:12.047687 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.034 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:12.083745 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.041 sec) I0528 09:28:12.124903 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.041 sec) INFO:tensorflow:emb_size = 10 (0.035 sec) I0528 09:28:12.160395 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.035 sec) INFO:tensorflow:emb_size = 10 (0.034 sec) I0528 09:28:12.194021 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.034 sec) INFO:tensorflow:emb_size = 10 (0.033 sec) I0528 09:28:12.227432 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.033 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:12.263111 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.038 sec) I0528 09:28:12.301128 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.038 sec) INFO:tensorflow:emb_size = 10 (0.040 sec) I0528 09:28:12.341221 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.040 sec) INFO:tensorflow:loss = 0.000117327305, step = 3428 (0.376 sec) I0528 09:28:12.389701 139833197688640 basic_session_run_hooks.py:264] loss = 0.000117327305, step = 3428 (0.376 sec) INFO:tensorflow:emb_size = 10 (0.049 sec) I0528 09:28:12.389959 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.049 sec) INFO:tensorflow:emb_size = 10 (0.042 sec) I0528 09:28:12.432028 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.042 sec) INFO:tensorflow:emb_size = 10 (0.040 sec) I0528 09:28:12.472251 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.040 sec) INFO:tensorflow:emb_size = 10 (0.036 sec) I0528 09:28:12.508009 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.036 sec) INFO:tensorflow:emb_size = 10 (0.035 sec) I0528 09:28:12.542820 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.035 sec) INFO:tensorflow:emb_size = 10 (0.040 sec) I0528 09:28:12.582829 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.040 sec) INFO:tensorflow:emb_size = 10 (0.041 sec) I0528 09:28:12.624187 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.041 sec) INFO:tensorflow:emb_size = 10 (0.041 sec) I0528 09:28:12.665272 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.041 sec) INFO:tensorflow:emb_size = 10 (0.033 sec) I0528 09:28:12.697775 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.033 sec) INFO:tensorflow:emb_size = 10 (0.033 sec) I0528 09:28:12.730712 139833197688640 basic_session_run_hooks.py:264] emb_size = 10 (0.033 sec) 2023-05-28 09:28:12.732436: W tensorflow/core/distributed_runtime/rpc/grpc_worker_service.cc:513] RecvTensor cancelled for 69785301723769019 2023-05-28 09:28:12.732525: W tensorflow/core/distributed_runtime/rpc/grpc_worker_service.cc:513] RecvTensor cancelled for 69785301723769019 2023-05-28 09:28:12.732580: W tensorflow/core/distributed_runtime/rpc/grpc_worker_service.cc:513] RecvTensor cancelled for 69785301723769019 2023-05-28 09:28:12.732623: W tensorflow/core/distributed_runtime/rpc/grpc_worker_service.cc:513] RecvTensor cancelled for 69785301723769019 2023-05-28 09:28:12.732663: W tensorflow/core/distributed_runtime/rpc/grpc_worker_service.cc:513] RecvTensor cancelled for 69785301723769019 INFO:tensorflow:Loss for final step: 5.5710232e-05. I0528 09:28:12.977537 139833197688640 estimator.py:361] Loss for final step: 5.5710232e-05. k>>> (None, 1) k>>> (None, 1) k>>> (None, 1) mode>>> train ids>>>shape (None, 2) ps_list ['/job:ps/replica:0/task:0/CPU:0'] warm start ckpt1