WARNING: Logging before flag parsing goes to stderr. W0420 07:30:11.011545 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/ops/py_x_ops.py:26: The name tf.resource_loader.get_path_to_datafile is deprecated. Please use tf.compat.v1.resource_loader.get_path_to_datafile instead. W0420 07:30:11.038940 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1234: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead. W0420 07:30:11.294478 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1556: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead. W0420 07:30:11.295180 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/model_imports.py:46: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead. I0420 07:30:11.295275 140395597309760 model_imports.py:46] Importing lingvo.tasks.asr.params W0420 07:30:11.316593 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/model_registry.py:121: The name tf.logging.debug is deprecated. Please use tf.compat.v1.logging.debug instead. I0420 07:30:11.316711 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.asr.params.librispeech I0420 07:30:11.320092 140395597309760 model_imports.py:46] Importing lingvo.tasks.image.params I0420 07:30:11.322403 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.image.params.mnist I0420 07:30:11.322523 140395597309760 model_imports.py:46] Importing lingvo.tasks.lm.params I0420 07:30:11.324174 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.lm.params.one_billion_wds I0420 07:30:11.326229 140395597309760 model_imports.py:46] Importing lingvo.tasks.mt.params I0420 07:30:11.330373 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.mt.params.wmt14_en_de I0420 07:30:11.335803 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.mt.params.wmtm16_en_de I0420 07:30:11.335920 140395597309760 model_imports.py:46] Importing lingvo.tasks.punctuator.params I0420 07:30:11.337392 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.punctuator.params.codelab W0420 07:30:11.337567 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1515: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. W0420 07:30:11.337685 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1515: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. W0420 07:30:11.337846 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1383: The name tf.train.Server is deprecated. Please use tf.distribute.Server instead. 2019-04-20 07:30:11.338234: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-04-20 07:30:11.350665: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcuda.so.1 2019-04-20 07:30:13.961255: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6fb62e0 executing computations on platform CUDA. Devices: 2019-04-20 07:30:13.961341: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): TITAN Xp, Compute Capability 6.1 2019-04-20 07:30:13.961383: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (1): TITAN Xp, Compute Capability 6.1 2019-04-20 07:30:13.961422: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (2): TITAN Xp, Compute Capability 6.1 2019-04-20 07:30:13.961434: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (3): TITAN Xp, Compute Capability 6.1 2019-04-20 07:30:13.969634: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199885000 Hz 2019-04-20 07:30:13.979006: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x711be40 executing computations on platform Host. Devices: 2019-04-20 07:30:13.979064: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2019-04-20 07:30:13.979776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 0 with properties: name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:02:00.0 totalMemory: 11.91GiB freeMemory: 11.75GiB 2019-04-20 07:30:13.980268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 1 with properties: name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:03:00.0 totalMemory: 11.91GiB freeMemory: 11.75GiB 2019-04-20 07:30:13.980740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 2 with properties: name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:82:00.0 totalMemory: 11.91GiB freeMemory: 11.75GiB 2019-04-20 07:30:13.981206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 3 with properties: name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID: 0000:83:00.0 totalMemory: 11.91GiB freeMemory: 11.75GiB 2019-04-20 07:30:13.984208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1718] Adding visible gpu devices: 0, 1, 2, 3 2019-04-20 07:30:13.984778: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcudart.so.10.0 2019-04-20 07:30:13.991259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-04-20 07:30:13.991291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1132] 0 1 2 3 2019-04-20 07:30:13.991309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 0: N Y N N 2019-04-20 07:30:13.991319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 1: Y N N N 2019-04-20 07:30:13.991328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 2: N N N Y 2019-04-20 07:30:13.991337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 3: N N Y N 2019-04-20 07:30:13.993076: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:0 with 11427 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:02:00.0, compute capability: 6.1) 2019-04-20 07:30:13.993564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:1 with 11427 MB memory) -> physical GPU (device: 1, name: TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1) 2019-04-20 07:30:13.993994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:2 with 11427 MB memory) -> physical GPU (device: 2, name: TITAN Xp, pci bus id: 0000:82:00.0, compute capability: 6.1) 2019-04-20 07:30:13.995082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:3 with 11427 MB memory) -> physical GPU (device: 3, name: TITAN Xp, pci bus id: 0000:83:00.0, compute capability: 6.1) 2019-04-20 07:30:13.999280: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:250] Initialize GrpcChannelCache for job local -> {0 -> localhost:40087} 2019-04-20 07:30:14.010769: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:365] Started server with target: grpc://localhost:40087 I0420 07:30:14.019190 140395597309760 trainer.py:1263] Job controller start I0420 07:30:14.087660 140395597309760 base_runner.py:67] ============================================================ I0420 07:30:14.100358 140395597309760 base_runner.py:69] allow_implicit_capture : NoneType I0420 07:30:14.100505 140395597309760 base_runner.py:69] cls : type/lingvo.core.base_model/SingleTaskModel I0420 07:30:14.100614 140395597309760 base_runner.py:69] cluster.add_summary : NoneType I0420 07:30:14.100713 140395597309760 base_runner.py:69] cluster.cls : type/lingvo.core.cluster/_Cluster I0420 07:30:14.100819 140395597309760 base_runner.py:69] cluster.controller.cpus_per_replica : 1 I0420 07:30:14.100914 140395597309760 base_runner.py:69] cluster.controller.devices_per_split : 1 I0420 07:30:14.101003 140395597309760 base_runner.py:69] cluster.controller.gpus_per_replica : 0 I0420 07:30:14.101094 140395597309760 base_runner.py:69] cluster.controller.name : '/job:local' I0420 07:30:14.101186 140395597309760 base_runner.py:69] cluster.controller.num_tpu_hosts : 0 I0420 07:30:14.101277 140395597309760 base_runner.py:69] cluster.controller.replicas : 1 I0420 07:30:14.101367 140395597309760 base_runner.py:69] cluster.controller.tpus_per_replica : 0 I0420 07:30:14.101458 140395597309760 base_runner.py:69] cluster.decoder.cpus_per_replica : 1 I0420 07:30:14.101547 140395597309760 base_runner.py:69] cluster.decoder.devices_per_split : 1 I0420 07:30:14.101638 140395597309760 base_runner.py:69] cluster.decoder.gpus_per_replica : 1 I0420 07:30:14.101732 140395597309760 base_runner.py:69] cluster.decoder.name : '/job:local' I0420 07:30:14.101828 140395597309760 base_runner.py:69] cluster.decoder.num_tpu_hosts : 0 I0420 07:30:14.101918 140395597309760 base_runner.py:69] cluster.decoder.replicas : 1 I0420 07:30:14.102009 140395597309760 base_runner.py:69] cluster.decoder.tpus_per_replica : 0 I0420 07:30:14.102098 140395597309760 base_runner.py:69] cluster.evaler.cpus_per_replica : 1 I0420 07:30:14.102188 140395597309760 base_runner.py:69] cluster.evaler.devices_per_split : 1 I0420 07:30:14.102277 140395597309760 base_runner.py:69] cluster.evaler.gpus_per_replica : 1 I0420 07:30:14.102365 140395597309760 base_runner.py:69] cluster.evaler.name : '/job:local' I0420 07:30:14.102453 140395597309760 base_runner.py:69] cluster.evaler.num_tpu_hosts : 0 I0420 07:30:14.102544 140395597309760 base_runner.py:69] cluster.evaler.replicas : 1 I0420 07:30:14.102632 140395597309760 base_runner.py:69] cluster.evaler.tpus_per_replica : 0 I0420 07:30:14.102726 140395597309760 base_runner.py:69] cluster.input.cpus_per_replica : 1 I0420 07:30:14.102821 140395597309760 base_runner.py:69] cluster.input.devices_per_split : 1 I0420 07:30:14.102912 140395597309760 base_runner.py:69] cluster.input.gpus_per_replica : 0 I0420 07:30:14.103003 140395597309760 base_runner.py:69] cluster.input.name : '/job:local' I0420 07:30:14.103091 140395597309760 base_runner.py:69] cluster.input.num_tpu_hosts : 0 I0420 07:30:14.103180 140395597309760 base_runner.py:69] cluster.input.replicas : 0 I0420 07:30:14.103271 140395597309760 base_runner.py:69] cluster.input.tpus_per_replica : 0 I0420 07:30:14.103358 140395597309760 base_runner.py:69] cluster.job : 'controller' I0420 07:30:14.103446 140395597309760 base_runner.py:69] cluster.mode : 'async' I0420 07:30:14.103534 140395597309760 base_runner.py:69] cluster.ps.cpus_per_replica : 1 I0420 07:30:14.103624 140395597309760 base_runner.py:69] cluster.ps.devices_per_split : 1 I0420 07:30:14.103738 140395597309760 base_runner.py:69] cluster.ps.gpus_per_replica : 0 I0420 07:30:14.103827 140395597309760 base_runner.py:69] cluster.ps.name : '/job:local' I0420 07:30:14.103914 140395597309760 base_runner.py:69] cluster.ps.num_tpu_hosts : 0 I0420 07:30:14.104001 140395597309760 base_runner.py:69] cluster.ps.replicas : 1 I0420 07:30:14.104088 140395597309760 base_runner.py:69] cluster.ps.tpus_per_replica : 0 I0420 07:30:14.104173 140395597309760 base_runner.py:69] cluster.task : 0 I0420 07:30:14.104264 140395597309760 base_runner.py:69] cluster.worker.cpus_per_replica : 1 I0420 07:30:14.104351 140395597309760 base_runner.py:69] cluster.worker.devices_per_split : 1 I0420 07:30:14.104446 140395597309760 base_runner.py:69] cluster.worker.gpus_per_replica : 4 I0420 07:30:14.104536 140395597309760 base_runner.py:69] cluster.worker.name : '/job:local' I0420 07:30:14.104620 140395597309760 base_runner.py:69] cluster.worker.num_tpu_hosts : 0 I0420 07:30:14.104707 140395597309760 base_runner.py:69] cluster.worker.replicas : 1 I0420 07:30:14.104811 140395597309760 base_runner.py:69] cluster.worker.tpus_per_replica : 0 I0420 07:30:14.104897 140395597309760 base_runner.py:69] dtype : float32 I0420 07:30:14.104984 140395597309760 base_runner.py:69] fprop_dtype : NoneType I0420 07:30:14.105072 140395597309760 base_runner.py:69] inference_driver_name : NoneType I0420 07:30:14.105158 140395597309760 base_runner.py:69] input.allow_implicit_capture : NoneType I0420 07:30:14.105245 140395597309760 base_runner.py:69] input.append_eos_frame : True I0420 07:30:14.105331 140395597309760 base_runner.py:69] input.bucket_adjust_every_n : 0 I0420 07:30:14.105418 140395597309760 base_runner.py:69] input.bucket_batch_limit : [64, 32, 32, 32, 32, 32, 32, 32] I0420 07:30:14.105505 140395597309760 base_runner.py:69] input.bucket_upper_bound : [639, 1062, 1275, 1377, 1449, 1506, 1563, 1710] I0420 07:30:14.105678 140395597309760 base_runner.py:69] input.cls : type/lingvo.tasks.asr.input_generator/AsrInput I0420 07:30:14.105782 140395597309760 base_runner.py:69] input.dtype : float32 I0420 07:30:14.105878 140395597309760 base_runner.py:69] input.file_buffer_size : 10000 I0420 07:30:14.105963 140395597309760 base_runner.py:69] input.file_parallelism : 16 I0420 07:30:14.106050 140395597309760 base_runner.py:69] input.file_pattern : 'tfrecord:/data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*' I0420 07:30:14.106137 140395597309760 base_runner.py:69] input.file_random_seed : 0 I0420 07:30:14.106223 140395597309760 base_runner.py:69] input.flush_every_n : 0 I0420 07:30:14.106308 140395597309760 base_runner.py:69] input.fprop_dtype : NoneType I0420 07:30:14.106395 140395597309760 base_runner.py:69] input.frame_size : 80 I0420 07:30:14.106481 140395597309760 base_runner.py:69] input.inference_driver_name : NoneType I0420 07:30:14.106566 140395597309760 base_runner.py:69] input.is_eval : False I0420 07:30:14.106653 140395597309760 base_runner.py:69] input.is_inference : NoneType I0420 07:30:14.106745 140395597309760 base_runner.py:69] input.name : 'input' I0420 07:30:14.106836 140395597309760 base_runner.py:69] input.num_batcher_threads : 1 I0420 07:30:14.106921 140395597309760 base_runner.py:69] input.num_samples : 281241 I0420 07:30:14.107008 140395597309760 base_runner.py:69] input.pad_to_max_seq_length : False I0420 07:30:14.107094 140395597309760 base_runner.py:69] input.params_init.method : 'xavier' I0420 07:30:14.107178 140395597309760 base_runner.py:69] input.params_init.scale : 1.000001 I0420 07:30:14.107264 140395597309760 base_runner.py:69] input.params_init.seed : NoneType I0420 07:30:14.107350 140395597309760 base_runner.py:69] input.random_seed : NoneType I0420 07:30:14.107436 140395597309760 base_runner.py:69] input.require_sequential_order : False I0420 07:30:14.107522 140395597309760 base_runner.py:69] input.skip_lp_regularization : NoneType I0420 07:30:14.107608 140395597309760 base_runner.py:69] input.source_max_length : 3000 I0420 07:30:14.107693 140395597309760 base_runner.py:69] input.target_max_length : 620 I0420 07:30:14.107789 140395597309760 base_runner.py:69] input.tokenizer.allow_implicit_capture : NoneType I0420 07:30:14.107877 140395597309760 base_runner.py:69] input.tokenizer.append_eos : True I0420 07:30:14.107960 140395597309760 base_runner.py:69] input.tokenizer.cls : type/lingvo.core.tokenizers/AsciiTokenizer I0420 07:30:14.108072 140395597309760 base_runner.py:69] input.tokenizer.dtype : float32 I0420 07:30:14.108164 140395597309760 base_runner.py:69] input.tokenizer.fprop_dtype : NoneType I0420 07:30:14.108249 140395597309760 base_runner.py:69] input.tokenizer.inference_driver_name : NoneType I0420 07:30:14.108335 140395597309760 base_runner.py:69] input.tokenizer.is_eval : NoneType I0420 07:30:14.108431 140395597309760 base_runner.py:69] input.tokenizer.is_inference : NoneType I0420 07:30:14.108519 140395597309760 base_runner.py:69] input.tokenizer.name : 'tokenizer' I0420 07:30:14.108606 140395597309760 base_runner.py:69] input.tokenizer.pad_to_max_length : True I0420 07:30:14.108691 140395597309760 base_runner.py:69] input.tokenizer.params_init.method : 'xavier' I0420 07:30:14.108793 140395597309760 base_runner.py:69] input.tokenizer.params_init.scale : 1.000001 I0420 07:30:14.108880 140395597309760 base_runner.py:69] input.tokenizer.params_init.seed : NoneType I0420 07:30:14.108967 140395597309760 base_runner.py:69] input.tokenizer.random_seed : NoneType I0420 07:30:14.109051 140395597309760 base_runner.py:69] input.tokenizer.skip_lp_regularization : NoneType I0420 07:30:14.109138 140395597309760 base_runner.py:69] input.tokenizer.target_eos_id : 2 I0420 07:30:14.109224 140395597309760 base_runner.py:69] input.tokenizer.target_sos_id : 1 I0420 07:30:14.109308 140395597309760 base_runner.py:69] input.tokenizer.target_unk_id : 0 I0420 07:30:14.109395 140395597309760 base_runner.py:69] input.tokenizer.vn.global_vn : False I0420 07:30:14.109479 140395597309760 base_runner.py:69] input.tokenizer.vn.per_step_vn : False I0420 07:30:14.109565 140395597309760 base_runner.py:69] input.tokenizer.vn.scale : NoneType I0420 07:30:14.109652 140395597309760 base_runner.py:69] input.tokenizer.vn.seed : NoneType I0420 07:30:14.109747 140395597309760 base_runner.py:69] input.tokenizer.vocab_size : 76 I0420 07:30:14.109838 140395597309760 base_runner.py:69] input.tokenizer_dict : {} I0420 07:30:14.109922 140395597309760 base_runner.py:69] input.tpu_infeed_parallism : 1 I0420 07:30:14.110008 140395597309760 base_runner.py:69] input.use_per_host_infeed : False I0420 07:30:14.110093 140395597309760 base_runner.py:69] input.use_within_batch_mixing : False I0420 07:30:14.110177 140395597309760 base_runner.py:69] input.vn.global_vn : False I0420 07:30:14.110264 140395597309760 base_runner.py:69] input.vn.per_step_vn : False I0420 07:30:14.110349 140395597309760 base_runner.py:69] input.vn.scale : NoneType I0420 07:30:14.110435 140395597309760 base_runner.py:69] input.vn.seed : NoneType I0420 07:30:14.110521 140395597309760 base_runner.py:69] is_eval : NoneType I0420 07:30:14.110605 140395597309760 base_runner.py:69] is_inference : NoneType I0420 07:30:14.110690 140395597309760 base_runner.py:69] model : 'asr.librispeech.Librispeech960Grapheme@/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/params/librispeech.py:181' I0420 07:30:14.110785 140395597309760 base_runner.py:69] name : '' I0420 07:30:14.110872 140395597309760 base_runner.py:69] params_init.method : 'xavier' I0420 07:30:14.110959 140395597309760 base_runner.py:69] params_init.scale : 1.000001 I0420 07:30:14.111044 140395597309760 base_runner.py:69] params_init.seed : NoneType I0420 07:30:14.111129 140395597309760 base_runner.py:69] random_seed : NoneType I0420 07:30:14.111215 140395597309760 base_runner.py:69] skip_lp_regularization : NoneType I0420 07:30:14.111299 140395597309760 base_runner.py:69] task.allow_implicit_capture : NoneType I0420 07:30:14.111383 140395597309760 base_runner.py:69] task.cls : type/lingvo.tasks.asr.model/AsrModel I0420 07:30:14.111469 140395597309760 base_runner.py:69] task.decoder.allow_implicit_capture : NoneType I0420 07:30:14.111555 140395597309760 base_runner.py:69] task.decoder.atten_context_dim : 0 I0420 07:30:14.111640 140395597309760 base_runner.py:69] task.decoder.attention.allow_implicit_capture : NoneType I0420 07:30:14.111735 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_deterministic : False I0420 07:30:14.111824 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_prob : 0.0 I0420 07:30:14.111911 140395597309760 base_runner.py:69] task.decoder.attention.cls : type/lingvo.core.attention/AdditiveAttention I0420 07:30:14.111996 140395597309760 base_runner.py:69] task.decoder.attention.dtype : float32 I0420 07:30:14.112081 140395597309760 base_runner.py:69] task.decoder.attention.fprop_dtype : NoneType I0420 07:30:14.112174 140395597309760 base_runner.py:69] task.decoder.attention.hidden_dim : 128 I0420 07:30:14.112262 140395597309760 base_runner.py:69] task.decoder.attention.inference_driver_name : NoneType I0420 07:30:14.112346 140395597309760 base_runner.py:69] task.decoder.attention.is_eval : NoneType I0420 07:30:14.112432 140395597309760 base_runner.py:69] task.decoder.attention.is_inference : NoneType I0420 07:30:14.112517 140395597309760 base_runner.py:69] task.decoder.attention.name : '' I0420 07:30:14.112601 140395597309760 base_runner.py:69] task.decoder.attention.packed_input : False I0420 07:30:14.112687 140395597309760 base_runner.py:69] task.decoder.attention.params_init.method : 'uniform_sqrt_dim' I0420 07:30:14.112781 140395597309760 base_runner.py:69] task.decoder.attention.params_init.scale : 1.73205080757 I0420 07:30:14.112869 140395597309760 base_runner.py:69] task.decoder.attention.params_init.seed : NoneType I0420 07:30:14.112955 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.default : NoneType I0420 07:30:14.113039 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.fullyconnected : NoneType I0420 07:30:14.113125 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.softmax : NoneType I0420 07:30:14.113209 140395597309760 base_runner.py:69] task.decoder.attention.query_dim : 0 I0420 07:30:14.113293 140395597309760 base_runner.py:69] task.decoder.attention.random_seed : NoneType I0420 07:30:14.113379 140395597309760 base_runner.py:69] task.decoder.attention.same_batch_size : False I0420 07:30:14.113464 140395597309760 base_runner.py:69] task.decoder.attention.skip_lp_regularization : NoneType I0420 07:30:14.113549 140395597309760 base_runner.py:69] task.decoder.attention.source_dim : 0 I0420 07:30:14.113651 140395597309760 base_runner.py:69] task.decoder.attention.vn.global_vn : False I0420 07:30:14.113740 140395597309760 base_runner.py:69] task.decoder.attention.vn.per_step_vn : False I0420 07:30:14.113827 140395597309760 base_runner.py:69] task.decoder.attention.vn.scale : NoneType I0420 07:30:14.113909 140395597309760 base_runner.py:69] task.decoder.attention.vn.seed : NoneType I0420 07:30:14.113992 140395597309760 base_runner.py:69] task.decoder.attention_plot_font_properties : FontProperties I0420 07:30:14.114078 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_empty_terminated_hyp : True I0420 07:30:14.114161 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_implicit_capture : NoneType I0420 07:30:14.114243 140395597309760 base_runner.py:69] task.decoder.beam_search.batch_major_state : True I0420 07:30:14.114327 140395597309760 base_runner.py:69] task.decoder.beam_search.beam_size : 3.0 I0420 07:30:14.114411 140395597309760 base_runner.py:69] task.decoder.beam_search.cls : type/lingvo.core.beam_search_helper/BeamSearchHelper I0420 07:30:14.114494 140395597309760 base_runner.py:69] task.decoder.beam_search.coverage_penalty : 0.0 I0420 07:30:14.114577 140395597309760 base_runner.py:69] task.decoder.beam_search.dtype : float32 I0420 07:30:14.114660 140395597309760 base_runner.py:69] task.decoder.beam_search.ensure_full_beam : False I0420 07:30:14.114751 140395597309760 base_runner.py:69] task.decoder.beam_search.force_eos_in_last_step : False I0420 07:30:14.114836 140395597309760 base_runner.py:69] task.decoder.beam_search.fprop_dtype : NoneType I0420 07:30:14.114919 140395597309760 base_runner.py:69] task.decoder.beam_search.inference_driver_name : NoneType I0420 07:30:14.115003 140395597309760 base_runner.py:69] task.decoder.beam_search.is_eval : NoneType I0420 07:30:14.115087 140395597309760 base_runner.py:69] task.decoder.beam_search.is_inference : NoneType I0420 07:30:14.115169 140395597309760 base_runner.py:69] task.decoder.beam_search.length_normalization : 0.0 I0420 07:30:14.115252 140395597309760 base_runner.py:69] task.decoder.beam_search.merge_paths : False I0420 07:30:14.115334 140395597309760 base_runner.py:69] task.decoder.beam_search.name : 'beam_search' I0420 07:30:14.115423 140395597309760 base_runner.py:69] task.decoder.beam_search.num_hyps_per_beam : 8 I0420 07:30:14.115508 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.method : 'xavier' I0420 07:30:14.115592 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.scale : 1.000001 I0420 07:30:14.115674 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.seed : NoneType I0420 07:30:14.115767 140395597309760 base_runner.py:69] task.decoder.beam_search.random_seed : NoneType I0420 07:30:14.115852 140395597309760 base_runner.py:69] task.decoder.beam_search.skip_lp_regularization : NoneType I0420 07:30:14.115935 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eoc_id : -1 I0420 07:30:14.116018 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eos_id : 2 I0420 07:30:14.116100 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_len : 0 I0420 07:30:14.116183 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_length_ratio : 1.0 I0420 07:30:14.116266 140395597309760 base_runner.py:69] task.decoder.beam_search.target_sos_id : 1 I0420 07:30:14.116348 140395597309760 base_runner.py:69] task.decoder.beam_search.valid_eos_max_logit_delta : 5.0 I0420 07:30:14.116430 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.global_vn : False I0420 07:30:14.116513 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.per_step_vn : False I0420 07:30:14.116596 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.scale : NoneType I0420 07:30:14.116678 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.seed : NoneType I0420 07:30:14.116769 140395597309760 base_runner.py:69] task.decoder.cls : type/lingvo.tasks.asr.decoder/AsrDecoder I0420 07:30:14.116852 140395597309760 base_runner.py:69] task.decoder.contextualizer.allow_implicit_capture : NoneType I0420 07:30:14.116936 140395597309760 base_runner.py:69] task.decoder.contextualizer.cls : type/lingvo.tasks.asr.contextualizer_base/NullContextualizer I0420 07:30:14.117021 140395597309760 base_runner.py:69] task.decoder.contextualizer.dtype : float32 I0420 07:30:14.117105 140395597309760 base_runner.py:69] task.decoder.contextualizer.fprop_dtype : NoneType I0420 07:30:14.117188 140395597309760 base_runner.py:69] task.decoder.contextualizer.inference_driver_name : NoneType I0420 07:30:14.117273 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_eval : NoneType I0420 07:30:14.117357 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_inference : NoneType I0420 07:30:14.117440 140395597309760 base_runner.py:69] task.decoder.contextualizer.name : '' I0420 07:30:14.117525 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.method : 'xavier' I0420 07:30:14.117609 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.scale : 1.000001 I0420 07:30:14.117692 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.seed : NoneType I0420 07:30:14.117789 140395597309760 base_runner.py:69] task.decoder.contextualizer.random_seed : NoneType I0420 07:30:14.117873 140395597309760 base_runner.py:69] task.decoder.contextualizer.skip_lp_regularization : NoneType I0420 07:30:14.117958 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.global_vn : False I0420 07:30:14.118041 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.per_step_vn : False I0420 07:30:14.118124 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.scale : NoneType I0420 07:30:14.118207 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.seed : NoneType I0420 07:30:14.118290 140395597309760 base_runner.py:69] task.decoder.dropout_prob : 0.0 I0420 07:30:14.118374 140395597309760 base_runner.py:69] task.decoder.dtype : float32 I0420 07:30:14.118457 140395597309760 base_runner.py:69] task.decoder.emb.allow_implicit_capture : NoneType I0420 07:30:14.118540 140395597309760 base_runner.py:69] task.decoder.emb.cls : type/lingvo.core.layers/EmbeddingLayer I0420 07:30:14.118630 140395597309760 base_runner.py:69] task.decoder.emb.dtype : float32 I0420 07:30:14.118716 140395597309760 base_runner.py:69] task.decoder.emb.embedding_dim : 0 I0420 07:30:14.118808 140395597309760 base_runner.py:69] task.decoder.emb.fprop_dtype : NoneType I0420 07:30:14.118892 140395597309760 base_runner.py:69] task.decoder.emb.inference_driver_name : NoneType I0420 07:30:14.118976 140395597309760 base_runner.py:69] task.decoder.emb.is_eval : NoneType I0420 07:30:14.119060 140395597309760 base_runner.py:69] task.decoder.emb.is_inference : NoneType I0420 07:30:14.119143 140395597309760 base_runner.py:69] task.decoder.emb.max_num_shards : 1 I0420 07:30:14.119225 140395597309760 base_runner.py:69] task.decoder.emb.name : '' I0420 07:30:14.119308 140395597309760 base_runner.py:69] task.decoder.emb.on_ps : True I0420 07:30:14.119393 140395597309760 base_runner.py:69] task.decoder.emb.params_init.method : 'uniform' I0420 07:30:14.119476 140395597309760 base_runner.py:69] task.decoder.emb.params_init.scale : 1.0 I0420 07:30:14.119559 140395597309760 base_runner.py:69] task.decoder.emb.params_init.seed : NoneType I0420 07:30:14.119642 140395597309760 base_runner.py:69] task.decoder.emb.random_seed : NoneType I0420 07:30:14.119729 140395597309760 base_runner.py:69] task.decoder.emb.scale_sqrt_depth : False I0420 07:30:14.119817 140395597309760 base_runner.py:69] task.decoder.emb.skip_lp_regularization : NoneType I0420 07:30:14.119899 140395597309760 base_runner.py:69] task.decoder.emb.vn.global_vn : False I0420 07:30:14.119982 140395597309760 base_runner.py:69] task.decoder.emb.vn.per_step_vn : False I0420 07:30:14.120064 140395597309760 base_runner.py:69] task.decoder.emb.vn.scale : NoneType I0420 07:30:14.120146 140395597309760 base_runner.py:69] task.decoder.emb.vn.seed : NoneType I0420 07:30:14.120229 140395597309760 base_runner.py:69] task.decoder.emb.vocab_size : 76 I0420 07:30:14.120312 140395597309760 base_runner.py:69] task.decoder.emb_dim : 76 I0420 07:30:14.120395 140395597309760 base_runner.py:69] task.decoder.fprop_dtype : NoneType I0420 07:30:14.120480 140395597309760 base_runner.py:69] task.decoder.fusion.allow_implicit_capture : NoneType I0420 07:30:14.120563 140395597309760 base_runner.py:69] task.decoder.fusion.base_model_logits_dim : NoneType I0420 07:30:14.120646 140395597309760 base_runner.py:69] task.decoder.fusion.cls : type/lingvo.tasks.asr.fusion/NullFusion I0420 07:30:14.120735 140395597309760 base_runner.py:69] task.decoder.fusion.dtype : float32 I0420 07:30:14.120822 140395597309760 base_runner.py:69] task.decoder.fusion.fprop_dtype : NoneType I0420 07:30:14.120906 140395597309760 base_runner.py:69] task.decoder.fusion.inference_driver_name : NoneType I0420 07:30:14.120989 140395597309760 base_runner.py:69] task.decoder.fusion.is_eval : NoneType I0420 07:30:14.121071 140395597309760 base_runner.py:69] task.decoder.fusion.is_inference : NoneType I0420 07:30:14.121154 140395597309760 base_runner.py:69] task.decoder.fusion.lm.allow_implicit_capture : NoneType I0420 07:30:14.121237 140395597309760 base_runner.py:69] task.decoder.fusion.lm.cls : type/lingvo.tasks.lm.layers/NullLm I0420 07:30:14.121320 140395597309760 base_runner.py:69] task.decoder.fusion.lm.dtype : float32 I0420 07:30:14.121403 140395597309760 base_runner.py:69] task.decoder.fusion.lm.fprop_dtype : NoneType I0420 07:30:14.121486 140395597309760 base_runner.py:69] task.decoder.fusion.lm.inference_driver_name : NoneType I0420 07:30:14.121571 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_eval : NoneType I0420 07:30:14.121651 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_inference : NoneType I0420 07:30:14.121738 140395597309760 base_runner.py:69] task.decoder.fusion.lm.name : '' I0420 07:30:14.121825 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.method : 'xavier' I0420 07:30:14.121907 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.scale : 1.000001 I0420 07:30:14.121992 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.seed : NoneType I0420 07:30:14.122082 140395597309760 base_runner.py:69] task.decoder.fusion.lm.random_seed : NoneType I0420 07:30:14.122168 140395597309760 base_runner.py:69] task.decoder.fusion.lm.skip_lp_regularization : NoneType I0420 07:30:14.122251 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.global_vn : False I0420 07:30:14.122334 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.per_step_vn : False I0420 07:30:14.122416 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.scale : NoneType I0420 07:30:14.122499 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.seed : NoneType I0420 07:30:14.122582 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vocab_size : 96 I0420 07:30:14.122665 140395597309760 base_runner.py:69] task.decoder.fusion.name : '' I0420 07:30:14.122755 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.method : 'xavier' I0420 07:30:14.122842 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.scale : 1.000001 I0420 07:30:14.122925 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.seed : NoneType I0420 07:30:14.123008 140395597309760 base_runner.py:69] task.decoder.fusion.random_seed : NoneType I0420 07:30:14.123091 140395597309760 base_runner.py:69] task.decoder.fusion.skip_lp_regularization : NoneType I0420 07:30:14.123173 140395597309760 base_runner.py:69] task.decoder.fusion.vn.global_vn : False I0420 07:30:14.123256 140395597309760 base_runner.py:69] task.decoder.fusion.vn.per_step_vn : False I0420 07:30:14.123338 140395597309760 base_runner.py:69] task.decoder.fusion.vn.scale : NoneType I0420 07:30:14.123420 140395597309760 base_runner.py:69] task.decoder.fusion.vn.seed : NoneType I0420 07:30:14.123503 140395597309760 base_runner.py:69] task.decoder.inference_driver_name : NoneType I0420 07:30:14.123585 140395597309760 base_runner.py:69] task.decoder.is_eval : NoneType I0420 07:30:14.123667 140395597309760 base_runner.py:69] task.decoder.is_inference : NoneType I0420 07:30:14.123760 140395597309760 base_runner.py:69] task.decoder.label_smoothing : NoneType I0420 07:30:14.123846 140395597309760 base_runner.py:69] task.decoder.logit_types : {'logits': 1.0} I0420 07:30:14.123929 140395597309760 base_runner.py:69] task.decoder.min_ground_truth_prob : 1.0 I0420 07:30:14.124013 140395597309760 base_runner.py:69] task.decoder.min_prob_step : 1000000.0 I0420 07:30:14.124128 140395597309760 base_runner.py:69] task.decoder.name : '' I0420 07:30:14.124211 140395597309760 base_runner.py:69] task.decoder.packed_input : False I0420 07:30:14.124291 140395597309760 base_runner.py:69] task.decoder.parallel_iterations : 30 I0420 07:30:14.124373 140395597309760 base_runner.py:69] task.decoder.params_init.method : 'xavier' I0420 07:30:14.124455 140395597309760 base_runner.py:69] task.decoder.params_init.scale : 1.000001 I0420 07:30:14.124536 140395597309760 base_runner.py:69] task.decoder.params_init.seed : NoneType I0420 07:30:14.124619 140395597309760 base_runner.py:69] task.decoder.per_token_avg_loss : True I0420 07:30:14.124700 140395597309760 base_runner.py:69] task.decoder.prob_decay_start_step : 10000.0 I0420 07:30:14.124792 140395597309760 base_runner.py:69] task.decoder.random_seed : NoneType I0420 07:30:14.124874 140395597309760 base_runner.py:69] task.decoder.residual_start : 0 I0420 07:30:14.124955 140395597309760 base_runner.py:69] task.decoder.rnn_cell_dim : 1024 I0420 07:30:14.125037 140395597309760 base_runner.py:69] task.decoder.rnn_cell_hidden_dim : 0 I0420 07:30:14.125118 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.allow_implicit_capture : NoneType I0420 07:30:14.125201 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.apply_pruning : False I0420 07:30:14.125283 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.method : 'constant' I0420 07:30:14.125365 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.scale : 0.0 I0420 07:30:14.125447 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.seed : 0 I0420 07:30:14.125535 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cell_value_cap : 10.0 I0420 07:30:14.125619 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple I0420 07:30:14.125703 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.couple_input_forget_gates : False I0420 07:30:14.125794 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.dtype : float32 I0420 07:30:14.125876 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.enable_lstm_bias : True I0420 07:30:14.125958 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.forget_gate_bias : 0.0 I0420 07:30:14.126040 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.fprop_dtype : NoneType I0420 07:30:14.126121 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inference_driver_name : NoneType I0420 07:30:14.126204 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inputs_arity : 1 I0420 07:30:14.126285 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_eval : NoneType I0420 07:30:14.126365 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_inference : NoneType I0420 07:30:14.126446 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.name : '' I0420 07:30:14.126527 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_hidden_nodes : 0 I0420 07:30:14.126609 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_input_nodes : 0 I0420 07:30:14.126689 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_output_nodes : 0 I0420 07:30:14.126785 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.output_nonlinearity : True I0420 07:30:14.126868 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.method : 'uniform' I0420 07:30:14.126950 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.scale : 0.1 I0420 07:30:14.127031 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.seed : NoneType I0420 07:30:14.127113 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.c_state : NoneType I0420 07:30:14.127194 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.default : NoneType I0420 07:30:14.127275 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.fullyconnected : NoneType I0420 07:30:14.127357 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.m_state : NoneType I0420 07:30:14.127439 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.weight : NoneType I0420 07:30:14.127520 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.random_seed : NoneType I0420 07:30:14.127602 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.reset_cell_state : False I0420 07:30:14.127684 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.skip_lp_regularization : NoneType I0420 07:30:14.127774 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.global_vn : False I0420 07:30:14.127856 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.per_step_vn : False I0420 07:30:14.127938 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.scale : NoneType I0420 07:30:14.128021 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.seed : NoneType I0420 07:30:14.128102 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:14.128184 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.seed : NoneType I0420 07:30:14.128264 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zo_prob : 0.0 I0420 07:30:14.128346 140395597309760 base_runner.py:69] task.decoder.rnn_layers : 2 I0420 07:30:14.128427 140395597309760 base_runner.py:69] task.decoder.skip_lp_regularization : NoneType I0420 07:30:14.128509 140395597309760 base_runner.py:69] task.decoder.softmax.allow_implicit_capture : NoneType I0420 07:30:14.128592 140395597309760 base_runner.py:69] task.decoder.softmax.apply_pruning : False I0420 07:30:14.128683 140395597309760 base_runner.py:69] task.decoder.softmax.chunk_size : 0 I0420 07:30:14.128774 140395597309760 base_runner.py:69] task.decoder.softmax.cls : type/lingvo.core.layers/SimpleFullSoftmax I0420 07:30:14.128859 140395597309760 base_runner.py:69] task.decoder.softmax.dtype : float32 I0420 07:30:14.128942 140395597309760 base_runner.py:69] task.decoder.softmax.fprop_dtype : NoneType I0420 07:30:14.129024 140395597309760 base_runner.py:69] task.decoder.softmax.inference_driver_name : NoneType I0420 07:30:14.129106 140395597309760 base_runner.py:69] task.decoder.softmax.input_dim : 0 I0420 07:30:14.129188 140395597309760 base_runner.py:69] task.decoder.softmax.is_eval : NoneType I0420 07:30:14.129271 140395597309760 base_runner.py:69] task.decoder.softmax.is_inference : NoneType I0420 07:30:14.129352 140395597309760 base_runner.py:69] task.decoder.softmax.logits_abs_max : NoneType I0420 07:30:14.129432 140395597309760 base_runner.py:69] task.decoder.softmax.name : '' I0420 07:30:14.129514 140395597309760 base_runner.py:69] task.decoder.softmax.num_classes : 76 I0420 07:30:14.129596 140395597309760 base_runner.py:69] task.decoder.softmax.num_sampled : 0 I0420 07:30:14.129678 140395597309760 base_runner.py:69] task.decoder.softmax.num_shards : 1 I0420 07:30:14.129767 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.method : 'uniform' I0420 07:30:14.129851 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.scale : 0.1 I0420 07:30:14.129931 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.seed : NoneType I0420 07:30:14.130013 140395597309760 base_runner.py:69] task.decoder.softmax.qdomain.default : NoneType I0420 07:30:14.130095 140395597309760 base_runner.py:69] task.decoder.softmax.random_seed : NoneType I0420 07:30:14.130176 140395597309760 base_runner.py:69] task.decoder.softmax.skip_lp_regularization : NoneType I0420 07:30:14.130256 140395597309760 base_runner.py:69] task.decoder.softmax.vn.global_vn : False I0420 07:30:14.130337 140395597309760 base_runner.py:69] task.decoder.softmax.vn.per_step_vn : False I0420 07:30:14.130419 140395597309760 base_runner.py:69] task.decoder.softmax.vn.scale : NoneType I0420 07:30:14.130500 140395597309760 base_runner.py:69] task.decoder.softmax.vn.seed : NoneType I0420 07:30:14.130580 140395597309760 base_runner.py:69] task.decoder.softmax_uses_attention : True I0420 07:30:14.130661 140395597309760 base_runner.py:69] task.decoder.source_dim : 2048 I0420 07:30:14.130753 140395597309760 base_runner.py:69] task.decoder.target_eos_id : 2 I0420 07:30:14.130837 140395597309760 base_runner.py:69] task.decoder.target_seq_len : 620 I0420 07:30:14.130918 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.allow_implicit_capture : NoneType I0420 07:30:14.131000 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.cls : type/lingvo.core.target_sequence_sampler/TargetSequenceSampler I0420 07:30:14.131082 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.dtype : float32 I0420 07:30:14.131164 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.fprop_dtype : NoneType I0420 07:30:14.131244 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.inference_driver_name : NoneType I0420 07:30:14.131326 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_eval : NoneType I0420 07:30:14.131407 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_inference : NoneType I0420 07:30:14.131489 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.name : 'target_sequence_sampler' I0420 07:30:14.131570 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.method : 'xavier' I0420 07:30:14.131652 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.scale : 1.000001 I0420 07:30:14.131736 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.seed : NoneType I0420 07:30:14.131822 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.random_seed : NoneType I0420 07:30:14.131910 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.skip_lp_regularization : NoneType I0420 07:30:14.131994 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eoc_id : -1 I0420 07:30:14.132076 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eos_id : 2 I0420 07:30:14.132157 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_seq_len : 0 I0420 07:30:14.132236 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_sos_id : 1 I0420 07:30:14.132318 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.temperature : 1.0 I0420 07:30:14.132397 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.global_vn : False I0420 07:30:14.132478 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.per_step_vn : False I0420 07:30:14.132560 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.scale : NoneType I0420 07:30:14.132641 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.seed : NoneType I0420 07:30:14.132725 140395597309760 base_runner.py:69] task.decoder.target_sos_id : 1 I0420 07:30:14.132812 140395597309760 base_runner.py:69] task.decoder.use_unnormalized_logits_as_log_probs : True I0420 07:30:14.132894 140395597309760 base_runner.py:69] task.decoder.use_while_loop_based_unrolling : False I0420 07:30:14.132977 140395597309760 base_runner.py:69] task.decoder.vn.global_vn : False I0420 07:30:14.133059 140395597309760 base_runner.py:69] task.decoder.vn.per_step_vn : False I0420 07:30:14.133138 140395597309760 base_runner.py:69] task.decoder.vn.scale : NoneType I0420 07:30:14.133220 140395597309760 base_runner.py:69] task.decoder.vn.seed : NoneType I0420 07:30:14.133301 140395597309760 base_runner.py:69] task.dtype : float32 I0420 07:30:14.133383 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.activation : 'RELU' I0420 07:30:14.133462 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.allow_implicit_capture : NoneType I0420 07:30:14.133543 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.batch_norm : True I0420 07:30:14.133625 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bias : False I0420 07:30:14.133733 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_decay : 0.999 I0420 07:30:14.133817 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_fold_weights : NoneType I0420 07:30:14.133898 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.causal_convolution : False I0420 07:30:14.133977 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer I0420 07:30:14.134061 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.conv_last : False I0420 07:30:14.134140 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dilation_rate : (1, 1) I0420 07:30:14.134221 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.disable_activation_quantization : False I0420 07:30:14.134300 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dtype : float32 I0420 07:30:14.134382 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_shape : [3, 3, 'NoneType', 'NoneType'] I0420 07:30:14.134462 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_stride : [1, 1] I0420 07:30:14.134541 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.fprop_dtype : NoneType I0420 07:30:14.134619 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.inference_driver_name : NoneType I0420 07:30:14.134700 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_eval : NoneType I0420 07:30:14.134788 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_inference : NoneType I0420 07:30:14.134876 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.name : '' I0420 07:30:14.134958 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:14.135037 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.scale : 0.1 I0420 07:30:14.135118 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.seed : NoneType I0420 07:30:14.135199 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.qdomain.default : NoneType I0420 07:30:14.135278 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.random_seed : NoneType I0420 07:30:14.135358 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.skip_lp_regularization : NoneType I0420 07:30:14.135437 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.global_vn : False I0420 07:30:14.135518 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.per_step_vn : False I0420 07:30:14.135596 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.scale : NoneType I0420 07:30:14.135677 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.seed : NoneType I0420 07:30:14.135762 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.weight_norm : False I0420 07:30:14.135844 140395597309760 base_runner.py:69] task.encoder.allow_implicit_capture : NoneType I0420 07:30:14.135925 140395597309760 base_runner.py:69] task.encoder.bidi_rnn_type : 'func' I0420 07:30:14.136006 140395597309760 base_runner.py:69] task.encoder.cls : type/lingvo.tasks.asr.encoder/AsrEncoder I0420 07:30:14.136085 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.activation : 'RELU' I0420 07:30:14.136164 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.allow_implicit_capture : NoneType I0420 07:30:14.136245 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.batch_norm : True I0420 07:30:14.136323 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bias : False I0420 07:30:14.136404 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_decay : 0.999 I0420 07:30:14.136482 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_fold_weights : NoneType I0420 07:30:14.136563 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.causal_convolution : False I0420 07:30:14.136642 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer I0420 07:30:14.136727 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.conv_last : False I0420 07:30:14.136810 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dilation_rate : (1, 1) I0420 07:30:14.136890 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.disable_activation_quantization : False I0420 07:30:14.136971 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dtype : float32 I0420 07:30:14.137051 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_shape : (0, 0, 0, 0) I0420 07:30:14.137130 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_stride : (0, 0) I0420 07:30:14.137208 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.fprop_dtype : NoneType I0420 07:30:14.137288 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.inference_driver_name : NoneType I0420 07:30:14.137367 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_eval : NoneType I0420 07:30:14.137447 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_inference : NoneType I0420 07:30:14.137526 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.name : '' I0420 07:30:14.137604 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.method : 'gaussian' I0420 07:30:14.137685 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.scale : 0.001 I0420 07:30:14.137772 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.seed : NoneType I0420 07:30:14.137852 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.qdomain.default : NoneType I0420 07:30:14.137940 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.random_seed : NoneType I0420 07:30:14.138022 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.skip_lp_regularization : NoneType I0420 07:30:14.138103 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.global_vn : False I0420 07:30:14.138181 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.per_step_vn : False I0420 07:30:14.138261 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.scale : NoneType I0420 07:30:14.138339 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.seed : NoneType I0420 07:30:14.138418 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.weight_norm : False I0420 07:30:14.138498 140395597309760 base_runner.py:69] task.encoder.conv_filter_shapes : [(3, 3, 1, 32), (3, 3, 32, 32)] I0420 07:30:14.138577 140395597309760 base_runner.py:69] task.encoder.conv_filter_strides : [(2, 2), (2, 2)] I0420 07:30:14.138658 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.allow_implicit_capture : NoneType I0420 07:30:14.138742 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType'] I0420 07:30:14.138825 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_value_cap : 10.0 I0420 07:30:14.138905 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cls : type/lingvo.core.rnn_cell/ConvLSTMCell I0420 07:30:14.138986 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.dtype : float32 I0420 07:30:14.139065 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.filter_shape : [1, 3] I0420 07:30:14.139143 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.fprop_dtype : NoneType I0420 07:30:14.139225 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inference_driver_name : NoneType I0420 07:30:14.139303 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_arity : 1 I0420 07:30:14.139384 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType'] I0420 07:30:14.139463 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_eval : NoneType I0420 07:30:14.139543 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_inference : NoneType I0420 07:30:14.139622 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.name : '' I0420 07:30:14.139702 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_input_nodes : 0 I0420 07:30:14.139790 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_output_nodes : 0 I0420 07:30:14.139870 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.output_nonlinearity : True I0420 07:30:14.139950 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:14.140031 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.scale : 0.1 I0420 07:30:14.140111 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.seed : NoneType I0420 07:30:14.140191 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.qdomain.default : NoneType I0420 07:30:14.140269 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.random_seed : NoneType I0420 07:30:14.140348 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.reset_cell_state : False I0420 07:30:14.140429 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.skip_lp_regularization : NoneType I0420 07:30:14.140507 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.global_vn : False I0420 07:30:14.140588 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.per_step_vn : False I0420 07:30:14.140667 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.scale : NoneType I0420 07:30:14.140753 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.seed : NoneType I0420 07:30:14.140834 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:14.140921 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.seed : NoneType I0420 07:30:14.141004 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zo_prob : 0.0 I0420 07:30:14.141084 140395597309760 base_runner.py:69] task.encoder.dtype : float32 I0420 07:30:14.141165 140395597309760 base_runner.py:69] task.encoder.extra_per_layer_outputs : False I0420 07:30:14.141244 140395597309760 base_runner.py:69] task.encoder.fprop_dtype : NoneType I0420 07:30:14.141324 140395597309760 base_runner.py:69] task.encoder.highway_skip : False I0420 07:30:14.141405 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.allow_implicit_capture : NoneType I0420 07:30:14.141484 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.batch_norm : False I0420 07:30:14.141565 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.carry_bias_init : 1.0 I0420 07:30:14.141644 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.cls : type/lingvo.core.layers/HighwaySkipLayer I0420 07:30:14.141730 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.couple_carry_transform_gates : False I0420 07:30:14.141813 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.dtype : float32 I0420 07:30:14.141894 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.fprop_dtype : NoneType I0420 07:30:14.141973 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.inference_driver_name : NoneType I0420 07:30:14.142054 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.input_dim : 0 I0420 07:30:14.142133 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_eval : NoneType I0420 07:30:14.142214 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_inference : NoneType I0420 07:30:14.142293 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.name : '' I0420 07:30:14.142373 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.method : 'xavier' I0420 07:30:14.142453 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.scale : 1.000001 I0420 07:30:14.142534 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.seed : NoneType I0420 07:30:14.142612 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.random_seed : NoneType I0420 07:30:14.142693 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.skip_lp_regularization : NoneType I0420 07:30:14.142781 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.global_vn : False I0420 07:30:14.142863 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.per_step_vn : False I0420 07:30:14.142941 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.scale : NoneType I0420 07:30:14.143021 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.seed : NoneType I0420 07:30:14.143100 140395597309760 base_runner.py:69] task.encoder.inference_driver_name : NoneType I0420 07:30:14.143178 140395597309760 base_runner.py:69] task.encoder.input_shape : ['NoneType', 'NoneType', 80, 1] I0420 07:30:14.143259 140395597309760 base_runner.py:69] task.encoder.is_eval : NoneType I0420 07:30:14.143338 140395597309760 base_runner.py:69] task.encoder.is_inference : NoneType I0420 07:30:14.143419 140395597309760 base_runner.py:69] task.encoder.lstm_cell_size : 1024 I0420 07:30:14.143497 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.allow_implicit_capture : NoneType I0420 07:30:14.143578 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.apply_pruning : False I0420 07:30:14.143656 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.method : 'constant' I0420 07:30:14.143760 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.scale : 0.0 I0420 07:30:14.143840 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.seed : 0 I0420 07:30:14.143918 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cell_value_cap : 10.0 I0420 07:30:14.144005 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple I0420 07:30:14.144085 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.couple_input_forget_gates : False I0420 07:30:14.144164 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.dtype : float32 I0420 07:30:14.144243 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.enable_lstm_bias : True I0420 07:30:14.144320 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.forget_gate_bias : 0.0 I0420 07:30:14.144398 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.fprop_dtype : NoneType I0420 07:30:14.144475 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inference_driver_name : NoneType I0420 07:30:14.144553 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inputs_arity : 1 I0420 07:30:14.144632 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_eval : NoneType I0420 07:30:14.144709 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_inference : NoneType I0420 07:30:14.144798 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.name : '' I0420 07:30:14.144877 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_hidden_nodes : 0 I0420 07:30:14.144958 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_input_nodes : 0 I0420 07:30:14.145035 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_output_nodes : 0 I0420 07:30:14.145113 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.output_nonlinearity : True I0420 07:30:14.145191 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.method : 'uniform' I0420 07:30:14.145270 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.scale : 0.1 I0420 07:30:14.145348 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.seed : NoneType I0420 07:30:14.145426 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.c_state : NoneType I0420 07:30:14.145504 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.default : NoneType I0420 07:30:14.145581 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.fullyconnected : NoneType I0420 07:30:14.145661 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.m_state : NoneType I0420 07:30:14.145744 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.weight : NoneType I0420 07:30:14.145826 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.random_seed : NoneType I0420 07:30:14.145904 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.reset_cell_state : False I0420 07:30:14.145982 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.skip_lp_regularization : NoneType I0420 07:30:14.146061 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.global_vn : False I0420 07:30:14.146140 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.per_step_vn : False I0420 07:30:14.146218 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.scale : NoneType I0420 07:30:14.146296 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.seed : NoneType I0420 07:30:14.146373 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:14.146450 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.seed : NoneType I0420 07:30:14.146528 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zo_prob : 0.0 I0420 07:30:14.146608 140395597309760 base_runner.py:69] task.encoder.name : '' I0420 07:30:14.146686 140395597309760 base_runner.py:69] task.encoder.num_cnn_layers : 2 I0420 07:30:14.146769 140395597309760 base_runner.py:69] task.encoder.num_conv_lstm_layers : 0 I0420 07:30:14.146848 140395597309760 base_runner.py:69] task.encoder.num_lstm_layers : 4 I0420 07:30:14.146925 140395597309760 base_runner.py:69] task.encoder.packed_input : False I0420 07:30:14.147005 140395597309760 base_runner.py:69] task.encoder.pad_steps : 6 I0420 07:30:14.147083 140395597309760 base_runner.py:69] task.encoder.params_init.method : 'xavier' I0420 07:30:14.147161 140395597309760 base_runner.py:69] task.encoder.params_init.scale : 1.000001 I0420 07:30:14.147245 140395597309760 base_runner.py:69] task.encoder.params_init.seed : NoneType I0420 07:30:14.147325 140395597309760 base_runner.py:69] task.encoder.proj_tpl.activation : 'RELU' I0420 07:30:14.147403 140395597309760 base_runner.py:69] task.encoder.proj_tpl.affine_last : False I0420 07:30:14.147480 140395597309760 base_runner.py:69] task.encoder.proj_tpl.allow_implicit_capture : NoneType I0420 07:30:14.147558 140395597309760 base_runner.py:69] task.encoder.proj_tpl.batch_norm : True I0420 07:30:14.147636 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bias_init : 0.0 I0420 07:30:14.147715 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bn_fold_weights : NoneType I0420 07:30:14.147800 140395597309760 base_runner.py:69] task.encoder.proj_tpl.cls : type/lingvo.core.layers/ProjectionLayer I0420 07:30:14.147881 140395597309760 base_runner.py:69] task.encoder.proj_tpl.dtype : float32 I0420 07:30:14.147958 140395597309760 base_runner.py:69] task.encoder.proj_tpl.fprop_dtype : NoneType I0420 07:30:14.148036 140395597309760 base_runner.py:69] task.encoder.proj_tpl.has_bias : False I0420 07:30:14.148116 140395597309760 base_runner.py:69] task.encoder.proj_tpl.inference_driver_name : NoneType I0420 07:30:14.148194 140395597309760 base_runner.py:69] task.encoder.proj_tpl.input_dim : 0 I0420 07:30:14.148272 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_eval : NoneType I0420 07:30:14.148350 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_inference : NoneType I0420 07:30:14.148427 140395597309760 base_runner.py:69] task.encoder.proj_tpl.name : '' I0420 07:30:14.148505 140395597309760 base_runner.py:69] task.encoder.proj_tpl.output_dim : 0 I0420 07:30:14.148583 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:14.148662 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.scale : 0.1 I0420 07:30:14.148745 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.seed : NoneType I0420 07:30:14.148827 140395597309760 base_runner.py:69] task.encoder.proj_tpl.qdomain.default : NoneType I0420 07:30:14.148905 140395597309760 base_runner.py:69] task.encoder.proj_tpl.random_seed : NoneType I0420 07:30:14.148983 140395597309760 base_runner.py:69] task.encoder.proj_tpl.skip_lp_regularization : NoneType I0420 07:30:14.149061 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.global_vn : False I0420 07:30:14.149141 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.per_step_vn : False I0420 07:30:14.149219 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.scale : NoneType I0420 07:30:14.149296 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.seed : NoneType I0420 07:30:14.149374 140395597309760 base_runner.py:69] task.encoder.proj_tpl.weight_norm : False I0420 07:30:14.149451 140395597309760 base_runner.py:69] task.encoder.project_lstm_output : True I0420 07:30:14.149529 140395597309760 base_runner.py:69] task.encoder.random_seed : NoneType I0420 07:30:14.149607 140395597309760 base_runner.py:69] task.encoder.residual_start : 0 I0420 07:30:14.149687 140395597309760 base_runner.py:69] task.encoder.residual_stride : 1 I0420 07:30:14.149770 140395597309760 base_runner.py:69] task.encoder.skip_lp_regularization : NoneType I0420 07:30:14.149851 140395597309760 base_runner.py:69] task.encoder.vn.global_vn : False I0420 07:30:14.149930 140395597309760 base_runner.py:69] task.encoder.vn.per_step_vn : False I0420 07:30:14.150007 140395597309760 base_runner.py:69] task.encoder.vn.scale : NoneType I0420 07:30:14.150085 140395597309760 base_runner.py:69] task.encoder.vn.seed : NoneType I0420 07:30:14.150163 140395597309760 base_runner.py:69] task.eval.decoder_samples_per_summary : 0 I0420 07:30:14.150242 140395597309760 base_runner.py:69] task.eval.samples_per_summary : 5000 I0420 07:30:14.150321 140395597309760 base_runner.py:69] task.fprop_dtype : NoneType I0420 07:30:14.150398 140395597309760 base_runner.py:69] task.frontend : NoneType I0420 07:30:14.150489 140395597309760 base_runner.py:69] task.inference_driver_name : NoneType I0420 07:30:14.150569 140395597309760 base_runner.py:69] task.input : NoneType I0420 07:30:14.150648 140395597309760 base_runner.py:69] task.is_eval : NoneType I0420 07:30:14.150729 140395597309760 base_runner.py:69] task.is_inference : NoneType I0420 07:30:14.150811 140395597309760 base_runner.py:69] task.name : 'librispeech' I0420 07:30:14.150891 140395597309760 base_runner.py:69] task.online_encoder : NoneType I0420 07:30:14.150969 140395597309760 base_runner.py:69] task.params_init.method : 'xavier' I0420 07:30:14.151047 140395597309760 base_runner.py:69] task.params_init.scale : 1.000001 I0420 07:30:14.151125 140395597309760 base_runner.py:69] task.params_init.seed : NoneType I0420 07:30:14.151205 140395597309760 base_runner.py:69] task.random_seed : NoneType I0420 07:30:14.151283 140395597309760 base_runner.py:69] task.skip_lp_regularization : NoneType I0420 07:30:14.151360 140395597309760 base_runner.py:69] task.target_key : '' I0420 07:30:14.151438 140395597309760 base_runner.py:69] task.train.bprop_variable_filter : NoneType I0420 07:30:14.151516 140395597309760 base_runner.py:69] task.train.clip_gradient_norm_to_value : 1.0 I0420 07:30:14.151593 140395597309760 base_runner.py:69] task.train.clip_gradient_single_norm_to_value : 0.0 I0420 07:30:14.151671 140395597309760 base_runner.py:69] task.train.colocate_gradients_with_ops : True I0420 07:30:14.151755 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.jobname : 'eval_dev' I0420 07:30:14.151834 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.local_filesystem : False I0420 07:30:14.151913 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.logdir : '' I0420 07:30:14.151992 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.metric : 'log_pplx' I0420 07:30:14.152070 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.minimize : True I0420 07:30:14.152148 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.name : 'MetricHistory' I0420 07:30:14.152226 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.tfevent_file : False I0420 07:30:14.152304 140395597309760 base_runner.py:69] task.train.early_stop.name : 'EarlyStop' I0420 07:30:14.152383 140395597309760 base_runner.py:69] task.train.early_stop.tolerance : 0.0 I0420 07:30:14.152461 140395597309760 base_runner.py:69] task.train.early_stop.verbose : True I0420 07:30:14.152538 140395597309760 base_runner.py:69] task.train.early_stop.window : 0 I0420 07:30:14.152615 140395597309760 base_runner.py:69] task.train.ema_decay : 0.0 I0420 07:30:14.152693 140395597309760 base_runner.py:69] task.train.gate_gradients : False I0420 07:30:14.152776 140395597309760 base_runner.py:69] task.train.grad_aggregation_method : 1 I0420 07:30:14.152857 140395597309760 base_runner.py:69] task.train.grad_norm_to_clip_to_zero : 100.0 I0420 07:30:14.152935 140395597309760 base_runner.py:69] task.train.grad_norm_tracker : NoneType I0420 07:30:14.153012 140395597309760 base_runner.py:69] task.train.init_from_checkpoint_rules : {} I0420 07:30:14.153090 140395597309760 base_runner.py:69] task.train.l1_regularizer_weight : NoneType I0420 07:30:14.153168 140395597309760 base_runner.py:69] task.train.l2_regularizer_weight : 1e-06 I0420 07:30:14.153244 140395597309760 base_runner.py:69] task.train.learning_rate : 0.00025 I0420 07:30:14.153321 140395597309760 base_runner.py:69] task.train.lr_schedule.allow_implicit_capture : NoneType I0420 07:30:14.153400 140395597309760 base_runner.py:69] task.train.lr_schedule.cls : type/lingvo.core.lr_schedule/ContinuousLearningRateSchedule I0420 07:30:14.153480 140395597309760 base_runner.py:69] task.train.lr_schedule.dtype : float32 I0420 07:30:14.153557 140395597309760 base_runner.py:69] task.train.lr_schedule.fprop_dtype : NoneType I0420 07:30:14.153634 140395597309760 base_runner.py:69] task.train.lr_schedule.half_life_steps : 100000 I0420 07:30:14.153719 140395597309760 base_runner.py:69] task.train.lr_schedule.inference_driver_name : NoneType I0420 07:30:14.153809 140395597309760 base_runner.py:69] task.train.lr_schedule.initial_value : 1.0 I0420 07:30:14.153887 140395597309760 base_runner.py:69] task.train.lr_schedule.is_eval : NoneType I0420 07:30:14.153983 140395597309760 base_runner.py:69] task.train.lr_schedule.is_inference : NoneType I0420 07:30:14.154061 140395597309760 base_runner.py:69] task.train.lr_schedule.min : 0.01 I0420 07:30:14.154139 140395597309760 base_runner.py:69] task.train.lr_schedule.name : 'LRSched' I0420 07:30:14.154215 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.method : 'xavier' I0420 07:30:14.154292 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.scale : 1.000001 I0420 07:30:14.154370 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.seed : NoneType I0420 07:30:14.154447 140395597309760 base_runner.py:69] task.train.lr_schedule.random_seed : NoneType I0420 07:30:14.154524 140395597309760 base_runner.py:69] task.train.lr_schedule.skip_lp_regularization : NoneType I0420 07:30:14.154602 140395597309760 base_runner.py:69] task.train.lr_schedule.start_step : 50000 I0420 07:30:14.154678 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.global_vn : False I0420 07:30:14.154759 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.per_step_vn : False I0420 07:30:14.154838 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.scale : NoneType I0420 07:30:14.154913 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.seed : NoneType I0420 07:30:14.154990 140395597309760 base_runner.py:69] task.train.max_steps : 4000000 I0420 07:30:14.155067 140395597309760 base_runner.py:69] task.train.optimizer.allow_implicit_capture : NoneType I0420 07:30:14.155144 140395597309760 base_runner.py:69] task.train.optimizer.beta1 : 0.9 I0420 07:30:14.155221 140395597309760 base_runner.py:69] task.train.optimizer.beta2 : 0.999 I0420 07:30:14.155299 140395597309760 base_runner.py:69] task.train.optimizer.cls : type/lingvo.core.optimizer/Adam I0420 07:30:14.155376 140395597309760 base_runner.py:69] task.train.optimizer.dtype : float32 I0420 07:30:14.155453 140395597309760 base_runner.py:69] task.train.optimizer.epsilon : 1e-06 I0420 07:30:14.155530 140395597309760 base_runner.py:69] task.train.optimizer.fprop_dtype : NoneType I0420 07:30:14.155607 140395597309760 base_runner.py:69] task.train.optimizer.inference_driver_name : NoneType I0420 07:30:14.155685 140395597309760 base_runner.py:69] task.train.optimizer.is_eval : NoneType I0420 07:30:14.155767 140395597309760 base_runner.py:69] task.train.optimizer.is_inference : NoneType I0420 07:30:14.155847 140395597309760 base_runner.py:69] task.train.optimizer.name : 'Adam' I0420 07:30:14.155924 140395597309760 base_runner.py:69] task.train.optimizer.params_init.method : 'xavier' I0420 07:30:14.156001 140395597309760 base_runner.py:69] task.train.optimizer.params_init.scale : 1.000001 I0420 07:30:14.156078 140395597309760 base_runner.py:69] task.train.optimizer.params_init.seed : NoneType I0420 07:30:14.156155 140395597309760 base_runner.py:69] task.train.optimizer.random_seed : NoneType I0420 07:30:14.156232 140395597309760 base_runner.py:69] task.train.optimizer.skip_lp_regularization : NoneType I0420 07:30:14.156308 140395597309760 base_runner.py:69] task.train.optimizer.vn.global_vn : False I0420 07:30:14.156385 140395597309760 base_runner.py:69] task.train.optimizer.vn.per_step_vn : False I0420 07:30:14.156461 140395597309760 base_runner.py:69] task.train.optimizer.vn.scale : NoneType I0420 07:30:14.156538 140395597309760 base_runner.py:69] task.train.optimizer.vn.seed : NoneType I0420 07:30:14.156616 140395597309760 base_runner.py:69] task.train.pruning_hparams_dict : NoneType I0420 07:30:14.156693 140395597309760 base_runner.py:69] task.train.save_interval_seconds : 600 I0420 07:30:14.156775 140395597309760 base_runner.py:69] task.train.start_up_delay_steps : 200 I0420 07:30:14.156853 140395597309760 base_runner.py:69] task.train.summary_interval_steps : 100 I0420 07:30:14.156939 140395597309760 base_runner.py:69] task.train.tpu_steps_per_loop : 20 I0420 07:30:14.157018 140395597309760 base_runner.py:69] task.train.vn_start_step : 20000 I0420 07:30:14.157095 140395597309760 base_runner.py:69] task.train.vn_std : 0.075 I0420 07:30:14.157172 140395597309760 base_runner.py:69] task.vn.global_vn : True I0420 07:30:14.157249 140395597309760 base_runner.py:69] task.vn.per_step_vn : False I0420 07:30:14.157327 140395597309760 base_runner.py:69] task.vn.scale : NoneType I0420 07:30:14.157403 140395597309760 base_runner.py:69] task.vn.seed : NoneType I0420 07:30:14.157480 140395597309760 base_runner.py:69] train.early_stop.metric_history.jobname : 'eval_dev' I0420 07:30:14.157557 140395597309760 base_runner.py:69] train.early_stop.metric_history.local_filesystem : False I0420 07:30:14.157634 140395597309760 base_runner.py:69] train.early_stop.metric_history.logdir : '' I0420 07:30:14.157710 140395597309760 base_runner.py:69] train.early_stop.metric_history.metric : 'log_pplx' I0420 07:30:14.157793 140395597309760 base_runner.py:69] train.early_stop.metric_history.minimize : True I0420 07:30:14.157871 140395597309760 base_runner.py:69] train.early_stop.metric_history.name : 'MetricHistory' I0420 07:30:14.157948 140395597309760 base_runner.py:69] train.early_stop.metric_history.tfevent_file : False I0420 07:30:14.158025 140395597309760 base_runner.py:69] train.early_stop.name : 'EarlyStop' I0420 07:30:14.158102 140395597309760 base_runner.py:69] train.early_stop.tolerance : 0.0 I0420 07:30:14.158179 140395597309760 base_runner.py:69] train.early_stop.verbose : True I0420 07:30:14.158255 140395597309760 base_runner.py:69] train.early_stop.window : 0 I0420 07:30:14.158344 140395597309760 base_runner.py:69] train.ema_decay : 0.0 I0420 07:30:14.158476 140395597309760 base_runner.py:69] train.init_from_checkpoint_rules : {} I0420 07:30:14.158565 140395597309760 base_runner.py:69] train.max_steps : 4000000 I0420 07:30:14.158658 140395597309760 base_runner.py:69] train.save_interval_seconds : 600 I0420 07:30:14.158746 140395597309760 base_runner.py:69] train.start_up_delay_steps : 200 I0420 07:30:14.158827 140395597309760 base_runner.py:69] train.summary_interval_steps : 100 I0420 07:30:14.158905 140395597309760 base_runner.py:69] train.tpu_steps_per_loop : 20 I0420 07:30:14.158983 140395597309760 base_runner.py:69] vn.global_vn : True I0420 07:30:14.159061 140395597309760 base_runner.py:69] vn.per_step_vn : False I0420 07:30:14.159140 140395597309760 base_runner.py:69] vn.scale : NoneType I0420 07:30:14.159218 140395597309760 base_runner.py:69] vn.seed : NoneType I0420 07:30:14.159296 140395597309760 base_runner.py:69] I0420 07:30:14.159385 140395597309760 base_runner.py:70] ============================================================ I0420 07:30:14.161322 140395597309760 base_runner.py:115] Starting ... W0420 07:30:14.161540 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:186: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead. W0420 07:30:14.162086 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py:324: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead. W0420 07:30:14.162875 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:192: The name tf.container is deprecated. Please use tf.compat.v1.container instead. I0420 07:30:14.163144 140395597309760 cluster.py:429] _LeastLoadedPlacer : ['/job:local/replica:0/task:0/device:CPU:0'] W0420 07:30:14.169799 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1258: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. W0420 07:30:14.170238 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1260: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. I0420 07:30:14.174882 140395597309760 cluster.py:447] Place variable global_step on /job:local/replica:0/task:0/device:CPU:0 8 W0420 07:30:14.189157 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1250: The name tf.train.get_global_step is deprecated. Please use tf.compat.v1.train.get_global_step instead. I0420 07:30:14.189383 140395597309760 base_model.py:1116] Training parameters for : { early_stop: { metric_history: { "eval_dev" local_filesystem: False "/data/dingzhenyou/speech_data/librispeech/log/" "log_pplx" minimize: True "MetricHistory" tfevent_file: False } "EarlyStop" tolerance: 0.0 verbose: True window: 0 } ema_decay: 0.0 init_from_checkpoint_rules: {} max_steps: 4000000 save_interval_seconds: 600 start_up_delay_steps: 200 summary_interval_steps: 100 tpu_steps_per_loop: 20 } I0420 07:30:14.208252 140395597309760 base_input_generator.py:510] bucket_batch_limit [64, 32, 32, 32, 32, 32, 32, 32] W0420 07:30:14.209599 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/input_generator.py:47: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead. W0420 07:30:14.209780 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/input_generator.py:51: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead. I0420 07:30:14.308718 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 1160 I0420 07:30:14.311394 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/w/var:0 shape=(3, 3, 1, 32) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.318114 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 1288 I0420 07:30:14.320282 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.323750 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 1416 I0420 07:30:14.325911 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.330802 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 1544 I0420 07:30:14.332957 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.336487 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 1672 I0420 07:30:14.338845 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.349708 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 38536 I0420 07:30:14.352178 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/w/var:0 shape=(3, 3, 32, 32) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.359050 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 38664 I0420 07:30:14.361227 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.364701 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 38792 I0420 07:30:14.366914 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.371793 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 38920 I0420 07:30:14.373955 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.377459 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 39048 I0420 07:30:14.379628 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.406703 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 27302024 I0420 07:30:14.409185 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.418040 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 27318408 I0420 07:30:14.420245 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 W0420 07:30:14.428427 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/summary_utils.py:41: The name tf.summary.histogram is deprecated. Please use tf.compat.v1.summary.histogram instead. I0420 07:30:14.449314 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 54581384 I0420 07:30:14.451812 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.460689 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 54597768 I0420 07:30:14.463033 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.496942 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 104929416 I0420 07:30:14.499413 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.509780 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 104945800 I0420 07:30:14.512828 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.551914 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 155277448 I0420 07:30:14.555337 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.567281 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 155293832 I0420 07:30:14.570199 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.615386 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 205625480 I0420 07:30:14.618185 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.629010 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 205641864 I0420 07:30:14.631587 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.664180 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 255973512 I0420 07:30:14.666968 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.676595 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 255989896 I0420 07:30:14.678953 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.717291 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 306321544 I0420 07:30:14.720068 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.729692 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 306337928 I0420 07:30:14.733038 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.765098 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 356669576 I0420 07:30:14.767899 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.777565 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 356685960 I0420 07:30:14.779922 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.805918 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 373463176 I0420 07:30:14.808506 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.816268 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 373471368 I0420 07:30:14.818604 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.822424 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 373479560 I0420 07:30:14.824768 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.830056 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 373487752 I0420 07:30:14.832406 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.836292 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 373495944 I0420 07:30:14.838644 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.850867 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 390273160 I0420 07:30:14.853652 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.861360 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 390281352 I0420 07:30:14.863693 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.868551 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 390289544 I0420 07:30:14.873275 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.879584 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 390297736 I0420 07:30:14.882167 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.886128 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 390305928 I0420 07:30:14.888609 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.899238 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/w/var on /job:local/replica:0/task:0/device:CPU:0 407083144 I0420 07:30:14.901216 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.906749 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/beta/var on /job:local/replica:0/task:0/device:CPU:0 407091336 I0420 07:30:14.908530 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.911384 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/gamma/var on /job:local/replica:0/task:0/device:CPU:0 407099528 I0420 07:30:14.913311 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.917190 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 407107720 I0420 07:30:14.919024 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.922070 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 407115912 I0420 07:30:14.923866 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.952696 140395597309760 cluster.py:447] Place variable librispeech/dec/emb/var_0/var on /job:local/replica:0/task:0/device:CPU:0 407139016 I0420 07:30:14.954703 140395597309760 py_utils.py:1220] Creating var librispeech/dec/emb/var_0/var:0 shape=(76, 76) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.963403 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/weight_0/var on /job:local/replica:0/task:0/device:CPU:0 408072904 I0420 07:30:14.965323 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/weight_0/var:0 shape=(3072, 76) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.972189 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/bias_0/var on /job:local/replica:0/task:0/device:CPU:0 408073208 I0420 07:30:14.973892 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/bias_0/var:0 shape=(76,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:14.992157 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/wm/var on /job:local/replica:0/task:0/device:CPU:0 459650040 I0420 07:30:14.994079 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/wm/var:0 shape=(3148, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.000926 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/b/var on /job:local/replica:0/task:0/device:CPU:0 459666424 I0420 07:30:15.002634 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.021346 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/wm/var on /job:local/replica:0/task:0/device:CPU:0 526775288 I0420 07:30:15.023277 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/wm/var:0 shape=(4096, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.030137 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/b/var on /job:local/replica:0/task:0/device:CPU:0 526791672 I0420 07:30:15.031835 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.242995 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/source_var/var on /job:local/replica:0/task:0/device:CPU:0 527840248 I0420 07:30:15.245058 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/source_var/var:0 shape=(2048, 128) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.255265 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/query_var/var on /job:local/replica:0/task:0/device:CPU:0 528364536 I0420 07:30:15.257199 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/query_var/var:0 shape=(1024, 128) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.267314 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/hidden_var/var on /job:local/replica:0/task:0/device:CPU:0 528365048 I0420 07:30:15.269248 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/hidden_var/var:0 shape=(128,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.326706 140395597309760 py_utils.py:1277] === worker 0 === I0420 07:30:15.328285 140395597309760 py_utils.py:1267] worker 0: decoder.atten.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328372 140395597309760 py_utils.py:1267] worker 0: decoder.atten.hidden_var /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328435 140395597309760 py_utils.py:1267] worker 0: decoder.atten.query_var /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328493 140395597309760 py_utils.py:1267] worker 0: decoder.atten.source_var /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328548 140395597309760 py_utils.py:1267] worker 0: decoder.beam_search.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328603 140395597309760 py_utils.py:1267] worker 0: decoder.contextualizer.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328656 140395597309760 py_utils.py:1267] worker 0: decoder.emb.wm[0] /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328711 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328783 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.lm.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328838 140395597309760 py_utils.py:1267] worker 0: decoder.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328891 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328944 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.328996 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329049 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329102 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329154 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329206 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.bias_0 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329260 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329312 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.weight_0 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329365 140395597309760 py_utils.py:1267] worker 0: decoder.target_sequence_sampler.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329417 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329471 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.gamma /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329523 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329576 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329627 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329680 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329735 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.gamma /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329791 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329848 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329902 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.329955 140395597309760 py_utils.py:1267] worker 0: encoder.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330008 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330060 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.gamma /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330111 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330164 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330216 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330269 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330321 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.gamma /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330373 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330425 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330476 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330528 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330580 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.gamma /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330632 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330684 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330740 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330795 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330847 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330899 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.330956 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331011 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331063 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331115 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331168 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331222 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331274 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331326 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331378 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331430 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331484 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331535 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331588 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331640 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331691 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331751 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331804 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331856 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331908 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.331960 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332014 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332071 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332124 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332178 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332230 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332283 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332334 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332386 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332438 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.b /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332490 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332544 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332596 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332648 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332700 140395597309760 py_utils.py:1267] worker 0: global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332758 140395597309760 py_utils.py:1267] worker 0: input._tokenizer_default.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332813 140395597309760 py_utils.py:1267] worker 0: input.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332865 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332918 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.linear.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.332971 140395597309760 py_utils.py:1267] worker 0: lr_schedule.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.333023 140395597309760 py_utils.py:1267] worker 0: optimizer.global_step /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:15.333077 140395597309760 py_utils.py:1283] ========== W0420 07:30:17.816431 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/plot.py:239: The name tf.summary.image is deprecated. Please use tf.compat.v1.summary.image instead. W0420 07:30:18.624974 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/layers.py:1874: The name tf.logging.vlog is deprecated. Please use tf.compat.v1.logging.vlog instead. I0420 07:30:18.679635 140395597309760 decoder.py:749] Merging metric loss: (, ) I0420 07:30:18.683741 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (, ) I0420 07:30:18.687864 140395597309760 decoder.py:749] Merging metric log_pplx: (, ) W0420 07:30:18.731125 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/summary_utils.py:36: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. I0420 07:30:22.142855 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.hidden_var: I0420 07:30:22.143069 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.query_var: I0420 07:30:22.143222 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.source_var: I0420 07:30:22.143345 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.emb.wm_0: I0420 07:30:22.143487 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.b: I0420 07:30:22.143604 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.wm: I0420 07:30:22.143743 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.b: I0420 07:30:22.143887 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.wm: I0420 07:30:22.144016 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.bias_0: I0420 07:30:22.144134 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.weight_0: I0420 07:30:22.144254 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.beta: I0420 07:30:22.144364 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.gamma: I0420 07:30:22.144476 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.w: I0420 07:30:22.144601 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.beta: I0420 07:30:22.144728 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.gamma: I0420 07:30:22.144845 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.w: I0420 07:30:22.144972 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.beta: I0420 07:30:22.145085 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.gamma: I0420 07:30:22.145204 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.w: I0420 07:30:22.145322 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.beta: I0420 07:30:22.145435 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.gamma: I0420 07:30:22.145554 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.w: I0420 07:30:22.145672 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.beta: I0420 07:30:22.145792 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.gamma: I0420 07:30:22.145904 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.w: I0420 07:30:22.146023 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.b: I0420 07:30:22.146132 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.wm: I0420 07:30:22.146251 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.b: I0420 07:30:22.146362 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.wm: I0420 07:30:22.146480 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.b: I0420 07:30:22.146589 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.wm: I0420 07:30:22.146711 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.b: I0420 07:30:22.146826 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.wm: I0420 07:30:22.146948 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.b: I0420 07:30:22.147057 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.wm: I0420 07:30:22.147182 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.b: I0420 07:30:22.147293 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.wm: I0420 07:30:22.147413 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.b: I0420 07:30:22.147520 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.wm: I0420 07:30:22.147639 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.b: I0420 07:30:22.147757 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.wm: W0420 07:30:23.409567 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/optimizer.py:179: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. I0420 07:30:23.411606 140395597309760 cluster.py:447] Place variable beta1_power on /job:local/replica:0/task:0/device:CPU:0 528365052 I0420 07:30:23.414588 140395597309760 cluster.py:447] Place variable beta2_power on /job:local/replica:0/task:0/device:CPU:0 528365056 I0420 07:30:23.900799 140395597309760 cluster.py:447] Place variable librispeech/total_samples/var on /job:local/replica:0/task:0/device:CPU:0 528365064 I0420 07:30:23.902559 140395597309760 py_utils.py:1220] Creating var librispeech/total_samples/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:23.909670 140395597309760 cluster.py:447] Place variable total_nan_gradients/var on /job:local/replica:0/task:0/device:CPU:0 528365072 I0420 07:30:23.911412 140395597309760 py_utils.py:1220] Creating var total_nan_gradients/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0 W0420 07:30:23.934954 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py:156: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead. W0420 07:30:24.054827 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:198: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead. I0420 07:30:24.163578 140395597309760 py_utils.py:1267] MODEL ANALYSIS: I0420 07:30:24.163672 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.hidden_var (128,) 128 librispeech/dec/atten/hidden_var/var I0420 07:30:24.163784 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.query_var (1024, 128) 131072 librispeech/dec/atten/query_var/var I0420 07:30:24.163866 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.source_var (2048, 128) 262144 librispeech/dec/atten/source_var/var I0420 07:30:24.163948 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.emb.wm[0] (76, 76) 5776 librispeech/dec/emb/var_0/var I0420 07:30:24.164036 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[0].b (4096,) 4096 librispeech/dec/rnn_cell/b/var I0420 07:30:24.164122 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[0].wm (3148, 4096) 12894208 librispeech/dec/rnn_cell/wm/var I0420 07:30:24.164206 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[1].b (4096,) 4096 librispeech/dec/rnn_cell_1/b/var I0420 07:30:24.164299 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[1].wm (4096, 4096) 16777216 librispeech/dec/rnn_cell_1/wm/var I0420 07:30:24.164386 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.softmax.bias_0 (76,) 76 librispeech/dec/softmax/bias_0/var I0420 07:30:24.164469 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.softmax.weight_0 (3072, 76) 233472 librispeech/dec/softmax/weight_0/var I0420 07:30:24.164551 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].bn.beta (32,) 32 librispeech/enc/conv_L0/beta/var I0420 07:30:24.164633 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].bn.gamma (32,) 32 librispeech/enc/conv_L0/gamma/var I0420 07:30:24.164714 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].w (3, 3, 1, 32) 288 librispeech/enc/conv_L0/w/var I0420 07:30:24.164803 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].bn.beta (32,) 32 librispeech/enc/conv_L1/beta/var I0420 07:30:24.164881 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].bn.gamma (32,) 32 librispeech/enc/conv_L1/gamma/var I0420 07:30:24.164966 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].w (3, 3, 32, 32) 9216 librispeech/enc/conv_L1/w/var I0420 07:30:24.165045 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].bn.beta (2048,) 2048 librispeech/enc/proj_L0/beta/var I0420 07:30:24.165127 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].bn.gamma (2048,) 2048 librispeech/enc/proj_L0/gamma/var I0420 07:30:24.165208 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].w (2048, 2048) 4194304 librispeech/enc/proj_L0/w/var I0420 07:30:24.165291 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].bn.beta (2048,) 2048 librispeech/enc/proj_L1/beta/var I0420 07:30:24.165373 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].bn.gamma (2048,) 2048 librispeech/enc/proj_L1/gamma/var I0420 07:30:24.165455 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].w (2048, 2048) 4194304 librispeech/enc/proj_L1/w/var I0420 07:30:24.165535 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].bn.beta (2048,) 2048 librispeech/enc/proj_L2/beta/var I0420 07:30:24.165616 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].bn.gamma (2048,) 2048 librispeech/enc/proj_L2/gamma/var I0420 07:30:24.165698 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].w (2048, 2048) 4194304 librispeech/enc/proj_L2/w/var I0420 07:30:24.165782 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].bak_rnn.cell.b (4096,) 4096 librispeech/enc/bak_rnn_L0/b/var I0420 07:30:24.165864 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].bak_rnn.cell.wm (1664, 4096) 6815744 librispeech/enc/bak_rnn_L0/wm/var I0420 07:30:24.165945 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].fwd_rnn.cell.b (4096,) 4096 librispeech/enc/fwd_rnn_L0/b/var I0420 07:30:24.166027 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].fwd_rnn.cell.wm (1664, 4096) 6815744 librispeech/enc/fwd_rnn_L0/wm/var I0420 07:30:24.166107 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].bak_rnn.cell.b (4096,) 4096 librispeech/enc/bak_rnn_L1/b/var I0420 07:30:24.166194 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].bak_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/bak_rnn_L1/wm/var I0420 07:30:24.166275 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].fwd_rnn.cell.b (4096,) 4096 librispeech/enc/fwd_rnn_L1/b/var I0420 07:30:24.166356 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].fwd_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/fwd_rnn_L1/wm/var I0420 07:30:24.166438 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].bak_rnn.cell.b (4096,) 4096 librispeech/enc/bak_rnn_L2/b/var I0420 07:30:24.166517 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].bak_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/bak_rnn_L2/wm/var I0420 07:30:24.166599 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].fwd_rnn.cell.b (4096,) 4096 librispeech/enc/fwd_rnn_L2/b/var I0420 07:30:24.166678 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].fwd_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/fwd_rnn_L2/wm/var I0420 07:30:24.166765 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].bak_rnn.cell.b (4096,) 4096 librispeech/enc/bak_rnn_L3/b/var I0420 07:30:24.166847 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].bak_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/bak_rnn_L3/wm/var I0420 07:30:24.166928 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].fwd_rnn.cell.b (4096,) 4096 librispeech/enc/fwd_rnn_L3/b/var I0420 07:30:24.167011 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].fwd_rnn.cell.wm (3072, 4096) 12582912 librispeech/enc/fwd_rnn_L3/wm/var I0420 07:30:24.167092 140395597309760 py_utils.py:1267] MODEL ANALYSIS: ==================================================================================================== I0420 07:30:24.167171 140395597309760 py_utils.py:1267] MODEL ANALYSIS: total #params: 132078844 I0420 07:30:24.167253 140395597309760 py_utils.py:1267] MODEL ANALYSIS: I0420 07:30:26.209583 140395597309760 trainer.py:1263] Job trainer start I0420 07:30:26.219883 140395597309760 base_runner.py:67] ============================================================ I0420 07:30:26.225033 140395597309760 base_runner.py:69] allow_implicit_capture : NoneType I0420 07:30:26.225131 140395597309760 base_runner.py:69] cls : type/lingvo.core.base_model/SingleTaskModel I0420 07:30:26.225227 140395597309760 base_runner.py:69] cluster.add_summary : NoneType I0420 07:30:26.225331 140395597309760 base_runner.py:69] cluster.cls : type/lingvo.core.cluster/_Cluster I0420 07:30:26.225426 140395597309760 base_runner.py:69] cluster.controller.cpus_per_replica : 1 I0420 07:30:26.225514 140395597309760 base_runner.py:69] cluster.controller.devices_per_split : 1 I0420 07:30:26.225601 140395597309760 base_runner.py:69] cluster.controller.gpus_per_replica : 0 I0420 07:30:26.225688 140395597309760 base_runner.py:69] cluster.controller.name : '/job:local' I0420 07:30:26.225781 140395597309760 base_runner.py:69] cluster.controller.num_tpu_hosts : 0 I0420 07:30:26.225868 140395597309760 base_runner.py:69] cluster.controller.replicas : 1 I0420 07:30:26.225955 140395597309760 base_runner.py:69] cluster.controller.tpus_per_replica : 0 I0420 07:30:26.226039 140395597309760 base_runner.py:69] cluster.decoder.cpus_per_replica : 1 I0420 07:30:26.226125 140395597309760 base_runner.py:69] cluster.decoder.devices_per_split : 1 I0420 07:30:26.226210 140395597309760 base_runner.py:69] cluster.decoder.gpus_per_replica : 1 I0420 07:30:26.226296 140395597309760 base_runner.py:69] cluster.decoder.name : '/job:local' I0420 07:30:26.226378 140395597309760 base_runner.py:69] cluster.decoder.num_tpu_hosts : 0 I0420 07:30:26.226463 140395597309760 base_runner.py:69] cluster.decoder.replicas : 1 I0420 07:30:26.226557 140395597309760 base_runner.py:69] cluster.decoder.tpus_per_replica : 0 I0420 07:30:26.226644 140395597309760 base_runner.py:69] cluster.evaler.cpus_per_replica : 1 I0420 07:30:26.226736 140395597309760 base_runner.py:69] cluster.evaler.devices_per_split : 1 I0420 07:30:26.226835 140395597309760 base_runner.py:69] cluster.evaler.gpus_per_replica : 1 I0420 07:30:26.226921 140395597309760 base_runner.py:69] cluster.evaler.name : '/job:local' I0420 07:30:26.227004 140395597309760 base_runner.py:69] cluster.evaler.num_tpu_hosts : 0 I0420 07:30:26.227087 140395597309760 base_runner.py:69] cluster.evaler.replicas : 1 I0420 07:30:26.227170 140395597309760 base_runner.py:69] cluster.evaler.tpus_per_replica : 0 I0420 07:30:26.227253 140395597309760 base_runner.py:69] cluster.input.cpus_per_replica : 1 I0420 07:30:26.227339 140395597309760 base_runner.py:69] cluster.input.devices_per_split : 1 I0420 07:30:26.227423 140395597309760 base_runner.py:69] cluster.input.gpus_per_replica : 0 I0420 07:30:26.227508 140395597309760 base_runner.py:69] cluster.input.name : '/job:local' I0420 07:30:26.227591 140395597309760 base_runner.py:69] cluster.input.num_tpu_hosts : 0 I0420 07:30:26.227674 140395597309760 base_runner.py:69] cluster.input.replicas : 0 I0420 07:30:26.227766 140395597309760 base_runner.py:69] cluster.input.tpus_per_replica : 0 I0420 07:30:26.227849 140395597309760 base_runner.py:69] cluster.job : 'trainer' I0420 07:30:26.227933 140395597309760 base_runner.py:69] cluster.mode : 'async' I0420 07:30:26.228017 140395597309760 base_runner.py:69] cluster.ps.cpus_per_replica : 1 I0420 07:30:26.228100 140395597309760 base_runner.py:69] cluster.ps.devices_per_split : 1 I0420 07:30:26.228184 140395597309760 base_runner.py:69] cluster.ps.gpus_per_replica : 0 I0420 07:30:26.228266 140395597309760 base_runner.py:69] cluster.ps.name : '/job:local' I0420 07:30:26.228353 140395597309760 base_runner.py:69] cluster.ps.num_tpu_hosts : 0 I0420 07:30:26.228435 140395597309760 base_runner.py:69] cluster.ps.replicas : 1 I0420 07:30:26.228518 140395597309760 base_runner.py:69] cluster.ps.tpus_per_replica : 0 I0420 07:30:26.228601 140395597309760 base_runner.py:69] cluster.task : 0 I0420 07:30:26.228687 140395597309760 base_runner.py:69] cluster.worker.cpus_per_replica : 1 I0420 07:30:26.228792 140395597309760 base_runner.py:69] cluster.worker.devices_per_split : 1 I0420 07:30:26.228882 140395597309760 base_runner.py:69] cluster.worker.gpus_per_replica : 4 I0420 07:30:26.228970 140395597309760 base_runner.py:69] cluster.worker.name : '/job:local' I0420 07:30:26.229058 140395597309760 base_runner.py:69] cluster.worker.num_tpu_hosts : 0 I0420 07:30:26.229155 140395597309760 base_runner.py:69] cluster.worker.replicas : 1 I0420 07:30:26.229240 140395597309760 base_runner.py:69] cluster.worker.tpus_per_replica : 0 I0420 07:30:26.229325 140395597309760 base_runner.py:69] dtype : float32 I0420 07:30:26.229409 140395597309760 base_runner.py:69] fprop_dtype : NoneType I0420 07:30:26.229495 140395597309760 base_runner.py:69] inference_driver_name : NoneType I0420 07:30:26.229579 140395597309760 base_runner.py:69] input.allow_implicit_capture : NoneType I0420 07:30:26.229664 140395597309760 base_runner.py:69] input.append_eos_frame : True I0420 07:30:26.229756 140395597309760 base_runner.py:69] input.bucket_adjust_every_n : 0 I0420 07:30:26.229837 140395597309760 base_runner.py:69] input.bucket_batch_limit : [64, 32, 32, 32, 32, 32, 32, 32] I0420 07:30:26.229921 140395597309760 base_runner.py:69] input.bucket_upper_bound : [639, 1062, 1275, 1377, 1449, 1506, 1563, 1710] I0420 07:30:26.230006 140395597309760 base_runner.py:69] input.cls : type/lingvo.tasks.asr.input_generator/AsrInput I0420 07:30:26.230089 140395597309760 base_runner.py:69] input.dtype : float32 I0420 07:30:26.230174 140395597309760 base_runner.py:69] input.file_buffer_size : 10000 I0420 07:30:26.230257 140395597309760 base_runner.py:69] input.file_parallelism : 16 I0420 07:30:26.230343 140395597309760 base_runner.py:69] input.file_pattern : 'tfrecord:/data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*' I0420 07:30:26.230433 140395597309760 base_runner.py:69] input.file_random_seed : 0 I0420 07:30:26.230520 140395597309760 base_runner.py:69] input.flush_every_n : 0 I0420 07:30:26.230608 140395597309760 base_runner.py:69] input.fprop_dtype : NoneType I0420 07:30:26.230688 140395597309760 base_runner.py:69] input.frame_size : 80 I0420 07:30:26.230776 140395597309760 base_runner.py:69] input.inference_driver_name : NoneType I0420 07:30:26.230861 140395597309760 base_runner.py:69] input.is_eval : False I0420 07:30:26.230947 140395597309760 base_runner.py:69] input.is_inference : NoneType I0420 07:30:26.231029 140395597309760 base_runner.py:69] input.name : 'input' I0420 07:30:26.231112 140395597309760 base_runner.py:69] input.num_batcher_threads : 1 I0420 07:30:26.231197 140395597309760 base_runner.py:69] input.num_samples : 281241 I0420 07:30:26.231281 140395597309760 base_runner.py:69] input.pad_to_max_seq_length : False I0420 07:30:26.231364 140395597309760 base_runner.py:69] input.params_init.method : 'xavier' I0420 07:30:26.231447 140395597309760 base_runner.py:69] input.params_init.scale : 1.000001 I0420 07:30:26.231534 140395597309760 base_runner.py:69] input.params_init.seed : NoneType I0420 07:30:26.231617 140395597309760 base_runner.py:69] input.random_seed : NoneType I0420 07:30:26.231700 140395597309760 base_runner.py:69] input.require_sequential_order : False I0420 07:30:26.231795 140395597309760 base_runner.py:69] input.skip_lp_regularization : NoneType I0420 07:30:26.231879 140395597309760 base_runner.py:69] input.source_max_length : 3000 I0420 07:30:26.231966 140395597309760 base_runner.py:69] input.target_max_length : 620 I0420 07:30:26.232048 140395597309760 base_runner.py:69] input.tokenizer.allow_implicit_capture : NoneType I0420 07:30:26.232131 140395597309760 base_runner.py:69] input.tokenizer.append_eos : True I0420 07:30:26.232215 140395597309760 base_runner.py:69] input.tokenizer.cls : type/lingvo.core.tokenizers/AsciiTokenizer I0420 07:30:26.232300 140395597309760 base_runner.py:69] input.tokenizer.dtype : float32 I0420 07:30:26.232384 140395597309760 base_runner.py:69] input.tokenizer.fprop_dtype : NoneType I0420 07:30:26.232472 140395597309760 base_runner.py:69] input.tokenizer.inference_driver_name : NoneType I0420 07:30:26.232552 140395597309760 base_runner.py:69] input.tokenizer.is_eval : NoneType I0420 07:30:26.232635 140395597309760 base_runner.py:69] input.tokenizer.is_inference : NoneType I0420 07:30:26.232718 140395597309760 base_runner.py:69] input.tokenizer.name : 'tokenizer' I0420 07:30:26.232809 140395597309760 base_runner.py:69] input.tokenizer.pad_to_max_length : True I0420 07:30:26.232893 140395597309760 base_runner.py:69] input.tokenizer.params_init.method : 'xavier' I0420 07:30:26.232979 140395597309760 base_runner.py:69] input.tokenizer.params_init.scale : 1.000001 I0420 07:30:26.233062 140395597309760 base_runner.py:69] input.tokenizer.params_init.seed : NoneType I0420 07:30:26.233146 140395597309760 base_runner.py:69] input.tokenizer.random_seed : NoneType I0420 07:30:26.233231 140395597309760 base_runner.py:69] input.tokenizer.skip_lp_regularization : NoneType I0420 07:30:26.233316 140395597309760 base_runner.py:69] input.tokenizer.target_eos_id : 2 I0420 07:30:26.233401 140395597309760 base_runner.py:69] input.tokenizer.target_sos_id : 1 I0420 07:30:26.233484 140395597309760 base_runner.py:69] input.tokenizer.target_unk_id : 0 I0420 07:30:26.233570 140395597309760 base_runner.py:69] input.tokenizer.vn.global_vn : False I0420 07:30:26.233652 140395597309760 base_runner.py:69] input.tokenizer.vn.per_step_vn : False I0420 07:30:26.233743 140395597309760 base_runner.py:69] input.tokenizer.vn.scale : NoneType I0420 07:30:26.233827 140395597309760 base_runner.py:69] input.tokenizer.vn.seed : NoneType I0420 07:30:26.233911 140395597309760 base_runner.py:69] input.tokenizer.vocab_size : 76 I0420 07:30:26.233994 140395597309760 base_runner.py:69] input.tokenizer_dict : {} I0420 07:30:26.234086 140395597309760 base_runner.py:69] input.tpu_infeed_parallism : 1 I0420 07:30:26.234169 140395597309760 base_runner.py:69] input.use_per_host_infeed : False I0420 07:30:26.234253 140395597309760 base_runner.py:69] input.use_within_batch_mixing : False I0420 07:30:26.234337 140395597309760 base_runner.py:69] input.vn.global_vn : False I0420 07:30:26.234421 140395597309760 base_runner.py:69] input.vn.per_step_vn : False I0420 07:30:26.234503 140395597309760 base_runner.py:69] input.vn.scale : NoneType I0420 07:30:26.234589 140395597309760 base_runner.py:69] input.vn.seed : NoneType I0420 07:30:26.234672 140395597309760 base_runner.py:69] is_eval : NoneType I0420 07:30:26.234760 140395597309760 base_runner.py:69] is_inference : NoneType I0420 07:30:26.234848 140395597309760 base_runner.py:69] model : 'asr.librispeech.Librispeech960Grapheme@/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/params/librispeech.py:181' I0420 07:30:26.234934 140395597309760 base_runner.py:69] name : '' I0420 07:30:26.235013 140395597309760 base_runner.py:69] params_init.method : 'xavier' I0420 07:30:26.235105 140395597309760 base_runner.py:69] params_init.scale : 1.000001 I0420 07:30:26.235184 140395597309760 base_runner.py:69] params_init.seed : NoneType I0420 07:30:26.235270 140395597309760 base_runner.py:69] random_seed : NoneType I0420 07:30:26.235354 140395597309760 base_runner.py:69] skip_lp_regularization : NoneType I0420 07:30:26.235438 140395597309760 base_runner.py:69] task.allow_implicit_capture : NoneType I0420 07:30:26.235522 140395597309760 base_runner.py:69] task.cls : type/lingvo.tasks.asr.model/AsrModel I0420 07:30:26.235611 140395597309760 base_runner.py:69] task.decoder.allow_implicit_capture : NoneType I0420 07:30:26.235691 140395597309760 base_runner.py:69] task.decoder.atten_context_dim : 0 I0420 07:30:26.235780 140395597309760 base_runner.py:69] task.decoder.attention.allow_implicit_capture : NoneType I0420 07:30:26.235867 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_deterministic : False I0420 07:30:26.235975 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_prob : 0.0 I0420 07:30:26.236058 140395597309760 base_runner.py:69] task.decoder.attention.cls : type/lingvo.core.attention/AdditiveAttention I0420 07:30:26.236149 140395597309760 base_runner.py:69] task.decoder.attention.dtype : float32 I0420 07:30:26.236237 140395597309760 base_runner.py:69] task.decoder.attention.fprop_dtype : NoneType I0420 07:30:26.236335 140395597309760 base_runner.py:69] task.decoder.attention.hidden_dim : 128 I0420 07:30:26.236421 140395597309760 base_runner.py:69] task.decoder.attention.inference_driver_name : NoneType I0420 07:30:26.236505 140395597309760 base_runner.py:69] task.decoder.attention.is_eval : NoneType I0420 07:30:26.236589 140395597309760 base_runner.py:69] task.decoder.attention.is_inference : NoneType I0420 07:30:26.236674 140395597309760 base_runner.py:69] task.decoder.attention.name : '' I0420 07:30:26.236763 140395597309760 base_runner.py:69] task.decoder.attention.packed_input : False I0420 07:30:26.236850 140395597309760 base_runner.py:69] task.decoder.attention.params_init.method : 'uniform_sqrt_dim' I0420 07:30:26.236936 140395597309760 base_runner.py:69] task.decoder.attention.params_init.scale : 1.73205080757 I0420 07:30:26.237021 140395597309760 base_runner.py:69] task.decoder.attention.params_init.seed : NoneType I0420 07:30:26.237107 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.default : NoneType I0420 07:30:26.237190 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.fullyconnected : NoneType I0420 07:30:26.237273 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.softmax : NoneType I0420 07:30:26.237356 140395597309760 base_runner.py:69] task.decoder.attention.query_dim : 0 I0420 07:30:26.237441 140395597309760 base_runner.py:69] task.decoder.attention.random_seed : NoneType I0420 07:30:26.237524 140395597309760 base_runner.py:69] task.decoder.attention.same_batch_size : False I0420 07:30:26.237612 140395597309760 base_runner.py:69] task.decoder.attention.skip_lp_regularization : NoneType I0420 07:30:26.237698 140395597309760 base_runner.py:69] task.decoder.attention.source_dim : 0 I0420 07:30:26.237788 140395597309760 base_runner.py:69] task.decoder.attention.vn.global_vn : False I0420 07:30:26.237874 140395597309760 base_runner.py:69] task.decoder.attention.vn.per_step_vn : False I0420 07:30:26.237957 140395597309760 base_runner.py:69] task.decoder.attention.vn.scale : NoneType I0420 07:30:26.238043 140395597309760 base_runner.py:69] task.decoder.attention.vn.seed : NoneType I0420 07:30:26.238126 140395597309760 base_runner.py:69] task.decoder.attention_plot_font_properties : FontProperties I0420 07:30:26.238209 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_empty_terminated_hyp : True I0420 07:30:26.238293 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_implicit_capture : NoneType I0420 07:30:26.238375 140395597309760 base_runner.py:69] task.decoder.beam_search.batch_major_state : True I0420 07:30:26.238461 140395597309760 base_runner.py:69] task.decoder.beam_search.beam_size : 3.0 I0420 07:30:26.238543 140395597309760 base_runner.py:69] task.decoder.beam_search.cls : type/lingvo.core.beam_search_helper/BeamSearchHelper I0420 07:30:26.238630 140395597309760 base_runner.py:69] task.decoder.beam_search.coverage_penalty : 0.0 I0420 07:30:26.238713 140395597309760 base_runner.py:69] task.decoder.beam_search.dtype : float32 I0420 07:30:26.238805 140395597309760 base_runner.py:69] task.decoder.beam_search.ensure_full_beam : False I0420 07:30:26.238889 140395597309760 base_runner.py:69] task.decoder.beam_search.force_eos_in_last_step : False I0420 07:30:26.238975 140395597309760 base_runner.py:69] task.decoder.beam_search.fprop_dtype : NoneType I0420 07:30:26.239058 140395597309760 base_runner.py:69] task.decoder.beam_search.inference_driver_name : NoneType I0420 07:30:26.239142 140395597309760 base_runner.py:69] task.decoder.beam_search.is_eval : NoneType I0420 07:30:26.239226 140395597309760 base_runner.py:69] task.decoder.beam_search.is_inference : NoneType I0420 07:30:26.239310 140395597309760 base_runner.py:69] task.decoder.beam_search.length_normalization : 0.0 I0420 07:30:26.239393 140395597309760 base_runner.py:69] task.decoder.beam_search.merge_paths : False I0420 07:30:26.239479 140395597309760 base_runner.py:69] task.decoder.beam_search.name : 'beam_search' I0420 07:30:26.239561 140395597309760 base_runner.py:69] task.decoder.beam_search.num_hyps_per_beam : 8 I0420 07:30:26.239645 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.method : 'xavier' I0420 07:30:26.239734 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.scale : 1.000001 I0420 07:30:26.239824 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.seed : NoneType I0420 07:30:26.239908 140395597309760 base_runner.py:69] task.decoder.beam_search.random_seed : NoneType I0420 07:30:26.239993 140395597309760 base_runner.py:69] task.decoder.beam_search.skip_lp_regularization : NoneType I0420 07:30:26.240078 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eoc_id : -1 I0420 07:30:26.240160 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eos_id : 2 I0420 07:30:26.240243 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_len : 0 I0420 07:30:26.240329 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_length_ratio : 1.0 I0420 07:30:26.240412 140395597309760 base_runner.py:69] task.decoder.beam_search.target_sos_id : 1 I0420 07:30:26.240497 140395597309760 base_runner.py:69] task.decoder.beam_search.valid_eos_max_logit_delta : 5.0 I0420 07:30:26.240581 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.global_vn : False I0420 07:30:26.240664 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.per_step_vn : False I0420 07:30:26.240751 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.scale : NoneType I0420 07:30:26.240845 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.seed : NoneType I0420 07:30:26.240926 140395597309760 base_runner.py:69] task.decoder.cls : type/lingvo.tasks.asr.decoder/AsrDecoder I0420 07:30:26.241009 140395597309760 base_runner.py:69] task.decoder.contextualizer.allow_implicit_capture : NoneType I0420 07:30:26.241094 140395597309760 base_runner.py:69] task.decoder.contextualizer.cls : type/lingvo.tasks.asr.contextualizer_base/NullContextualizer I0420 07:30:26.241178 140395597309760 base_runner.py:69] task.decoder.contextualizer.dtype : float32 I0420 07:30:26.241261 140395597309760 base_runner.py:69] task.decoder.contextualizer.fprop_dtype : NoneType I0420 07:30:26.241348 140395597309760 base_runner.py:69] task.decoder.contextualizer.inference_driver_name : NoneType I0420 07:30:26.241427 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_eval : NoneType I0420 07:30:26.241512 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_inference : NoneType I0420 07:30:26.241595 140395597309760 base_runner.py:69] task.decoder.contextualizer.name : '' I0420 07:30:26.241678 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.method : 'xavier' I0420 07:30:26.241766 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.scale : 1.000001 I0420 07:30:26.241852 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.seed : NoneType I0420 07:30:26.241936 140395597309760 base_runner.py:69] task.decoder.contextualizer.random_seed : NoneType I0420 07:30:26.242022 140395597309760 base_runner.py:69] task.decoder.contextualizer.skip_lp_regularization : NoneType I0420 07:30:26.242106 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.global_vn : False I0420 07:30:26.242187 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.per_step_vn : False I0420 07:30:26.242274 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.scale : NoneType I0420 07:30:26.242360 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.seed : NoneType I0420 07:30:26.242445 140395597309760 base_runner.py:69] task.decoder.dropout_prob : 0.0 I0420 07:30:26.242528 140395597309760 base_runner.py:69] task.decoder.dtype : float32 I0420 07:30:26.242613 140395597309760 base_runner.py:69] task.decoder.emb.allow_implicit_capture : NoneType I0420 07:30:26.242696 140395597309760 base_runner.py:69] task.decoder.emb.cls : type/lingvo.core.layers/EmbeddingLayer I0420 07:30:26.242794 140395597309760 base_runner.py:69] task.decoder.emb.dtype : float32 I0420 07:30:26.242881 140395597309760 base_runner.py:69] task.decoder.emb.embedding_dim : 0 I0420 07:30:26.242964 140395597309760 base_runner.py:69] task.decoder.emb.fprop_dtype : NoneType I0420 07:30:26.243048 140395597309760 base_runner.py:69] task.decoder.emb.inference_driver_name : NoneType I0420 07:30:26.243130 140395597309760 base_runner.py:69] task.decoder.emb.is_eval : NoneType I0420 07:30:26.243216 140395597309760 base_runner.py:69] task.decoder.emb.is_inference : NoneType I0420 07:30:26.243298 140395597309760 base_runner.py:69] task.decoder.emb.max_num_shards : 1 I0420 07:30:26.243381 140395597309760 base_runner.py:69] task.decoder.emb.name : '' I0420 07:30:26.243465 140395597309760 base_runner.py:69] task.decoder.emb.on_ps : True I0420 07:30:26.243550 140395597309760 base_runner.py:69] task.decoder.emb.params_init.method : 'uniform' I0420 07:30:26.243633 140395597309760 base_runner.py:69] task.decoder.emb.params_init.scale : 1.0 I0420 07:30:26.243727 140395597309760 base_runner.py:69] task.decoder.emb.params_init.seed : NoneType I0420 07:30:26.243807 140395597309760 base_runner.py:69] task.decoder.emb.random_seed : NoneType I0420 07:30:26.243894 140395597309760 base_runner.py:69] task.decoder.emb.scale_sqrt_depth : False I0420 07:30:26.243978 140395597309760 base_runner.py:69] task.decoder.emb.skip_lp_regularization : NoneType I0420 07:30:26.244060 140395597309760 base_runner.py:69] task.decoder.emb.vn.global_vn : False I0420 07:30:26.244147 140395597309760 base_runner.py:69] task.decoder.emb.vn.per_step_vn : False I0420 07:30:26.244236 140395597309760 base_runner.py:69] task.decoder.emb.vn.scale : NoneType I0420 07:30:26.244321 140395597309760 base_runner.py:69] task.decoder.emb.vn.seed : NoneType I0420 07:30:26.244405 140395597309760 base_runner.py:69] task.decoder.emb.vocab_size : 76 I0420 07:30:26.244488 140395597309760 base_runner.py:69] task.decoder.emb_dim : 76 I0420 07:30:26.244571 140395597309760 base_runner.py:69] task.decoder.fprop_dtype : NoneType I0420 07:30:26.244653 140395597309760 base_runner.py:69] task.decoder.fusion.allow_implicit_capture : NoneType I0420 07:30:26.244745 140395597309760 base_runner.py:69] task.decoder.fusion.base_model_logits_dim : NoneType I0420 07:30:26.244829 140395597309760 base_runner.py:69] task.decoder.fusion.cls : type/lingvo.tasks.asr.fusion/NullFusion I0420 07:30:26.244915 140395597309760 base_runner.py:69] task.decoder.fusion.dtype : float32 I0420 07:30:26.244997 140395597309760 base_runner.py:69] task.decoder.fusion.fprop_dtype : NoneType I0420 07:30:26.245081 140395597309760 base_runner.py:69] task.decoder.fusion.inference_driver_name : NoneType I0420 07:30:26.245165 140395597309760 base_runner.py:69] task.decoder.fusion.is_eval : NoneType I0420 07:30:26.245248 140395597309760 base_runner.py:69] task.decoder.fusion.is_inference : NoneType I0420 07:30:26.245331 140395597309760 base_runner.py:69] task.decoder.fusion.lm.allow_implicit_capture : NoneType I0420 07:30:26.245415 140395597309760 base_runner.py:69] task.decoder.fusion.lm.cls : type/lingvo.tasks.lm.layers/NullLm I0420 07:30:26.245498 140395597309760 base_runner.py:69] task.decoder.fusion.lm.dtype : float32 I0420 07:30:26.245584 140395597309760 base_runner.py:69] task.decoder.fusion.lm.fprop_dtype : NoneType I0420 07:30:26.245666 140395597309760 base_runner.py:69] task.decoder.fusion.lm.inference_driver_name : NoneType I0420 07:30:26.245759 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_eval : NoneType I0420 07:30:26.245843 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_inference : NoneType I0420 07:30:26.245923 140395597309760 base_runner.py:69] task.decoder.fusion.lm.name : '' I0420 07:30:26.246007 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.method : 'xavier' I0420 07:30:26.246092 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.scale : 1.000001 I0420 07:30:26.246176 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.seed : NoneType I0420 07:30:26.246260 140395597309760 base_runner.py:69] task.decoder.fusion.lm.random_seed : NoneType I0420 07:30:26.246344 140395597309760 base_runner.py:69] task.decoder.fusion.lm.skip_lp_regularization : NoneType I0420 07:30:26.246428 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.global_vn : False I0420 07:30:26.246514 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.per_step_vn : False I0420 07:30:26.246598 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.scale : NoneType I0420 07:30:26.246678 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.seed : NoneType I0420 07:30:26.246769 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vocab_size : 96 I0420 07:30:26.246859 140395597309760 base_runner.py:69] task.decoder.fusion.name : '' I0420 07:30:26.246937 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.method : 'xavier' I0420 07:30:26.247023 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.scale : 1.000001 I0420 07:30:26.247107 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.seed : NoneType I0420 07:30:26.247189 140395597309760 base_runner.py:69] task.decoder.fusion.random_seed : NoneType I0420 07:30:26.247276 140395597309760 base_runner.py:69] task.decoder.fusion.skip_lp_regularization : NoneType I0420 07:30:26.247361 140395597309760 base_runner.py:69] task.decoder.fusion.vn.global_vn : False I0420 07:30:26.247445 140395597309760 base_runner.py:69] task.decoder.fusion.vn.per_step_vn : False I0420 07:30:26.247528 140395597309760 base_runner.py:69] task.decoder.fusion.vn.scale : NoneType I0420 07:30:26.247617 140395597309760 base_runner.py:69] task.decoder.fusion.vn.seed : NoneType I0420 07:30:26.247699 140395597309760 base_runner.py:69] task.decoder.inference_driver_name : NoneType I0420 07:30:26.247792 140395597309760 base_runner.py:69] task.decoder.is_eval : NoneType I0420 07:30:26.247878 140395597309760 base_runner.py:69] task.decoder.is_inference : NoneType I0420 07:30:26.247961 140395597309760 base_runner.py:69] task.decoder.label_smoothing : NoneType I0420 07:30:26.248045 140395597309760 base_runner.py:69] task.decoder.logit_types : {'logits': 1.0} I0420 07:30:26.248128 140395597309760 base_runner.py:69] task.decoder.min_ground_truth_prob : 1.0 I0420 07:30:26.248213 140395597309760 base_runner.py:69] task.decoder.min_prob_step : 1000000.0 I0420 07:30:26.248296 140395597309760 base_runner.py:69] task.decoder.name : '' I0420 07:30:26.248378 140395597309760 base_runner.py:69] task.decoder.packed_input : False I0420 07:30:26.248464 140395597309760 base_runner.py:69] task.decoder.parallel_iterations : 30 I0420 07:30:26.248547 140395597309760 base_runner.py:69] task.decoder.params_init.method : 'xavier' I0420 07:30:26.248631 140395597309760 base_runner.py:69] task.decoder.params_init.scale : 1.000001 I0420 07:30:26.248712 140395597309760 base_runner.py:69] task.decoder.params_init.seed : NoneType I0420 07:30:26.248805 140395597309760 base_runner.py:69] task.decoder.per_token_avg_loss : True I0420 07:30:26.248888 140395597309760 base_runner.py:69] task.decoder.prob_decay_start_step : 10000.0 I0420 07:30:26.248972 140395597309760 base_runner.py:69] task.decoder.random_seed : NoneType I0420 07:30:26.249056 140395597309760 base_runner.py:69] task.decoder.residual_start : 0 I0420 07:30:26.249140 140395597309760 base_runner.py:69] task.decoder.rnn_cell_dim : 1024 I0420 07:30:26.249222 140395597309760 base_runner.py:69] task.decoder.rnn_cell_hidden_dim : 0 I0420 07:30:26.249308 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.allow_implicit_capture : NoneType I0420 07:30:26.249392 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.apply_pruning : False I0420 07:30:26.249478 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.method : 'constant' I0420 07:30:26.249562 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.scale : 0.0 I0420 07:30:26.249648 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.seed : 0 I0420 07:30:26.249737 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cell_value_cap : 10.0 I0420 07:30:26.249821 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple I0420 07:30:26.249906 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.couple_input_forget_gates : False I0420 07:30:26.249989 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.dtype : float32 I0420 07:30:26.250072 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.enable_lstm_bias : True I0420 07:30:26.250157 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.forget_gate_bias : 0.0 I0420 07:30:26.250240 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.fprop_dtype : NoneType I0420 07:30:26.250324 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inference_driver_name : NoneType I0420 07:30:26.250407 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inputs_arity : 1 I0420 07:30:26.250492 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_eval : NoneType I0420 07:30:26.250572 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_inference : NoneType I0420 07:30:26.250659 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.name : '' I0420 07:30:26.250751 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_hidden_nodes : 0 I0420 07:30:26.250835 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_input_nodes : 0 I0420 07:30:26.250921 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_output_nodes : 0 I0420 07:30:26.251012 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.output_nonlinearity : True I0420 07:30:26.251096 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.method : 'uniform' I0420 07:30:26.251180 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.scale : 0.1 I0420 07:30:26.251264 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.seed : NoneType I0420 07:30:26.251348 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.c_state : NoneType I0420 07:30:26.251435 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.default : NoneType I0420 07:30:26.251521 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.fullyconnected : NoneType I0420 07:30:26.251600 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.m_state : NoneType I0420 07:30:26.251688 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.weight : NoneType I0420 07:30:26.251780 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.random_seed : NoneType I0420 07:30:26.251859 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.reset_cell_state : False I0420 07:30:26.251946 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.skip_lp_regularization : NoneType I0420 07:30:26.252032 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.global_vn : False I0420 07:30:26.252113 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.per_step_vn : False I0420 07:30:26.252199 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.scale : NoneType I0420 07:30:26.252285 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.seed : NoneType I0420 07:30:26.252367 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:26.252451 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.seed : NoneType I0420 07:30:26.252537 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zo_prob : 0.0 I0420 07:30:26.252619 140395597309760 base_runner.py:69] task.decoder.rnn_layers : 2 I0420 07:30:26.252703 140395597309760 base_runner.py:69] task.decoder.skip_lp_regularization : NoneType I0420 07:30:26.252793 140395597309760 base_runner.py:69] task.decoder.softmax.allow_implicit_capture : NoneType I0420 07:30:26.252877 140395597309760 base_runner.py:69] task.decoder.softmax.apply_pruning : False I0420 07:30:26.252960 140395597309760 base_runner.py:69] task.decoder.softmax.chunk_size : 0 I0420 07:30:26.253046 140395597309760 base_runner.py:69] task.decoder.softmax.cls : type/lingvo.core.layers/SimpleFullSoftmax I0420 07:30:26.253129 140395597309760 base_runner.py:69] task.decoder.softmax.dtype : float32 I0420 07:30:26.253212 140395597309760 base_runner.py:69] task.decoder.softmax.fprop_dtype : NoneType I0420 07:30:26.253293 140395597309760 base_runner.py:69] task.decoder.softmax.inference_driver_name : NoneType I0420 07:30:26.253380 140395597309760 base_runner.py:69] task.decoder.softmax.input_dim : 0 I0420 07:30:26.253467 140395597309760 base_runner.py:69] task.decoder.softmax.is_eval : NoneType I0420 07:30:26.253550 140395597309760 base_runner.py:69] task.decoder.softmax.is_inference : NoneType I0420 07:30:26.253635 140395597309760 base_runner.py:69] task.decoder.softmax.logits_abs_max : NoneType I0420 07:30:26.253717 140395597309760 base_runner.py:69] task.decoder.softmax.name : '' I0420 07:30:26.253808 140395597309760 base_runner.py:69] task.decoder.softmax.num_classes : 76 I0420 07:30:26.253891 140395597309760 base_runner.py:69] task.decoder.softmax.num_sampled : 0 I0420 07:30:26.253976 140395597309760 base_runner.py:69] task.decoder.softmax.num_shards : 1 I0420 07:30:26.254060 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.method : 'uniform' I0420 07:30:26.254143 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.scale : 0.1 I0420 07:30:26.254229 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.seed : NoneType I0420 07:30:26.254314 140395597309760 base_runner.py:69] task.decoder.softmax.qdomain.default : NoneType I0420 07:30:26.254401 140395597309760 base_runner.py:69] task.decoder.softmax.random_seed : NoneType I0420 07:30:26.254491 140395597309760 base_runner.py:69] task.decoder.softmax.skip_lp_regularization : NoneType I0420 07:30:26.254570 140395597309760 base_runner.py:69] task.decoder.softmax.vn.global_vn : False I0420 07:30:26.254654 140395597309760 base_runner.py:69] task.decoder.softmax.vn.per_step_vn : False I0420 07:30:26.254745 140395597309760 base_runner.py:69] task.decoder.softmax.vn.scale : NoneType I0420 07:30:26.254828 140395597309760 base_runner.py:69] task.decoder.softmax.vn.seed : NoneType I0420 07:30:26.254913 140395597309760 base_runner.py:69] task.decoder.softmax_uses_attention : True I0420 07:30:26.254997 140395597309760 base_runner.py:69] task.decoder.source_dim : 2048 I0420 07:30:26.255093 140395597309760 base_runner.py:69] task.decoder.target_eos_id : 2 I0420 07:30:26.255172 140395597309760 base_runner.py:69] task.decoder.target_seq_len : 620 I0420 07:30:26.255256 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.allow_implicit_capture : NoneType I0420 07:30:26.255338 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.cls : type/lingvo.core.target_sequence_sampler/TargetSequenceSampler I0420 07:30:26.255422 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.dtype : float32 I0420 07:30:26.255502 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.fprop_dtype : NoneType I0420 07:30:26.255585 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.inference_driver_name : NoneType I0420 07:30:26.255665 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_eval : NoneType I0420 07:30:26.255753 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_inference : NoneType I0420 07:30:26.255835 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.name : 'target_sequence_sampler' I0420 07:30:26.255918 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.method : 'xavier' I0420 07:30:26.255999 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.scale : 1.000001 I0420 07:30:26.256082 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.seed : NoneType I0420 07:30:26.256161 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.random_seed : NoneType I0420 07:30:26.256247 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.skip_lp_regularization : NoneType I0420 07:30:26.256328 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eoc_id : -1 I0420 07:30:26.256412 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eos_id : 2 I0420 07:30:26.256491 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_seq_len : 0 I0420 07:30:26.256576 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_sos_id : 1 I0420 07:30:26.256654 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.temperature : 1.0 I0420 07:30:26.256743 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.global_vn : False I0420 07:30:26.256824 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.per_step_vn : False I0420 07:30:26.256910 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.scale : NoneType I0420 07:30:26.256989 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.seed : NoneType I0420 07:30:26.257072 140395597309760 base_runner.py:69] task.decoder.target_sos_id : 1 I0420 07:30:26.257152 140395597309760 base_runner.py:69] task.decoder.use_unnormalized_logits_as_log_probs : True I0420 07:30:26.257235 140395597309760 base_runner.py:69] task.decoder.use_while_loop_based_unrolling : False I0420 07:30:26.257314 140395597309760 base_runner.py:69] task.decoder.vn.global_vn : False I0420 07:30:26.257402 140395597309760 base_runner.py:69] task.decoder.vn.per_step_vn : False I0420 07:30:26.257483 140395597309760 base_runner.py:69] task.decoder.vn.scale : NoneType I0420 07:30:26.257567 140395597309760 base_runner.py:69] task.decoder.vn.seed : NoneType I0420 07:30:26.257646 140395597309760 base_runner.py:69] task.dtype : float32 I0420 07:30:26.257735 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.activation : 'RELU' I0420 07:30:26.257816 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.allow_implicit_capture : NoneType I0420 07:30:26.257900 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.batch_norm : True I0420 07:30:26.257980 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bias : False I0420 07:30:26.258064 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_decay : 0.999 I0420 07:30:26.258142 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_fold_weights : NoneType I0420 07:30:26.258225 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.causal_convolution : False I0420 07:30:26.258306 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer I0420 07:30:26.258389 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.conv_last : False I0420 07:30:26.258470 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dilation_rate : (1, 1) I0420 07:30:26.258554 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.disable_activation_quantization : False I0420 07:30:26.258634 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dtype : float32 I0420 07:30:26.258716 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_shape : [3, 3, 'NoneType', 'NoneType'] I0420 07:30:26.258807 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_stride : [1, 1] I0420 07:30:26.258893 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.fprop_dtype : NoneType I0420 07:30:26.258972 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.inference_driver_name : NoneType I0420 07:30:26.259056 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_eval : NoneType I0420 07:30:26.259135 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_inference : NoneType I0420 07:30:26.259218 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.name : '' I0420 07:30:26.259304 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:26.259387 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.scale : 0.1 I0420 07:30:26.259468 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.seed : NoneType I0420 07:30:26.259552 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.qdomain.default : NoneType I0420 07:30:26.259630 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.random_seed : NoneType I0420 07:30:26.259715 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.skip_lp_regularization : NoneType I0420 07:30:26.259798 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.global_vn : False I0420 07:30:26.259886 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.per_step_vn : False I0420 07:30:26.259964 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.scale : NoneType I0420 07:30:26.260051 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.seed : NoneType I0420 07:30:26.260134 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.weight_norm : False I0420 07:30:26.260221 140395597309760 base_runner.py:69] task.encoder.allow_implicit_capture : NoneType I0420 07:30:26.260303 140395597309760 base_runner.py:69] task.encoder.bidi_rnn_type : 'func' I0420 07:30:26.260396 140395597309760 base_runner.py:69] task.encoder.cls : type/lingvo.tasks.asr.encoder/AsrEncoder I0420 07:30:26.260477 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.activation : 'RELU' I0420 07:30:26.260560 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.allow_implicit_capture : NoneType I0420 07:30:26.260638 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.batch_norm : True I0420 07:30:26.260721 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bias : False I0420 07:30:26.260806 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_decay : 0.999 I0420 07:30:26.260890 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_fold_weights : NoneType I0420 07:30:26.260970 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.causal_convolution : False I0420 07:30:26.261049 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer I0420 07:30:26.261132 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.conv_last : False I0420 07:30:26.261218 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dilation_rate : (1, 1) I0420 07:30:26.261297 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.disable_activation_quantization : False I0420 07:30:26.261383 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dtype : float32 I0420 07:30:26.261461 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_shape : (0, 0, 0, 0) I0420 07:30:26.261547 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_stride : (0, 0) I0420 07:30:26.261626 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.fprop_dtype : NoneType I0420 07:30:26.261708 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.inference_driver_name : NoneType I0420 07:30:26.261795 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_eval : NoneType I0420 07:30:26.261881 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_inference : NoneType I0420 07:30:26.261960 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.name : '' I0420 07:30:26.262042 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.method : 'gaussian' I0420 07:30:26.262119 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.scale : 0.001 I0420 07:30:26.262206 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.seed : NoneType I0420 07:30:26.262285 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.qdomain.default : NoneType I0420 07:30:26.262368 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.random_seed : NoneType I0420 07:30:26.262449 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.skip_lp_regularization : NoneType I0420 07:30:26.262531 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.global_vn : False I0420 07:30:26.262610 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.per_step_vn : False I0420 07:30:26.262693 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.scale : NoneType I0420 07:30:26.262778 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.seed : NoneType I0420 07:30:26.262865 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.weight_norm : False I0420 07:30:26.262943 140395597309760 base_runner.py:69] task.encoder.conv_filter_shapes : [(3, 3, 1, 32), (3, 3, 32, 32)] I0420 07:30:26.263027 140395597309760 base_runner.py:69] task.encoder.conv_filter_strides : [(2, 2), (2, 2)] I0420 07:30:26.263106 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.allow_implicit_capture : NoneType I0420 07:30:26.263190 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType'] I0420 07:30:26.263268 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_value_cap : 10.0 I0420 07:30:26.263353 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cls : type/lingvo.core.rnn_cell/ConvLSTMCell I0420 07:30:26.263432 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.dtype : float32 I0420 07:30:26.263515 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.filter_shape : [1, 3] I0420 07:30:26.263600 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.fprop_dtype : NoneType I0420 07:30:26.263685 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inference_driver_name : NoneType I0420 07:30:26.263771 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_arity : 1 I0420 07:30:26.263859 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType'] I0420 07:30:26.263937 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_eval : NoneType I0420 07:30:26.264022 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_inference : NoneType I0420 07:30:26.264101 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.name : '' I0420 07:30:26.264185 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_input_nodes : 0 I0420 07:30:26.264265 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_output_nodes : 0 I0420 07:30:26.264348 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.output_nonlinearity : True I0420 07:30:26.264429 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:26.264513 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.scale : 0.1 I0420 07:30:26.264594 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.seed : NoneType I0420 07:30:26.264678 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.qdomain.default : NoneType I0420 07:30:26.264764 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.random_seed : NoneType I0420 07:30:26.264847 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.reset_cell_state : False I0420 07:30:26.264930 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.skip_lp_regularization : NoneType I0420 07:30:26.265014 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.global_vn : False I0420 07:30:26.265100 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.per_step_vn : False I0420 07:30:26.265185 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.scale : NoneType I0420 07:30:26.265264 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.seed : NoneType I0420 07:30:26.265352 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:26.265431 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.seed : NoneType I0420 07:30:26.265516 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zo_prob : 0.0 I0420 07:30:26.265594 140395597309760 base_runner.py:69] task.encoder.dtype : float32 I0420 07:30:26.265680 140395597309760 base_runner.py:69] task.encoder.extra_per_layer_outputs : False I0420 07:30:26.265764 140395597309760 base_runner.py:69] task.encoder.fprop_dtype : NoneType I0420 07:30:26.265851 140395597309760 base_runner.py:69] task.encoder.highway_skip : False I0420 07:30:26.265930 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.allow_implicit_capture : NoneType I0420 07:30:26.266016 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.batch_norm : False I0420 07:30:26.266094 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.carry_bias_init : 1.0 I0420 07:30:26.266179 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.cls : type/lingvo.core.layers/HighwaySkipLayer I0420 07:30:26.266258 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.couple_carry_transform_gates : False I0420 07:30:26.266340 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.dtype : float32 I0420 07:30:26.266423 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.fprop_dtype : NoneType I0420 07:30:26.266509 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.inference_driver_name : NoneType I0420 07:30:26.266597 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.input_dim : 0 I0420 07:30:26.266688 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_eval : NoneType I0420 07:30:26.266772 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_inference : NoneType I0420 07:30:26.266856 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.name : '' I0420 07:30:26.266940 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.method : 'xavier' I0420 07:30:26.267024 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.scale : 1.000001 I0420 07:30:26.267110 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.seed : NoneType I0420 07:30:26.267194 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.random_seed : NoneType I0420 07:30:26.267273 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.skip_lp_regularization : NoneType I0420 07:30:26.267358 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.global_vn : False I0420 07:30:26.267437 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.per_step_vn : False I0420 07:30:26.267522 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.scale : NoneType I0420 07:30:26.267599 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.seed : NoneType I0420 07:30:26.267685 140395597309760 base_runner.py:69] task.encoder.inference_driver_name : NoneType I0420 07:30:26.267771 140395597309760 base_runner.py:69] task.encoder.input_shape : ['NoneType', 'NoneType', 80, 1] I0420 07:30:26.267854 140395597309760 base_runner.py:69] task.encoder.is_eval : NoneType I0420 07:30:26.267937 140395597309760 base_runner.py:69] task.encoder.is_inference : NoneType I0420 07:30:26.268023 140395597309760 base_runner.py:69] task.encoder.lstm_cell_size : 1024 I0420 07:30:26.268101 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.allow_implicit_capture : NoneType I0420 07:30:26.268187 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.apply_pruning : False I0420 07:30:26.268265 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.method : 'constant' I0420 07:30:26.268349 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.scale : 0.0 I0420 07:30:26.268430 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.seed : 0 I0420 07:30:26.268515 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cell_value_cap : 10.0 I0420 07:30:26.268594 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple I0420 07:30:26.268677 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.couple_input_forget_gates : False I0420 07:30:26.268764 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.dtype : float32 I0420 07:30:26.268850 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.enable_lstm_bias : True I0420 07:30:26.268927 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.forget_gate_bias : 0.0 I0420 07:30:26.269013 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.fprop_dtype : NoneType I0420 07:30:26.269093 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inference_driver_name : NoneType I0420 07:30:26.269176 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inputs_arity : 1 I0420 07:30:26.269254 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_eval : NoneType I0420 07:30:26.269340 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_inference : NoneType I0420 07:30:26.269419 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.name : '' I0420 07:30:26.269503 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_hidden_nodes : 0 I0420 07:30:26.269582 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_input_nodes : 0 I0420 07:30:26.269663 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_output_nodes : 0 I0420 07:30:26.269752 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.output_nonlinearity : True I0420 07:30:26.269839 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.method : 'uniform' I0420 07:30:26.269921 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.scale : 0.1 I0420 07:30:26.270009 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.seed : NoneType I0420 07:30:26.270087 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.c_state : NoneType I0420 07:30:26.270175 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.default : NoneType I0420 07:30:26.270255 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.fullyconnected : NoneType I0420 07:30:26.270339 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.m_state : NoneType I0420 07:30:26.270418 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.weight : NoneType I0420 07:30:26.270503 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.random_seed : NoneType I0420 07:30:26.270581 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.reset_cell_state : False I0420 07:30:26.270664 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.skip_lp_regularization : NoneType I0420 07:30:26.270754 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.global_vn : False I0420 07:30:26.270838 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.per_step_vn : False I0420 07:30:26.270915 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.scale : NoneType I0420 07:30:26.271002 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.seed : NoneType I0420 07:30:26.271080 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.method : 'zeros' I0420 07:30:26.271169 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.seed : NoneType I0420 07:30:26.271248 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zo_prob : 0.0 I0420 07:30:26.271332 140395597309760 base_runner.py:69] task.encoder.name : '' I0420 07:30:26.271411 140395597309760 base_runner.py:69] task.encoder.num_cnn_layers : 2 I0420 07:30:26.271496 140395597309760 base_runner.py:69] task.encoder.num_conv_lstm_layers : 0 I0420 07:30:26.271574 140395597309760 base_runner.py:69] task.encoder.num_lstm_layers : 4 I0420 07:30:26.271656 140395597309760 base_runner.py:69] task.encoder.packed_input : False I0420 07:30:26.271743 140395597309760 base_runner.py:69] task.encoder.pad_steps : 6 I0420 07:30:26.271828 140395597309760 base_runner.py:69] task.encoder.params_init.method : 'xavier' I0420 07:30:26.271908 140395597309760 base_runner.py:69] task.encoder.params_init.scale : 1.000001 I0420 07:30:26.271992 140395597309760 base_runner.py:69] task.encoder.params_init.seed : NoneType I0420 07:30:26.272073 140395597309760 base_runner.py:69] task.encoder.proj_tpl.activation : 'RELU' I0420 07:30:26.272156 140395597309760 base_runner.py:69] task.encoder.proj_tpl.affine_last : False I0420 07:30:26.272236 140395597309760 base_runner.py:69] task.encoder.proj_tpl.allow_implicit_capture : NoneType I0420 07:30:26.272319 140395597309760 base_runner.py:69] task.encoder.proj_tpl.batch_norm : True I0420 07:30:26.272397 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bias_init : 0.0 I0420 07:30:26.272481 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bn_fold_weights : NoneType I0420 07:30:26.272561 140395597309760 base_runner.py:69] task.encoder.proj_tpl.cls : type/lingvo.core.layers/ProjectionLayer I0420 07:30:26.272644 140395597309760 base_runner.py:69] task.encoder.proj_tpl.dtype : float32 I0420 07:30:26.272727 140395597309760 base_runner.py:69] task.encoder.proj_tpl.fprop_dtype : NoneType I0420 07:30:26.272814 140395597309760 base_runner.py:69] task.encoder.proj_tpl.has_bias : False I0420 07:30:26.272893 140395597309760 base_runner.py:69] task.encoder.proj_tpl.inference_driver_name : NoneType I0420 07:30:26.272977 140395597309760 base_runner.py:69] task.encoder.proj_tpl.input_dim : 0 I0420 07:30:26.273056 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_eval : NoneType I0420 07:30:26.273139 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_inference : NoneType I0420 07:30:26.273219 140395597309760 base_runner.py:69] task.encoder.proj_tpl.name : '' I0420 07:30:26.273309 140395597309760 base_runner.py:69] task.encoder.proj_tpl.output_dim : 0 I0420 07:30:26.273391 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.method : 'truncated_gaussian' I0420 07:30:26.273472 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.scale : 0.1 I0420 07:30:26.273554 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.seed : NoneType I0420 07:30:26.273639 140395597309760 base_runner.py:69] task.encoder.proj_tpl.qdomain.default : NoneType I0420 07:30:26.273720 140395597309760 base_runner.py:69] task.encoder.proj_tpl.random_seed : NoneType I0420 07:30:26.273808 140395597309760 base_runner.py:69] task.encoder.proj_tpl.skip_lp_regularization : NoneType I0420 07:30:26.273889 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.global_vn : False I0420 07:30:26.273974 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.per_step_vn : False I0420 07:30:26.274058 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.scale : NoneType I0420 07:30:26.274142 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.seed : NoneType I0420 07:30:26.274221 140395597309760 base_runner.py:69] task.encoder.proj_tpl.weight_norm : False I0420 07:30:26.274306 140395597309760 base_runner.py:69] task.encoder.project_lstm_output : True I0420 07:30:26.274385 140395597309760 base_runner.py:69] task.encoder.random_seed : NoneType I0420 07:30:26.274470 140395597309760 base_runner.py:69] task.encoder.residual_start : 0 I0420 07:30:26.274550 140395597309760 base_runner.py:69] task.encoder.residual_stride : 1 I0420 07:30:26.274635 140395597309760 base_runner.py:69] task.encoder.skip_lp_regularization : NoneType I0420 07:30:26.274713 140395597309760 base_runner.py:69] task.encoder.vn.global_vn : False I0420 07:30:26.274802 140395597309760 base_runner.py:69] task.encoder.vn.per_step_vn : False I0420 07:30:26.274883 140395597309760 base_runner.py:69] task.encoder.vn.scale : NoneType I0420 07:30:26.274967 140395597309760 base_runner.py:69] task.encoder.vn.seed : NoneType I0420 07:30:26.275044 140395597309760 base_runner.py:69] task.eval.decoder_samples_per_summary : 0 I0420 07:30:26.275131 140395597309760 base_runner.py:69] task.eval.samples_per_summary : 5000 I0420 07:30:26.275209 140395597309760 base_runner.py:69] task.fprop_dtype : NoneType I0420 07:30:26.275296 140395597309760 base_runner.py:69] task.frontend : NoneType I0420 07:30:26.275378 140395597309760 base_runner.py:69] task.inference_driver_name : NoneType I0420 07:30:26.275464 140395597309760 base_runner.py:69] task.input : NoneType I0420 07:30:26.275543 140395597309760 base_runner.py:69] task.is_eval : NoneType I0420 07:30:26.275628 140395597309760 base_runner.py:69] task.is_inference : NoneType I0420 07:30:26.275707 140395597309760 base_runner.py:69] task.name : 'librispeech' I0420 07:30:26.275796 140395597309760 base_runner.py:69] task.online_encoder : NoneType I0420 07:30:26.275876 140395597309760 base_runner.py:69] task.params_init.method : 'xavier' I0420 07:30:26.275960 140395597309760 base_runner.py:69] task.params_init.scale : 1.000001 I0420 07:30:26.276041 140395597309760 base_runner.py:69] task.params_init.seed : NoneType I0420 07:30:26.276124 140395597309760 base_runner.py:69] task.random_seed : NoneType I0420 07:30:26.276204 140395597309760 base_runner.py:69] task.skip_lp_regularization : NoneType I0420 07:30:26.276288 140395597309760 base_runner.py:69] task.target_key : '' I0420 07:30:26.276365 140395597309760 base_runner.py:69] task.train.bprop_variable_filter : NoneType I0420 07:30:26.276451 140395597309760 base_runner.py:69] task.train.clip_gradient_norm_to_value : 1.0 I0420 07:30:26.276530 140395597309760 base_runner.py:69] task.train.clip_gradient_single_norm_to_value : 0.0 I0420 07:30:26.276617 140395597309760 base_runner.py:69] task.train.colocate_gradients_with_ops : True I0420 07:30:26.276699 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.jobname : 'eval_dev' I0420 07:30:26.276793 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.local_filesystem : False I0420 07:30:26.276876 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.logdir : '' I0420 07:30:26.276961 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.metric : 'log_pplx' I0420 07:30:26.277040 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.minimize : True I0420 07:30:26.277124 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.name : 'MetricHistory' I0420 07:30:26.277204 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.tfevent_file : False I0420 07:30:26.277287 140395597309760 base_runner.py:69] task.train.early_stop.name : 'EarlyStop' I0420 07:30:26.277373 140395597309760 base_runner.py:69] task.train.early_stop.tolerance : 0.0 I0420 07:30:26.277456 140395597309760 base_runner.py:69] task.train.early_stop.verbose : True I0420 07:30:26.277537 140395597309760 base_runner.py:69] task.train.early_stop.window : 0 I0420 07:30:26.277616 140395597309760 base_runner.py:69] task.train.ema_decay : 0.0 I0420 07:30:26.277699 140395597309760 base_runner.py:69] task.train.gate_gradients : False I0420 07:30:26.277790 140395597309760 base_runner.py:69] task.train.grad_aggregation_method : 1 I0420 07:30:26.277870 140395597309760 base_runner.py:69] task.train.grad_norm_to_clip_to_zero : 100.0 I0420 07:30:26.277954 140395597309760 base_runner.py:69] task.train.grad_norm_tracker : NoneType I0420 07:30:26.278033 140395597309760 base_runner.py:69] task.train.init_from_checkpoint_rules : {} I0420 07:30:26.278116 140395597309760 base_runner.py:69] task.train.l1_regularizer_weight : NoneType I0420 07:30:26.278198 140395597309760 base_runner.py:69] task.train.l2_regularizer_weight : 1e-06 I0420 07:30:26.278280 140395597309760 base_runner.py:69] task.train.learning_rate : 0.00025 I0420 07:30:26.278367 140395597309760 base_runner.py:69] task.train.lr_schedule.allow_implicit_capture : NoneType I0420 07:30:26.278454 140395597309760 base_runner.py:69] task.train.lr_schedule.cls : type/lingvo.core.lr_schedule/ContinuousLearningRateSchedule I0420 07:30:26.278532 140395597309760 base_runner.py:69] task.train.lr_schedule.dtype : float32 I0420 07:30:26.278616 140395597309760 base_runner.py:69] task.train.lr_schedule.fprop_dtype : NoneType I0420 07:30:26.278697 140395597309760 base_runner.py:69] task.train.lr_schedule.half_life_steps : 100000 I0420 07:30:26.278785 140395597309760 base_runner.py:69] task.train.lr_schedule.inference_driver_name : NoneType I0420 07:30:26.278866 140395597309760 base_runner.py:69] task.train.lr_schedule.initial_value : 1.0 I0420 07:30:26.278949 140395597309760 base_runner.py:69] task.train.lr_schedule.is_eval : NoneType I0420 07:30:26.279028 140395597309760 base_runner.py:69] task.train.lr_schedule.is_inference : NoneType I0420 07:30:26.279109 140395597309760 base_runner.py:69] task.train.lr_schedule.min : 0.01 I0420 07:30:26.279192 140395597309760 base_runner.py:69] task.train.lr_schedule.name : 'LRSched' I0420 07:30:26.279278 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.method : 'xavier' I0420 07:30:26.279364 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.scale : 1.000001 I0420 07:30:26.279448 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.seed : NoneType I0420 07:30:26.279526 140395597309760 base_runner.py:69] task.train.lr_schedule.random_seed : NoneType I0420 07:30:26.279611 140395597309760 base_runner.py:69] task.train.lr_schedule.skip_lp_regularization : NoneType I0420 07:30:26.279690 140395597309760 base_runner.py:69] task.train.lr_schedule.start_step : 50000 I0420 07:30:26.279779 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.global_vn : False I0420 07:30:26.279859 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.per_step_vn : False I0420 07:30:26.279944 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.scale : NoneType I0420 07:30:26.280024 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.seed : NoneType I0420 07:30:26.280117 140395597309760 base_runner.py:69] task.train.max_steps : 4000000 I0420 07:30:26.280201 140395597309760 base_runner.py:69] task.train.optimizer.allow_implicit_capture : NoneType I0420 07:30:26.280287 140395597309760 base_runner.py:69] task.train.optimizer.beta1 : 0.9 I0420 07:30:26.280368 140395597309760 base_runner.py:69] task.train.optimizer.beta2 : 0.999 I0420 07:30:26.280452 140395597309760 base_runner.py:69] task.train.optimizer.cls : type/lingvo.core.optimizer/Adam I0420 07:30:26.280530 140395597309760 base_runner.py:69] task.train.optimizer.dtype : float32 I0420 07:30:26.280616 140395597309760 base_runner.py:69] task.train.optimizer.epsilon : 1e-06 I0420 07:30:26.280705 140395597309760 base_runner.py:69] task.train.optimizer.fprop_dtype : NoneType I0420 07:30:26.280791 140395597309760 base_runner.py:69] task.train.optimizer.inference_driver_name : NoneType I0420 07:30:26.280874 140395597309760 base_runner.py:69] task.train.optimizer.is_eval : NoneType I0420 07:30:26.280957 140395597309760 base_runner.py:69] task.train.optimizer.is_inference : NoneType I0420 07:30:26.281035 140395597309760 base_runner.py:69] task.train.optimizer.name : 'Adam' I0420 07:30:26.281120 140395597309760 base_runner.py:69] task.train.optimizer.params_init.method : 'xavier' I0420 07:30:26.281200 140395597309760 base_runner.py:69] task.train.optimizer.params_init.scale : 1.000001 I0420 07:30:26.281286 140395597309760 base_runner.py:69] task.train.optimizer.params_init.seed : NoneType I0420 07:30:26.281364 140395597309760 base_runner.py:69] task.train.optimizer.random_seed : NoneType I0420 07:30:26.281450 140395597309760 base_runner.py:69] task.train.optimizer.skip_lp_regularization : NoneType I0420 07:30:26.281533 140395597309760 base_runner.py:69] task.train.optimizer.vn.global_vn : False I0420 07:30:26.281619 140395597309760 base_runner.py:69] task.train.optimizer.vn.per_step_vn : False I0420 07:30:26.281699 140395597309760 base_runner.py:69] task.train.optimizer.vn.scale : NoneType I0420 07:30:26.281788 140395597309760 base_runner.py:69] task.train.optimizer.vn.seed : NoneType I0420 07:30:26.281868 140395597309760 base_runner.py:69] task.train.pruning_hparams_dict : NoneType I0420 07:30:26.281953 140395597309760 base_runner.py:69] task.train.save_interval_seconds : 600 I0420 07:30:26.282032 140395597309760 base_runner.py:69] task.train.start_up_delay_steps : 200 I0420 07:30:26.282119 140395597309760 base_runner.py:69] task.train.summary_interval_steps : 100 I0420 07:30:26.282197 140395597309760 base_runner.py:69] task.train.tpu_steps_per_loop : 20 I0420 07:30:26.282283 140395597309760 base_runner.py:69] task.train.vn_start_step : 20000 I0420 07:30:26.282366 140395597309760 base_runner.py:69] task.train.vn_std : 0.075 I0420 07:30:26.282450 140395597309760 base_runner.py:69] task.vn.global_vn : True I0420 07:30:26.282533 140395597309760 base_runner.py:69] task.vn.per_step_vn : False I0420 07:30:26.282620 140395597309760 base_runner.py:69] task.vn.scale : NoneType I0420 07:30:26.282701 140395597309760 base_runner.py:69] task.vn.seed : NoneType I0420 07:30:26.282788 140395597309760 base_runner.py:69] train.early_stop.metric_history.jobname : 'eval_dev' I0420 07:30:26.282871 140395597309760 base_runner.py:69] train.early_stop.metric_history.local_filesystem : False I0420 07:30:26.282958 140395597309760 base_runner.py:69] train.early_stop.metric_history.logdir : '' I0420 07:30:26.283037 140395597309760 base_runner.py:69] train.early_stop.metric_history.metric : 'log_pplx' I0420 07:30:26.283122 140395597309760 base_runner.py:69] train.early_stop.metric_history.minimize : True I0420 07:30:26.283200 140395597309760 base_runner.py:69] train.early_stop.metric_history.name : 'MetricHistory' I0420 07:30:26.283282 140395597309760 base_runner.py:69] train.early_stop.metric_history.tfevent_file : False I0420 07:30:26.283366 140395597309760 base_runner.py:69] train.early_stop.name : 'EarlyStop' I0420 07:30:26.283451 140395597309760 base_runner.py:69] train.early_stop.tolerance : 0.0 I0420 07:30:26.283533 140395597309760 base_runner.py:69] train.early_stop.verbose : True I0420 07:30:26.283624 140395597309760 base_runner.py:69] train.early_stop.window : 0 I0420 07:30:26.283704 140395597309760 base_runner.py:69] train.ema_decay : 0.0 I0420 07:30:26.283792 140395597309760 base_runner.py:69] train.init_from_checkpoint_rules : {} I0420 07:30:26.283878 140395597309760 base_runner.py:69] train.max_steps : 4000000 I0420 07:30:26.283962 140395597309760 base_runner.py:69] train.save_interval_seconds : 600 I0420 07:30:26.284043 140395597309760 base_runner.py:69] train.start_up_delay_steps : 200 I0420 07:30:26.284126 140395597309760 base_runner.py:69] train.summary_interval_steps : 100 I0420 07:30:26.284204 140395597309760 base_runner.py:69] train.tpu_steps_per_loop : 20 I0420 07:30:26.284286 140395597309760 base_runner.py:69] vn.global_vn : True I0420 07:30:26.284372 140395597309760 base_runner.py:69] vn.per_step_vn : False I0420 07:30:26.284456 140395597309760 base_runner.py:69] vn.scale : NoneType I0420 07:30:26.284537 140395597309760 base_runner.py:69] vn.seed : NoneType I0420 07:30:26.284621 140395597309760 base_runner.py:69] I0420 07:30:26.284714 140395597309760 base_runner.py:70] ============================================================ I0420 07:30:26.286209 140395597309760 base_runner.py:115] Starting ... I0420 07:30:26.286483 140395597309760 cluster.py:429] _LeastLoadedPlacer : ['/job:local/replica:0/task:0/device:CPU:0'] I0420 07:30:26.294176 140395597309760 cluster.py:447] Place variable global_step on /job:local/replica:0/task:0/device:CPU:0 8 I0420 07:30:26.304714 140395597309760 base_model.py:1116] Training parameters for : { early_stop: { metric_history: { "eval_dev" local_filesystem: False "/data/dingzhenyou/speech_data/librispeech/log/" "log_pplx" minimize: True "MetricHistory" tfevent_file: False } "EarlyStop" tolerance: 0.0 verbose: True window: 0 } ema_decay: 0.0 init_from_checkpoint_rules: {} max_steps: 4000000 save_interval_seconds: 600 start_up_delay_steps: 200 summary_interval_steps: 100 tpu_steps_per_loop: 20 } I0420 07:30:26.318938 140395597309760 base_input_generator.py:510] bucket_batch_limit [256, 128, 128, 128, 128, 128, 128, 128] I0420 07:30:26.584214 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 1160 I0420 07:30:26.586292 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/w/var:0 shape=(3, 3, 1, 32) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.591540 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 1288 I0420 07:30:26.593260 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.595974 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 1416 I0420 07:30:26.597687 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.601490 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 1544 I0420 07:30:26.603195 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.605947 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 1672 I0420 07:30:26.607652 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.616122 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 38536 I0420 07:30:26.618041 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/w/var:0 shape=(3, 3, 32, 32) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.623229 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 38664 I0420 07:30:26.625087 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.627810 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 38792 I0420 07:30:26.629525 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.633286 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 38920 I0420 07:30:26.635004 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.637748 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 39048 I0420 07:30:26.639466 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.660010 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 27302024 I0420 07:30:26.662070 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.668963 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 27318408 I0420 07:30:26.670681 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.688720 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 54581384 I0420 07:30:26.690670 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.697510 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 54597768 I0420 07:30:26.699352 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.721532 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 104929416 I0420 07:30:26.723484 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.730324 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 104945800 I0420 07:30:26.732203 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.750165 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 155277448 I0420 07:30:26.752114 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.759006 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 155293832 I0420 07:30:26.760740 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.782478 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 205625480 I0420 07:30:26.784432 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.791290 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 205641864 I0420 07:30:26.793705 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.811717 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 255973512 I0420 07:30:26.813673 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.820574 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 255989896 I0420 07:30:26.822328 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.843966 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 306321544 I0420 07:30:26.845961 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.852847 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 306337928 I0420 07:30:26.854681 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.873220 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 356669576 I0420 07:30:26.875176 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.882061 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 356685960 I0420 07:30:26.883821 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.898005 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 373463176 I0420 07:30:26.899913 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.905128 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 373471368 I0420 07:30:26.906843 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.909568 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 373479560 I0420 07:30:26.911293 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.915091 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 373487752 I0420 07:30:26.916820 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.919569 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 373495944 I0420 07:30:26.921300 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.929511 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 390273160 I0420 07:30:26.931543 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.936654 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 390281352 I0420 07:30:26.938420 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.941274 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 390289544 I0420 07:30:26.942984 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.946660 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 390297736 I0420 07:30:26.948585 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.951374 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 390305928 I0420 07:30:26.953089 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.961455 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/w/var on /job:local/replica:0/task:0/device:CPU:0 407083144 I0420 07:30:26.963459 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.969258 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/beta/var on /job:local/replica:0/task:0/device:CPU:0 407091336 I0420 07:30:26.970977 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.973731 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/gamma/var on /job:local/replica:0/task:0/device:CPU:0 407099528 I0420 07:30:26.975578 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.979238 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 407107720 I0420 07:30:26.980963 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:26.983851 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 407115912 I0420 07:30:26.985569 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.012496 140395597309760 cluster.py:447] Place variable librispeech/dec/emb/var_0/var on /job:local/replica:0/task:0/device:CPU:0 407139016 I0420 07:30:27.014452 140395597309760 py_utils.py:1220] Creating var librispeech/dec/emb/var_0/var:0 shape=(76, 76) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.023178 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/weight_0/var on /job:local/replica:0/task:0/device:CPU:0 408072904 I0420 07:30:27.025145 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/weight_0/var:0 shape=(3072, 76) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.032419 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/bias_0/var on /job:local/replica:0/task:0/device:CPU:0 408073208 I0420 07:30:27.034135 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/bias_0/var:0 shape=(76,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.052334 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/wm/var on /job:local/replica:0/task:0/device:CPU:0 459650040 I0420 07:30:27.054295 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/wm/var:0 shape=(3148, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.061111 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/b/var on /job:local/replica:0/task:0/device:CPU:0 459666424 I0420 07:30:27.062843 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.077316 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/wm/var on /job:local/replica:0/task:0/device:CPU:0 526775288 I0420 07:30:27.079365 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/wm/var:0 shape=(4096, 4096) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.086210 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/b/var on /job:local/replica:0/task:0/device:CPU:0 526791672 I0420 07:30:27.087944 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.102298 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/source_var/var on /job:local/replica:0/task:0/device:CPU:0 527840248 I0420 07:30:27.104248 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/source_var/var:0 shape=(2048, 128) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.114279 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/query_var/var on /job:local/replica:0/task:0/device:CPU:0 528364536 I0420 07:30:27.116240 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/query_var/var:0 shape=(1024, 128) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.127183 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/hidden_var/var on /job:local/replica:0/task:0/device:CPU:0 528365048 I0420 07:30:27.129138 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/hidden_var/var:0 shape=(128,) on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.232089 140395597309760 py_utils.py:1277] === worker 0 === I0420 07:30:27.233619 140395597309760 py_utils.py:1267] worker 0: decoder.atten.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.233711 140395597309760 py_utils.py:1267] worker 0: decoder.atten.hidden_var /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.233795 140395597309760 py_utils.py:1267] worker 0: decoder.atten.query_var /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.233886 140395597309760 py_utils.py:1267] worker 0: decoder.atten.source_var /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.233968 140395597309760 py_utils.py:1267] worker 0: decoder.beam_search.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.234051 140395597309760 py_utils.py:1267] worker 0: decoder.contextualizer.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.234136 140395597309760 py_utils.py:1267] worker 0: decoder.emb.wm[0] /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:27.234220 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.234311 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.lm.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.234388 140395597309760 py_utils.py:1267] worker 0: decoder.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.234468 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.234544 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.234625 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.234702 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.234791 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.234868 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.234951 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.bias_0 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.235027 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.235106 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.weight_0 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.235183 140395597309760 py_utils.py:1267] worker 0: decoder.target_sequence_sampler.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.235264 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.235340 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.gamma /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.235421 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.235497 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.235578 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.235655 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.235740 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.gamma /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.235817 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.235903 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.235980 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.236061 140395597309760 py_utils.py:1267] worker 0: encoder.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.236138 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.236217 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.gamma /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.236293 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.236375 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.236449 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.236531 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.236605 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.gamma /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.236687 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.236771 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.236854 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.236928 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.beta /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.237009 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.gamma /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.237085 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.237164 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.237241 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].w /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.237322 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.237397 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.237498 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.237581 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.237664 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.237750 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.237853 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.237926 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.238007 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.238082 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.238162 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.238239 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.238321 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.238395 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.238476 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.238553 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.238634 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.238708 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.238795 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.238872 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.238953 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.239028 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.239109 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.239190 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.239272 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.239350 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.239430 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.239506 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.239588 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.239665 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.239753 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.239830 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.b /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.239909 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.239985 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.wm /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.240067 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.240142 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.240221 140395597309760 py_utils.py:1267] worker 0: global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.240297 140395597309760 py_utils.py:1267] worker 0: input._tokenizer_default.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.240376 140395597309760 py_utils.py:1267] worker 0: input.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.240453 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0 I0420 07:30:27.240534 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.linear.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1 I0420 07:30:27.240608 140395597309760 py_utils.py:1267] worker 0: lr_schedule.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2 I0420 07:30:27.240689 140395597309760 py_utils.py:1267] worker 0: optimizer.global_step /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3 I0420 07:30:27.240773 140395597309760 py_utils.py:1283] ========== I0420 07:30:30.333229 140395597309760 decoder.py:749] Merging metric loss: (, ) I0420 07:30:30.337625 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (, ) I0420 07:30:30.341779 140395597309760 decoder.py:749] Merging metric log_pplx: (, ) I0420 07:30:33.729021 140395597309760 decoder.py:749] Merging metric loss: (, ) I0420 07:30:33.733247 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (, ) I0420 07:30:33.737484 140395597309760 decoder.py:749] Merging metric log_pplx: (, ) I0420 07:30:36.879219 140395597309760 decoder.py:749] Merging metric loss: (, ) I0420 07:30:36.883528 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (, ) I0420 07:30:36.887641 140395597309760 decoder.py:749] Merging metric log_pplx: (, ) I0420 07:30:40.128005 140395597309760 decoder.py:749] Merging metric loss: (, ) I0420 07:30:40.132308 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (, ) I0420 07:30:40.136519 140395597309760 decoder.py:749] Merging metric log_pplx: (, ) I0420 07:30:53.594866 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.hidden_var: I0420 07:30:53.595122 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.query_var: I0420 07:30:53.595273 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.source_var: I0420 07:30:53.595398 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.emb.wm_0: I0420 07:30:53.595546 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.b: I0420 07:30:53.595670 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.wm: I0420 07:30:53.595813 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.b: I0420 07:30:53.595932 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.wm: I0420 07:30:53.596065 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.bias_0: I0420 07:30:53.596189 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.weight_0: I0420 07:30:53.596312 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.beta: I0420 07:30:53.596432 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.gamma: I0420 07:30:53.596545 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.w: I0420 07:30:53.596672 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.beta: I0420 07:30:53.596795 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.gamma: I0420 07:30:53.596910 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.w: I0420 07:30:53.597038 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.beta: I0420 07:30:53.597156 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.gamma: I0420 07:30:53.597265 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.w: I0420 07:30:53.597389 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.beta: I0420 07:30:53.597506 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.gamma: I0420 07:30:53.597620 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.w: I0420 07:30:53.597750 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.beta: I0420 07:30:53.597863 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.gamma: I0420 07:30:53.597970 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.w: I0420 07:30:53.598094 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.b: I0420 07:30:53.598207 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.wm: I0420 07:30:53.598339 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.b: I0420 07:30:53.598453 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.wm: I0420 07:30:53.598577 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.b: I0420 07:30:53.598686 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.wm: I0420 07:30:53.598819 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.b: I0420 07:30:53.598931 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.wm: I0420 07:30:53.599054 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.b: I0420 07:30:53.599164 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.wm: I0420 07:30:53.599287 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.b: I0420 07:30:53.599399 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.wm: I0420 07:30:53.599523 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.b: I0420 07:30:53.599636 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.wm: I0420 07:30:53.599764 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.b: I0420 07:30:53.599878 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.wm: I0420 07:30:54.965993 140395597309760 cluster.py:447] Place variable beta1_power on /job:local/replica:0/task:0/device:CPU:0 528365052 I0420 07:30:54.969089 140395597309760 cluster.py:447] Place variable beta2_power on /job:local/replica:0/task:0/device:CPU:0 528365056 I0420 07:30:55.471826 140395597309760 cluster.py:447] Place variable librispeech/total_samples/var on /job:local/replica:0/task:0/device:CPU:0 528365064 I0420 07:30:55.473663 140395597309760 py_utils.py:1220] Creating var librispeech/total_samples/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:55.480046 140395597309760 cluster.py:447] Place variable total_nan_gradients/var on /job:local/replica:0/task:0/device:CPU:0 528365072 I0420 07:30:55.481794 140395597309760 py_utils.py:1220] Creating var total_nan_gradients/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0 I0420 07:30:55.505249 140395597309760 trainer.py:392] Trainer number of enqueue ops: 0 I0420 07:30:55.505595 140395597309760 trainer.py:401] AttributeError. Expected for single task models. I0420 07:31:00.711345 140395597309760 trainer.py:1329] Starting runners I0420 07:31:00.712577 140375142418176 base_runner.py:195] controller started. I0420 07:31:00.712903 140395597309760 trainer.py:1336] Total num runner.enqueue_ops: 0 I0420 07:31:00.713491 140375134025472 base_runner.py:195] trainer started. I0420 07:31:00.713701 140395597309760 trainer.py:1336] Total num runner.enqueue_ops: 0 I0420 07:31:00.714096 140395597309760 trainer.py:1346] Waiting for runners to finish... 2019-04-20 07:31:01.873354: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1485] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. I0420 07:31:02.337265 140375142418176 trainer.py:302] Uninitialized var list: ['global_step', 'librispeech/enc/conv_L0/w/var', 'librispeech/enc/conv_L0/beta/var', 'librispeech/enc/conv_L0/gamma/var', 'librispeech/enc/conv_L0/moving_mean/var', 'librispeech/enc/conv_L0/moving_variance/var', 'librispeech/enc/conv_L1/w/var', 'librispeech/enc/conv_L1/beta/var', 'librispeech/enc/conv_L1/gamma/var', 'librispeech/enc/conv_L1/moving_mean/var', 'librispeech/enc/conv_L1/moving_variance/var', 'librispeech/enc/fwd_rnn_L0/wm/var', 'librispeech/enc/fwd_rnn_L0/b/var', 'librispeech/enc/bak_rnn_L0/wm/var', 'librispeech/enc/bak_rnn_L0/b/var', 'librispeech/enc/fwd_rnn_L1/wm/var', 'librispeech/enc/fwd_rnn_L1/b/var', 'librispeech/enc/bak_rnn_L1/wm/var', 'librispeech/enc/bak_rnn_L1/b/var', 'librispeech/enc/fwd_rnn_L2/wm/var', 'librispeech/enc/fwd_rnn_L2/b/var', 'librispeech/enc/bak_rnn_L2/wm/var', 'librispeech/enc/bak_rnn_L2/b/var', 'librispeech/enc/fwd_rnn_L3/wm/var', 'librispeech/enc/fwd_rnn_L3/b/var', 'librispeech/enc/bak_rnn_L3/wm/var', 'librispeech/enc/bak_rnn_L3/b/var', 'librispeech/enc/proj_L0/w/var', 'librispeech/enc/proj_L0/beta/var', 'librispeech/enc/proj_L0/gamma/var', 'librispeech/enc/proj_L0/moving_mean/var', 'librispeech/enc/proj_L0/moving_variance/var', 'librispeech/enc/proj_L1/w/var', 'librispeech/enc/proj_L1/beta/var', 'librispeech/enc/proj_L1/gamma/var', 'librispeech/enc/proj_L1/moving_mean/var', 'librispeech/enc/proj_L1/moving_variance/var', 'librispeech/enc/proj_L2/w/var', 'librispeech/enc/proj_L2/beta/var', 'librispeech/enc/proj_L2/gamma/var', 'librispeech/enc/proj_L2/moving_mean/var', 'librispeech/enc/proj_L2/moving_variance/var', 'librispeech/dec/emb/var_0/var', 'librispeech/dec/softmax/weight_0/var', 'librispeech/dec/softmax/bias_0/var', 'librispeech/dec/rnn_cell/wm/var', 'librispeech/dec/rnn_cell/b/var', 'librispeech/dec/rnn_cell_1/wm/var', 'librispeech/dec/rnn_cell_1/b/var', 'librispeech/dec/atten/source_var/var', 'librispeech/dec/atten/query_var/var', 'librispeech/dec/atten/hidden_var/var', 'beta1_power', 'beta2_power', 'librispeech/dec/atten/hidden_var/var/Adam', 'librispeech/dec/atten/hidden_var/var/Adam_1', 'librispeech/dec/atten/query_var/var/Adam', 'librispeech/dec/atten/query_var/var/Adam_1', 'librispeech/dec/atten/source_var/var/Adam', 'librispeech/dec/atten/source_var/var/Adam_1', 'librispeech/dec/emb/var_0/var/Adam', 'librispeech/dec/emb/var_0/var/Adam_1', 'librispeech/dec/rnn_cell/b/var/Adam', 'librispeech/dec/rnn_cell/b/var/Adam_1', 'librispeech/dec/rnn_cell/wm/var/Adam', 'librispeech/dec/rnn_cell/wm/var/Adam_1', 'librispeech/dec/rnn_cell_1/b/var/Adam', 'librispeech/dec/rnn_cell_1/b/var/Adam_1', 'librispeech/dec/rnn_cell_1/wm/var/Adam', 'librispeech/dec/rnn_cell_1/wm/var/Adam_1', 'librispeech/dec/softmax/bias_0/var/Adam', 'librispeech/dec/softmax/bias_0/var/Adam_1', 'librispeech/dec/softmax/weight_0/var/Adam', 'librispeech/dec/softmax/weight_0/var/Adam_1', 'librispeech/enc/conv_L0/beta/var/Adam', 'librispeech/enc/conv_L0/beta/var/Adam_1', 'librispeech/enc/conv_L0/gamma/var/Adam', 'librispeech/enc/conv_L0/gamma/var/Adam_1', 'librispeech/enc/conv_L0/w/var/Adam', 'librispeech/enc/conv_L0/w/var/Adam_1', 'librispeech/enc/conv_L1/beta/var/Adam', 'librispeech/enc/conv_L1/beta/var/Adam_1', 'librispeech/enc/conv_L1/gamma/var/Adam', 'librispeech/enc/conv_L1/gamma/var/Adam_1', 'librispeech/enc/conv_L1/w/var/Adam', 'librispeech/enc/conv_L1/w/var/Adam_1', 'librispeech/enc/proj_L0/beta/var/Adam', 'librispeech/enc/proj_L0/beta/var/Adam_1', 'librispeech/enc/proj_L0/gamma/var/Adam', 'librispeech/enc/proj_L0/gamma/var/Adam_1', 'librispeech/enc/proj_L0/w/var/Adam', 'librispeech/enc/proj_L0/w/var/Adam_1', 'librispeech/enc/proj_L1/beta/var/Adam', 'librispeech/enc/proj_L1/beta/var/Adam_1', 'librispeech/enc/proj_L1/gamma/var/Adam', 'librispeech/enc/proj_L1/gamma/var/Adam_1', 'librispeech/enc/proj_L1/w/var/Adam', 'librispeech/enc/proj_L1/w/var/Adam_1', 'librispeech/enc/proj_L2/beta/var/Adam', 'librispeech/enc/proj_L2/beta/var/Adam_1', 'librispeech/enc/proj_L2/gamma/var/Adam', 'librispeech/enc/proj_L2/gamma/var/Adam_1', 'librispeech/enc/proj_L2/w/var/Adam', 'librispeech/enc/proj_L2/w/var/Adam_1', 'librispeech/enc/bak_rnn_L0/b/var/Adam', 'librispeech/enc/bak_rnn_L0/b/var/Adam_1', 'librispeech/enc/bak_rnn_L0/wm/var/Adam', 'librispeech/enc/bak_rnn_L0/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/b/var/Adam', 'librispeech/enc/fwd_rnn_L0/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L1/b/var/Adam', 'librispeech/enc/bak_rnn_L1/b/var/Adam_1', 'librispeech/enc/bak_rnn_L1/wm/var/Adam', 'librispeech/enc/bak_rnn_L1/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/b/var/Adam', 'librispeech/enc/fwd_rnn_L1/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L2/b/var/Adam', 'librispeech/enc/bak_rnn_L2/b/var/Adam_1', 'librispeech/enc/bak_rnn_L2/wm/var/Adam', 'librispeech/enc/bak_rnn_L2/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/b/var/Adam', 'librispeech/enc/fwd_rnn_L2/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L3/b/var/Adam', 'librispeech/enc/bak_rnn_L3/b/var/Adam_1', 'librispeech/enc/bak_rnn_L3/wm/var/Adam', 'librispeech/enc/bak_rnn_L3/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/b/var/Adam', 'librispeech/enc/fwd_rnn_L3/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam_1', 'librispeech/total_samples/var', 'total_nan_gradients/var'] I0420 07:31:02.337812 140375142418176 trainer.py:313] Initialize ALL variables: ['global_step', 'librispeech/enc/conv_L0/w/var', 'librispeech/enc/conv_L0/beta/var', 'librispeech/enc/conv_L0/gamma/var', 'librispeech/enc/conv_L0/moving_mean/var', 'librispeech/enc/conv_L0/moving_variance/var', 'librispeech/enc/conv_L1/w/var', 'librispeech/enc/conv_L1/beta/var', 'librispeech/enc/conv_L1/gamma/var', 'librispeech/enc/conv_L1/moving_mean/var', 'librispeech/enc/conv_L1/moving_variance/var', 'librispeech/enc/fwd_rnn_L0/wm/var', 'librispeech/enc/fwd_rnn_L0/b/var', 'librispeech/enc/bak_rnn_L0/wm/var', 'librispeech/enc/bak_rnn_L0/b/var', 'librispeech/enc/fwd_rnn_L1/wm/var', 'librispeech/enc/fwd_rnn_L1/b/var', 'librispeech/enc/bak_rnn_L1/wm/var', 'librispeech/enc/bak_rnn_L1/b/var', 'librispeech/enc/fwd_rnn_L2/wm/var', 'librispeech/enc/fwd_rnn_L2/b/var', 'librispeech/enc/bak_rnn_L2/wm/var', 'librispeech/enc/bak_rnn_L2/b/var', 'librispeech/enc/fwd_rnn_L3/wm/var', 'librispeech/enc/fwd_rnn_L3/b/var', 'librispeech/enc/bak_rnn_L3/wm/var', 'librispeech/enc/bak_rnn_L3/b/var', 'librispeech/enc/proj_L0/w/var', 'librispeech/enc/proj_L0/beta/var', 'librispeech/enc/proj_L0/gamma/var', 'librispeech/enc/proj_L0/moving_mean/var', 'librispeech/enc/proj_L0/moving_variance/var', 'librispeech/enc/proj_L1/w/var', 'librispeech/enc/proj_L1/beta/var', 'librispeech/enc/proj_L1/gamma/var', 'librispeech/enc/proj_L1/moving_mean/var', 'librispeech/enc/proj_L1/moving_variance/var', 'librispeech/enc/proj_L2/w/var', 'librispeech/enc/proj_L2/beta/var', 'librispeech/enc/proj_L2/gamma/var', 'librispeech/enc/proj_L2/moving_mean/var', 'librispeech/enc/proj_L2/moving_variance/var', 'librispeech/dec/emb/var_0/var', 'librispeech/dec/softmax/weight_0/var', 'librispeech/dec/softmax/bias_0/var', 'librispeech/dec/rnn_cell/wm/var', 'librispeech/dec/rnn_cell/b/var', 'librispeech/dec/rnn_cell_1/wm/var', 'librispeech/dec/rnn_cell_1/b/var', 'librispeech/dec/atten/source_var/var', 'librispeech/dec/atten/query_var/var', 'librispeech/dec/atten/hidden_var/var', 'beta1_power', 'beta2_power', 'librispeech/dec/atten/hidden_var/var/Adam', 'librispeech/dec/atten/hidden_var/var/Adam_1', 'librispeech/dec/atten/query_var/var/Adam', 'librispeech/dec/atten/query_var/var/Adam_1', 'librispeech/dec/atten/source_var/var/Adam', 'librispeech/dec/atten/source_var/var/Adam_1', 'librispeech/dec/emb/var_0/var/Adam', 'librispeech/dec/emb/var_0/var/Adam_1', 'librispeech/dec/rnn_cell/b/var/Adam', 'librispeech/dec/rnn_cell/b/var/Adam_1', 'librispeech/dec/rnn_cell/wm/var/Adam', 'librispeech/dec/rnn_cell/wm/var/Adam_1', 'librispeech/dec/rnn_cell_1/b/var/Adam', 'librispeech/dec/rnn_cell_1/b/var/Adam_1', 'librispeech/dec/rnn_cell_1/wm/var/Adam', 'librispeech/dec/rnn_cell_1/wm/var/Adam_1', 'librispeech/dec/softmax/bias_0/var/Adam', 'librispeech/dec/softmax/bias_0/var/Adam_1', 'librispeech/dec/softmax/weight_0/var/Adam', 'librispeech/dec/softmax/weight_0/var/Adam_1', 'librispeech/enc/conv_L0/beta/var/Adam', 'librispeech/enc/conv_L0/beta/var/Adam_1', 'librispeech/enc/conv_L0/gamma/var/Adam', 'librispeech/enc/conv_L0/gamma/var/Adam_1', 'librispeech/enc/conv_L0/w/var/Adam', 'librispeech/enc/conv_L0/w/var/Adam_1', 'librispeech/enc/conv_L1/beta/var/Adam', 'librispeech/enc/conv_L1/beta/var/Adam_1', 'librispeech/enc/conv_L1/gamma/var/Adam', 'librispeech/enc/conv_L1/gamma/var/Adam_1', 'librispeech/enc/conv_L1/w/var/Adam', 'librispeech/enc/conv_L1/w/var/Adam_1', 'librispeech/enc/proj_L0/beta/var/Adam', 'librispeech/enc/proj_L0/beta/var/Adam_1', 'librispeech/enc/proj_L0/gamma/var/Adam', 'librispeech/enc/proj_L0/gamma/var/Adam_1', 'librispeech/enc/proj_L0/w/var/Adam', 'librispeech/enc/proj_L0/w/var/Adam_1', 'librispeech/enc/proj_L1/beta/var/Adam', 'librispeech/enc/proj_L1/beta/var/Adam_1', 'librispeech/enc/proj_L1/gamma/var/Adam', 'librispeech/enc/proj_L1/gamma/var/Adam_1', 'librispeech/enc/proj_L1/w/var/Adam', 'librispeech/enc/proj_L1/w/var/Adam_1', 'librispeech/enc/proj_L2/beta/var/Adam', 'librispeech/enc/proj_L2/beta/var/Adam_1', 'librispeech/enc/proj_L2/gamma/var/Adam', 'librispeech/enc/proj_L2/gamma/var/Adam_1', 'librispeech/enc/proj_L2/w/var/Adam', 'librispeech/enc/proj_L2/w/var/Adam_1', 'librispeech/enc/bak_rnn_L0/b/var/Adam', 'librispeech/enc/bak_rnn_L0/b/var/Adam_1', 'librispeech/enc/bak_rnn_L0/wm/var/Adam', 'librispeech/enc/bak_rnn_L0/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/b/var/Adam', 'librispeech/enc/fwd_rnn_L0/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L1/b/var/Adam', 'librispeech/enc/bak_rnn_L1/b/var/Adam_1', 'librispeech/enc/bak_rnn_L1/wm/var/Adam', 'librispeech/enc/bak_rnn_L1/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/b/var/Adam', 'librispeech/enc/fwd_rnn_L1/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L2/b/var/Adam', 'librispeech/enc/bak_rnn_L2/b/var/Adam_1', 'librispeech/enc/bak_rnn_L2/wm/var/Adam', 'librispeech/enc/bak_rnn_L2/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/b/var/Adam', 'librispeech/enc/fwd_rnn_L2/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L3/b/var/Adam', 'librispeech/enc/bak_rnn_L3/b/var/Adam_1', 'librispeech/enc/bak_rnn_L3/wm/var/Adam', 'librispeech/enc/bak_rnn_L3/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/b/var/Adam', 'librispeech/enc/fwd_rnn_L3/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam_1', 'librispeech/total_samples/var', 'total_nan_gradients/var'] I0420 07:31:03.263567 140375134025472 trainer.py:455] Probably the expected race on global_step: Attempting to use uninitialized value global_step [[{{node _send_global_step_0}}]] I0420 07:31:04.268563 140375134025472 retry.py:68] Retry: caught exception: _WaitTillInit while running FailedPreconditionError: Attempting to use uninitialized value global_step [[{{node _send_global_step_0}}]] . Call failed at (most recent call last): File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 774, in __bootstrap self.__bootstrap_inner() File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 420, in Start self._RunLoop('trainer', self._Loop) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper return func(*args, **kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py", line 196, in _RunLoop loop_func(*loop_args) Traceback for above exception (most recent call last): File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper return func(*args, **kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 453, in _WaitTillInit global_step = sess.run(py_utils.GetGlobalStep()) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 933, in run run_metadata_ptr) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1156, in _run feed_dict_tensor, options, run_metadata) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1333, in _do_run run_metadata) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1353, in _do_call raise type(e)(node_def, op, message) Waiting for 1.53 seconds before retrying. I0420 07:31:04.272396 140375134025472 trainer.py:455] Probably the expected race on global_step: Attempting to use uninitialized value global_step [[{{node _send_global_step_0}}]] I0420 07:31:05.105427 140375142418176 trainer.py:315] Initialize variables done. I0420 07:31:05.805938 140375134025472 retry.py:68] Retry: caught exception: _WaitTillInit while running FailedPreconditionError: Attempting to use uninitialized value global_step [[{{node _send_global_step_0}}]] . Call failed at (most recent call last): File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 774, in __bootstrap self.__bootstrap_inner() File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 420, in Start self._RunLoop('trainer', self._Loop) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper return func(*args, **kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py", line 196, in _RunLoop loop_func(*loop_args) Traceback for above exception (most recent call last): File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper return func(*args, **kwargs) File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 453, in _WaitTillInit global_step = sess.run(py_utils.GetGlobalStep()) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 933, in run run_metadata_ptr) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1156, in _run feed_dict_tensor, options, run_metadata) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1333, in _do_run run_metadata) File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1353, in _do_call raise type(e)(node_def, op, message) Waiting for 2.37 seconds before retrying. I0420 07:31:05.807986 140375134025472 base_runner.py:115] step: 0 I0420 07:31:05.836642 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000 I0420 07:31:05.837508 140375142418176 trainer.py:268] Save checkpoint W0420 07:31:08.370887 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'dict' object has no attribute 'name' W0420 07:31:08.371248 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'list' object has no attribute 'name' I0420 07:31:08.566843 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000000 I0420 07:31:12.045002 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000 I0420 07:31:22.052608 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000 2019-04-20 07:31:25.052466: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcublas.so.10.0 2019-04-20 07:31:25.262119: I ./lingvo/core/ops/input_common.h:68] Create RecordProcessor 2019-04-20 07:31:25.278404: I lingvo/core/ops/input_common.cc:30] Input source weights are empty, fall back to legacy behavior. 2019-04-20 07:31:25.280768: I lingvo/core/ops/record_yielder.cc:288] 0x7fa805806f60 Record yielder start 2019-04-20 07:31:25.280785: I lingvo/core/ops/record_yielder.cc:290] Randomly seed RecordYielder. 2019-04-20 07:31:25.281236: I ./lingvo/core/ops/input_common.h:73] Create batcher 2019-04-20 07:31:25.281280: I lingvo/core/ops/record_yielder.cc:341] Epoch 1 /data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-* 2019-04-20 07:31:29.134968: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcudnn.so.7 I0420 07:31:32.057389 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000 I0420 07:31:40.928052 140375134025472 trainer.py:511] time: 35.119649 I0420 07:31:40.929809 140375134025472 base_runner.py:115] step: 1 fraction_of_correct_next_step_preds:0.021143124 fraction_of_correct_next_step_preds/logits:0.021143124 grad_norm/all:95.786835 grad_scale_all:0.010439848 log_pplx:4.8166261 log_pplx/logits:4.8166261 loss:4.8166261 loss/logits:4.8166261 num_samples_in_batch:128 var_norm/all:704.50604 I0420 07:31:42.068264 140375142418176 trainer.py:371] Steps/second: 0.099892, Examples/second: 12.786187 I0420 07:31:42.069178 140375142418176 trainer.py:275] Write summary @1 2019-04-20 07:31:49.473411: I ./lingvo/core/ops/input_common.h:68] Create RecordProcessor 2019-04-20 07:31:49.476861: I lingvo/core/ops/input_common.cc:30] Input source weights are empty, fall back to legacy behavior. 2019-04-20 07:31:49.477990: I lingvo/core/ops/record_yielder.cc:288] 0x7fa7920a4690 Record yielder start 2019-04-20 07:31:49.478007: I lingvo/core/ops/record_yielder.cc:290] Randomly seed RecordYielder. 2019-04-20 07:31:49.478464: I ./lingvo/core/ops/input_common.h:73] Create batcher 2019-04-20 07:31:49.478492: I lingvo/core/ops/record_yielder.cc:341] Epoch 1 /data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-* I0420 07:31:50.867839 140375134025472 trainer.py:511] time: 9.934763 I0420 07:31:50.874564 140375134025472 trainer.py:522] step: 2 fraction_of_correct_next_step_preds:0.1764628 fraction_of_correct_next_step_preds/logits:0.1764628 grad_norm/all:21.061581 grad_scale_all:0.047479816 log_pplx:3.8171189 log_pplx/logits:3.8171189 loss:3.8171189 loss/logits:3.8171189 num_samples_in_batch:128 var_norm/all:704.50598 I0420 07:32:01.550363 140375134025472 trainer.py:511] time: 10.675241 I0420 07:32:01.553199 140375134025472 trainer.py:522] step: 3 fraction_of_correct_next_step_preds:0.18393762 fraction_of_correct_next_step_preds/logits:0.18393762 grad_norm/all:26.088888 grad_scale_all:0.038330495 log_pplx:3.4706612 log_pplx/logits:3.4706612 loss:3.4706612 loss/logits:3.4706612 num_samples_in_batch:128 var_norm/all:704.50604 I0420 07:32:13.798919 140375134025472 trainer.py:511] time: 12.244547 I0420 07:32:13.802406 140375134025472 trainer.py:522] step: 4 fraction_of_correct_next_step_preds:0.10271726 fraction_of_correct_next_step_preds/logits:0.10271726 grad_norm/all:22.408554 grad_scale_all:0.044625815 log_pplx:3.4633482 log_pplx/logits:3.4633482 loss:3.4633482 loss/logits:3.4633482 num_samples_in_batch:128 var_norm/all:704.50623 I0420 07:32:24.322017 140375134025472 trainer.py:511] time: 10.517802 I0420 07:32:24.333794 140375134025472 trainer.py:522] step: 5 fraction_of_correct_next_step_preds:0.063654847 fraction_of_correct_next_step_preds/logits:0.063654847 grad_norm/all:13.663131 grad_scale_all:0.073189668 log_pplx:3.3751349 log_pplx/logits:3.3751349 loss:3.3751349 loss/logits:3.3751349 num_samples_in_batch:128 var_norm/all:704.50647 I0420 07:32:33.058339 140375134025472 trainer.py:511] time: 8.714539 I0420 07:32:33.064846 140375134025472 trainer.py:522] step: 6 fraction_of_correct_next_step_preds:0.17381285 fraction_of_correct_next_step_preds/logits:0.17381285 grad_norm/all:8.5088272 grad_scale_all:0.11752501 log_pplx:3.1021802 log_pplx/logits:3.1021802 loss:3.1021802 loss/logits:3.1021802 num_samples_in_batch:128 var_norm/all:704.50677 2019-04-20 07:32:33.077313: I lingvo/core/ops/record_batcher.cc:344] 68 total seconds passed. Total records yielded: 1930. Total records skipped: 1 2019-04-20 07:32:33.077468: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1720 I0420 07:32:43.407505 140375142418176 trainer.py:284] Write summary done: step 1 I0420 07:32:43.408030 140375142418176 base_runner.py:115] step: 1, steps/sec: 0.10, examples/sec: 12.79 I0420 07:32:43.427865 140375142418176 trainer.py:371] Steps/second: 0.084068, Examples/second: 10.760739 I0420 07:32:43.763868 140375134025472 trainer.py:511] time: 10.698063 I0420 07:32:43.765400 140375134025472 trainer.py:522] step: 7 fraction_of_correct_next_step_preds:0.18137592 fraction_of_correct_next_step_preds/logits:0.18137592 grad_norm/all:11.258716 grad_scale_all:0.088820077 log_pplx:3.1270289 log_pplx/logits:3.1270289 loss:3.1270289 loss/logits:3.1270289 num_samples_in_batch:128 var_norm/all:704.50696 I0420 07:32:51.922163 140375134025472 trainer.py:511] time: 8.156427 I0420 07:32:51.923819 140375134025472 trainer.py:522] step: 8 fraction_of_correct_next_step_preds:0.1022212 fraction_of_correct_next_step_preds/logits:0.1022212 grad_norm/all:10.946689 grad_scale_all:0.091351829 log_pplx:3.1191299 log_pplx/logits:3.1191299 loss:3.1191299 loss/logits:3.1191299 num_samples_in_batch:128 var_norm/all:704.5072 I0420 07:32:53.429886 140375142418176 trainer.py:371] Steps/second: 0.098313, Examples/second: 12.584109 I0420 07:33:01.255381 140375134025472 trainer.py:511] time: 9.331202 I0420 07:33:01.257316 140375134025472 trainer.py:522] step: 9 fraction_of_correct_next_step_preds:0.090148062 fraction_of_correct_next_step_preds/logits:0.090148062 grad_norm/all:5.2309275 grad_scale_all:0.19117069 log_pplx:3.0450623 log_pplx/logits:3.0450623 loss:3.0450623 loss/logits:3.0450623 num_samples_in_batch:128 var_norm/all:704.50745 I0420 07:33:03.441674 140375142418176 trainer.py:371] Steps/second: 0.098485, Examples/second: 12.606123 I0420 07:33:10.237653 140375134025472 trainer.py:511] time: 8.979791 I0420 07:33:10.239054 140375134025472 trainer.py:522] step: 10 fraction_of_correct_next_step_preds:0.18519549 fraction_of_correct_next_step_preds/logits:0.18519549 grad_norm/all:3.8206174 grad_scale_all:0.26173779 log_pplx:2.9197507 log_pplx/logits:2.9197507 loss:2.9197507 loss/logits:2.9197507 num_samples_in_batch:128 var_norm/all:704.50757 I0420 07:33:13.448293 140375142418176 trainer.py:371] Steps/second: 0.098628, Examples/second: 12.624399 I0420 07:33:18.439397 140375134025472 trainer.py:511] time: 8.200120 I0420 07:33:18.440263 140375134025472 trainer.py:522] step: 11 fraction_of_correct_next_step_preds:0.18182154 fraction_of_correct_next_step_preds/logits:0.18182154 grad_norm/all:7.2013311 grad_scale_all:0.13886322 log_pplx:2.9674447 log_pplx/logits:2.9674447 loss:2.9674447 loss/logits:2.9674447 num_samples_in_batch:128 var_norm/all:704.50751 I0420 07:33:23.461621 140375142418176 trainer.py:371] Steps/second: 0.098740, Examples/second: 12.638698 I0420 07:33:25.952996 140375134025472 trainer.py:511] time: 7.512431 I0420 07:33:25.954210 140375134025472 trainer.py:522] step: 12 fraction_of_correct_next_step_preds:0.1485029 fraction_of_correct_next_step_preds/logits:0.1485029 grad_norm/all:3.4744987 grad_scale_all:0.28781131 log_pplx:2.9354002 log_pplx/logits:2.9354002 loss:2.9354002 loss/logits:2.9354002 num_samples_in_batch:128 var_norm/all:704.50739 I0420 07:33:32.858994 140375134025472 trainer.py:511] time: 6.904439 I0420 07:33:32.860662 140375134025472 trainer.py:522] step: 13 fraction_of_correct_next_step_preds:0.16566102 fraction_of_correct_next_step_preds/logits:0.16566102 grad_norm/all:2.9549899 grad_scale_all:0.33841065 log_pplx:2.9054689 log_pplx/logits:2.9054689 loss:2.9054689 loss/logits:2.9054689 num_samples_in_batch:128 var_norm/all:704.50702 I0420 07:33:33.468287 140375142418176 trainer.py:371] Steps/second: 0.107074, Examples/second: 13.705518 I0420 07:33:38.838541 140375134025472 trainer.py:511] time: 5.977578 I0420 07:33:38.839858 140375134025472 trainer.py:522] step: 14 fraction_of_correct_next_step_preds:0.18039839 fraction_of_correct_next_step_preds/logits:0.18039839 grad_norm/all:2.9032726 grad_scale_all:0.34443888 log_pplx:2.8980184 log_pplx/logits:2.8980184 loss:2.8980184 loss/logits:2.8980184 num_samples_in_batch:128 var_norm/all:704.50647 I0420 07:33:43.207788 140375134025472 trainer.py:511] time: 4.367637 I0420 07:33:43.209603 140375134025472 trainer.py:522] step: 15 fraction_of_correct_next_step_preds:0.17200515 fraction_of_correct_next_step_preds/logits:0.17200515 grad_norm/all:2.4388075 grad_scale_all:0.41003647 log_pplx:2.9262927 log_pplx/logits:2.9262927 loss:2.9262927 loss/logits:2.9262927 num_samples_in_batch:256 var_norm/all:704.50562 I0420 07:33:43.493648 140375142418176 trainer.py:371] Steps/second: 0.114124, Examples/second: 15.581691 I0420 07:33:52.574059 140375134025472 trainer.py:511] time: 9.364165 I0420 07:33:52.575781 140375134025472 trainer.py:522] step: 16 fraction_of_correct_next_step_preds:0.18149137 fraction_of_correct_next_step_preds/logits:0.18149137 grad_norm/all:2.0045273 grad_scale_all:0.49887073 log_pplx:2.8566251 log_pplx/logits:2.8566251 loss:2.8566251 loss/logits:2.8566251 num_samples_in_batch:128 var_norm/all:704.50458 I0420 07:33:53.487867 140375142418176 trainer.py:371] Steps/second: 0.113130, Examples/second: 15.385644 I0420 07:34:01.174437 140375134025472 trainer.py:511] time: 8.598373 I0420 07:34:01.176188 140375134025472 trainer.py:522] step: 17 fraction_of_correct_next_step_preds:0.1833981 fraction_of_correct_next_step_preds/logits:0.1833981 grad_norm/all:2.2261834 grad_scale_all:0.44919929 log_pplx:2.8556404 log_pplx/logits:2.8556404 loss:2.8556404 loss/logits:2.8556404 num_samples_in_batch:128 var_norm/all:704.50323 I0420 07:34:03.499483 140375142418176 trainer.py:371] Steps/second: 0.112254, Examples/second: 15.213734 I0420 07:34:09.164508 140375134025472 trainer.py:511] time: 7.988012 I0420 07:34:09.165910 140375134025472 trainer.py:522] step: 18 fraction_of_correct_next_step_preds:0.21112825 fraction_of_correct_next_step_preds/logits:0.21112825 grad_norm/all:1.8983856 grad_scale_all:0.52676338 log_pplx:2.8433697 log_pplx/logits:2.8433697 loss:2.8433697 loss/logits:2.8433697 num_samples_in_batch:128 var_norm/all:704.50177 I0420 07:34:13.508529 140375142418176 trainer.py:371] Steps/second: 0.111489, Examples/second: 15.063375 I0420 07:34:17.413304 140375134025472 trainer.py:511] time: 8.247067 I0420 07:34:17.414127 140375134025472 trainer.py:522] step: 19 fraction_of_correct_next_step_preds:0.19221394 fraction_of_correct_next_step_preds/logits:0.19221394 grad_norm/all:1.5632075 grad_scale_all:0.63971031 log_pplx:2.8191531 log_pplx/logits:2.8191531 loss:2.8191531 loss/logits:2.8191531 num_samples_in_batch:128 var_norm/all:704.5 I0420 07:34:23.518748 140375142418176 trainer.py:371] Steps/second: 0.110812, Examples/second: 14.930481 I0420 07:34:25.138550 140375134025472 trainer.py:511] time: 7.724124 I0420 07:34:25.139547 140375134025472 trainer.py:522] step: 20 fraction_of_correct_next_step_preds:0.18065803 fraction_of_correct_next_step_preds/logits:0.18065803 grad_norm/all:1.513368 grad_scale_all:0.66077781 log_pplx:2.8195481 log_pplx/logits:2.8195481 loss:2.8195481 loss/logits:2.8195481 num_samples_in_batch:128 var_norm/all:704.4978 I0420 07:34:32.060746 140375134025472 trainer.py:511] time: 6.920765 I0420 07:34:32.062063 140375134025472 trainer.py:522] step: 21 fraction_of_correct_next_step_preds:0.22351789 fraction_of_correct_next_step_preds/logits:0.22351789 grad_norm/all:1.794582 grad_scale_all:0.5572328 log_pplx:2.8125205 log_pplx/logits:2.8125205 loss:2.8125205 loss/logits:2.8125205 num_samples_in_batch:128 var_norm/all:704.4953 I0420 07:34:33.528932 140375142418176 trainer.py:371] Steps/second: 0.115721, Examples/second: 15.517583 I0420 07:34:38.001070 140375134025472 trainer.py:511] time: 5.938352 I0420 07:34:38.002382 140375134025472 trainer.py:522] step: 22 fraction_of_correct_next_step_preds:0.18407217 fraction_of_correct_next_step_preds/logits:0.18407217 grad_norm/all:1.6261265 grad_scale_all:0.61495829 log_pplx:2.8218467 log_pplx/logits:2.8218467 loss:2.8218467 loss/logits:2.8218467 num_samples_in_batch:128 var_norm/all:704.49268 2019-04-20 07:34:38.010061: I lingvo/core/ops/record_batcher.cc:344] 193 total seconds passed. Total records yielded: 3947. Total records skipped: 7 2019-04-20 07:34:38.010244: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711 2019-04-20 07:34:38.010277: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1726 2019-04-20 07:34:38.010300: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717 2019-04-20 07:34:38.010322: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717 2019-04-20 07:34:38.010343: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1735 2019-04-20 07:34:38.010362: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2431 I0420 07:34:43.538116 140375142418176 trainer.py:371] Steps/second: 0.114894, Examples/second: 15.374913 I0420 07:34:47.163630 140375134025472 trainer.py:511] time: 9.160850 I0420 07:34:47.164750 140375134025472 trainer.py:522] step: 23 fraction_of_correct_next_step_preds:0.19543481 fraction_of_correct_next_step_preds/logits:0.19543481 grad_norm/all:0.95440471 grad_scale_all:1 log_pplx:2.7736437 log_pplx/logits:2.7736437 loss:2.7736437 loss/logits:2.7736437 num_samples_in_batch:128 var_norm/all:704.48987 I0420 07:34:53.550586 140375142418176 trainer.py:371] Steps/second: 0.114148, Examples/second: 15.246183 I0420 07:34:55.746999 140375134025472 trainer.py:511] time: 8.582018 I0420 07:34:55.748157 140375134025472 trainer.py:522] step: 24 fraction_of_correct_next_step_preds:0.21631305 fraction_of_correct_next_step_preds/logits:0.21631305 grad_norm/all:1.2232789 grad_scale_all:0.81747508 log_pplx:2.7470989 log_pplx/logits:2.7470989 loss:2.7470989 loss/logits:2.7470989 num_samples_in_batch:128 var_norm/all:704.48651 I0420 07:35:03.559312 140375142418176 trainer.py:371] Steps/second: 0.113474, Examples/second: 15.129889 I0420 07:35:04.091337 140375134025472 trainer.py:511] time: 8.342919 I0420 07:35:04.092581 140375134025472 trainer.py:522] step: 25 fraction_of_correct_next_step_preds:0.21755469 fraction_of_correct_next_step_preds/logits:0.21755469 grad_norm/all:1.304142 grad_scale_all:0.76678765 log_pplx:2.7486336 log_pplx/logits:2.7486336 loss:2.7486336 loss/logits:2.7486336 num_samples_in_batch:128 var_norm/all:704.48285 I0420 07:35:12.140954 140375134025472 trainer.py:511] time: 8.048148 I0420 07:35:12.141916 140375134025472 trainer.py:522] step: 26 fraction_of_correct_next_step_preds:0.23160981 fraction_of_correct_next_step_preds/logits:0.23160981 grad_norm/all:0.91558677 grad_scale_all:1 log_pplx:2.7053049 log_pplx/logits:2.7053049 loss:2.7053049 loss/logits:2.7053049 num_samples_in_batch:128 var_norm/all:704.47894 I0420 07:35:13.568135 140375142418176 trainer.py:371] Steps/second: 0.117376, Examples/second: 15.601947 I0420 07:35:19.786127 140375134025472 trainer.py:511] time: 7.643813 I0420 07:35:19.787606 140375134025472 trainer.py:522] step: 27 fraction_of_correct_next_step_preds:0.25725779 fraction_of_correct_next_step_preds/logits:0.25725779 grad_norm/all:1.5240614 grad_scale_all:0.65614152 log_pplx:2.7172825 log_pplx/logits:2.7172825 loss:2.7172825 loss/logits:2.7172825 num_samples_in_batch:128 var_norm/all:704.47467 I0420 07:35:23.578629 140375142418176 trainer.py:371] Steps/second: 0.116620, Examples/second: 15.480218 I0420 07:35:27.002883 140375134025472 trainer.py:511] time: 7.215031 I0420 07:35:27.004050 140375134025472 trainer.py:522] step: 28 fraction_of_correct_next_step_preds:0.19790526 fraction_of_correct_next_step_preds/logits:0.19790526 grad_norm/all:2.5601723 grad_scale_all:0.39059871 log_pplx:2.7200246 log_pplx/logits:2.7200246 loss:2.7200246 loss/logits:2.7200246 num_samples_in_batch:128 var_norm/all:704.47046 I0420 07:35:31.298074 140375134025472 trainer.py:511] time: 4.293699 I0420 07:35:31.298980 140375134025472 trainer.py:522] step: 29 fraction_of_correct_next_step_preds:0.23508346 fraction_of_correct_next_step_preds/logits:0.23508346 grad_norm/all:1.768149 grad_scale_all:0.5655632 log_pplx:2.7536347 log_pplx/logits:2.7536347 loss:2.7536347 loss/logits:2.7536347 num_samples_in_batch:256 var_norm/all:704.46643 I0420 07:35:33.588792 140375142418176 trainer.py:371] Steps/second: 0.120067, Examples/second: 16.428504 I0420 07:35:37.132637 140375134025472 trainer.py:511] time: 5.833392 I0420 07:35:37.133708 140375134025472 trainer.py:522] step: 30 fraction_of_correct_next_step_preds:0.27157485 fraction_of_correct_next_step_preds/logits:0.27157485 grad_norm/all:1.1383265 grad_scale_all:0.87848258 log_pplx:2.6550455 log_pplx/logits:2.6550455 loss:2.6550455 loss/logits:2.6550455 num_samples_in_batch:128 var_norm/all:704.46259 I0420 07:35:43.598779 140375142418176 trainer.py:371] Steps/second: 0.119265, Examples/second: 16.283604 I0420 07:35:46.383224 140375134025472 trainer.py:511] time: 9.249290 I0420 07:35:46.384552 140375134025472 trainer.py:522] step: 31 fraction_of_correct_next_step_preds:0.26169586 fraction_of_correct_next_step_preds/logits:0.26169586 grad_norm/all:2.2130964 grad_scale_all:0.4518556 log_pplx:2.6702738 log_pplx/logits:2.6702738 loss:2.6702738 loss/logits:2.6702738 num_samples_in_batch:128 var_norm/all:704.45856 I0420 07:35:53.610418 140375142418176 trainer.py:371] Steps/second: 0.118523, Examples/second: 16.149695 I0420 07:35:55.118340 140375134025472 trainer.py:511] time: 8.733485 I0420 07:35:55.119571 140375134025472 trainer.py:522] step: 32 fraction_of_correct_next_step_preds:0.24295399 fraction_of_correct_next_step_preds/logits:0.24295399 grad_norm/all:2.4151533 grad_scale_all:0.4140524 log_pplx:2.6709504 log_pplx/logits:2.6709504 loss:2.6709504 loss/logits:2.6709504 num_samples_in_batch:128 var_norm/all:704.45471 I0420 07:36:03.338177 140375134025472 trainer.py:511] time: 8.218271 I0420 07:36:03.339405 140375134025472 trainer.py:522] step: 33 fraction_of_correct_next_step_preds:0.27863407 fraction_of_correct_next_step_preds/logits:0.27863407 grad_norm/all:1.0046402 grad_scale_all:0.99538124 log_pplx:2.6023529 log_pplx/logits:2.6023529 loss:2.6023529 loss/logits:2.6023529 num_samples_in_batch:128 var_norm/all:704.45111 I0420 07:36:03.618396 140375142418176 trainer.py:371] Steps/second: 0.121520, Examples/second: 16.497211 I0420 07:36:11.618510 140375134025472 trainer.py:511] time: 8.278831 I0420 07:36:11.620147 140375134025472 trainer.py:522] step: 34 fraction_of_correct_next_step_preds:0.2705746 fraction_of_correct_next_step_preds/logits:0.2705746 grad_norm/all:3.2093797 grad_scale_all:0.31158671 log_pplx:2.6450973 log_pplx/logits:2.6450973 loss:2.6450973 loss/logits:2.6450973 num_samples_in_batch:128 var_norm/all:704.44714 I0420 07:36:13.628343 140375142418176 trainer.py:371] Steps/second: 0.120751, Examples/second: 16.365323 I0420 07:36:19.194186 140375134025472 trainer.py:511] time: 7.573756 I0420 07:36:19.195812 140375134025472 trainer.py:522] step: 35 fraction_of_correct_next_step_preds:0.26238686 fraction_of_correct_next_step_preds/logits:0.26238686 grad_norm/all:1.9304246 grad_scale_all:0.51802075 log_pplx:2.6112156 log_pplx/logits:2.6112156 loss:2.6112156 loss/logits:2.6112156 num_samples_in_batch:128 var_norm/all:704.44342 I0420 07:36:23.638386 140375142418176 trainer.py:371] Steps/second: 0.120035, Examples/second: 16.242484 I0420 07:36:26.169241 140375134025472 trainer.py:511] time: 6.973244 I0420 07:36:26.170348 140375134025472 trainer.py:522] step: 36 fraction_of_correct_next_step_preds:0.28620055 fraction_of_correct_next_step_preds/logits:0.28620055 grad_norm/all:1.9523199 grad_scale_all:0.51221114 log_pplx:2.6116624 log_pplx/logits:2.6116624 loss:2.6116624 loss/logits:2.6116624 num_samples_in_batch:128 var_norm/all:704.43976 I0420 07:36:33.649904 140375142418176 trainer.py:371] Steps/second: 0.119366, Examples/second: 16.127728 I0420 07:36:35.362273 140375134025472 trainer.py:511] time: 9.191655 I0420 07:36:35.363902 140375134025472 trainer.py:522] step: 37 fraction_of_correct_next_step_preds:0.26480764 fraction_of_correct_next_step_preds/logits:0.26480764 grad_norm/all:2.1899633 grad_scale_all:0.45662865 log_pplx:2.5841577 log_pplx/logits:2.5841577 loss:2.5841577 loss/logits:2.5841577 num_samples_in_batch:128 var_norm/all:704.43622 I0420 07:36:41.210024 140375134025472 trainer.py:511] time: 5.845796 I0420 07:36:41.211704 140375134025472 trainer.py:522] step: 38 fraction_of_correct_next_step_preds:0.27340057 fraction_of_correct_next_step_preds/logits:0.27340057 grad_norm/all:1.3420987 grad_scale_all:0.74510169 log_pplx:2.5779867 log_pplx/logits:2.5779867 loss:2.5779867 loss/logits:2.5779867 num_samples_in_batch:128 var_norm/all:704.43274 I0420 07:36:43.658245 140375142418176 trainer.py:371] Steps/second: 0.121951, Examples/second: 16.431276 I0420 07:36:49.907310 140375134025472 trainer.py:511] time: 8.695390 I0420 07:36:49.908164 140375134025472 trainer.py:522] step: 39 fraction_of_correct_next_step_preds:0.25044024 fraction_of_correct_next_step_preds/logits:0.25044024 grad_norm/all:2.3219526 grad_scale_all:0.43067202 log_pplx:2.5622635 log_pplx/logits:2.5622635 loss:2.5622635 loss/logits:2.5622635 num_samples_in_batch:128 var_norm/all:704.42908 I0420 07:36:53.669751 140375142418176 trainer.py:371] Steps/second: 0.121264, Examples/second: 16.317783 I0420 07:36:58.235479 140375134025472 trainer.py:511] time: 8.326842 I0420 07:36:58.236478 140375134025472 trainer.py:522] step: 40 fraction_of_correct_next_step_preds:0.27789244 fraction_of_correct_next_step_preds/logits:0.27789244 grad_norm/all:1.7515401 grad_scale_all:0.57092613 log_pplx:2.5496616 log_pplx/logits:2.5496616 loss:2.5496616 loss/logits:2.5496616 num_samples_in_batch:128 var_norm/all:704.42548 I0420 07:37:03.679372 140375142418176 trainer.py:371] Steps/second: 0.120619, Examples/second: 16.211234 I0420 07:37:06.227189 140375134025472 trainer.py:511] time: 7.990343 I0420 07:37:06.228425 140375134025472 trainer.py:522] step: 41 fraction_of_correct_next_step_preds:0.27679548 fraction_of_correct_next_step_preds/logits:0.27679548 grad_norm/all:2.7463503 grad_scale_all:0.36411962 log_pplx:2.5383792 log_pplx/logits:2.5383792 loss:2.5383792 loss/logits:2.5383792 num_samples_in_batch:128 var_norm/all:704.42194 I0420 07:37:13.687947 140375142418176 trainer.py:371] Steps/second: 0.120013, Examples/second: 16.110977 I0420 07:37:13.896599 140375134025472 trainer.py:511] time: 7.667946 I0420 07:37:13.897425 140375134025472 trainer.py:522] step: 42 fraction_of_correct_next_step_preds:0.27243295 fraction_of_correct_next_step_preds/logits:0.27243295 grad_norm/all:1.3269877 grad_scale_all:0.75358647 log_pplx:2.5244553 log_pplx/logits:2.5244553 loss:2.5244553 loss/logits:2.5244553 num_samples_in_batch:128 var_norm/all:704.4184 I0420 07:37:20.823199 140375134025472 trainer.py:511] time: 6.925393 I0420 07:37:20.824395 140375134025472 trainer.py:522] step: 43 fraction_of_correct_next_step_preds:0.25611943 fraction_of_correct_next_step_preds/logits:0.25611943 grad_norm/all:2.4771316 grad_scale_all:0.40369272 log_pplx:2.5433397 log_pplx/logits:2.5433397 loss:2.5433397 loss/logits:2.5433397 num_samples_in_batch:128 var_norm/all:704.41473 I0420 07:37:23.694926 140375142418176 trainer.py:371] Steps/second: 0.122285, Examples/second: 16.380501 I0420 07:37:29.416392 140375134025472 trainer.py:511] time: 8.591591 I0420 07:37:29.417622 140375134025472 trainer.py:522] step: 44 fraction_of_correct_next_step_preds:0.2867482 fraction_of_correct_next_step_preds/logits:0.2867482 grad_norm/all:2.2287934 grad_scale_all:0.44867328 log_pplx:2.5145919 log_pplx/logits:2.5145919 loss:2.5145919 loss/logits:2.5145919 num_samples_in_batch:128 var_norm/all:704.41107 I0420 07:37:33.707510 140375142418176 trainer.py:371] Steps/second: 0.121665, Examples/second: 16.280931 I0420 07:37:38.633066 140375134025472 trainer.py:511] time: 9.191278 I0420 07:37:38.634128 140375134025472 trainer.py:522] step: 45 fraction_of_correct_next_step_preds:0.2967239 fraction_of_correct_next_step_preds/logits:0.2967239 grad_norm/all:1.9373232 grad_scale_all:0.5161761 log_pplx:2.4962223 log_pplx/logits:2.4962223 loss:2.4962223 loss/logits:2.4962223 num_samples_in_batch:128 var_norm/all:704.40753 I0420 07:37:43.716531 140375142418176 trainer.py:371] Steps/second: 0.121079, Examples/second: 16.186881 I0420 07:37:44.558804 140375134025472 trainer.py:511] time: 5.924469 I0420 07:37:44.559990 140375134025472 trainer.py:522] step: 46 fraction_of_correct_next_step_preds:0.27961257 fraction_of_correct_next_step_preds/logits:0.27961257 grad_norm/all:1.3620263 grad_scale_all:0.73420018 log_pplx:2.4990051 log_pplx/logits:2.4990051 loss:2.4990051 loss/logits:2.4990051 num_samples_in_batch:128 var_norm/all:704.40393 I0420 07:37:52.829343 140375134025472 trainer.py:511] time: 8.269159 I0420 07:37:52.830336 140375134025472 trainer.py:522] step: 47 fraction_of_correct_next_step_preds:0.29120943 fraction_of_correct_next_step_preds/logits:0.29120943 grad_norm/all:1.6608243 grad_scale_all:0.60211062 log_pplx:2.4610975 log_pplx/logits:2.4610975 loss:2.4610975 loss/logits:2.4610975 num_samples_in_batch:128 var_norm/all:704.39996 I0420 07:37:53.725608 140375142418176 trainer.py:371] Steps/second: 0.123144, Examples/second: 16.768488 I0420 07:37:57.058876 140375134025472 trainer.py:511] time: 4.228302 I0420 07:37:57.060250 140375134025472 trainer.py:522] step: 48 fraction_of_correct_next_step_preds:0.28546801 fraction_of_correct_next_step_preds/logits:0.28546801 grad_norm/all:1.3926181 grad_scale_all:0.718072 log_pplx:2.5202353 log_pplx/logits:2.5202353 loss:2.5202353 loss/logits:2.5202353 num_samples_in_batch:256 var_norm/all:704.396 I0420 07:38:03.736061 140375142418176 trainer.py:371] Steps/second: 0.122549, Examples/second: 16.666722 I0420 07:38:04.924762 140375134025472 trainer.py:511] time: 7.864333 I0420 07:38:04.925765 140375134025472 trainer.py:522] step: 49 fraction_of_correct_next_step_preds:0.30097163 fraction_of_correct_next_step_preds/logits:0.30097163 grad_norm/all:0.98358774 grad_scale_all:1 log_pplx:2.4348617 log_pplx/logits:2.4348617 loss:2.4348617 loss/logits:2.4348617 num_samples_in_batch:128 var_norm/all:704.39172 I0420 07:38:11.901254 140375134025472 trainer.py:511] time: 6.975202 I0420 07:38:11.902563 140375134025472 trainer.py:522] step: 50 fraction_of_correct_next_step_preds:0.29285777 fraction_of_correct_next_step_preds/logits:0.29285777 grad_norm/all:1.3046882 grad_scale_all:0.76646662 log_pplx:2.4325299 log_pplx/logits:2.4325299 loss:2.4325299 loss/logits:2.4325299 num_samples_in_batch:128 var_norm/all:704.38702 I0420 07:38:13.745528 140375142418176 trainer.py:371] Steps/second: 0.124475, Examples/second: 16.888723 I0420 07:38:19.548118 140375134025472 trainer.py:511] time: 7.645136 I0420 07:38:19.549738 140375134025472 trainer.py:522] step: 51 fraction_of_correct_next_step_preds:0.30415326 fraction_of_correct_next_step_preds/logits:0.30415326 grad_norm/all:1.2448066 grad_scale_all:0.80333763 log_pplx:2.4177649 log_pplx/logits:2.4177649 loss:2.4177649 loss/logits:2.4177649 num_samples_in_batch:128 var_norm/all:704.38208 I0420 07:38:23.756176 140375142418176 trainer.py:371] Steps/second: 0.123877, Examples/second: 16.788972 I0420 07:38:28.114622 140375134025472 trainer.py:511] time: 8.564568 I0420 07:38:28.115804 140375134025472 trainer.py:522] step: 52 fraction_of_correct_next_step_preds:0.31214514 fraction_of_correct_next_step_preds/logits:0.31214514 grad_norm/all:1.0816773 grad_scale_all:0.92449015 log_pplx:2.4007387 log_pplx/logits:2.4007387 loss:2.4007387 loss/logits:2.4007387 num_samples_in_batch:128 var_norm/all:704.37695 I0420 07:38:33.765763 140375142418176 trainer.py:371] Steps/second: 0.123308, Examples/second: 16.694001 I0420 07:38:37.219835 140375134025472 trainer.py:511] time: 9.103512 I0420 07:38:37.221427 140375134025472 trainer.py:522] step: 53 fraction_of_correct_next_step_preds:0.31481665 fraction_of_correct_next_step_preds/logits:0.31481665 grad_norm/all:0.74732727 grad_scale_all:1 log_pplx:2.3677359 log_pplx/logits:2.3677359 loss:2.3677359 loss/logits:2.3677359 num_samples_in_batch:128 var_norm/all:704.3714 I0420 07:38:43.775006 140375142418176 trainer.py:371] Steps/second: 0.122765, Examples/second: 16.603444 I0420 07:38:45.475708 140375134025472 trainer.py:511] time: 8.253978 I0420 07:38:45.476913 140375134025472 trainer.py:522] step: 54 fraction_of_correct_next_step_preds:0.30801314 fraction_of_correct_next_step_preds/logits:0.30801314 grad_norm/all:1.1673788 grad_scale_all:0.85662001 log_pplx:2.3846834 log_pplx/logits:2.3846834 loss:2.3846834 loss/logits:2.3846834 num_samples_in_batch:128 var_norm/all:704.36542 2019-04-20 07:38:45.480067: I lingvo/core/ops/record_batcher.cc:344] 440 total seconds passed. Total records yielded: 8285. Total records skipped: 8 2019-04-20 07:38:45.481832: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711 I0420 07:38:51.320868 140375134025472 trainer.py:511] time: 5.843708 I0420 07:38:51.322346 140375134025472 trainer.py:522] step: 55 fraction_of_correct_next_step_preds:0.30724242 fraction_of_correct_next_step_preds/logits:0.30724242 grad_norm/all:1.4477671 grad_scale_all:0.69071883 log_pplx:2.4033675 log_pplx/logits:2.4033675 loss:2.4033675 loss/logits:2.4033675 num_samples_in_batch:128 var_norm/all:704.35919 I0420 07:38:53.785290 140375142418176 trainer.py:371] Steps/second: 0.124511, Examples/second: 16.806726 I0420 07:38:59.143594 140375134025472 trainer.py:511] time: 7.820905 I0420 07:38:59.145212 140375134025472 trainer.py:522] step: 56 fraction_of_correct_next_step_preds:0.32211804 fraction_of_correct_next_step_preds/logits:0.32211804 grad_norm/all:0.99645722 grad_scale_all:1 log_pplx:2.3391604 log_pplx/logits:2.3391604 loss:2.3391604 loss/logits:2.3391604 num_samples_in_batch:128 var_norm/all:704.35303 I0420 07:39:03.794799 140375142418176 trainer.py:371] Steps/second: 0.123966, Examples/second: 16.717675 I0420 07:39:05.955985 140375134025472 trainer.py:511] time: 6.810547 I0420 07:39:05.957156 140375134025472 trainer.py:522] step: 57 fraction_of_correct_next_step_preds:0.31032616 fraction_of_correct_next_step_preds/logits:0.31032616 grad_norm/all:1.3071382 grad_scale_all:0.76503003 log_pplx:2.3732302 log_pplx/logits:2.3732302 loss:2.3732302 loss/logits:2.3732302 num_samples_in_batch:128 var_norm/all:704.34644 I0420 07:39:13.333959 140375134025472 trainer.py:511] time: 7.376561 I0420 07:39:13.335150 140375134025472 trainer.py:522] step: 58 fraction_of_correct_next_step_preds:0.31746674 fraction_of_correct_next_step_preds/logits:0.31746674 grad_norm/all:0.7144863 grad_scale_all:1 log_pplx:2.3458943 log_pplx/logits:2.3458943 loss:2.3458943 loss/logits:2.3458943 num_samples_in_batch:128 var_norm/all:704.33972 I0420 07:39:13.804999 140375142418176 trainer.py:371] Steps/second: 0.125610, Examples/second: 16.909668 I0420 07:39:21.758083 140375134025472 trainer.py:511] time: 8.422377 I0420 07:39:21.759244 140375134025472 trainer.py:522] step: 59 fraction_of_correct_next_step_preds:0.32590339 fraction_of_correct_next_step_preds/logits:0.32590339 grad_norm/all:0.80505741 grad_scale_all:1 log_pplx:2.3301501 log_pplx/logits:2.3301501 loss:2.3301501 loss/logits:2.3301501 num_samples_in_batch:128 var_norm/all:704.3327 I0420 07:39:23.814544 140375142418176 trainer.py:371] Steps/second: 0.125064, Examples/second: 16.822212 I0420 07:39:30.864382 140375134025472 trainer.py:511] time: 9.104589 I0420 07:39:30.866740 140375134025472 trainer.py:522] step: 60 fraction_of_correct_next_step_preds:0.3370963 fraction_of_correct_next_step_preds/logits:0.3370963 grad_norm/all:0.67958254 grad_scale_all:1 log_pplx:2.306602 log_pplx/logits:2.306602 loss:2.306602 loss/logits:2.306602 num_samples_in_batch:128 var_norm/all:704.32526 I0420 07:39:33.827307 140375142418176 trainer.py:371] Steps/second: 0.124541, Examples/second: 16.738280 I0420 07:39:39.266830 140375134025472 trainer.py:511] time: 8.399826 I0420 07:39:39.268023 140375134025472 trainer.py:522] step: 61 fraction_of_correct_next_step_preds:0.33446094 fraction_of_correct_next_step_preds/logits:0.33446094 grad_norm/all:0.94569188 grad_scale_all:1 log_pplx:2.3149652 log_pplx/logits:2.3149652 loss:2.3149652 loss/logits:2.3149652 num_samples_in_batch:128 var_norm/all:704.31757 I0420 07:39:43.848258 140375142418176 trainer.py:371] Steps/second: 0.124036, Examples/second: 16.657488 I0420 07:39:47.206741 140375134025472 trainer.py:511] time: 7.938489 I0420 07:39:47.207648 140375134025472 trainer.py:522] step: 62 fraction_of_correct_next_step_preds:0.33041391 fraction_of_correct_next_step_preds/logits:0.33041391 grad_norm/all:0.97894734 grad_scale_all:1 log_pplx:2.3103309 log_pplx/logits:2.3103309 loss:2.3103309 loss/logits:2.3103309 num_samples_in_batch:128 var_norm/all:704.30957 I0420 07:39:52.905343 140375134025472 trainer.py:511] time: 5.697142 I0420 07:39:52.906486 140375134025472 trainer.py:522] step: 63 fraction_of_correct_next_step_preds:0.33234218 fraction_of_correct_next_step_preds/logits:0.33234218 grad_norm/all:1.9755416 grad_scale_all:0.5061903 log_pplx:2.3209085 log_pplx/logits:2.3209085 loss:2.3209085 loss/logits:2.3209085 num_samples_in_batch:128 var_norm/all:704.30127 I0420 07:39:53.844923 140375142418176 trainer.py:371] Steps/second: 0.125551, Examples/second: 16.835810 I0420 07:39:59.689280 140375134025472 trainer.py:511] time: 6.782313 I0420 07:39:59.690049 140375134025472 trainer.py:522] step: 64 fraction_of_correct_next_step_preds:0.33565956 fraction_of_correct_next_step_preds/logits:0.33565956 grad_norm/all:0.87551606 grad_scale_all:1 log_pplx:2.3032544 log_pplx/logits:2.3032544 loss:2.3032544 loss/logits:2.3032544 num_samples_in_batch:128 var_norm/all:704.2934 I0420 07:40:03.856801 140375142418176 trainer.py:371] Steps/second: 0.125049, Examples/second: 17.006665 I0420 07:40:04.021229 140375134025472 trainer.py:511] time: 4.330891 I0420 07:40:04.022106 140375134025472 trainer.py:522] step: 65 fraction_of_correct_next_step_preds:0.31422696 fraction_of_correct_next_step_preds/logits:0.31422696 grad_norm/all:1.7568246 grad_scale_all:0.5692088 log_pplx:2.3677559 log_pplx/logits:2.3677559 loss:2.3677559 loss/logits:2.3677559 num_samples_in_batch:256 var_norm/all:704.28528 I0420 07:40:11.636953 140375134025472 trainer.py:511] time: 7.614421 I0420 07:40:11.638170 140375134025472 trainer.py:522] step: 66 fraction_of_correct_next_step_preds:0.33445784 fraction_of_correct_next_step_preds/logits:0.33445784 grad_norm/all:1.4875379 grad_scale_all:0.67225182 log_pplx:2.3114297 log_pplx/logits:2.3114297 loss:2.3114297 loss/logits:2.3114297 num_samples_in_batch:128 var_norm/all:704.27734 I0420 07:40:13.866642 140375142418176 trainer.py:371] Steps/second: 0.126483, Examples/second: 17.171027 I0420 07:40:20.701528 140375134025472 trainer.py:511] time: 9.062950 I0420 07:40:20.702819 140375134025472 trainer.py:522] step: 67 fraction_of_correct_next_step_preds:0.34274742 fraction_of_correct_next_step_preds/logits:0.34274742 grad_norm/all:1.3280134 grad_scale_all:0.75300443 log_pplx:2.2751744 log_pplx/logits:2.2751744 loss:2.2751744 loss/logits:2.2751744 num_samples_in_batch:128 var_norm/all:704.26959 I0420 07:40:23.875549 140375142418176 trainer.py:371] Steps/second: 0.125983, Examples/second: 17.088548 I0420 07:40:29.276140 140375134025472 trainer.py:511] time: 8.572985 I0420 07:40:29.277910 140375134025472 trainer.py:522] step: 68 fraction_of_correct_next_step_preds:0.35038498 fraction_of_correct_next_step_preds/logits:0.35038498 grad_norm/all:0.74925572 grad_scale_all:1 log_pplx:2.257813 log_pplx/logits:2.257813 loss:2.257813 loss/logits:2.257813 num_samples_in_batch:128 var_norm/all:704.26178 I0420 07:40:33.885135 140375142418176 trainer.py:371] Steps/second: 0.125501, Examples/second: 17.009094 I0420 07:40:37.489234 140375134025472 trainer.py:511] time: 8.211099 I0420 07:40:37.490010 140375134025472 trainer.py:522] step: 69 fraction_of_correct_next_step_preds:0.3476578 fraction_of_correct_next_step_preds/logits:0.3476578 grad_norm/all:1.2810773 grad_scale_all:0.78059304 log_pplx:2.2596214 log_pplx/logits:2.2596214 loss:2.2596214 loss/logits:2.2596214 num_samples_in_batch:128 var_norm/all:704.25372 I0420 07:40:43.896034 140375142418176 trainer.py:371] Steps/second: 0.125037, Examples/second: 16.932488 I0420 07:40:45.529628 140375134025472 trainer.py:511] time: 8.039236 I0420 07:40:45.530534 140375134025472 trainer.py:522] step: 70 fraction_of_correct_next_step_preds:0.34041184 fraction_of_correct_next_step_preds/logits:0.34041184 grad_norm/all:1.2760593 grad_scale_all:0.78366268 log_pplx:2.281177 log_pplx/logits:2.281177 loss:2.281177 loss/logits:2.281177 num_samples_in_batch:128 var_norm/all:704.24561 I0420 07:40:52.443317 140375134025472 trainer.py:511] time: 6.912189 I0420 07:40:52.444982 140375134025472 trainer.py:522] step: 71 fraction_of_correct_next_step_preds:0.34689143 fraction_of_correct_next_step_preds/logits:0.34689143 grad_norm/all:1.113879 grad_scale_all:0.89776361 log_pplx:2.2540269 log_pplx/logits:2.2540269 loss:2.2540269 loss/logits:2.2540269 num_samples_in_batch:128 var_norm/all:704.23737 I0420 07:40:53.904791 140375142418176 trainer.py:371] Steps/second: 0.126369, Examples/second: 17.086490 I0420 07:40:58.216379 140375134025472 trainer.py:511] time: 5.771179 I0420 07:40:58.217495 140375134025472 trainer.py:522] step: 72 fraction_of_correct_next_step_preds:0.33866972 fraction_of_correct_next_step_preds/logits:0.33866972 grad_norm/all:1.1875348 grad_scale_all:0.84208059 log_pplx:2.2681174 log_pplx/logits:2.2681174 loss:2.2681174 loss/logits:2.2681174 num_samples_in_batch:128 var_norm/all:704.229 I0420 07:41:03.916526 140375142418176 trainer.py:371] Steps/second: 0.125905, Examples/second: 17.011183 I0420 07:41:03.917241 140375142418176 trainer.py:268] Save checkpoint W0420 07:41:06.173212 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'dict' object has no attribute 'name' W0420 07:41:06.173594 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'list' object has no attribute 'name' I0420 07:41:06.398998 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000072 I0420 07:41:06.594940 140375134025472 trainer.py:511] time: 8.353298 I0420 07:41:06.595875 140375134025472 trainer.py:522] step: 73 fraction_of_correct_next_step_preds:0.35307753 fraction_of_correct_next_step_preds/logits:0.35307753 grad_norm/all:0.75280231 grad_scale_all:1 log_pplx:2.2375162 log_pplx/logits:2.2375162 loss:2.2375162 loss/logits:2.2375162 num_samples_in_batch:128 var_norm/all:704.22052 I0420 07:41:13.935550 140375142418176 trainer.py:371] Steps/second: 0.125456, Examples/second: 16.938254 I0420 07:41:15.875637 140375134025472 trainer.py:511] time: 9.279429 I0420 07:41:15.876627 140375134025472 trainer.py:522] step: 74 fraction_of_correct_next_step_preds:0.36721641 fraction_of_correct_next_step_preds/logits:0.36721641 grad_norm/all:0.86656165 grad_scale_all:1 log_pplx:2.2110724 log_pplx/logits:2.2110724 loss:2.2110724 loss/logits:2.2110724 num_samples_in_batch:128 var_norm/all:704.21179 I0420 07:41:23.576463 140375134025472 trainer.py:511] time: 7.699559 I0420 07:41:23.577770 140375134025472 trainer.py:522] step: 75 fraction_of_correct_next_step_preds:0.3473832 fraction_of_correct_next_step_preds/logits:0.3473832 grad_norm/all:1.2296442 grad_scale_all:0.81324339 log_pplx:2.2439103 log_pplx/logits:2.2439103 loss:2.2439103 loss/logits:2.2439103 num_samples_in_batch:128 var_norm/all:704.20282 I0420 07:41:23.932002 140375142418176 trainer.py:371] Steps/second: 0.126716, Examples/second: 17.084698 I0420 07:41:31.270879 140375134025472 trainer.py:511] time: 7.692725 I0420 07:41:31.271617 140375134025472 trainer.py:522] step: 76 fraction_of_correct_next_step_preds:0.34664184 fraction_of_correct_next_step_preds/logits:0.34664184 grad_norm/all:1.7527937 grad_scale_all:0.57051778 log_pplx:2.2464612 log_pplx/logits:2.2464612 loss:2.2464612 loss/logits:2.2464612 num_samples_in_batch:128 var_norm/all:704.19379 I0420 07:41:33.943514 140375142418176 trainer.py:371] Steps/second: 0.126270, Examples/second: 17.013188 I0420 07:41:39.387933 140375134025472 trainer.py:511] time: 8.116039 I0420 07:41:39.389003 140375134025472 trainer.py:522] step: 77 fraction_of_correct_next_step_preds:0.35830984 fraction_of_correct_next_step_preds/logits:0.35830984 grad_norm/all:0.81886053 grad_scale_all:1 log_pplx:2.2158952 log_pplx/logits:2.2158952 loss:2.2158952 loss/logits:2.2158952 num_samples_in_batch:128 var_norm/all:704.18512 I0420 07:41:43.953299 140375142418176 trainer.py:371] Steps/second: 0.125838, Examples/second: 16.944062 I0420 07:41:46.216691 140375134025472 trainer.py:511] time: 6.827460 I0420 07:41:46.217935 140375134025472 trainer.py:522] step: 78 fraction_of_correct_next_step_preds:0.35563675 fraction_of_correct_next_step_preds/logits:0.35563675 grad_norm/all:0.79842186 grad_scale_all:1 log_pplx:2.2194674 log_pplx/logits:2.2194674 loss:2.2194674 loss/logits:2.2194674 num_samples_in_batch:128 var_norm/all:704.17609 I0420 07:41:51.873646 140375134025472 trainer.py:511] time: 5.655520 I0420 07:41:51.874325 140375134025472 trainer.py:522] step: 79 fraction_of_correct_next_step_preds:0.35122594 fraction_of_correct_next_step_preds/logits:0.35122594 grad_norm/all:1.6447265 grad_scale_all:0.6080038 log_pplx:2.24704 log_pplx/logits:2.24704 loss:2.24704 loss/logits:2.24704 num_samples_in_batch:128 var_norm/all:704.16681 I0420 07:41:53.964529 140375142418176 trainer.py:371] Steps/second: 0.127029, Examples/second: 17.082936 I0420 07:42:00.195930 140375134025472 trainer.py:511] time: 8.321231 I0420 07:42:00.196805 140375134025472 trainer.py:522] step: 80 fraction_of_correct_next_step_preds:0.36126739 fraction_of_correct_next_step_preds/logits:0.36126739 grad_norm/all:0.98546255 grad_scale_all:1 log_pplx:2.1959085 log_pplx/logits:2.1959085 loss:2.1959085 loss/logits:2.1959085 num_samples_in_batch:128 var_norm/all:704.15784 I0420 07:42:03.973375 140375142418176 trainer.py:371] Steps/second: 0.126599, Examples/second: 17.217479 I0420 07:42:04.534535 140375134025472 trainer.py:511] time: 4.337248 I0420 07:42:04.536370 140375134025472 trainer.py:522] step: 81 fraction_of_correct_next_step_preds:0.35335645 fraction_of_correct_next_step_preds/logits:0.35335645 grad_norm/all:0.70726234 grad_scale_all:1 log_pplx:2.2483265 log_pplx/logits:2.2483265 loss:2.2483265 loss/logits:2.2483265 num_samples_in_batch:256 var_norm/all:704.14862 I0420 07:42:13.410237 140375134025472 trainer.py:511] time: 8.873534 I0420 07:42:13.411325 140375134025472 trainer.py:522] step: 82 fraction_of_correct_next_step_preds:0.3593379 fraction_of_correct_next_step_preds/logits:0.3593379 grad_norm/all:1.1385168 grad_scale_all:0.87833577 log_pplx:2.1953106 log_pplx/logits:2.1953106 loss:2.1953106 loss/logits:2.1953106 num_samples_in_batch:128 var_norm/all:704.13922 I0420 07:42:13.978193 140375142418176 trainer.py:371] Steps/second: 0.127742, Examples/second: 17.347933 I0420 07:42:20.770096 140375134025472 trainer.py:511] time: 7.358493 I0420 07:42:20.771768 140375134025472 trainer.py:522] step: 83 fraction_of_correct_next_step_preds:0.35221922 fraction_of_correct_next_step_preds/logits:0.35221922 grad_norm/all:1.4923625 grad_scale_all:0.67007846 log_pplx:2.2021344 log_pplx/logits:2.2021344 loss:2.2021344 loss/logits:2.2021344 num_samples_in_batch:128 var_norm/all:704.12964 I0420 07:42:23.988683 140375142418176 trainer.py:371] Steps/second: 0.127314, Examples/second: 17.277894 I0420 07:42:28.549462 140375134025472 trainer.py:511] time: 7.777474 I0420 07:42:28.550607 140375134025472 trainer.py:522] step: 84 fraction_of_correct_next_step_preds:0.37899226 fraction_of_correct_next_step_preds/logits:0.37899226 grad_norm/all:0.82399756 grad_scale_all:1 log_pplx:2.1561263 log_pplx/logits:2.1561263 loss:2.1561263 loss/logits:2.1561263 num_samples_in_batch:128 var_norm/all:704.12036 I0420 07:42:33.999638 140375142418176 trainer.py:371] Steps/second: 0.126899, Examples/second: 17.209961 I0420 07:42:36.480242 140375134025472 trainer.py:511] time: 7.929422 I0420 07:42:36.481380 140375134025472 trainer.py:522] step: 85 fraction_of_correct_next_step_preds:0.36427727 fraction_of_correct_next_step_preds/logits:0.36427727 grad_norm/all:0.73982769 grad_scale_all:1 log_pplx:2.1625452 log_pplx/logits:2.1625452 loss:2.1625452 loss/logits:2.1625452 num_samples_in_batch:128 var_norm/all:704.11084 I0420 07:42:43.379302 140375134025472 trainer.py:511] time: 6.897688 I0420 07:42:43.380394 140375134025472 trainer.py:522] step: 86 fraction_of_correct_next_step_preds:0.35739926 fraction_of_correct_next_step_preds/logits:0.35739926 grad_norm/all:1.5248234 grad_scale_all:0.65581363 log_pplx:2.1884296 log_pplx/logits:2.1884296 loss:2.1884296 loss/logits:2.1884296 num_samples_in_batch:128 var_norm/all:704.10114 I0420 07:42:44.008449 140375142418176 trainer.py:371] Steps/second: 0.127986, Examples/second: 17.334595 I0420 07:42:52.863199 140375134025472 trainer.py:511] time: 9.482465 I0420 07:42:52.864460 140375134025472 trainer.py:522] step: 87 fraction_of_correct_next_step_preds:0.3832756 fraction_of_correct_next_step_preds/logits:0.3832756 grad_norm/all:0.99459577 grad_scale_all:1 log_pplx:2.1456861 log_pplx/logits:2.1456861 loss:2.1456861 loss/logits:2.1456861 num_samples_in_batch:128 var_norm/all:704.09155 I0420 07:42:54.018491 140375142418176 trainer.py:371] Steps/second: 0.127573, Examples/second: 17.267847 I0420 07:42:58.559226 140375134025472 trainer.py:511] time: 5.694561 I0420 07:42:58.560539 140375134025472 trainer.py:522] step: 88 fraction_of_correct_next_step_preds:0.36507463 fraction_of_correct_next_step_preds/logits:0.36507463 grad_norm/all:0.8461504 grad_scale_all:1 log_pplx:2.1562793 log_pplx/logits:2.1562793 loss:2.1562793 loss/logits:2.1562793 num_samples_in_batch:128 var_norm/all:704.08179 I0420 07:43:04.029028 140375142418176 trainer.py:371] Steps/second: 0.127173, Examples/second: 17.203016 I0420 07:43:06.955815 140375134025472 trainer.py:511] time: 8.395064 I0420 07:43:06.956528 140375134025472 trainer.py:522] step: 89 fraction_of_correct_next_step_preds:0.38289136 fraction_of_correct_next_step_preds/logits:0.38289136 grad_norm/all:0.5773949 grad_scale_all:1 log_pplx:2.1110232 log_pplx/logits:2.1110232 loss:2.1110232 loss/logits:2.1110232 num_samples_in_batch:128 var_norm/all:704.07178 I0420 07:43:14.034790 140375142418176 trainer.py:371] Steps/second: 0.126785, Examples/second: 17.140155 I0420 07:43:14.521831 140375134025472 trainer.py:511] time: 7.564866 I0420 07:43:14.522579 140375134025472 trainer.py:522] step: 90 fraction_of_correct_next_step_preds:0.37470537 fraction_of_correct_next_step_preds/logits:0.37470537 grad_norm/all:0.70084506 grad_scale_all:1 log_pplx:2.1351776 log_pplx/logits:2.1351776 loss:2.1351776 loss/logits:2.1351776 num_samples_in_batch:128 var_norm/all:704.06158 I0420 07:43:22.593677 140375134025472 trainer.py:511] time: 8.070737 I0420 07:43:22.595352 140375134025472 trainer.py:522] step: 91 fraction_of_correct_next_step_preds:0.37510994 fraction_of_correct_next_step_preds/logits:0.37510994 grad_norm/all:0.89258677 grad_scale_all:1 log_pplx:2.1301734 log_pplx/logits:2.1301734 loss:2.1301734 loss/logits:2.1301734 num_samples_in_batch:128 var_norm/all:704.05121 I0420 07:43:24.044142 140375142418176 trainer.py:371] Steps/second: 0.127811, Examples/second: 17.258747 I0420 07:43:30.330538 140375134025472 trainer.py:511] time: 7.734857 I0420 07:43:30.331336 140375134025472 trainer.py:522] step: 92 fraction_of_correct_next_step_preds:0.37568703 fraction_of_correct_next_step_preds/logits:0.37568703 grad_norm/all:1.012671 grad_scale_all:0.98748755 log_pplx:2.1166055 log_pplx/logits:2.1166055 loss:2.1166055 loss/logits:2.1166055 num_samples_in_batch:128 var_norm/all:704.04065 I0420 07:43:34.055632 140375142418176 trainer.py:371] Steps/second: 0.127424, Examples/second: 17.196722 I0420 07:43:39.545340 140375134025472 trainer.py:511] time: 9.213729 I0420 07:43:39.546376 140375134025472 trainer.py:522] step: 93 fraction_of_correct_next_step_preds:0.37040973 fraction_of_correct_next_step_preds/logits:0.37040973 grad_norm/all:1.307186 grad_scale_all:0.76500207 log_pplx:2.1472313 log_pplx/logits:2.1472313 loss:2.1472313 loss/logits:2.1472313 num_samples_in_batch:128 var_norm/all:704.02991 I0420 07:43:44.059933 140375142418176 trainer.py:371] Steps/second: 0.127049, Examples/second: 17.136553 I0420 07:43:46.515561 140375134025472 trainer.py:511] time: 6.968886 I0420 07:43:46.517144 140375134025472 trainer.py:522] step: 94 fraction_of_correct_next_step_preds:0.37518752 fraction_of_correct_next_step_preds/logits:0.37518752 grad_norm/all:1.1413982 grad_scale_all:0.87611842 log_pplx:2.1311126 log_pplx/logits:2.1311126 loss:2.1311126 loss/logits:2.1311126 num_samples_in_batch:128 var_norm/all:704.01935 I0420 07:43:52.326225 140375134025472 trainer.py:511] time: 5.808762 I0420 07:43:52.327689 140375134025472 trainer.py:522] step: 95 fraction_of_correct_next_step_preds:0.38357881 fraction_of_correct_next_step_preds/logits:0.38357881 grad_norm/all:0.84444368 grad_scale_all:1 log_pplx:2.1159191 log_pplx/logits:2.1159191 loss:2.1159191 loss/logits:2.1159191 num_samples_in_batch:128 var_norm/all:704.00879 I0420 07:43:54.059287 140375142418176 trainer.py:371] Steps/second: 0.128032, Examples/second: 17.250629 I0420 07:44:01.040597 140375134025472 trainer.py:511] time: 8.689469 I0420 07:44:01.041412 140375134025472 trainer.py:522] step: 96 fraction_of_correct_next_step_preds:0.37989101 fraction_of_correct_next_step_preds/logits:0.37989101 grad_norm/all:0.82067651 grad_scale_all:1 log_pplx:2.1037724 log_pplx/logits:2.1037724 loss:2.1037724 loss/logits:2.1037724 num_samples_in_batch:128 var_norm/all:703.99811 I0420 07:44:04.069787 140375142418176 trainer.py:371] Steps/second: 0.127657, Examples/second: 17.361416 I0420 07:44:05.287786 140375134025472 trainer.py:511] time: 4.245918 I0420 07:44:05.289007 140375134025472 trainer.py:522] step: 97 fraction_of_correct_next_step_preds:0.36707532 fraction_of_correct_next_step_preds/logits:0.36707532 grad_norm/all:1.0457711 grad_scale_all:0.95623219 log_pplx:2.1651423 log_pplx/logits:2.1651423 loss:2.1651423 loss/logits:2.1651423 num_samples_in_batch:256 var_norm/all:703.98724 I0420 07:44:12.534790 140375134025472 trainer.py:511] time: 7.245540 I0420 07:44:12.536067 140375134025472 trainer.py:522] step: 98 fraction_of_correct_next_step_preds:0.37593955 fraction_of_correct_next_step_preds/logits:0.37593955 grad_norm/all:1.6557065 grad_scale_all:0.60397178 log_pplx:2.1272745 log_pplx/logits:2.1272745 loss:2.1272745 loss/logits:2.1272745 num_samples_in_batch:128 var_norm/all:703.97632 I0420 07:44:14.079885 140375142418176 trainer.py:371] Steps/second: 0.128605, Examples/second: 17.469302 I0420 07:44:20.448647 140375134025472 trainer.py:511] time: 7.912352 I0420 07:44:20.449891 140375134025472 trainer.py:522] step: 99 fraction_of_correct_next_step_preds:0.38830405 fraction_of_correct_next_step_preds/logits:0.38830405 grad_norm/all:0.66534883 grad_scale_all:1 log_pplx:2.0848999 log_pplx/logits:2.0848999 loss:2.0848999 loss/logits:2.0848999 num_samples_in_batch:128 var_norm/all:703.96582 I0420 07:44:24.089390 140375142418176 trainer.py:371] Steps/second: 0.128233, Examples/second: 17.408603 I0420 07:44:29.597444 140375134025472 trainer.py:511] time: 9.147288 I0420 07:44:29.598999 140375134025472 trainer.py:522] step: 100 fraction_of_correct_next_step_preds:0.37406558 fraction_of_correct_next_step_preds/logits:0.37406558 grad_norm/all:1.759958 grad_scale_all:0.56819534 log_pplx:2.1023078 log_pplx/logits:2.1023078 loss:2.1023078 loss/logits:2.1023078 num_samples_in_batch:128 var_norm/all:703.9552 I0420 07:44:34.098342 140375142418176 trainer.py:371] Steps/second: 0.127871, Examples/second: 17.349474 I0420 07:44:37.641819 140375134025472 trainer.py:511] time: 8.042531 I0420 07:44:37.643115 140375134025472 base_runner.py:115] step: 101 fraction_of_correct_next_step_preds:0.38138181 fraction_of_correct_next_step_preds/logits:0.38138181 grad_norm/all:1.0481714 grad_scale_all:0.95404243 log_pplx:2.1053541 log_pplx/logits:2.1053541 loss:2.1053541 loss/logits:2.1053541 num_samples_in_batch:128 var_norm/all:703.94501 I0420 07:44:44.107460 140375142418176 trainer.py:371] Steps/second: 0.127517, Examples/second: 17.291835 I0420 07:44:44.108102 140375142418176 trainer.py:275] Write summary @101 2019-04-20 07:44:44.115563: I lingvo/core/ops/record_batcher.cc:344] 775 total seconds passed. Total records yielded: 264. Total records skipped: 0 I0420 07:44:44.722656 140375134025472 trainer.py:511] time: 7.079163 I0420 07:44:44.724056 140375134025472 trainer.py:522] step: 102 fraction_of_correct_next_step_preds:0.38316277 fraction_of_correct_next_step_preds/logits:0.38316277 grad_norm/all:1.0969374 grad_scale_all:0.91162902 log_pplx:2.0958667 log_pplx/logits:2.0958667 loss:2.0958667 loss/logits:2.0958667 num_samples_in_batch:128 var_norm/all:703.93457 I0420 07:44:55.762762 140375134025472 trainer.py:511] time: 11.038428 I0420 07:44:55.776536 140375134025472 trainer.py:522] step: 103 fraction_of_correct_next_step_preds:0.38022271 fraction_of_correct_next_step_preds/logits:0.38022271 grad_norm/all:1.1487237 grad_scale_all:0.87053132 log_pplx:2.0835736 log_pplx/logits:2.0835736 loss:2.0835736 loss/logits:2.0835736 num_samples_in_batch:128 var_norm/all:703.92407 I0420 07:45:03.670341 140375134025472 trainer.py:511] time: 7.891746 I0420 07:45:03.671505 140375134025472 trainer.py:522] step: 104 fraction_of_correct_next_step_preds:0.38704854 fraction_of_correct_next_step_preds/logits:0.38704854 grad_norm/all:0.81093085 grad_scale_all:1 log_pplx:2.0827632 log_pplx/logits:2.0827632 loss:2.0827632 loss/logits:2.0827632 num_samples_in_batch:128 var_norm/all:703.91351 I0420 07:45:14.271611 140375134025472 trainer.py:511] time: 10.599824 I0420 07:45:14.273308 140375134025472 trainer.py:522] step: 105 fraction_of_correct_next_step_preds:0.38036564 fraction_of_correct_next_step_preds/logits:0.38036564 grad_norm/all:1.2385389 grad_scale_all:0.80740303 log_pplx:2.0951517 log_pplx/logits:2.0951517 loss:2.0951517 loss/logits:2.0951517 num_samples_in_batch:128 var_norm/all:703.90283 I0420 07:45:27.701951 140375134025472 trainer.py:511] time: 13.428311 I0420 07:45:27.704013 140375134025472 trainer.py:522] step: 106 fraction_of_correct_next_step_preds:0.38831434 fraction_of_correct_next_step_preds/logits:0.38831434 grad_norm/all:0.8401624 grad_scale_all:1 log_pplx:2.0681973 log_pplx/logits:2.0681973 loss:2.0681973 loss/logits:2.0681973 num_samples_in_batch:128 var_norm/all:703.89215 I0420 07:45:40.065790 140375134025472 trainer.py:511] time: 12.361295 I0420 07:45:40.068871 140375134025472 trainer.py:522] step: 107 fraction_of_correct_next_step_preds:0.39199519 fraction_of_correct_next_step_preds/logits:0.39199519 grad_norm/all:0.82031536 grad_scale_all:1 log_pplx:2.0627809 log_pplx/logits:2.0627809 loss:2.0627809 loss/logits:2.0627809 num_samples_in_batch:128 var_norm/all:703.88129 I0420 07:45:51.887306 140375134025472 trainer.py:511] time: 11.818003 I0420 07:45:51.889163 140375134025472 trainer.py:522] step: 108 fraction_of_correct_next_step_preds:0.39280939 fraction_of_correct_next_step_preds/logits:0.39280939 grad_norm/all:1.099699 grad_scale_all:0.90933973 log_pplx:2.0636985 log_pplx/logits:2.0636985 loss:2.0636985 loss/logits:2.0636985 num_samples_in_batch:128 var_norm/all:703.87018 I0420 07:45:55.701033 140375142418176 trainer.py:284] Write summary done: step 101 I0420 07:45:55.714133 140375142418176 base_runner.py:115] step: 101, steps/sec: 0.13, examples/sec: 17.29 I0420 07:45:55.718148 140375142418176 trainer.py:371] Steps/second: 0.125049, Examples/second: 16.895521 I0420 07:45:59.875958 140375134025472 trainer.py:511] time: 7.985847 I0420 07:45:59.877171 140375134025472 trainer.py:522] step: 109 fraction_of_correct_next_step_preds:0.38182643 fraction_of_correct_next_step_preds/logits:0.38182643 grad_norm/all:1.3109812 grad_scale_all:0.76278746 log_pplx:2.0635197 log_pplx/logits:2.0635197 loss:2.0635197 loss/logits:2.0635197 num_samples_in_batch:128 var_norm/all:703.85907 I0420 07:46:05.727878 140375142418176 trainer.py:371] Steps/second: 0.124761, Examples/second: 16.848456 I0420 07:46:08.144885 140375134025472 trainer.py:511] time: 8.267169 I0420 07:46:08.146713 140375134025472 trainer.py:522] step: 110 fraction_of_correct_next_step_preds:0.39472944 fraction_of_correct_next_step_preds/logits:0.39472944 grad_norm/all:1.1448041 grad_scale_all:0.87351191 log_pplx:2.0443304 log_pplx/logits:2.0443304 loss:2.0443304 loss/logits:2.0443304 num_samples_in_batch:128 var_norm/all:703.84808 I0420 07:46:13.943571 140375134025472 trainer.py:511] time: 5.796645 I0420 07:46:13.944745 140375134025472 trainer.py:522] step: 111 fraction_of_correct_next_step_preds:0.39036024 fraction_of_correct_next_step_preds/logits:0.39036024 grad_norm/all:0.98204678 grad_scale_all:1 log_pplx:2.0808325 log_pplx/logits:2.0808325 loss:2.0808325 loss/logits:2.0808325 num_samples_in_batch:128 var_norm/all:703.8371 I0420 07:46:15.740056 140375142418176 trainer.py:371] Steps/second: 0.125611, Examples/second: 17.092108 I0420 07:46:18.146146 140375134025472 trainer.py:511] time: 4.201175 I0420 07:46:18.147413 140375134025472 trainer.py:522] step: 112 fraction_of_correct_next_step_preds:0.38091317 fraction_of_correct_next_step_preds/logits:0.38091317 grad_norm/all:0.8844955 grad_scale_all:1 log_pplx:2.1076317 log_pplx/logits:2.1076317 loss:2.1076317 loss/logits:2.1076317 num_samples_in_batch:256 var_norm/all:703.82587 I0420 07:46:25.491939 140375134025472 trainer.py:511] time: 7.344290 I0420 07:46:25.492826 140375134025472 trainer.py:522] step: 113 fraction_of_correct_next_step_preds:0.40295595 fraction_of_correct_next_step_preds/logits:0.40295595 grad_norm/all:0.7746343 grad_scale_all:1 log_pplx:2.0239127 log_pplx/logits:2.0239127 loss:2.0239127 loss/logits:2.0239127 num_samples_in_batch:128 var_norm/all:703.81451 I0420 07:46:25.765045 140375142418176 trainer.py:371] Steps/second: 0.126440, Examples/second: 17.186828 I0420 07:46:34.654975 140375134025472 trainer.py:511] time: 9.161685 I0420 07:46:34.656054 140375134025472 trainer.py:522] step: 114 fraction_of_correct_next_step_preds:0.40458798 fraction_of_correct_next_step_preds/logits:0.40458798 grad_norm/all:0.77485955 grad_scale_all:1 log_pplx:2.0200136 log_pplx/logits:2.0200136 loss:2.0200136 loss/logits:2.0200136 num_samples_in_batch:128 var_norm/all:703.80304 I0420 07:46:35.756966 140375142418176 trainer.py:371] Steps/second: 0.126148, Examples/second: 17.138438 I0420 07:46:42.850702 140375134025472 trainer.py:511] time: 8.194245 I0420 07:46:42.852169 140375134025472 trainer.py:522] step: 115 fraction_of_correct_next_step_preds:0.40469682 fraction_of_correct_next_step_preds/logits:0.40469682 grad_norm/all:1.019146 grad_scale_all:0.98121369 log_pplx:2.0309305 log_pplx/logits:2.0309305 loss:2.0309305 loss/logits:2.0309305 num_samples_in_batch:128 var_norm/all:703.79138 I0420 07:46:45.768280 140375142418176 trainer.py:371] Steps/second: 0.125860, Examples/second: 17.090745 I0420 07:46:50.728563 140375134025472 trainer.py:511] time: 7.876101 I0420 07:46:50.729914 140375134025472 trainer.py:522] step: 116 fraction_of_correct_next_step_preds:0.3940883 fraction_of_correct_next_step_preds/logits:0.3940883 grad_norm/all:1.2589619 grad_scale_all:0.79430521 log_pplx:2.0252271 log_pplx/logits:2.0252271 loss:2.0252271 loss/logits:2.0252271 num_samples_in_batch:128 var_norm/all:703.77966 2019-04-20 07:46:50.734156: I lingvo/core/ops/record_batcher.cc:344] 925 total seconds passed. Total records yielded: 16765. Total records skipped: 11 2019-04-20 07:46:50.734335: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1726 2019-04-20 07:46:50.734376: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1729 2019-04-20 07:46:50.734412: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1966 I0420 07:46:55.777012 140375142418176 trainer.py:371] Steps/second: 0.125579, Examples/second: 17.044132 I0420 07:46:57.789110 140375134025472 trainer.py:511] time: 7.058867 I0420 07:46:57.790066 140375134025472 trainer.py:522] step: 117 fraction_of_correct_next_step_preds:0.40091401 fraction_of_correct_next_step_preds/logits:0.40091401 grad_norm/all:0.99531782 grad_scale_all:1 log_pplx:2.0293548 log_pplx/logits:2.0293548 loss:2.0293548 loss/logits:2.0293548 num_samples_in_batch:128 var_norm/all:703.76801 I0420 07:47:03.618060 140375134025472 trainer.py:511] time: 5.827690 I0420 07:47:03.619812 140375134025472 trainer.py:522] step: 118 fraction_of_correct_next_step_preds:0.3924461 fraction_of_correct_next_step_preds/logits:0.3924461 grad_norm/all:0.97392869 grad_scale_all:1 log_pplx:2.0520797 log_pplx/logits:2.0520797 loss:2.0520797 loss/logits:2.0520797 num_samples_in_batch:128 var_norm/all:703.75629 I0420 07:47:05.787250 140375142418176 trainer.py:371] Steps/second: 0.126375, Examples/second: 17.135577 I0420 07:47:12.191804 140375134025472 trainer.py:511] time: 8.571755 I0420 07:47:12.193002 140375134025472 trainer.py:522] step: 119 fraction_of_correct_next_step_preds:0.39932805 fraction_of_correct_next_step_preds/logits:0.39932805 grad_norm/all:1.4277698 grad_scale_all:0.70039302 log_pplx:2.0204892 log_pplx/logits:2.0204892 loss:2.0204892 loss/logits:2.0204892 num_samples_in_batch:128 var_norm/all:703.74438 I0420 07:47:15.798336 140375142418176 trainer.py:371] Steps/second: 0.126094, Examples/second: 17.089437 I0420 07:47:19.713876 140375134025472 trainer.py:511] time: 7.520668 I0420 07:47:19.714983 140375134025472 trainer.py:522] step: 120 fraction_of_correct_next_step_preds:0.4065825 fraction_of_correct_next_step_preds/logits:0.4065825 grad_norm/all:1.0304936 grad_scale_all:0.97040874 log_pplx:2.01543 log_pplx/logits:2.01543 loss:2.01543 loss/logits:2.01543 num_samples_in_batch:128 var_norm/all:703.73279 I0420 07:47:25.808001 140375142418176 trainer.py:371] Steps/second: 0.125819, Examples/second: 17.044289 I0420 07:47:28.746090 140375134025472 trainer.py:511] time: 9.030880 I0420 07:47:28.747275 140375134025472 trainer.py:522] step: 121 fraction_of_correct_next_step_preds:0.40240431 fraction_of_correct_next_step_preds/logits:0.40240431 grad_norm/all:0.93115848 grad_scale_all:1 log_pplx:1.9913671 log_pplx/logits:1.9913671 loss:1.9913671 loss/logits:1.9913671 num_samples_in_batch:128 var_norm/all:703.72107 I0420 07:47:35.818212 140375142418176 trainer.py:371] Steps/second: 0.125550, Examples/second: 17.000070 I0420 07:47:36.670398 140375134025472 trainer.py:511] time: 7.922841 I0420 07:47:36.671638 140375134025472 trainer.py:522] step: 122 fraction_of_correct_next_step_preds:0.39683267 fraction_of_correct_next_step_preds/logits:0.39683267 grad_norm/all:0.82973373 grad_scale_all:1 log_pplx:2.0151408 log_pplx/logits:2.0151408 loss:2.0151408 loss/logits:2.0151408 num_samples_in_batch:128 var_norm/all:703.70923 I0420 07:47:44.871414 140375134025472 trainer.py:511] time: 8.199487 I0420 07:47:44.872231 140375134025472 trainer.py:522] step: 123 fraction_of_correct_next_step_preds:0.40840337 fraction_of_correct_next_step_preds/logits:0.40840337 grad_norm/all:1.4440463 grad_scale_all:0.69249856 log_pplx:1.9989125 log_pplx/logits:1.9989125 loss:1.9989125 loss/logits:1.9989125 num_samples_in_batch:128 var_norm/all:703.69714 I0420 07:47:45.822077 140375142418176 trainer.py:371] Steps/second: 0.126314, Examples/second: 17.088316 I0420 07:47:51.782823 140375134025472 trainer.py:511] time: 6.910181 I0420 07:47:51.784365 140375134025472 trainer.py:522] step: 124 fraction_of_correct_next_step_preds:0.39865616 fraction_of_correct_next_step_preds/logits:0.39865616 grad_norm/all:0.97169137 grad_scale_all:1 log_pplx:2.017369 log_pplx/logits:2.017369 loss:2.017369 loss/logits:2.017369 num_samples_in_batch:128 var_norm/all:703.68542 I0420 07:47:55.832262 140375142418176 trainer.py:371] Steps/second: 0.126045, Examples/second: 17.044550 I0420 07:48:00.231045 140375134025472 trainer.py:511] time: 8.446399 I0420 07:48:00.232076 140375134025472 trainer.py:522] step: 125 fraction_of_correct_next_step_preds:0.41026807 fraction_of_correct_next_step_preds/logits:0.41026807 grad_norm/all:1.2182258 grad_scale_all:0.82086587 log_pplx:1.9927412 log_pplx/logits:1.9927412 loss:1.9927412 loss/logits:1.9927412 num_samples_in_batch:128 var_norm/all:703.67346 I0420 07:48:05.842519 140375142418176 trainer.py:371] Steps/second: 0.125782, Examples/second: 17.001663 I0420 07:48:09.408751 140375134025472 trainer.py:511] time: 9.176420 I0420 07:48:09.410067 140375134025472 trainer.py:522] step: 126 fraction_of_correct_next_step_preds:0.40561908 fraction_of_correct_next_step_preds/logits:0.40561908 grad_norm/all:1.1187987 grad_scale_all:0.89381582 log_pplx:2.000335 log_pplx/logits:2.000335 loss:2.000335 loss/logits:2.000335 num_samples_in_batch:128 var_norm/all:703.66162 I0420 07:48:15.108469 140375134025472 trainer.py:511] time: 5.698159 I0420 07:48:15.109544 140375134025472 trainer.py:522] step: 127 fraction_of_correct_next_step_preds:0.413378 fraction_of_correct_next_step_preds/logits:0.413378 grad_norm/all:0.75346535 grad_scale_all:1 log_pplx:1.9956678 log_pplx/logits:1.9956678 loss:1.9956678 loss/logits:1.9956678 num_samples_in_batch:128 var_norm/all:703.64972 I0420 07:48:15.850579 140375142418176 trainer.py:371] Steps/second: 0.126520, Examples/second: 17.087184 I0420 07:48:22.503793 140375134025472 trainer.py:511] time: 7.393976 I0420 07:48:22.505626 140375134025472 trainer.py:522] step: 128 fraction_of_correct_next_step_preds:0.40938216 fraction_of_correct_next_step_preds/logits:0.40938216 grad_norm/all:0.88790184 grad_scale_all:1 log_pplx:1.9819201 log_pplx/logits:1.9819201 loss:1.9819201 loss/logits:1.9819201 num_samples_in_batch:128 var_norm/all:703.6377 I0420 07:48:25.861005 140375142418176 trainer.py:371] Steps/second: 0.126257, Examples/second: 17.170978 I0420 07:48:26.824466 140375134025472 trainer.py:511] time: 4.318640 I0420 07:48:26.825557 140375134025472 trainer.py:522] step: 129 fraction_of_correct_next_step_preds:0.4062759 fraction_of_correct_next_step_preds/logits:0.4062759 grad_norm/all:0.69528306 grad_scale_all:1 log_pplx:2.0344479 log_pplx/logits:2.0344479 loss:2.0344479 loss/logits:2.0344479 num_samples_in_batch:256 var_norm/all:703.62555 I0420 07:48:34.608892 140375134025472 trainer.py:511] time: 7.783073 I0420 07:48:34.610289 140375134025472 trainer.py:522] step: 130 fraction_of_correct_next_step_preds:0.4116973 fraction_of_correct_next_step_preds/logits:0.4116973 grad_norm/all:0.97755039 grad_scale_all:1 log_pplx:1.9893242 log_pplx/logits:1.9893242 loss:1.9893242 loss/logits:1.9893242 num_samples_in_batch:128 var_norm/all:703.61316 I0420 07:48:35.871114 140375142418176 trainer.py:371] Steps/second: 0.126976, Examples/second: 17.253137 I0420 07:48:42.722692 140375134025472 trainer.py:511] time: 8.112124 I0420 07:48:42.724195 140375134025472 trainer.py:522] step: 131 fraction_of_correct_next_step_preds:0.4181805 fraction_of_correct_next_step_preds/logits:0.4181805 grad_norm/all:0.60749316 grad_scale_all:1 log_pplx:1.9431043 log_pplx/logits:1.9431043 loss:1.9431043 loss/logits:1.9431043 num_samples_in_batch:128 var_norm/all:703.60071 I0420 07:48:45.880964 140375142418176 trainer.py:371] Steps/second: 0.126714, Examples/second: 17.209899 I0420 07:48:49.451812 140375134025472 trainer.py:511] time: 6.727396 I0420 07:48:49.452764 140375134025472 trainer.py:522] step: 132 fraction_of_correct_next_step_preds:0.41381198 fraction_of_correct_next_step_preds/logits:0.41381198 grad_norm/all:0.83036727 grad_scale_all:1 log_pplx:1.9733791 log_pplx/logits:1.9733791 loss:1.9733791 loss/logits:1.9733791 num_samples_in_batch:128 var_norm/all:703.58807 I0420 07:48:55.892543 140375142418176 trainer.py:371] Steps/second: 0.126457, Examples/second: 17.167462 I0420 07:48:57.872793 140375134025472 trainer.py:511] time: 8.419803 I0420 07:48:57.873647 140375134025472 trainer.py:522] step: 133 fraction_of_correct_next_step_preds:0.41793913 fraction_of_correct_next_step_preds/logits:0.41793913 grad_norm/all:1.3502294 grad_scale_all:0.74061489 log_pplx:1.9688462 log_pplx/logits:1.9688462 loss:1.9688462 loss/logits:1.9688462 num_samples_in_batch:128 var_norm/all:703.57538 I0420 07:49:05.900649 140375142418176 trainer.py:371] Steps/second: 0.126205, Examples/second: 17.125886 I0420 07:49:06.834827 140375134025472 trainer.py:511] time: 8.960698 I0420 07:49:06.835812 140375134025472 trainer.py:522] step: 134 fraction_of_correct_next_step_preds:0.40741351 fraction_of_correct_next_step_preds/logits:0.40741351 grad_norm/all:1.2382512 grad_scale_all:0.80759054 log_pplx:1.9809607 log_pplx/logits:1.9809607 loss:1.9809607 loss/logits:1.9809607 num_samples_in_batch:128 var_norm/all:703.56293 I0420 07:49:14.305708 140375134025472 trainer.py:511] time: 7.469594 I0420 07:49:14.307034 140375134025472 trainer.py:522] step: 135 fraction_of_correct_next_step_preds:0.41747892 fraction_of_correct_next_step_preds/logits:0.41747892 grad_norm/all:0.66647661 grad_scale_all:1 log_pplx:1.9676461 log_pplx/logits:1.9676461 loss:1.9676461 loss/logits:1.9676461 num_samples_in_batch:128 var_norm/all:703.55054 I0420 07:49:15.900914 140375142418176 trainer.py:371] Steps/second: 0.126898, Examples/second: 17.205537 I0420 07:49:22.252697 140375134025472 trainer.py:511] time: 7.945407 I0420 07:49:22.254210 140375134025472 trainer.py:522] step: 136 fraction_of_correct_next_step_preds:0.41611537 fraction_of_correct_next_step_preds/logits:0.41611537 grad_norm/all:0.70107651 grad_scale_all:1 log_pplx:1.9448195 log_pplx/logits:1.9448195 loss:1.9448195 loss/logits:1.9448195 num_samples_in_batch:128 var_norm/all:703.53796 I0420 07:49:25.910279 140375142418176 trainer.py:371] Steps/second: 0.126647, Examples/second: 17.164361 I0420 07:49:28.031122 140375134025472 trainer.py:511] time: 5.776590 I0420 07:49:28.032166 140375134025472 trainer.py:522] step: 137 fraction_of_correct_next_step_preds:0.41829944 fraction_of_correct_next_step_preds/logits:0.41829944 grad_norm/all:0.84144682 grad_scale_all:1 log_pplx:1.9547106 log_pplx/logits:1.9547106 loss:1.9547106 loss/logits:1.9547106 num_samples_in_batch:128 var_norm/all:703.52521 I0420 07:49:35.922549 140375142418176 trainer.py:371] Steps/second: 0.126400, Examples/second: 17.123903 I0420 07:49:36.090886 140375134025472 trainer.py:511] time: 8.058493 I0420 07:49:36.092125 140375134025472 trainer.py:522] step: 138 fraction_of_correct_next_step_preds:0.41517875 fraction_of_correct_next_step_preds/logits:0.41517875 grad_norm/all:1.4478648 grad_scale_all:0.69067222 log_pplx:1.9582996 log_pplx/logits:1.9582996 loss:1.9582996 loss/logits:1.9582996 num_samples_in_batch:128 var_norm/all:703.51233 I0420 07:49:42.946846 140375134025472 trainer.py:511] time: 6.854485 I0420 07:49:42.947705 140375134025472 trainer.py:522] step: 139 fraction_of_correct_next_step_preds:0.41506374 fraction_of_correct_next_step_preds/logits:0.41506374 grad_norm/all:0.63443571 grad_scale_all:1 log_pplx:1.9501776 log_pplx/logits:1.9501776 loss:1.9501776 loss/logits:1.9501776 num_samples_in_batch:128 var_norm/all:703.49976 I0420 07:49:45.931711 140375142418176 trainer.py:371] Steps/second: 0.127071, Examples/second: 17.201245 I0420 07:49:51.288666 140375134025472 trainer.py:511] time: 8.340743 I0420 07:49:51.289536 140375134025472 trainer.py:522] step: 140 fraction_of_correct_next_step_preds:0.4142887 fraction_of_correct_next_step_preds/logits:0.4142887 grad_norm/all:1.1109182 grad_scale_all:0.90015632 log_pplx:1.9538394 log_pplx/logits:1.9538394 loss:1.9538394 loss/logits:1.9538394 num_samples_in_batch:128 var_norm/all:703.487 I0420 07:49:55.941807 140375142418176 trainer.py:371] Steps/second: 0.126825, Examples/second: 17.161219 I0420 07:50:00.419210 140375134025472 trainer.py:511] time: 9.104871 I0420 07:50:00.420401 140375134025472 trainer.py:522] step: 141 fraction_of_correct_next_step_preds:0.4269422 fraction_of_correct_next_step_preds/logits:0.4269422 grad_norm/all:1.5170504 grad_scale_all:0.65917391 log_pplx:1.9341648 log_pplx/logits:1.9341648 loss:1.9341648 loss/logits:1.9341648 num_samples_in_batch:128 var_norm/all:703.4743 I0420 07:50:05.952049 140375142418176 trainer.py:371] Steps/second: 0.126583, Examples/second: 17.121908 I0420 07:50:08.352792 140375134025472 trainer.py:511] time: 7.932153 I0420 07:50:08.353964 140375134025472 trainer.py:522] step: 142 fraction_of_correct_next_step_preds:0.41864634 fraction_of_correct_next_step_preds/logits:0.41864634 grad_norm/all:0.79958361 grad_scale_all:1 log_pplx:1.9402517 log_pplx/logits:1.9402517 loss:1.9402517 loss/logits:1.9402517 num_samples_in_batch:128 var_norm/all:703.46191 I0420 07:50:15.772907 140375134025472 trainer.py:511] time: 7.418736 I0420 07:50:15.774157 140375134025472 trainer.py:522] step: 143 fraction_of_correct_next_step_preds:0.42049187 fraction_of_correct_next_step_preds/logits:0.42049187 grad_norm/all:1.1368365 grad_scale_all:0.87963396 log_pplx:1.9241009 log_pplx/logits:1.9241009 loss:1.9241009 loss/logits:1.9241009 num_samples_in_batch:128 var_norm/all:703.44934 I0420 07:50:16.017703 140375142418176 trainer.py:371] Steps/second: 0.127229, Examples/second: 17.196336 I0420 07:50:21.387326 140375134025472 trainer.py:511] time: 5.612883 I0420 07:50:21.390589 140375134025472 trainer.py:522] step: 144 fraction_of_correct_next_step_preds:0.41908437 fraction_of_correct_next_step_preds/logits:0.41908437 grad_norm/all:1.8576202 grad_scale_all:0.53832316 log_pplx:1.9595932 log_pplx/logits:1.9595932 loss:1.9595932 loss/logits:1.9595932 num_samples_in_batch:128 var_norm/all:703.43665 I0420 07:50:25.971918 140375142418176 trainer.py:371] Steps/second: 0.126994, Examples/second: 17.158259 I0420 07:50:29.444952 140375134025472 trainer.py:511] time: 8.053926 I0420 07:50:29.446449 140375134025472 trainer.py:522] step: 145 fraction_of_correct_next_step_preds:0.42030919 fraction_of_correct_next_step_preds/logits:0.42030919 grad_norm/all:0.76838368 grad_scale_all:1 log_pplx:1.9421583 log_pplx/logits:1.9421583 loss:1.9421583 loss/logits:1.9421583 num_samples_in_batch:128 var_norm/all:703.42462 I0420 07:50:35.969763 140375142418176 trainer.py:371] Steps/second: 0.126758, Examples/second: 17.120192 I0420 07:50:36.290680 140375134025472 trainer.py:511] time: 6.844025 I0420 07:50:36.291791 140375134025472 trainer.py:522] step: 146 fraction_of_correct_next_step_preds:0.40135837 fraction_of_correct_next_step_preds/logits:0.40135837 grad_norm/all:2.3181317 grad_scale_all:0.43138188 log_pplx:1.9873381 log_pplx/logits:1.9873381 loss:1.9873381 loss/logits:1.9873381 num_samples_in_batch:128 var_norm/all:703.41223 I0420 07:50:40.472434 140375134025472 trainer.py:511] time: 4.180376 I0420 07:50:40.473872 140375134025472 trainer.py:522] step: 147 fraction_of_correct_next_step_preds:0.39410341 fraction_of_correct_next_step_preds/logits:0.39410341 grad_norm/all:1.6320084 grad_scale_all:0.61274195 log_pplx:2.002454 log_pplx/logits:2.002454 loss:2.002454 loss/logits:2.002454 num_samples_in_batch:256 var_norm/all:703.40051 I0420 07:50:45.980564 140375142418176 trainer.py:371] Steps/second: 0.127391, Examples/second: 17.304445 I0420 07:50:48.851433 140375134025472 trainer.py:511] time: 8.377389 I0420 07:50:48.852555 140375134025472 trainer.py:522] step: 148 fraction_of_correct_next_step_preds:0.41687503 fraction_of_correct_next_step_preds/logits:0.41687503 grad_norm/all:1.9557862 grad_scale_all:0.51130331 log_pplx:1.9490486 log_pplx/logits:1.9490486 loss:1.9490486 loss/logits:1.9490486 num_samples_in_batch:128 var_norm/all:703.3891 I0420 07:50:55.991750 140375142418176 trainer.py:371] Steps/second: 0.127155, Examples/second: 17.265579 I0420 07:50:57.909780 140375134025472 trainer.py:511] time: 9.057049 I0420 07:50:57.910986 140375134025472 trainer.py:522] step: 149 fraction_of_correct_next_step_preds:0.42624637 fraction_of_correct_next_step_preds/logits:0.42624637 grad_norm/all:1.7976826 grad_scale_all:0.55627173 log_pplx:1.9306651 log_pplx/logits:1.9306651 loss:1.9306651 loss/logits:1.9306651 num_samples_in_batch:128 var_norm/all:703.37805 I0420 07:51:05.706504 140375134025472 trainer.py:511] time: 7.795318 I0420 07:51:05.707679 140375134025472 trainer.py:522] step: 150 fraction_of_correct_next_step_preds:0.41418889 fraction_of_correct_next_step_preds/logits:0.41418889 grad_norm/all:1.5489297 grad_scale_all:0.64560711 log_pplx:1.9437562 log_pplx/logits:1.9437562 loss:1.9437562 loss/logits:1.9437562 num_samples_in_batch:128 var_norm/all:703.36743 I0420 07:51:06.000524 140375142418176 trainer.py:371] Steps/second: 0.127774, Examples/second: 17.336444 I0420 07:51:06.001041 140375142418176 trainer.py:268] Save checkpoint W0420 07:51:08.298032 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'dict' object has no attribute 'name' W0420 07:51:08.298470 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'list' object has no attribute 'name' I0420 07:51:08.496934 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000150 I0420 07:51:12.993555 140375134025472 trainer.py:511] time: 7.285614 I0420 07:51:12.994955 140375134025472 trainer.py:522] step: 151 fraction_of_correct_next_step_preds:0.41762453 fraction_of_correct_next_step_preds/logits:0.41762453 grad_norm/all:1.4242008 grad_scale_all:0.7021482 log_pplx:1.9302332 log_pplx/logits:1.9302332 loss:1.9302332 loss/logits:1.9302332 num_samples_in_batch:128 var_norm/all:703.35681 I0420 07:51:16.007457 140375142418176 trainer.py:371] Steps/second: 0.127539, Examples/second: 17.298026 I0420 07:51:18.698502 140375134025472 trainer.py:511] time: 5.703286 I0420 07:51:18.699486 140375134025472 trainer.py:522] step: 152 fraction_of_correct_next_step_preds:0.42245334 fraction_of_correct_next_step_preds/logits:0.42245334 grad_norm/all:1.4734817 grad_scale_all:0.67866468 log_pplx:1.9470966 log_pplx/logits:1.9470966 loss:1.9470966 loss/logits:1.9470966 num_samples_in_batch:128 var_norm/all:703.34631 I0420 07:51:26.018496 140375142418176 trainer.py:371] Steps/second: 0.127307, Examples/second: 17.260195 I0420 07:51:26.805413 140375134025472 trainer.py:511] time: 8.105734 I0420 07:51:26.806492 140375134025472 trainer.py:522] step: 153 fraction_of_correct_next_step_preds:0.41691357 fraction_of_correct_next_step_preds/logits:0.41691357 grad_norm/all:1.4528413 grad_scale_all:0.68830645 log_pplx:1.9422051 log_pplx/logits:1.9422051 loss:1.9422051 loss/logits:1.9422051 num_samples_in_batch:128 var_norm/all:703.33582 I0420 07:51:33.581444 140375134025472 trainer.py:511] time: 6.774749 I0420 07:51:33.582201 140375134025472 trainer.py:522] step: 154 fraction_of_correct_next_step_preds:0.42445534 fraction_of_correct_next_step_preds/logits:0.42445534 grad_norm/all:1.031893 grad_scale_all:0.96909273 log_pplx:1.9087092 log_pplx/logits:1.9087092 loss:1.9087092 loss/logits:1.9087092 num_samples_in_batch:128 var_norm/all:703.32538 I0420 07:51:36.028353 140375142418176 trainer.py:371] Steps/second: 0.127910, Examples/second: 17.329321 I0420 07:51:41.803100 140375134025472 trainer.py:511] time: 8.220578 I0420 07:51:41.804466 140375134025472 trainer.py:522] step: 155 fraction_of_correct_next_step_preds:0.42430681 fraction_of_correct_next_step_preds/logits:0.42430681 grad_norm/all:1.2035612 grad_scale_all:0.83086759 log_pplx:1.9094154 log_pplx/logits:1.9094154 loss:1.9094154 loss/logits:1.9094154 num_samples_in_batch:128 var_norm/all:703.31458 I0420 07:51:46.037686 140375142418176 trainer.py:371] Steps/second: 0.127679, Examples/second: 17.291878 I0420 07:51:49.656562 140375134025472 trainer.py:511] time: 7.851894 I0420 07:51:49.657808 140375134025472 trainer.py:522] step: 156 fraction_of_correct_next_step_preds:0.43289766 fraction_of_correct_next_step_preds/logits:0.43289766 grad_norm/all:0.75488198 grad_scale_all:1 log_pplx:1.8982086 log_pplx/logits:1.8982086 loss:1.8982086 loss/logits:1.8982086 num_samples_in_batch:128 var_norm/all:703.30353 I0420 07:51:56.048183 140375142418176 trainer.py:371] Steps/second: 0.127452, Examples/second: 17.255033 I0420 07:51:58.760415 140375134025472 trainer.py:511] time: 9.102371 I0420 07:51:58.761346 140375134025472 trainer.py:522] step: 157 fraction_of_correct_next_step_preds:0.43443087 fraction_of_correct_next_step_preds/logits:0.43443087 grad_norm/all:1.1150398 grad_scale_all:0.89682895 log_pplx:1.9025295 log_pplx/logits:1.9025295 loss:1.9025295 loss/logits:1.9025295 num_samples_in_batch:128 var_norm/all:703.29211 I0420 07:52:06.058149 140375142418176 trainer.py:371] Steps/second: 0.127228, Examples/second: 17.218792 I0420 07:52:06.211291 140375134025472 trainer.py:511] time: 7.449661 I0420 07:52:06.212383 140375134025472 trainer.py:522] step: 158 fraction_of_correct_next_step_preds:0.43455517 fraction_of_correct_next_step_preds/logits:0.43455517 grad_norm/all:0.75359613 grad_scale_all:1 log_pplx:1.8875916 log_pplx/logits:1.8875916 loss:1.8875916 loss/logits:1.8875916 num_samples_in_batch:128 var_norm/all:703.28046 I0420 07:52:11.963902 140375134025472 trainer.py:511] time: 5.751313 I0420 07:52:11.964863 140375134025472 trainer.py:522] step: 159 fraction_of_correct_next_step_preds:0.42446634 fraction_of_correct_next_step_preds/logits:0.42446634 grad_norm/all:0.78566986 grad_scale_all:1 log_pplx:1.9137037 log_pplx/logits:1.9137037 loss:1.9137037 loss/logits:1.9137037 num_samples_in_batch:128 var_norm/all:703.26849 I0420 07:52:16.069747 140375142418176 trainer.py:371] Steps/second: 0.127812, Examples/second: 17.286004 I0420 07:52:19.933387 140375134025472 trainer.py:511] time: 7.968307 I0420 07:52:19.934446 140375134025472 trainer.py:522] step: 160 fraction_of_correct_next_step_preds:0.43351826 fraction_of_correct_next_step_preds/logits:0.43351826 grad_norm/all:0.69631839 grad_scale_all:1 log_pplx:1.8786674 log_pplx/logits:1.8786674 loss:1.8786674 loss/logits:1.8786674 num_samples_in_batch:128 var_norm/all:703.2561 I0420 07:52:26.076610 140375142418176 trainer.py:371] Steps/second: 0.127590, Examples/second: 17.250133 I0420 07:52:27.057111 140375134025472 trainer.py:511] time: 7.122404 I0420 07:52:27.058207 140375134025472 trainer.py:522] step: 161 fraction_of_correct_next_step_preds:0.43161297 fraction_of_correct_next_step_preds/logits:0.43161297 grad_norm/all:0.67985362 grad_scale_all:1 log_pplx:1.8794951 log_pplx/logits:1.8794951 loss:1.8794951 loss/logits:1.8794951 num_samples_in_batch:128 var_norm/all:703.24347 I0420 07:52:31.308384 140375134025472 trainer.py:511] time: 4.249896 I0420 07:52:31.309541 140375134025472 trainer.py:522] step: 162 fraction_of_correct_next_step_preds:0.42555854 fraction_of_correct_next_step_preds/logits:0.42555854 grad_norm/all:0.81493938 grad_scale_all:1 log_pplx:1.926751 log_pplx/logits:1.926751 loss:1.926751 loss/logits:1.926751 num_samples_in_batch:256 var_norm/all:703.23053 I0420 07:52:36.087415 140375142418176 trainer.py:371] Steps/second: 0.128162, Examples/second: 17.417308 I0420 07:52:39.672578 140375134025472 trainer.py:511] time: 8.335060 I0420 07:52:39.673749 140375134025472 trainer.py:522] step: 163 fraction_of_correct_next_step_preds:0.4331291 fraction_of_correct_next_step_preds/logits:0.4331291 grad_norm/all:0.95574957 grad_scale_all:1 log_pplx:1.8833166 log_pplx/logits:1.8833166 loss:1.8833166 loss/logits:1.8833166 num_samples_in_batch:128 var_norm/all:703.21741 I0420 07:52:46.096961 140375142418176 trainer.py:371] Steps/second: 0.127940, Examples/second: 17.380935 I0420 07:52:48.769392 140375134025472 trainer.py:511] time: 9.095292 I0420 07:52:48.770503 140375134025472 trainer.py:522] step: 164 fraction_of_correct_next_step_preds:0.43304643 fraction_of_correct_next_step_preds/logits:0.43304643 grad_norm/all:0.83339173 grad_scale_all:1 log_pplx:1.8863426 log_pplx/logits:1.8863426 loss:1.8863426 loss/logits:1.8863426 num_samples_in_batch:128 var_norm/all:703.20416 I0420 07:52:56.107681 140375142418176 trainer.py:371] Steps/second: 0.127721, Examples/second: 17.345115 I0420 07:52:56.502384 140375134025472 trainer.py:511] time: 7.731666 I0420 07:52:56.503449 140375134025472 trainer.py:522] step: 165 fraction_of_correct_next_step_preds:0.43856791 fraction_of_correct_next_step_preds/logits:0.43856791 grad_norm/all:0.69169378 grad_scale_all:1 log_pplx:1.8613528 log_pplx/logits:1.8613528 loss:1.8613528 loss/logits:1.8613528 num_samples_in_batch:128 var_norm/all:703.19061 I0420 07:53:03.854849 140375134025472 trainer.py:511] time: 7.351137 I0420 07:53:03.856317 140375134025472 trainer.py:522] step: 166 fraction_of_correct_next_step_preds:0.43318194 fraction_of_correct_next_step_preds/logits:0.43318194 grad_norm/all:0.88919246 grad_scale_all:1 log_pplx:1.8757007 log_pplx/logits:1.8757007 loss:1.8757007 loss/logits:1.8757007 num_samples_in_batch:128 var_norm/all:703.17694 I0420 07:53:06.116745 140375142418176 trainer.py:371] Steps/second: 0.128279, Examples/second: 17.408784 I0420 07:53:09.463217 140375134025472 trainer.py:511] time: 5.606602 I0420 07:53:09.464037 140375134025472 trainer.py:522] step: 167 fraction_of_correct_next_step_preds:0.43450034 fraction_of_correct_next_step_preds/logits:0.43450034 grad_norm/all:1.098273 grad_scale_all:0.91052037 log_pplx:1.8758377 log_pplx/logits:1.8758377 loss:1.8758377 loss/logits:1.8758377 num_samples_in_batch:128 var_norm/all:703.16309 I0420 07:53:16.127732 140375142418176 trainer.py:371] Steps/second: 0.128061, Examples/second: 17.373296 I0420 07:53:17.562886 140375134025472 trainer.py:511] time: 8.098476 I0420 07:53:17.563997 140375134025472 trainer.py:522] step: 168 fraction_of_correct_next_step_preds:0.43977895 fraction_of_correct_next_step_preds/logits:0.43977895 grad_norm/all:0.50802606 grad_scale_all:1 log_pplx:1.8625075 log_pplx/logits:1.8625075 loss:1.8625075 loss/logits:1.8625075 num_samples_in_batch:128 var_norm/all:703.14923 I0420 07:53:24.470643 140375134025472 trainer.py:511] time: 6.906448 I0420 07:53:24.471700 140375134025472 trainer.py:522] step: 169 fraction_of_correct_next_step_preds:0.43477678 fraction_of_correct_next_step_preds/logits:0.43477678 grad_norm/all:0.91479504 grad_scale_all:1 log_pplx:1.8775673 log_pplx/logits:1.8775673 loss:1.8775673 loss/logits:1.8775673 num_samples_in_batch:128 var_norm/all:703.13519 I0420 07:53:26.137291 140375142418176 trainer.py:371] Steps/second: 0.128607, Examples/second: 17.435774 I0420 07:53:33.559447 140375134025472 trainer.py:511] time: 9.087479 I0420 07:53:33.560532 140375134025472 trainer.py:522] step: 170 fraction_of_correct_next_step_preds:0.44339436 fraction_of_correct_next_step_preds/logits:0.44339436 grad_norm/all:0.80379444 grad_scale_all:1 log_pplx:1.8566017 log_pplx/logits:1.8566017 loss:1.8566017 loss/logits:1.8566017 num_samples_in_batch:128 var_norm/all:703.12097 I0420 07:53:36.147758 140375142418176 trainer.py:371] Steps/second: 0.128390, Examples/second: 17.400624 I0420 07:53:41.228625 140375134025472 trainer.py:511] time: 7.667775 I0420 07:53:41.229818 140375134025472 trainer.py:522] step: 171 fraction_of_correct_next_step_preds:0.44055197 fraction_of_correct_next_step_preds/logits:0.44055197 grad_norm/all:0.43629116 grad_scale_all:1 log_pplx:1.8634899 log_pplx/logits:1.8634899 loss:1.8634899 loss/logits:1.8634899 num_samples_in_batch:128 var_norm/all:703.10669 I0420 07:53:46.150906 140375142418176 trainer.py:371] Steps/second: 0.128177, Examples/second: 17.366098 I0420 07:53:49.603144 140375134025472 trainer.py:511] time: 8.372799 I0420 07:53:49.604515 140375134025472 trainer.py:522] step: 172 fraction_of_correct_next_step_preds:0.4352791 fraction_of_correct_next_step_preds/logits:0.4352791 grad_norm/all:0.70247895 grad_scale_all:1 log_pplx:1.863916 log_pplx/logits:1.863916 loss:1.863916 loss/logits:1.863916 num_samples_in_batch:128 var_norm/all:703.09222 I0420 07:53:56.157335 140375142418176 trainer.py:371] Steps/second: 0.127967, Examples/second: 17.332044 I0420 07:53:56.834737 140375134025472 trainer.py:511] time: 7.229920 I0420 07:53:56.835755 140375134025472 trainer.py:522] step: 173 fraction_of_correct_next_step_preds:0.43851206 fraction_of_correct_next_step_preds/logits:0.43851206 grad_norm/all:0.53970212 grad_scale_all:1 log_pplx:1.8514322 log_pplx/logits:1.8514322 loss:1.8514322 loss/logits:1.8514322 num_samples_in_batch:128 var_norm/all:703.07758 I0420 07:54:03.801841 140375134025472 trainer.py:511] time: 6.965903 I0420 07:54:03.802872 140375134025472 trainer.py:522] step: 174 fraction_of_correct_next_step_preds:0.43447852 fraction_of_correct_next_step_preds/logits:0.43447852 grad_norm/all:0.76130515 grad_scale_all:1 log_pplx:1.8808705 log_pplx/logits:1.8808705 loss:1.8808705 loss/logits:1.8808705 num_samples_in_batch:128 var_norm/all:703.06281 I0420 07:54:06.168541 140375142418176 trainer.py:371] Steps/second: 0.128498, Examples/second: 17.392959 I0420 07:54:09.511055 140375134025472 trainer.py:511] time: 5.707862 I0420 07:54:09.512670 140375134025472 trainer.py:522] step: 175 fraction_of_correct_next_step_preds:0.43812004 fraction_of_correct_next_step_preds/logits:0.43812004 grad_norm/all:0.58219862 grad_scale_all:1 log_pplx:1.8744855 log_pplx/logits:1.8744855 loss:1.8744855 loss/logits:1.8744855 num_samples_in_batch:128 var_norm/all:703.04791 I0420 07:54:16.176268 140375142418176 trainer.py:371] Steps/second: 0.128288, Examples/second: 17.359190 I0420 07:54:18.592397 140375134025472 trainer.py:511] time: 9.079428 I0420 07:54:18.593550 140375134025472 trainer.py:522] step: 176 fraction_of_correct_next_step_preds:0.44414341 fraction_of_correct_next_step_preds/logits:0.44414341 grad_norm/all:0.6023612 grad_scale_all:1 log_pplx:1.8393581 log_pplx/logits:1.8393581 loss:1.8393581 loss/logits:1.8393581 num_samples_in_batch:128 var_norm/all:703.03296 I0420 07:54:26.179130 140375142418176 trainer.py:371] Steps/second: 0.128082, Examples/second: 17.325976 I0420 07:54:26.692585 140375134025472 trainer.py:511] time: 8.098862 I0420 07:54:26.693692 140375134025472 trainer.py:522] step: 177 fraction_of_correct_next_step_preds:0.44614813 fraction_of_correct_next_step_preds/logits:0.44614813 grad_norm/all:0.41581571 grad_scale_all:1 log_pplx:1.8420435 log_pplx/logits:1.8420435 loss:1.8420435 loss/logits:1.8420435 num_samples_in_batch:128 var_norm/all:703.01788 I0420 07:54:34.614481 140375134025472 trainer.py:511] time: 7.920510 I0420 07:54:34.617162 140375134025472 trainer.py:522] step: 178 fraction_of_correct_next_step_preds:0.44184723 fraction_of_correct_next_step_preds/logits:0.44184723 grad_norm/all:0.62634295 grad_scale_all:1 log_pplx:1.8419659 log_pplx/logits:1.8419659 loss:1.8419659 loss/logits:1.8419659 num_samples_in_batch:128 var_norm/all:703.00269 I0420 07:54:36.192054 140375142418176 trainer.py:371] Steps/second: 0.128600, Examples/second: 17.478068 I0420 07:54:38.892163 140375134025472 trainer.py:511] time: 4.274740 I0420 07:54:38.893187 140375134025472 trainer.py:522] step: 179 fraction_of_correct_next_step_preds:0.43799087 fraction_of_correct_next_step_preds/logits:0.43799087 grad_norm/all:0.61439598 grad_scale_all:1 log_pplx:1.8842611 log_pplx/logits:1.8842611 loss:1.8842611 loss/logits:1.8842611 num_samples_in_batch:256 var_norm/all:702.98743 I0420 07:54:46.200571 140375142418176 trainer.py:371] Steps/second: 0.128394, Examples/second: 17.444407 I0420 07:54:47.041579 140375134025472 trainer.py:511] time: 8.148187 I0420 07:54:47.042654 140375134025472 trainer.py:522] step: 180 fraction_of_correct_next_step_preds:0.44396985 fraction_of_correct_next_step_preds/logits:0.44396985 grad_norm/all:0.8471067 grad_scale_all:1 log_pplx:1.8453927 log_pplx/logits:1.8453927 loss:1.8453927 loss/logits:1.8453927 num_samples_in_batch:128 var_norm/all:702.97211 I0420 07:54:54.404463 140375134025472 trainer.py:511] time: 7.361601 I0420 07:54:54.405478 140375134025472 trainer.py:522] step: 181 fraction_of_correct_next_step_preds:0.44369069 fraction_of_correct_next_step_preds/logits:0.44369069 grad_norm/all:0.48298407 grad_scale_all:1 log_pplx:1.8413055 log_pplx/logits:1.8413055 loss:1.8413055 loss/logits:1.8413055 num_samples_in_batch:128 var_norm/all:702.95679 I0420 07:54:56.209449 140375142418176 trainer.py:371] Steps/second: 0.128903, Examples/second: 17.502377 I0420 07:55:01.492369 140375134025472 trainer.py:511] time: 7.086571 I0420 07:55:01.493474 140375134025472 trainer.py:522] step: 182 fraction_of_correct_next_step_preds:0.44348097 fraction_of_correct_next_step_preds/logits:0.44348097 grad_norm/all:0.70817178 grad_scale_all:1 log_pplx:1.8433393 log_pplx/logits:1.8433393 loss:1.8433393 loss/logits:1.8433393 num_samples_in_batch:128 var_norm/all:702.94135 I0420 07:55:06.219089 140375142418176 trainer.py:371] Steps/second: 0.128698, Examples/second: 17.469006 I0420 07:55:10.661694 140375134025472 trainer.py:511] time: 9.167789 I0420 07:55:10.662916 140375134025472 trainer.py:522] step: 183 fraction_of_correct_next_step_preds:0.44200012 fraction_of_correct_next_step_preds/logits:0.44200012 grad_norm/all:0.78830743 grad_scale_all:1 log_pplx:1.8404706 log_pplx/logits:1.8404706 loss:1.8404706 loss/logits:1.8404706 num_samples_in_batch:128 var_norm/all:702.92584 I0420 07:55:16.229026 140375142418176 trainer.py:371] Steps/second: 0.128496, Examples/second: 17.436100 I0420 07:55:16.485204 140375134025472 trainer.py:511] time: 5.821826 I0420 07:55:16.486191 140375134025472 trainer.py:522] step: 184 fraction_of_correct_next_step_preds:0.44723406 fraction_of_correct_next_step_preds/logits:0.44723406 grad_norm/all:0.84321076 grad_scale_all:1 log_pplx:1.8484147 log_pplx/logits:1.8484147 loss:1.8484147 loss/logits:1.8484147 num_samples_in_batch:128 var_norm/all:702.91028 I0420 07:55:24.522628 140375134025472 trainer.py:511] time: 8.036144 I0420 07:55:24.524127 140375134025472 trainer.py:522] step: 185 fraction_of_correct_next_step_preds:0.44457987 fraction_of_correct_next_step_preds/logits:0.44457987 grad_norm/all:0.94873601 grad_scale_all:1 log_pplx:1.8245742 log_pplx/logits:1.8245742 loss:1.8245742 loss/logits:1.8245742 num_samples_in_batch:128 var_norm/all:702.89471 I0420 07:55:26.238769 140375142418176 trainer.py:371] Steps/second: 0.128993, Examples/second: 17.492905 I0420 07:55:32.738296 140375134025472 trainer.py:511] time: 8.213897 I0420 07:55:32.740339 140375134025472 trainer.py:522] step: 186 fraction_of_correct_next_step_preds:0.4430176 fraction_of_correct_next_step_preds/logits:0.4430176 grad_norm/all:0.93333519 grad_scale_all:1 log_pplx:1.8442669 log_pplx/logits:1.8442669 loss:1.8442669 loss/logits:1.8442669 num_samples_in_batch:128 var_norm/all:702.87897 I0420 07:55:36.248698 140375142418176 trainer.py:371] Steps/second: 0.128792, Examples/second: 17.460290 I0420 07:55:40.608198 140375134025472 trainer.py:511] time: 7.867668 I0420 07:55:40.609160 140375134025472 trainer.py:522] step: 187 fraction_of_correct_next_step_preds:0.44213852 fraction_of_correct_next_step_preds/logits:0.44213852 grad_norm/all:0.77158284 grad_scale_all:1 log_pplx:1.8315784 log_pplx/logits:1.8315784 loss:1.8315784 loss/logits:1.8315784 num_samples_in_batch:128 var_norm/all:702.86328 I0420 07:55:46.251959 140375142418176 trainer.py:371] Steps/second: 0.128594, Examples/second: 17.428206 I0420 07:55:48.021662 140375134025472 trainer.py:511] time: 7.412346 I0420 07:55:48.022475 140375134025472 trainer.py:522] step: 188 fraction_of_correct_next_step_preds:0.4498955 fraction_of_correct_next_step_preds/logits:0.4498955 grad_norm/all:0.72052014 grad_scale_all:1 log_pplx:1.8241475 log_pplx/logits:1.8241475 loss:1.8241475 loss/logits:1.8241475 num_samples_in_batch:128 var_norm/all:702.84747 I0420 07:55:56.254241 140375142418176 trainer.py:371] Steps/second: 0.128398, Examples/second: 17.396567 I0420 07:55:57.268708 140375134025472 trainer.py:511] time: 9.246066 I0420 07:55:57.269650 140375134025472 trainer.py:522] step: 189 fraction_of_correct_next_step_preds:0.45732975 fraction_of_correct_next_step_preds/logits:0.45732975 grad_norm/all:0.61174208 grad_scale_all:1 log_pplx:1.8068582 log_pplx/logits:1.8068582 loss:1.8068582 loss/logits:1.8068582 num_samples_in_batch:128 var_norm/all:702.8316 I0420 07:56:03.981911 140375134025472 trainer.py:511] time: 6.711774 I0420 07:56:03.983181 140375134025472 trainer.py:522] step: 190 fraction_of_correct_next_step_preds:0.44194993 fraction_of_correct_next_step_preds/logits:0.44194993 grad_norm/all:0.84591258 grad_scale_all:1 log_pplx:1.8346351 log_pplx/logits:1.8346351 loss:1.8346351 loss/logits:1.8346351 num_samples_in_batch:128 var_norm/all:702.81567 I0420 07:56:06.265222 140375142418176 trainer.py:371] Steps/second: 0.128883, Examples/second: 17.452085 I0420 07:56:09.590390 140375134025472 trainer.py:511] time: 5.606873 I0420 07:56:09.591136 140375134025472 trainer.py:522] step: 191 fraction_of_correct_next_step_preds:0.44372281 fraction_of_correct_next_step_preds/logits:0.44372281 grad_norm/all:0.93953317 grad_scale_all:1 log_pplx:1.8477401 log_pplx/logits:1.8477401 loss:1.8477401 loss/logits:1.8477401 num_samples_in_batch:128 var_norm/all:702.79968 I0420 07:56:16.276658 140375142418176 trainer.py:371] Steps/second: 0.128687, Examples/second: 17.420608 I0420 07:56:17.942615 140375134025472 trainer.py:511] time: 8.351234 I0420 07:56:17.943763 140375134025472 trainer.py:522] step: 192 fraction_of_correct_next_step_preds:0.44863379 fraction_of_correct_next_step_preds/logits:0.44863379 grad_norm/all:0.96895403 grad_scale_all:1 log_pplx:1.819329 log_pplx/logits:1.819329 loss:1.819329 loss/logits:1.819329 num_samples_in_batch:128 var_norm/all:702.78369 I0420 07:56:26.073008 140375134025472 trainer.py:511] time: 8.128952 I0420 07:56:26.074163 140375134025472 trainer.py:522] step: 193 fraction_of_correct_next_step_preds:0.44438726 fraction_of_correct_next_step_preds/logits:0.44438726 grad_norm/all:0.88765609 grad_scale_all:1 log_pplx:1.83139 log_pplx/logits:1.83139 loss:1.83139 loss/logits:1.83139 num_samples_in_batch:128 var_norm/all:702.76752 I0420 07:56:26.339052 140375142418176 trainer.py:371] Steps/second: 0.129159, Examples/second: 17.474617 I0420 07:56:33.424146 140375134025472 trainer.py:511] time: 7.349628 I0420 07:56:33.425277 140375134025472 trainer.py:522] step: 194 fraction_of_correct_next_step_preds:0.45994666 fraction_of_correct_next_step_preds/logits:0.45994666 grad_norm/all:0.60090661 grad_scale_all:1 log_pplx:1.7978886 log_pplx/logits:1.7978886 loss:1.7978886 loss/logits:1.7978886 num_samples_in_batch:128 var_norm/all:702.7514 I0420 07:56:36.295216 140375142418176 trainer.py:371] Steps/second: 0.128969, Examples/second: 17.444049 I0420 07:56:41.154331 140375134025472 trainer.py:511] time: 7.728805 I0420 07:56:41.155858 140375134025472 trainer.py:522] step: 195 fraction_of_correct_next_step_preds:0.4580164 fraction_of_correct_next_step_preds/logits:0.4580164 grad_norm/all:0.74558669 grad_scale_all:1 log_pplx:1.7964088 log_pplx/logits:1.7964088 loss:1.7964088 loss/logits:1.7964088 num_samples_in_batch:128 var_norm/all:702.73523 I0420 07:56:45.528053 140375134025472 trainer.py:511] time: 4.371963 I0420 07:56:45.528832 140375134025472 trainer.py:522] step: 196 fraction_of_correct_next_step_preds:0.44655734 fraction_of_correct_next_step_preds/logits:0.44655734 grad_norm/all:0.8231467 grad_scale_all:1 log_pplx:1.8481317 log_pplx/logits:1.8481317 loss:1.8481317 loss/logits:1.8481317 num_samples_in_batch:256 var_norm/all:702.71906 I0420 07:56:46.304059 140375142418176 trainer.py:371] Steps/second: 0.129437, Examples/second: 17.582339 I0420 07:56:54.434890 140375134025472 trainer.py:511] time: 8.905689 I0420 07:56:54.436141 140375134025472 trainer.py:522] step: 197 fraction_of_correct_next_step_preds:0.45484895 fraction_of_correct_next_step_preds/logits:0.45484895 grad_norm/all:0.79258645 grad_scale_all:1 log_pplx:1.8016788 log_pplx/logits:1.8016788 loss:1.8016788 loss/logits:1.8016788 num_samples_in_batch:128 var_norm/all:702.70276 I0420 07:56:56.315849 140375142418176 trainer.py:371] Steps/second: 0.129243, Examples/second: 17.550831 I0420 07:57:01.329133 140375134025472 trainer.py:511] time: 6.892796 I0420 07:57:01.330216 140375134025472 trainer.py:522] step: 198 fraction_of_correct_next_step_preds:0.44605467 fraction_of_correct_next_step_preds/logits:0.44605467 grad_norm/all:1.2284788 grad_scale_all:0.81401485 log_pplx:1.8213179 log_pplx/logits:1.8213179 loss:1.8213179 loss/logits:1.8213179 num_samples_in_batch:128 var_norm/all:702.68652 I0420 07:57:06.324517 140375142418176 trainer.py:371] Steps/second: 0.129052, Examples/second: 17.519764 I0420 07:57:09.592889 140375134025472 trainer.py:511] time: 8.262392 I0420 07:57:09.595169 140375134025472 trainer.py:522] step: 199 fraction_of_correct_next_step_preds:0.45952439 fraction_of_correct_next_step_preds/logits:0.45952439 grad_norm/all:0.93626809 grad_scale_all:1 log_pplx:1.807832 log_pplx/logits:1.807832 loss:1.807832 loss/logits:1.807832 num_samples_in_batch:128 var_norm/all:702.67047 I0420 07:57:15.281085 140375134025472 trainer.py:511] time: 5.685653 I0420 07:57:15.282196 140375134025472 trainer.py:522] step: 200 fraction_of_correct_next_step_preds:0.45597693 fraction_of_correct_next_step_preds/logits:0.45597693 grad_norm/all:1.0720363 grad_scale_all:0.93280429 log_pplx:1.8038468 log_pplx/logits:1.8038468 loss:1.8038468 loss/logits:1.8038468 num_samples_in_batch:128 var_norm/all:702.65454 I0420 07:57:16.325232 140375142418176 trainer.py:371] Steps/second: 0.129511, Examples/second: 17.572081 I0420 07:57:23.514617 140375134025472 trainer.py:511] time: 8.014513 I0420 07:57:23.515427 140375134025472 base_runner.py:115] step: 201 fraction_of_correct_next_step_preds:0.46213683 fraction_of_correct_next_step_preds/logits:0.46213683 grad_norm/all:0.51647431 grad_scale_all:1 log_pplx:1.775112 log_pplx/logits:1.775112 loss:1.775112 loss/logits:1.775112 num_samples_in_batch:128 var_norm/all:702.63855 I0420 07:57:26.336309 140375142418176 trainer.py:371] Steps/second: 0.129320, Examples/second: 17.541254 I0420 07:57:26.337402 140375142418176 trainer.py:275] Write summary @201 2019-04-20 07:57:26.341258: I lingvo/core/ops/record_batcher.cc:344] 1537 total seconds passed. Total records yielded: 304. Total records skipped: 0 I0420 07:57:32.520065 140375134025472 trainer.py:511] time: 9.004160 I0420 07:57:32.524442 140375134025472 trainer.py:522] step: 202 fraction_of_correct_next_step_preds:0.44965822 fraction_of_correct_next_step_preds/logits:0.44965822 grad_norm/all:0.92829382 grad_scale_all:1 log_pplx:1.8050215 log_pplx/logits:1.8050215 loss:1.8050215 loss/logits:1.8050215 num_samples_in_batch:128 var_norm/all:702.62244 I0420 07:57:42.691868 140375134025472 trainer.py:511] time: 10.166058 I0420 07:57:42.693574 140375134025472 trainer.py:522] step: 203 fraction_of_correct_next_step_preds:0.45663935 fraction_of_correct_next_step_preds/logits:0.45663935 grad_norm/all:0.89732426 grad_scale_all:1 log_pplx:1.7954292 log_pplx/logits:1.7954292 loss:1.7954292 loss/logits:1.7954292 num_samples_in_batch:128 var_norm/all:702.60632 I0420 07:57:56.001332 140375134025472 trainer.py:511] time: 13.307436 I0420 07:57:56.002633 140375134025472 trainer.py:522] step: 204 fraction_of_correct_next_step_preds:0.46008012 fraction_of_correct_next_step_preds/logits:0.46008012 grad_norm/all:0.80453771 grad_scale_all:1 log_pplx:1.7907335 log_pplx/logits:1.7907335 loss:1.7907335 loss/logits:1.7907335 num_samples_in_batch:128 var_norm/all:702.59015 I0420 07:58:06.379566 140375134025472 trainer.py:511] time: 10.376438 I0420 07:58:06.381184 140375134025472 trainer.py:522] step: 205 fraction_of_correct_next_step_preds:0.45583883 fraction_of_correct_next_step_preds/logits:0.45583883 grad_norm/all:0.98260665 grad_scale_all:1 log_pplx:1.7955699 log_pplx/logits:1.7955699 loss:1.7955699 loss/logits:1.7955699 num_samples_in_batch:128 var_norm/all:702.57391 I0420 07:58:11.054907 140375142418176 trainer.py:284] Write summary done: step 201 I0420 07:58:11.064054 140375142418176 base_runner.py:115] step: 201, steps/sec: 0.13, examples/sec: 17.54 I0420 07:58:11.067320 140375142418176 trainer.py:371] Steps/second: 0.128204, Examples/second: 17.370748 I0420 07:58:16.056494 140375134025472 trainer.py:511] time: 9.675025 I0420 07:58:16.057492 140375134025472 trainer.py:522] step: 206 fraction_of_correct_next_step_preds:0.46253374 fraction_of_correct_next_step_preds/logits:0.46253374 grad_norm/all:0.79528224 grad_scale_all:1 log_pplx:1.7718325 log_pplx/logits:1.7718325 loss:1.7718325 loss/logits:1.7718325 num_samples_in_batch:128 var_norm/all:702.55756 I0420 07:58:21.075256 140375142418176 trainer.py:371] Steps/second: 0.128028, Examples/second: 17.342256 I0420 07:58:21.783164 140375134025472 trainer.py:511] time: 5.725306 I0420 07:58:21.784272 140375134025472 trainer.py:522] step: 207 fraction_of_correct_next_step_preds:0.45435455 fraction_of_correct_next_step_preds/logits:0.45435455 grad_norm/all:0.87439638 grad_scale_all:1 log_pplx:1.8060277 log_pplx/logits:1.8060277 loss:1.8060277 loss/logits:1.8060277 num_samples_in_batch:128 var_norm/all:702.54114 I0420 07:58:29.925393 140375134025472 trainer.py:511] time: 8.140813 I0420 07:58:29.926698 140375134025472 trainer.py:522] step: 208 fraction_of_correct_next_step_preds:0.46172714 fraction_of_correct_next_step_preds/logits:0.46172714 grad_norm/all:0.7580263 grad_scale_all:1 log_pplx:1.7839309 log_pplx/logits:1.7839309 loss:1.7839309 loss/logits:1.7839309 num_samples_in_batch:128 var_norm/all:702.52466 I0420 07:58:31.081680 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.393191 I0420 07:58:37.610832 140375134025472 trainer.py:511] time: 7.683848 I0420 07:58:37.611921 140375134025472 trainer.py:522] step: 209 fraction_of_correct_next_step_preds:0.45912775 fraction_of_correct_next_step_preds/logits:0.45912775 grad_norm/all:0.59735614 grad_scale_all:1 log_pplx:1.7847834 log_pplx/logits:1.7847834 loss:1.7847834 loss/logits:1.7847834 num_samples_in_batch:128 var_norm/all:702.50818 I0420 07:58:41.092617 140375142418176 trainer.py:371] Steps/second: 0.128297, Examples/second: 17.364879 I0420 07:58:45.093353 140375134025472 trainer.py:511] time: 7.481086 I0420 07:58:45.094466 140375134025472 trainer.py:522] step: 210 fraction_of_correct_next_step_preds:0.45871964 fraction_of_correct_next_step_preds/logits:0.45871964 grad_norm/all:0.63530314 grad_scale_all:1 log_pplx:1.7760886 log_pplx/logits:1.7760886 loss:1.7760886 loss/logits:1.7760886 num_samples_in_batch:128 var_norm/all:702.49158 I0420 07:58:51.101516 140375142418176 trainer.py:371] Steps/second: 0.128123, Examples/second: 17.336934 I0420 07:58:54.138889 140375134025472 trainer.py:511] time: 9.044044 I0420 07:58:54.139625 140375134025472 trainer.py:522] step: 211 fraction_of_correct_next_step_preds:0.4615027 fraction_of_correct_next_step_preds/logits:0.4615027 grad_norm/all:0.5618149 grad_scale_all:1 log_pplx:1.7774103 log_pplx/logits:1.7774103 loss:1.7774103 loss/logits:1.7774103 num_samples_in_batch:128 var_norm/all:702.47498 I0420 07:59:01.086863 140375134025472 trainer.py:511] time: 6.946899 I0420 07:59:01.088413 140375134025472 trainer.py:522] step: 212 fraction_of_correct_next_step_preds:0.45972911 fraction_of_correct_next_step_preds/logits:0.45972911 grad_norm/all:0.69746411 grad_scale_all:1 log_pplx:1.7848744 log_pplx/logits:1.7848744 loss:1.7848744 loss/logits:1.7848744 num_samples_in_batch:128 var_norm/all:702.45831 I0420 07:59:01.174531 140375142418176 trainer.py:371] Steps/second: 0.128554, Examples/second: 17.308655 I0420 07:59:09.226310 140375134025472 trainer.py:511] time: 8.137713 I0420 07:59:09.227124 140375134025472 trainer.py:522] step: 213 fraction_of_correct_next_step_preds:0.46676597 fraction_of_correct_next_step_preds/logits:0.46676597 grad_norm/all:0.57188773 grad_scale_all:1 log_pplx:1.761503 log_pplx/logits:1.761503 loss:1.761503 loss/logits:1.761503 num_samples_in_batch:128 var_norm/all:702.44159 I0420 07:59:11.122014 140375142418176 trainer.py:371] Steps/second: 0.128386, Examples/second: 17.436331 I0420 07:59:13.427846 140375134025472 trainer.py:511] time: 4.200434 I0420 07:59:13.428952 140375134025472 trainer.py:522] step: 214 fraction_of_correct_next_step_preds:0.44703782 fraction_of_correct_next_step_preds/logits:0.44703782 grad_norm/all:0.80825353 grad_scale_all:1 log_pplx:1.8188372 log_pplx/logits:1.8188372 loss:1.8188372 loss/logits:1.8188372 num_samples_in_batch:256 var_norm/all:702.42468 I0420 07:59:19.056699 140375134025472 trainer.py:511] time: 5.627521 I0420 07:59:19.057606 140375134025472 trainer.py:522] step: 215 fraction_of_correct_next_step_preds:0.45846134 fraction_of_correct_next_step_preds/logits:0.45846134 grad_norm/all:0.78043801 grad_scale_all:1 log_pplx:1.7941139 log_pplx/logits:1.7941139 loss:1.7941139 loss/logits:1.7941139 num_samples_in_batch:128 var_norm/all:702.4079 I0420 07:59:21.130456 140375142418176 trainer.py:371] Steps/second: 0.128814, Examples/second: 17.485154 I0420 07:59:27.227977 140375134025472 trainer.py:511] time: 8.170123 I0420 07:59:27.229242 140375134025472 trainer.py:522] step: 216 fraction_of_correct_next_step_preds:0.46632123 fraction_of_correct_next_step_preds/logits:0.46632123 grad_norm/all:1.0154312 grad_scale_all:0.98480332 log_pplx:1.7582886 log_pplx/logits:1.7582886 loss:1.7582886 loss/logits:1.7582886 num_samples_in_batch:128 var_norm/all:702.39105 I0420 07:59:31.131680 140375142418176 trainer.py:371] Steps/second: 0.128642, Examples/second: 17.457238 I0420 07:59:35.057662 140375134025472 trainer.py:511] time: 7.828189 I0420 07:59:35.058621 140375134025472 trainer.py:522] step: 217 fraction_of_correct_next_step_preds:0.46656239 fraction_of_correct_next_step_preds/logits:0.46656239 grad_norm/all:0.84083319 grad_scale_all:1 log_pplx:1.7765833 log_pplx/logits:1.7765833 loss:1.7765833 loss/logits:1.7765833 num_samples_in_batch:128 var_norm/all:702.37415 I0420 07:59:41.134951 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.429633 I0420 07:59:42.318872 140375134025472 trainer.py:511] time: 7.260002 I0420 07:59:42.319998 140375134025472 trainer.py:522] step: 218 fraction_of_correct_next_step_preds:0.46115926 fraction_of_correct_next_step_preds/logits:0.46115926 grad_norm/all:0.79595202 grad_scale_all:1 log_pplx:1.7707779 log_pplx/logits:1.7707779 loss:1.7707779 loss/logits:1.7707779 num_samples_in_batch:128 var_norm/all:702.35724 I0420 07:59:51.163680 140375142418176 trainer.py:371] Steps/second: 0.128303, Examples/second: 17.402101 I0420 07:59:51.239382 140375134025472 trainer.py:511] time: 8.919143 I0420 07:59:51.240328 140375134025472 trainer.py:522] step: 219 fraction_of_correct_next_step_preds:0.47008803 fraction_of_correct_next_step_preds/logits:0.47008803 grad_norm/all:0.54538399 grad_scale_all:1 log_pplx:1.7490183 log_pplx/logits:1.7490183 loss:1.7490183 loss/logits:1.7490183 num_samples_in_batch:128 var_norm/all:702.34021 I0420 07:59:58.012236 140375134025472 trainer.py:511] time: 6.771603 I0420 07:59:58.013267 140375134025472 trainer.py:522] step: 220 fraction_of_correct_next_step_preds:0.46866179 fraction_of_correct_next_step_preds/logits:0.46866179 grad_norm/all:0.91434407 grad_scale_all:1 log_pplx:1.7713562 log_pplx/logits:1.7713562 loss:1.7713562 loss/logits:1.7713562 num_samples_in_batch:128 var_norm/all:702.32318 I0420 08:00:01.152885 140375142418176 trainer.py:371] Steps/second: 0.128723, Examples/second: 17.450165 I0420 08:00:03.786756 140375134025472 trainer.py:511] time: 5.773228 I0420 08:00:03.788918 140375134025472 trainer.py:522] step: 221 fraction_of_correct_next_step_preds:0.45591307 fraction_of_correct_next_step_preds/logits:0.45591307 grad_norm/all:0.75105029 grad_scale_all:1 log_pplx:1.7882202 log_pplx/logits:1.7882202 loss:1.7882202 loss/logits:1.7882202 num_samples_in_batch:128 var_norm/all:702.30609 I0420 08:00:11.162324 140375142418176 trainer.py:371] Steps/second: 0.128555, Examples/second: 17.423020 I0420 08:00:11.915620 140375134025472 trainer.py:511] time: 8.126472 I0420 08:00:11.916634 140375134025472 trainer.py:522] step: 222 fraction_of_correct_next_step_preds:0.46780324 fraction_of_correct_next_step_preds/logits:0.46780324 grad_norm/all:0.84238148 grad_scale_all:1 log_pplx:1.7543784 log_pplx/logits:1.7543784 loss:1.7543784 loss/logits:1.7543784 num_samples_in_batch:128 var_norm/all:702.28894 I0420 08:00:20.346482 140375134025472 trainer.py:511] time: 8.429596 I0420 08:00:20.348227 140375134025472 trainer.py:522] step: 223 fraction_of_correct_next_step_preds:0.4653933 fraction_of_correct_next_step_preds/logits:0.4653933 grad_norm/all:0.6992746 grad_scale_all:1 log_pplx:1.7581142 log_pplx/logits:1.7581142 loss:1.7581142 loss/logits:1.7581142 num_samples_in_batch:128 var_norm/all:702.27179 I0420 08:00:21.172450 140375142418176 trainer.py:371] Steps/second: 0.128968, Examples/second: 17.470208 I0420 08:00:28.055938 140375134025472 trainer.py:511] time: 7.707268 I0420 08:00:28.056710 140375134025472 trainer.py:522] step: 224 fraction_of_correct_next_step_preds:0.46492544 fraction_of_correct_next_step_preds/logits:0.46492544 grad_norm/all:0.73407185 grad_scale_all:1 log_pplx:1.7659955 log_pplx/logits:1.7659955 loss:1.7659955 loss/logits:1.7659955 num_samples_in_batch:128 var_norm/all:702.25458 I0420 08:00:31.183248 140375142418176 trainer.py:371] Steps/second: 0.128800, Examples/second: 17.443245 I0420 08:00:37.036247 140375134025472 trainer.py:511] time: 8.979163 I0420 08:00:37.037311 140375134025472 trainer.py:522] step: 225 fraction_of_correct_next_step_preds:0.47491661 fraction_of_correct_next_step_preds/logits:0.47491661 grad_norm/all:0.80260926 grad_scale_all:1 log_pplx:1.7374296 log_pplx/logits:1.7374296 loss:1.7374296 loss/logits:1.7374296 num_samples_in_batch:128 var_norm/all:702.23724 I0420 08:00:41.191685 140375142418176 trainer.py:371] Steps/second: 0.128635, Examples/second: 17.416615 I0420 08:00:44.467958 140375134025472 trainer.py:511] time: 7.430286 I0420 08:00:44.468951 140375134025472 trainer.py:522] step: 226 fraction_of_correct_next_step_preds:0.46759757 fraction_of_correct_next_step_preds/logits:0.46759757 grad_norm/all:0.47861227 grad_scale_all:1 log_pplx:1.756492 log_pplx/logits:1.756492 loss:1.756492 loss/logits:1.756492 num_samples_in_batch:128 var_norm/all:702.21997 I0420 08:00:51.193407 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.390355 I0420 08:00:52.937412 140375134025472 trainer.py:511] time: 8.468178 I0420 08:00:52.938375 140375134025472 trainer.py:522] step: 227 fraction_of_correct_next_step_preds:0.46996406 fraction_of_correct_next_step_preds/logits:0.46996406 grad_norm/all:0.58960819 grad_scale_all:1 log_pplx:1.7427672 log_pplx/logits:1.7427672 loss:1.7427672 loss/logits:1.7427672 num_samples_in_batch:128 var_norm/all:702.20258 I0420 08:00:57.075196 140375134025472 trainer.py:511] time: 4.136514 I0420 08:00:57.076277 140375134025472 trainer.py:522] step: 228 fraction_of_correct_next_step_preds:0.46155813 fraction_of_correct_next_step_preds/logits:0.46155813 grad_norm/all:0.61242658 grad_scale_all:1 log_pplx:1.7996696 log_pplx/logits:1.7996696 loss:1.7996696 loss/logits:1.7996696 num_samples_in_batch:256 var_norm/all:702.18518 I0420 08:01:01.202909 140375142418176 trainer.py:371] Steps/second: 0.128876, Examples/second: 17.509017 I0420 08:01:04.000119 140375134025472 trainer.py:511] time: 6.923594 I0420 08:01:04.001703 140375134025472 trainer.py:522] step: 229 fraction_of_correct_next_step_preds:0.46669361 fraction_of_correct_next_step_preds/logits:0.46669361 grad_norm/all:0.83380169 grad_scale_all:1 log_pplx:1.7546902 log_pplx/logits:1.7546902 loss:1.7546902 loss/logits:1.7546902 num_samples_in_batch:128 var_norm/all:702.16779 I0420 08:01:11.213485 140375142418176 trainer.py:371] Steps/second: 0.128713, Examples/second: 17.482445 I0420 08:01:11.214068 140375142418176 trainer.py:268] Save checkpoint I0420 08:01:12.066224 140375134025472 trainer.py:511] time: 8.064283 I0420 08:01:12.067042 140375134025472 trainer.py:522] step: 230 fraction_of_correct_next_step_preds:0.47051281 fraction_of_correct_next_step_preds/logits:0.47051281 grad_norm/all:0.89386725 grad_scale_all:1 log_pplx:1.7668961 log_pplx/logits:1.7668961 loss:1.7668961 loss/logits:1.7668961 num_samples_in_batch:128 var_norm/all:702.15027 W0420 08:01:13.397505 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'dict' object has no attribute 'name' W0420 08:01:13.397821 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'list' object has no attribute 'name' I0420 08:01:13.592818 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000229 I0420 08:01:17.900814 140375134025472 trainer.py:511] time: 5.833391 I0420 08:01:17.902220 140375134025472 trainer.py:522] step: 231 fraction_of_correct_next_step_preds:0.46253422 fraction_of_correct_next_step_preds/logits:0.46253422 grad_norm/all:0.90772861 grad_scale_all:1 log_pplx:1.7824725 log_pplx/logits:1.7824725 loss:1.7824725 loss/logits:1.7824725 num_samples_in_batch:128 var_norm/all:702.13281 I0420 08:01:21.222120 140375142418176 trainer.py:371] Steps/second: 0.129111, Examples/second: 17.527732 I0420 08:01:25.758121 140375134025472 trainer.py:511] time: 7.855735 I0420 08:01:25.759021 140375134025472 trainer.py:522] step: 232 fraction_of_correct_next_step_preds:0.46910375 fraction_of_correct_next_step_preds/logits:0.46910375 grad_norm/all:1.3872631 grad_scale_all:0.72084379 log_pplx:1.7563069 log_pplx/logits:1.7563069 loss:1.7563069 loss/logits:1.7563069 num_samples_in_batch:128 var_norm/all:702.11523 I0420 08:01:31.223759 140375142418176 trainer.py:371] Steps/second: 0.128949, Examples/second: 17.501440 I0420 08:01:34.945935 140375134025472 trainer.py:511] time: 9.186722 I0420 08:01:34.947176 140375134025472 trainer.py:522] step: 233 fraction_of_correct_next_step_preds:0.47437832 fraction_of_correct_next_step_preds/logits:0.47437832 grad_norm/all:0.63594443 grad_scale_all:1 log_pplx:1.741677 log_pplx/logits:1.741677 loss:1.741677 loss/logits:1.741677 num_samples_in_batch:128 var_norm/all:702.09827 I0420 08:01:41.231878 140375142418176 trainer.py:371] Steps/second: 0.128788, Examples/second: 17.475373 I0420 08:01:42.245994 140375134025472 trainer.py:511] time: 7.298589 I0420 08:01:42.247441 140375134025472 trainer.py:522] step: 234 fraction_of_correct_next_step_preds:0.46905929 fraction_of_correct_next_step_preds/logits:0.46905929 grad_norm/all:0.96724659 grad_scale_all:1 log_pplx:1.7541467 log_pplx/logits:1.7541467 loss:1.7541467 loss/logits:1.7541467 num_samples_in_batch:128 var_norm/all:702.08112 I0420 08:01:50.494924 140375134025472 trainer.py:511] time: 8.247263 I0420 08:01:50.496362 140375134025472 trainer.py:522] step: 235 fraction_of_correct_next_step_preds:0.47312659 fraction_of_correct_next_step_preds/logits:0.47312659 grad_norm/all:1.1440966 grad_scale_all:0.87405205 log_pplx:1.7468596 log_pplx/logits:1.7468596 loss:1.7468596 loss/logits:1.7468596 num_samples_in_batch:128 var_norm/all:702.0639 I0420 08:01:51.243119 140375142418176 trainer.py:371] Steps/second: 0.129179, Examples/second: 17.519926 I0420 08:01:56.405097 140375134025472 trainer.py:511] time: 5.908442 I0420 08:01:56.406378 140375134025472 trainer.py:522] step: 236 fraction_of_correct_next_step_preds:0.47224933 fraction_of_correct_next_step_preds/logits:0.47224933 grad_norm/all:0.83175695 grad_scale_all:1 log_pplx:1.7403471 log_pplx/logits:1.7403471 loss:1.7403471 loss/logits:1.7403471 num_samples_in_batch:128 var_norm/all:702.04681 I0420 08:02:01.251878 140375142418176 trainer.py:371] Steps/second: 0.129019, Examples/second: 17.494039 I0420 08:02:04.464371 140375134025472 trainer.py:511] time: 8.057689 I0420 08:02:04.465137 140375134025472 trainer.py:522] step: 237 fraction_of_correct_next_step_preds:0.470265 fraction_of_correct_next_step_preds/logits:0.470265 grad_norm/all:0.5707919 grad_scale_all:1 log_pplx:1.7329159 log_pplx/logits:1.7329159 loss:1.7329159 loss/logits:1.7329159 num_samples_in_batch:128 var_norm/all:702.02966 I0420 08:02:11.283281 140375142418176 trainer.py:371] Steps/second: 0.128859, Examples/second: 17.468229 I0420 08:02:11.318855 140375134025472 trainer.py:511] time: 6.853433 I0420 08:02:11.319916 140375134025472 trainer.py:522] step: 238 fraction_of_correct_next_step_preds:0.47459942 fraction_of_correct_next_step_preds/logits:0.47459942 grad_norm/all:1.0437591 grad_scale_all:0.95807546 log_pplx:1.7425517 log_pplx/logits:1.7425517 loss:1.7425517 loss/logits:1.7425517 num_samples_in_batch:128 var_norm/all:702.01239 I0420 08:02:18.968424 140375134025472 trainer.py:511] time: 7.648252 I0420 08:02:18.969336 140375134025472 trainer.py:522] step: 239 fraction_of_correct_next_step_preds:0.46933535 fraction_of_correct_next_step_preds/logits:0.46933535 grad_norm/all:0.87260091 grad_scale_all:1 log_pplx:1.7371274 log_pplx/logits:1.7371274 loss:1.7371274 loss/logits:1.7371274 num_samples_in_batch:128 var_norm/all:701.99518 I0420 08:02:21.271883 140375142418176 trainer.py:371] Steps/second: 0.129244, Examples/second: 17.512300 I0420 08:02:28.374356 140375134025472 trainer.py:511] time: 9.404775 I0420 08:02:28.375895 140375134025472 trainer.py:522] step: 240 fraction_of_correct_next_step_preds:0.46985883 fraction_of_correct_next_step_preds/logits:0.46985883 grad_norm/all:0.69113106 grad_scale_all:1 log_pplx:1.740096 log_pplx/logits:1.740096 loss:1.740096 loss/logits:1.740096 num_samples_in_batch:128 var_norm/all:701.97778 I0420 08:02:31.282310 140375142418176 trainer.py:371] Steps/second: 0.129086, Examples/second: 17.486856 I0420 08:02:37.180418 140375134025472 trainer.py:511] time: 8.804240 I0420 08:02:37.181627 140375134025472 trainer.py:522] step: 241 fraction_of_correct_next_step_preds:0.47086367 fraction_of_correct_next_step_preds/logits:0.47086367 grad_norm/all:0.52514201 grad_scale_all:1 log_pplx:1.7458298 log_pplx/logits:1.7458298 loss:1.7458298 loss/logits:1.7458298 num_samples_in_batch:128 var_norm/all:701.96039 I0420 08:02:41.291951 140375142418176 trainer.py:371] Steps/second: 0.128930, Examples/second: 17.461692 I0420 08:02:44.532416 140375134025472 trainer.py:511] time: 7.350590 I0420 08:02:44.533498 140375134025472 trainer.py:522] step: 242 fraction_of_correct_next_step_preds:0.47320038 fraction_of_correct_next_step_preds/logits:0.47320038 grad_norm/all:0.77490425 grad_scale_all:1 log_pplx:1.7374105 log_pplx/logits:1.7374105 loss:1.7374105 loss/logits:1.7374105 num_samples_in_batch:128 var_norm/all:701.94281 I0420 08:02:48.880996 140375134025472 trainer.py:511] time: 4.347199 I0420 08:02:48.882100 140375134025472 trainer.py:522] step: 243 fraction_of_correct_next_step_preds:0.46300641 fraction_of_correct_next_step_preds/logits:0.46300641 grad_norm/all:0.83082104 grad_scale_all:1 log_pplx:1.7768013 log_pplx/logits:1.7768013 loss:1.7768013 loss/logits:1.7768013 num_samples_in_batch:256 var_norm/all:701.92523 I0420 08:02:51.303666 140375142418176 trainer.py:371] Steps/second: 0.129307, Examples/second: 17.573003 I0420 08:02:57.107769 140375134025472 trainer.py:511] time: 8.225317 I0420 08:02:57.108860 140375134025472 trainer.py:522] step: 244 fraction_of_correct_next_step_preds:0.4717139 fraction_of_correct_next_step_preds/logits:0.4717139 grad_norm/all:0.93524802 grad_scale_all:1 log_pplx:1.7399476 log_pplx/logits:1.7399476 loss:1.7399476 loss/logits:1.7399476 num_samples_in_batch:128 var_norm/all:701.90759 2019-04-20 08:02:57.112897: I lingvo/core/ops/record_batcher.cc:344] 1892 total seconds passed. Total records yielded: 33820. Total records skipped: 28 2019-04-20 08:02:57.113046: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2110 2019-04-20 08:02:57.113072: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1771 2019-04-20 08:02:57.113136: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2152 2019-04-20 08:02:57.113153: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711 2019-04-20 08:02:57.113169: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2713 2019-04-20 08:02:57.113188: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1741 2019-04-20 08:02:57.113206: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717 2019-04-20 08:02:57.113225: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711 2019-04-20 08:02:57.113246: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1714 2019-04-20 08:02:57.113263: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2368 I0420 08:03:01.313136 140375142418176 trainer.py:371] Steps/second: 0.129151, Examples/second: 17.547652 I0420 08:03:02.861757 140375134025472 trainer.py:511] time: 5.752685 I0420 08:03:02.862689 140375134025472 trainer.py:522] step: 245 fraction_of_correct_next_step_preds:0.46972477 fraction_of_correct_next_step_preds/logits:0.46972477 grad_norm/all:0.63129705 grad_scale_all:1 log_pplx:1.7548465 log_pplx/logits:1.7548465 loss:1.7548465 loss/logits:1.7548465 num_samples_in_batch:128 var_norm/all:701.88983 I0420 08:03:11.322582 140375142418176 trainer.py:371] Steps/second: 0.128997, Examples/second: 17.522566 I0420 08:03:11.842037 140375134025472 trainer.py:511] time: 8.979055 I0420 08:03:11.843192 140375134025472 trainer.py:522] step: 246 fraction_of_correct_next_step_preds:0.47632533 fraction_of_correct_next_step_preds/logits:0.47632533 grad_norm/all:0.88023895 grad_scale_all:1 log_pplx:1.7404553 log_pplx/logits:1.7404553 loss:1.7404553 loss/logits:1.7404553 num_samples_in_batch:128 var_norm/all:701.87207 I0420 08:03:19.596901 140375134025472 trainer.py:511] time: 7.753358 I0420 08:03:19.597985 140375134025472 trainer.py:522] step: 247 fraction_of_correct_next_step_preds:0.46976799 fraction_of_correct_next_step_preds/logits:0.46976799 grad_norm/all:1.2076995 grad_scale_all:0.82802051 log_pplx:1.7418838 log_pplx/logits:1.7418838 loss:1.7418838 loss/logits:1.7418838 num_samples_in_batch:128 var_norm/all:701.85431 I0420 08:03:21.325915 140375142418176 trainer.py:371] Steps/second: 0.129369, Examples/second: 17.564842 I0420 08:03:26.526834 140375134025472 trainer.py:511] time: 6.928554 I0420 08:03:26.528428 140375134025472 trainer.py:522] step: 248 fraction_of_correct_next_step_preds:0.4831104 fraction_of_correct_next_step_preds/logits:0.4831104 grad_norm/all:0.73449987 grad_scale_all:1 log_pplx:1.7084304 log_pplx/logits:1.7084304 loss:1.7084304 loss/logits:1.7084304 num_samples_in_batch:128 var_norm/all:701.83673 I0420 08:03:31.333636 140375142418176 trainer.py:371] Steps/second: 0.129215, Examples/second: 17.539945 I0420 08:03:35.295617 140375134025472 trainer.py:511] time: 8.766982 I0420 08:03:35.296690 140375134025472 trainer.py:522] step: 249 fraction_of_correct_next_step_preds:0.48221907 fraction_of_correct_next_step_preds/logits:0.48221907 grad_norm/all:0.97499126 grad_scale_all:1 log_pplx:1.7169322 log_pplx/logits:1.7169322 loss:1.7169322 loss/logits:1.7169322 num_samples_in_batch:128 var_norm/all:701.81909 I0420 08:03:41.343588 140375142418176 trainer.py:371] Steps/second: 0.129063, Examples/second: 17.515286 I0420 08:03:42.775986 140375134025472 trainer.py:511] time: 7.479129 I0420 08:03:42.776784 140375134025472 trainer.py:522] step: 250 fraction_of_correct_next_step_preds:0.46579447 fraction_of_correct_next_step_preds/logits:0.46579447 grad_norm/all:1.0307354 grad_scale_all:0.97018111 log_pplx:1.7483062 log_pplx/logits:1.7483062 loss:1.7483062 loss/logits:1.7483062 num_samples_in_batch:128 var_norm/all:701.80145 I0420 08:03:51.352524 140375142418176 trainer.py:371] Steps/second: 0.128913, Examples/second: 17.490890 I0420 08:03:51.670706 140375134025472 trainer.py:511] time: 8.893632 I0420 08:03:51.671466 140375134025472 trainer.py:522] step: 251 fraction_of_correct_next_step_preds:0.47927254 fraction_of_correct_next_step_preds/logits:0.47927254 grad_norm/all:1.1698438 grad_scale_all:0.85481501 log_pplx:1.724764 log_pplx/logits:1.724764 loss:1.724764 loss/logits:1.724764 num_samples_in_batch:128 var_norm/all:701.78369 I0420 08:03:59.819484 140375134025472 trainer.py:511] time: 8.147690 I0420 08:03:59.821422 140375134025472 trainer.py:522] step: 252 fraction_of_correct_next_step_preds:0.47445041 fraction_of_correct_next_step_preds/logits:0.47445041 grad_norm/all:0.8330512 grad_scale_all:1 log_pplx:1.7177838 log_pplx/logits:1.7177838 loss:1.7177838 loss/logits:1.7177838 num_samples_in_batch:128 var_norm/all:701.76624 I0420 08:04:01.362076 140375142418176 trainer.py:371] Steps/second: 0.129277, Examples/second: 17.532405 I0420 08:04:06.722750 140375134025472 trainer.py:511] time: 6.901026 I0420 08:04:06.724076 140375134025472 trainer.py:522] step: 253 fraction_of_correct_next_step_preds:0.48421052 fraction_of_correct_next_step_preds/logits:0.48421052 grad_norm/all:0.87252587 grad_scale_all:1 log_pplx:1.7105172 log_pplx/logits:1.7105172 loss:1.7105172 loss/logits:1.7105172 num_samples_in_batch:128 var_norm/all:701.74854 I0420 08:04:11.372586 140375142418176 trainer.py:371] Steps/second: 0.129127, Examples/second: 17.508157 I0420 08:04:12.560870 140375134025472 trainer.py:511] time: 5.836542 I0420 08:04:12.561919 140375134025472 trainer.py:522] step: 254 fraction_of_correct_next_step_preds:0.47887152 fraction_of_correct_next_step_preds/logits:0.47887152 grad_norm/all:1.3652116 grad_scale_all:0.7324872 log_pplx:1.7168174 log_pplx/logits:1.7168174 loss:1.7168174 loss/logits:1.7168174 num_samples_in_batch:128 var_norm/all:701.73077 I0420 08:04:20.203851 140375134025472 trainer.py:511] time: 7.641743 I0420 08:04:20.204858 140375134025472 trainer.py:522] step: 255 fraction_of_correct_next_step_preds:0.48240939 fraction_of_correct_next_step_preds/logits:0.48240939 grad_norm/all:0.72274792 grad_scale_all:1 log_pplx:1.7139823 log_pplx/logits:1.7139823 loss:1.7139823 loss/logits:1.7139823 num_samples_in_batch:128 var_norm/all:701.7135 I0420 08:04:21.382539 140375142418176 trainer.py:371] Steps/second: 0.129486, Examples/second: 17.549158 I0420 08:04:27.892129 140375134025472 trainer.py:511] time: 7.686936 I0420 08:04:27.893568 140375134025472 trainer.py:522] step: 256 fraction_of_correct_next_step_preds:0.46987805 fraction_of_correct_next_step_preds/logits:0.46987805 grad_norm/all:1.5648351 grad_scale_all:0.639045 log_pplx:1.7303852 log_pplx/logits:1.7303852 loss:1.7303852 loss/logits:1.7303852 num_samples_in_batch:128 var_norm/all:701.69611 I0420 08:04:31.392179 140375142418176 trainer.py:371] Steps/second: 0.129336, Examples/second: 17.525079 I0420 08:04:36.150166 140375134025472 trainer.py:511] time: 8.256426 I0420 08:04:36.150923 140375134025472 trainer.py:522] step: 257 fraction_of_correct_next_step_preds:0.48051399 fraction_of_correct_next_step_preds/logits:0.48051399 grad_norm/all:0.92478973 grad_scale_all:1 log_pplx:1.712966 log_pplx/logits:1.712966 loss:1.712966 loss/logits:1.712966 num_samples_in_batch:128 var_norm/all:701.67926 I0420 08:04:41.403631 140375142418176 trainer.py:371] Steps/second: 0.129188, Examples/second: 17.501228 I0420 08:04:45.177848 140375134025472 trainer.py:511] time: 9.026632 I0420 08:04:45.179218 140375134025472 trainer.py:522] step: 258 fraction_of_correct_next_step_preds:0.48063409 fraction_of_correct_next_step_preds/logits:0.48063409 grad_norm/all:1.1676286 grad_scale_all:0.85643667 log_pplx:1.7200587 log_pplx/logits:1.7200587 loss:1.7200587 loss/logits:1.7200587 num_samples_in_batch:128 var_norm/all:701.66211 I0420 08:04:51.412882 140375142418176 trainer.py:371] Steps/second: 0.129042, Examples/second: 17.477632 I0420 08:04:52.784845 140375134025472 trainer.py:511] time: 7.605420 I0420 08:04:52.785636 140375134025472 trainer.py:522] step: 259 fraction_of_correct_next_step_preds:0.47774762 fraction_of_correct_next_step_preds/logits:0.47774762 grad_norm/all:1.1290226 grad_scale_all:0.88572186 log_pplx:1.7213285 log_pplx/logits:1.7213285 loss:1.7213285 loss/logits:1.7213285 num_samples_in_batch:128 var_norm/all:701.64526 I0420 08:04:59.750433 140375134025472 trainer.py:511] time: 6.964426 I0420 08:04:59.751822 140375134025472 trainer.py:522] step: 260 fraction_of_correct_next_step_preds:0.4800995 fraction_of_correct_next_step_preds/logits:0.4800995 grad_norm/all:0.97840136 grad_scale_all:1 log_pplx:1.7151138 log_pplx/logits:1.7151138 loss:1.7151138 loss/logits:1.7151138 num_samples_in_batch:128 var_norm/all:701.62842 I0420 08:05:01.424606 140375142418176 trainer.py:371] Steps/second: 0.129394, Examples/second: 17.581654 I0420 08:05:04.086353 140375134025472 trainer.py:511] time: 4.334335 I0420 08:05:04.087388 140375134025472 trainer.py:522] step: 261 fraction_of_correct_next_step_preds:0.46974903 fraction_of_correct_next_step_preds/logits:0.46974903 grad_norm/all:1.0458601 grad_scale_all:0.95615089 log_pplx:1.75725 log_pplx/logits:1.75725 loss:1.75725 loss/logits:1.75725 num_samples_in_batch:256 var_norm/all:701.61133 I0420 08:05:11.432993 140375142418176 trainer.py:371] Steps/second: 0.129248, Examples/second: 17.557903 I0420 08:05:12.238946 140375134025472 trainer.py:511] time: 8.151325 I0420 08:05:12.239756 140375134025472 trainer.py:522] step: 262 fraction_of_correct_next_step_preds:0.48408258 fraction_of_correct_next_step_preds/logits:0.48408258 grad_norm/all:0.81047904 grad_scale_all:1 log_pplx:1.7101616 log_pplx/logits:1.7101616 loss:1.7101616 loss/logits:1.7101616 num_samples_in_batch:128 var_norm/all:701.59418 I0420 08:05:17.831022 140375134025472 trainer.py:511] time: 5.590824 I0420 08:05:17.831772 140375134025472 trainer.py:522] step: 263 fraction_of_correct_next_step_preds:0.48156208 fraction_of_correct_next_step_preds/logits:0.48156208 grad_norm/all:1.1611915 grad_scale_all:0.86118442 log_pplx:1.7126567 log_pplx/logits:1.7126567 loss:1.7126567 loss/logits:1.7126567 num_samples_in_batch:128 var_norm/all:701.5769 I0420 08:05:21.436331 140375142418176 trainer.py:371] Steps/second: 0.129596, Examples/second: 17.597503 I0420 08:05:26.736340 140375134025472 trainer.py:511] time: 8.904243 I0420 08:05:26.737313 140375134025472 trainer.py:522] step: 264 fraction_of_correct_next_step_preds:0.4776144 fraction_of_correct_next_step_preds/logits:0.4776144 grad_norm/all:0.87775058 grad_scale_all:1 log_pplx:1.7094767 log_pplx/logits:1.7094767 loss:1.7094767 loss/logits:1.7094767 num_samples_in_batch:128 var_norm/all:701.55969 I0420 08:05:31.443900 140375142418176 trainer.py:371] Steps/second: 0.129451, Examples/second: 17.573912 I0420 08:05:34.129517 140375134025472 trainer.py:511] time: 7.391932 I0420 08:05:34.130951 140375134025472 trainer.py:522] step: 265 fraction_of_correct_next_step_preds:0.48271435 fraction_of_correct_next_step_preds/logits:0.48271435 grad_norm/all:1.0273635 grad_scale_all:0.97336531 log_pplx:1.7104719 log_pplx/logits:1.7104719 loss:1.7104719 loss/logits:1.7104719 num_samples_in_batch:128 var_norm/all:701.54242 I0420 08:05:41.453370 140375142418176 trainer.py:371] Steps/second: 0.129306, Examples/second: 17.550536 I0420 08:05:42.639960 140375134025472 trainer.py:511] time: 8.508722 I0420 08:05:42.641098 140375134025472 trainer.py:522] step: 266 fraction_of_correct_next_step_preds:0.48919922 fraction_of_correct_next_step_preds/logits:0.48919922 grad_norm/all:0.61877698 grad_scale_all:1 log_pplx:1.6739788 log_pplx/logits:1.6739788 loss:1.6739788 loss/logits:1.6739788 num_samples_in_batch:128 var_norm/all:701.52502 I0420 08:05:50.417263 140375134025472 trainer.py:511] time: 7.775875 I0420 08:05:50.418644 140375134025472 trainer.py:522] step: 267 fraction_of_correct_next_step_preds:0.47832456 fraction_of_correct_next_step_preds/logits:0.47832456 grad_norm/all:0.93601477 grad_scale_all:1 log_pplx:1.7074571 log_pplx/logits:1.7074571 loss:1.7074571 loss/logits:1.7074571 num_samples_in_batch:128 var_norm/all:701.50745 I0420 08:05:51.466936 140375142418176 trainer.py:371] Steps/second: 0.129649, Examples/second: 17.589507 I0420 08:05:56.287250 140375134025472 trainer.py:511] time: 5.868383 I0420 08:05:56.288091 140375134025472 trainer.py:522] step: 268 fraction_of_correct_next_step_preds:0.47999004 fraction_of_correct_next_step_preds/logits:0.47999004 grad_norm/all:0.85305828 grad_scale_all:1 log_pplx:1.7220774 log_pplx/logits:1.7220774 loss:1.7220774 loss/logits:1.7220774 num_samples_in_batch:128 var_norm/all:701.48969 I0420 08:06:01.473455 140375142418176 trainer.py:371] Steps/second: 0.129505, Examples/second: 17.566308 I0420 08:06:03.218576 140375134025472 trainer.py:511] time: 6.930291 I0420 08:06:03.219804 140375134025472 trainer.py:522] step: 269 fraction_of_correct_next_step_preds:0.48043379 fraction_of_correct_next_step_preds/logits:0.48043379 grad_norm/all:0.712749 grad_scale_all:1 log_pplx:1.6929581 log_pplx/logits:1.6929581 loss:1.6929581 loss/logits:1.6929581 num_samples_in_batch:128 var_norm/all:701.4718 I0420 08:06:11.352437 140375134025472 trainer.py:511] time: 8.132378 I0420 08:06:11.353750 140375134025472 trainer.py:522] step: 270 fraction_of_correct_next_step_preds:0.48388165 fraction_of_correct_next_step_preds/logits:0.48388165 grad_norm/all:0.61178142 grad_scale_all:1 log_pplx:1.6985079 log_pplx/logits:1.6985079 loss:1.6985079 loss/logits:1.6985079 num_samples_in_batch:128 var_norm/all:701.45386 I0420 08:06:11.496427 140375142418176 trainer.py:371] Steps/second: 0.129843, Examples/second: 17.543193 I0420 08:06:20.475584 140375134025472 trainer.py:511] time: 9.121587 I0420 08:06:20.476536 140375134025472 trainer.py:522] step: 271 fraction_of_correct_next_step_preds:0.4857007 fraction_of_correct_next_step_preds/logits:0.4857007 grad_norm/all:0.76059753 grad_scale_all:1 log_pplx:1.703305 log_pplx/logits:1.703305 loss:1.703305 loss/logits:1.703305 num_samples_in_batch:128 var_norm/all:701.43573 I0420 08:06:21.493537 140375142418176 trainer.py:371] Steps/second: 0.129700, Examples/second: 17.581776 I0420 08:06:28.765491 140375134025472 trainer.py:511] time: 8.288722 I0420 08:06:28.767008 140375134025472 trainer.py:522] step: 272 fraction_of_correct_next_step_preds:0.48852551 fraction_of_correct_next_step_preds/logits:0.48852551 grad_norm/all:0.78125584 grad_scale_all:1 log_pplx:1.6890454 log_pplx/logits:1.6890454 loss:1.6890454 loss/logits:1.6890454 num_samples_in_batch:128 var_norm/all:701.41754 I0420 08:06:31.502388 140375142418176 trainer.py:371] Steps/second: 0.129558, Examples/second: 17.558926 I0420 08:06:36.136847 140375134025472 trainer.py:511] time: 7.369630 I0420 08:06:36.137947 140375134025472 trainer.py:522] step: 273 fraction_of_correct_next_step_preds:0.47953263 fraction_of_correct_next_step_preds/logits:0.47953263 grad_norm/all:0.89224702 grad_scale_all:1 log_pplx:1.6981378 log_pplx/logits:1.6981378 loss:1.6981378 loss/logits:1.6981378 num_samples_in_batch:128 var_norm/all:701.39923 I0420 08:06:41.513550 140375142418176 trainer.py:371] Steps/second: 0.129417, Examples/second: 17.536274 I0420 08:06:41.777515 140375134025472 trainer.py:511] time: 5.639333 I0420 08:06:41.778304 140375134025472 trainer.py:522] step: 274 fraction_of_correct_next_step_preds:0.48499754 fraction_of_correct_next_step_preds/logits:0.48499754 grad_norm/all:0.78847671 grad_scale_all:1 log_pplx:1.7076409 log_pplx/logits:1.7076409 loss:1.7076409 loss/logits:1.7076409 num_samples_in_batch:128 var_norm/all:701.3808 I0420 08:06:49.734405 140375134025472 trainer.py:511] time: 7.955748 I0420 08:06:49.735455 140375134025472 trainer.py:522] step: 275 fraction_of_correct_next_step_preds:0.491671 fraction_of_correct_next_step_preds/logits:0.491671 grad_norm/all:0.54001719 grad_scale_all:1 log_pplx:1.6724101 log_pplx/logits:1.6724101 loss:1.6724101 loss/logits:1.6724101 num_samples_in_batch:128 var_norm/all:701.3623 I0420 08:06:51.523920 140375142418176 trainer.py:371] Steps/second: 0.129750, Examples/second: 17.574235 I0420 08:06:56.667988 140375134025472 trainer.py:511] time: 6.932291 I0420 08:06:56.669092 140375134025472 trainer.py:522] step: 276 fraction_of_correct_next_step_preds:0.48732969 fraction_of_correct_next_step_preds/logits:0.48732969 grad_norm/all:0.75405437 grad_scale_all:1 log_pplx:1.6760614 log_pplx/logits:1.6760614 loss:1.6760614 loss/logits:1.6760614 num_samples_in_batch:128 var_norm/all:701.34375 I0420 08:07:01.532896 140375142418176 trainer.py:371] Steps/second: 0.129609, Examples/second: 17.551739 I0420 08:07:04.908233 140375134025472 trainer.py:511] time: 8.238858 I0420 08:07:04.909634 140375134025472 trainer.py:522] step: 277 fraction_of_correct_next_step_preds:0.49006793 fraction_of_correct_next_step_preds/logits:0.49006793 grad_norm/all:0.5504958 grad_scale_all:1 log_pplx:1.6877584 log_pplx/logits:1.6877584 loss:1.6877584 loss/logits:1.6877584 num_samples_in_batch:128 var_norm/all:701.32507 I0420 08:07:09.160671 140375134025472 trainer.py:511] time: 4.250840 I0420 08:07:09.161709 140375134025472 trainer.py:522] step: 278 fraction_of_correct_next_step_preds:0.47712374 fraction_of_correct_next_step_preds/logits:0.47712374 grad_norm/all:0.75490832 grad_scale_all:1 log_pplx:1.7187968 log_pplx/logits:1.7187968 loss:1.7187968 loss/logits:1.7187968 num_samples_in_batch:256 var_norm/all:701.30634 I0420 08:07:11.542623 140375142418176 trainer.py:371] Steps/second: 0.129938, Examples/second: 17.649105 I0420 08:07:17.528034 140375134025472 trainer.py:511] time: 8.366018 I0420 08:07:17.529331 140375134025472 trainer.py:522] step: 279 fraction_of_correct_next_step_preds:0.48743379 fraction_of_correct_next_step_preds/logits:0.48743379 grad_norm/all:0.92561078 grad_scale_all:1 log_pplx:1.6828735 log_pplx/logits:1.6828735 loss:1.6828735 loss/logits:1.6828735 num_samples_in_batch:128 var_norm/all:701.28754 I0420 08:07:21.552903 140375142418176 trainer.py:371] Steps/second: 0.129798, Examples/second: 17.626461 I0420 08:07:26.421845 140375134025472 trainer.py:511] time: 8.892352 I0420 08:07:26.422921 140375134025472 trainer.py:522] step: 280 fraction_of_correct_next_step_preds:0.49203226 fraction_of_correct_next_step_preds/logits:0.49203226 grad_norm/all:0.74147272 grad_scale_all:1 log_pplx:1.6697469 log_pplx/logits:1.6697469 loss:1.6697469 loss/logits:1.6697469 num_samples_in_batch:128 var_norm/all:701.26868 I0420 08:07:31.554145 140375142418176 trainer.py:371] Steps/second: 0.129660, Examples/second: 17.604102 I0420 08:07:33.752093 140375134025472 trainer.py:511] time: 7.328991 I0420 08:07:33.753096 140375134025472 trainer.py:522] step: 281 fraction_of_correct_next_step_preds:0.49754408 fraction_of_correct_next_step_preds/logits:0.49754408 grad_norm/all:0.9283191 grad_scale_all:1 log_pplx:1.6691937 log_pplx/logits:1.6691937 loss:1.6691937 loss/logits:1.6691937 num_samples_in_batch:128 var_norm/all:701.24982 I0420 08:07:39.357894 140375134025472 trainer.py:511] time: 5.604640 I0420 08:07:39.358714 140375134025472 trainer.py:522] step: 282 fraction_of_correct_next_step_preds:0.49084425 fraction_of_correct_next_step_preds/logits:0.49084425 grad_norm/all:0.66445577 grad_scale_all:1 log_pplx:1.6889721 log_pplx/logits:1.6889721 loss:1.6889721 loss/logits:1.6889721 num_samples_in_batch:128 var_norm/all:701.23096 I0420 08:07:41.564384 140375142418176 trainer.py:371] Steps/second: 0.129983, Examples/second: 17.640874 I0420 08:07:46.204224 140375134025472 trainer.py:511] time: 6.845251 I0420 08:07:46.205530 140375134025472 trainer.py:522] step: 283 fraction_of_correct_next_step_preds:0.49110562 fraction_of_correct_next_step_preds/logits:0.49110562 grad_norm/all:0.97234619 grad_scale_all:1 log_pplx:1.6718854 log_pplx/logits:1.6718854 loss:1.6718854 loss/logits:1.6718854 num_samples_in_batch:128 var_norm/all:701.21191 I0420 08:07:51.573463 140375142418176 trainer.py:371] Steps/second: 0.129845, Examples/second: 17.618590 I0420 08:07:54.169616 140375134025472 trainer.py:511] time: 7.963868 I0420 08:07:54.170538 140375134025472 trainer.py:522] step: 284 fraction_of_correct_next_step_preds:0.4850997 fraction_of_correct_next_step_preds/logits:0.4850997 grad_norm/all:0.79734397 grad_scale_all:1 log_pplx:1.6948088 log_pplx/logits:1.6948088 loss:1.6948088 loss/logits:1.6948088 num_samples_in_batch:128 var_norm/all:701.19293 I0420 08:08:01.584471 140375142418176 trainer.py:371] Steps/second: 0.129708, Examples/second: 17.596494 I0420 08:08:02.195472 140375134025472 trainer.py:511] time: 8.024653 I0420 08:08:02.196593 140375134025472 trainer.py:522] step: 285 fraction_of_correct_next_step_preds:0.49519411 fraction_of_correct_next_step_preds/logits:0.49519411 grad_norm/all:0.60972106 grad_scale_all:1 log_pplx:1.6642418 log_pplx/logits:1.6642418 loss:1.6642418 loss/logits:1.6642418 num_samples_in_batch:128 var_norm/all:701.17395 I0420 08:08:10.374047 140375134025472 trainer.py:511] time: 8.177250 I0420 08:08:10.375052 140375134025472 trainer.py:522] step: 286 fraction_of_correct_next_step_preds:0.49601725 fraction_of_correct_next_step_preds/logits:0.49601725 grad_norm/all:0.71168172 grad_scale_all:1 log_pplx:1.6702002 log_pplx/logits:1.6702002 loss:1.6702002 loss/logits:1.6702002 num_samples_in_batch:128 var_norm/all:701.15485 I0420 08:08:11.585005 140375142418176 trainer.py:371] Steps/second: 0.130028, Examples/second: 17.632876 I0420 08:08:19.508388 140375134025472 trainer.py:511] time: 9.133091 I0420 08:08:19.509922 140375134025472 trainer.py:522] step: 287 fraction_of_correct_next_step_preds:0.49295774 fraction_of_correct_next_step_preds/logits:0.49295774 grad_norm/all:0.56234354 grad_scale_all:1 log_pplx:1.6584225 log_pplx/logits:1.6584225 loss:1.6584225 loss/logits:1.6584225 num_samples_in_batch:128 var_norm/all:701.13568 I0420 08:08:21.597275 140375142418176 trainer.py:371] Steps/second: 0.129891, Examples/second: 17.610906 I0420 08:08:26.930875 140375134025472 trainer.py:511] time: 7.420712 I0420 08:08:26.931936 140375134025472 trainer.py:522] step: 288 fraction_of_correct_next_step_preds:0.4916712 fraction_of_correct_next_step_preds/logits:0.4916712 grad_norm/all:0.66766959 grad_scale_all:1 log_pplx:1.6747226 log_pplx/logits:1.6747226 loss:1.6747226 loss/logits:1.6747226 num_samples_in_batch:128 var_norm/all:701.11646 I0420 08:08:31.607119 140375142418176 trainer.py:371] Steps/second: 0.129756, Examples/second: 17.589153 I0420 08:08:34.528407 140375134025472 trainer.py:511] time: 7.596274 I0420 08:08:34.529500 140375134025472 trainer.py:522] step: 289 fraction_of_correct_next_step_preds:0.49554795 fraction_of_correct_next_step_preds/logits:0.49554795 grad_norm/all:0.59034371 grad_scale_all:1 log_pplx:1.6630248 log_pplx/logits:1.6630248 loss:1.6630248 loss/logits:1.6630248 num_samples_in_batch:128 var_norm/all:701.09717 I0420 08:08:40.189884 140375134025472 trainer.py:511] time: 5.660071 I0420 08:08:40.190975 140375134025472 trainer.py:522] step: 290 fraction_of_correct_next_step_preds:0.49784851 fraction_of_correct_next_step_preds/logits:0.49784851 grad_norm/all:0.85040981 grad_scale_all:1 log_pplx:1.6592635 log_pplx/logits:1.6592635 loss:1.6592635 loss/logits:1.6592635 num_samples_in_batch:128 var_norm/all:701.07788 I0420 08:08:41.615701 140375142418176 trainer.py:371] Steps/second: 0.130071, Examples/second: 17.625015 I0420 08:08:47.114645 140375134025472 trainer.py:511] time: 6.899251 I0420 08:08:47.115684 140375134025472 trainer.py:522] step: 291 fraction_of_correct_next_step_preds:0.49960199 fraction_of_correct_next_step_preds/logits:0.49960199 grad_norm/all:0.69848967 grad_scale_all:1 log_pplx:1.6517125 log_pplx/logits:1.6517125 loss:1.6517125 loss/logits:1.6517125 num_samples_in_batch:128 var_norm/all:701.05847 I0420 08:08:51.625663 140375142418176 trainer.py:371] Steps/second: 0.129936, Examples/second: 17.603392 I0420 08:08:55.315984 140375134025472 trainer.py:511] time: 8.199814 I0420 08:08:55.317042 140375134025472 trainer.py:522] step: 292 fraction_of_correct_next_step_preds:0.49012542 fraction_of_correct_next_step_preds/logits:0.49012542 grad_norm/all:0.76374143 grad_scale_all:1 log_pplx:1.6770569 log_pplx/logits:1.6770569 loss:1.6770569 loss/logits:1.6770569 num_samples_in_batch:128 var_norm/all:701.03912 I0420 08:08:59.675689 140375134025472 trainer.py:511] time: 4.358358 I0420 08:08:59.676516 140375134025472 trainer.py:522] step: 293 fraction_of_correct_next_step_preds:0.48613396 fraction_of_correct_next_step_preds/logits:0.48613396 grad_norm/all:0.58962899 grad_scale_all:1 log_pplx:1.7016851 log_pplx/logits:1.7016851 loss:1.7016851 loss/logits:1.7016851 num_samples_in_batch:256 var_norm/all:701.01959 I0420 08:09:01.636991 140375142418176 trainer.py:371] Steps/second: 0.130247, Examples/second: 17.695752 I0420 08:09:09.086409 140375134025472 trainer.py:511] time: 9.409725 I0420 08:09:09.087534 140375134025472 trainer.py:522] step: 294 fraction_of_correct_next_step_preds:0.49767861 fraction_of_correct_next_step_preds/logits:0.49767861 grad_norm/all:0.91316557 grad_scale_all:1 log_pplx:1.6625874 log_pplx/logits:1.6625874 loss:1.6625874 loss/logits:1.6625874 num_samples_in_batch:128 var_norm/all:701.00018 I0420 08:09:11.645663 140375142418176 trainer.py:371] Steps/second: 0.130112, Examples/second: 17.674016 I0420 08:09:17.783576 140375134025472 trainer.py:511] time: 8.695789 I0420 08:09:17.784698 140375134025472 trainer.py:522] step: 295 fraction_of_correct_next_step_preds:0.48983049 fraction_of_correct_next_step_preds/logits:0.48983049 grad_norm/all:0.95493954 grad_scale_all:1 log_pplx:1.6716349 log_pplx/logits:1.6716349 loss:1.6716349 loss/logits:1.6716349 num_samples_in_batch:128 var_norm/all:700.98065 I0420 08:09:21.646167 140375142418176 trainer.py:371] Steps/second: 0.129979, Examples/second: 17.652537 I0420 08:09:25.372036 140375134025472 trainer.py:511] time: 7.586990 I0420 08:09:25.372812 140375134025472 trainer.py:522] step: 296 fraction_of_correct_next_step_preds:0.5000816 fraction_of_correct_next_step_preds/logits:0.5000816 grad_norm/all:0.64577919 grad_scale_all:1 log_pplx:1.6472374 log_pplx/logits:1.6472374 loss:1.6472374 loss/logits:1.6472374 num_samples_in_batch:128 var_norm/all:700.96106 I0420 08:09:31.650913 140375142418176 trainer.py:371] Steps/second: 0.129848, Examples/second: 17.631214 I0420 08:09:32.993964 140375134025472 trainer.py:511] time: 7.620850 I0420 08:09:32.995151 140375134025472 trainer.py:522] step: 297 fraction_of_correct_next_step_preds:0.4945648 fraction_of_correct_next_step_preds/logits:0.4945648 grad_norm/all:0.72654635 grad_scale_all:1 log_pplx:1.6568946 log_pplx/logits:1.6568946 loss:1.6568946 loss/logits:1.6568946 num_samples_in_batch:128 var_norm/all:700.94147 I0420 08:09:39.829082 140375134025472 trainer.py:511] time: 6.833739 I0420 08:09:39.830120 140375134025472 trainer.py:522] step: 298 fraction_of_correct_next_step_preds:0.49532753 fraction_of_correct_next_step_preds/logits:0.49532753 grad_norm/all:0.84149033 grad_scale_all:1 log_pplx:1.6645547 log_pplx/logits:1.6645547 loss:1.6645547 loss/logits:1.6645547 num_samples_in_batch:128 var_norm/all:700.92188 I0420 08:09:41.661695 140375142418176 trainer.py:371] Steps/second: 0.130153, Examples/second: 17.665935 I0420 08:09:45.428024 140375134025472 trainer.py:511] time: 5.597765 I0420 08:09:45.428751 140375134025472 trainer.py:522] step: 299 fraction_of_correct_next_step_preds:0.49844065 fraction_of_correct_next_step_preds/logits:0.49844065 grad_norm/all:0.72662562 grad_scale_all:1 log_pplx:1.6585479 log_pplx/logits:1.6585479 loss:1.6585479 loss/logits:1.6585479 num_samples_in_batch:128 var_norm/all:700.90222 I0420 08:09:51.669598 140375142418176 trainer.py:371] Steps/second: 0.130022, Examples/second: 17.644714 I0420 08:09:54.806416 140375134025472 trainer.py:511] time: 9.276496 I0420 08:09:54.807459 140375134025472 trainer.py:522] step: 300 fraction_of_correct_next_step_preds:0.49890438 fraction_of_correct_next_step_preds/logits:0.49890438 grad_norm/all:0.47588634 grad_scale_all:1 log_pplx:1.6455002 log_pplx/logits:1.6455002 loss:1.6455002 loss/logits:1.6455002 num_samples_in_batch:128 var_norm/all:700.88251 I0420 08:10:01.680495 140375142418176 trainer.py:371] Steps/second: 0.129891, Examples/second: 17.623655 I0420 08:10:02.882178 140375134025472 trainer.py:511] time: 8.074491 I0420 08:10:02.883347 140375134025472 base_runner.py:115] step: 301 fraction_of_correct_next_step_preds:0.49539271 fraction_of_correct_next_step_preds/logits:0.49539271 grad_norm/all:0.55232203 grad_scale_all:1 log_pplx:1.6580131 log_pplx/logits:1.6580131 loss:1.6580131 loss/logits:1.6580131 num_samples_in_batch:128 var_norm/all:700.86273 I0420 08:10:11.410650 140375134025472 trainer.py:511] time: 8.527046 I0420 08:10:11.412190 140375134025472 trainer.py:522] step: 302 fraction_of_correct_next_step_preds:0.50001818 fraction_of_correct_next_step_preds/logits:0.50001818 grad_norm/all:0.59572774 grad_scale_all:1 log_pplx:1.6452334 log_pplx/logits:1.6452334 loss:1.6452334 loss/logits:1.6452334 num_samples_in_batch:128 var_norm/all:700.84296 I0420 08:10:11.702862 140375142418176 trainer.py:371] Steps/second: 0.130192, Examples/second: 17.657870 I0420 08:10:11.703310 140375142418176 trainer.py:275] Write summary @302 2019-04-20 08:10:11.704807: I lingvo/core/ops/record_batcher.cc:344] 2302 total seconds passed. Total records yielded: 319. Total records skipped: 0 I0420 08:10:21.761969 140375134025472 trainer.py:511] time: 10.349495 I0420 08:10:21.766017 140375134025472 trainer.py:522] step: 303 fraction_of_correct_next_step_preds:0.49674481 fraction_of_correct_next_step_preds/logits:0.49674481 grad_norm/all:0.54272324 grad_scale_all:1 log_pplx:1.6499885 log_pplx/logits:1.6499885 loss:1.6499885 loss/logits:1.6499885 num_samples_in_batch:128 var_norm/all:700.82318 I0420 08:10:33.999098 140375134025472 trainer.py:511] time: 12.231365 I0420 08:10:34.002353 140375134025472 trainer.py:522] step: 304 fraction_of_correct_next_step_preds:0.50056696 fraction_of_correct_next_step_preds/logits:0.50056696 grad_norm/all:0.56036305 grad_scale_all:1 log_pplx:1.6472863 log_pplx/logits:1.6472863 loss:1.6472863 loss/logits:1.6472863 num_samples_in_batch:128 var_norm/all:700.80322 I0420 08:10:44.876357 140375134025472 trainer.py:511] time: 10.873799 I0420 08:10:44.878129 140375134025472 trainer.py:522] step: 305 fraction_of_correct_next_step_preds:0.49619058 fraction_of_correct_next_step_preds/logits:0.49619058 grad_norm/all:0.45682853 grad_scale_all:1 log_pplx:1.6571825 log_pplx/logits:1.6571825 loss:1.6571825 loss/logits:1.6571825 num_samples_in_batch:128 var_norm/all:700.78339 I0420 08:10:54.934130 140375134025472 trainer.py:511] time: 10.055444 I0420 08:10:54.951957 140375134025472 trainer.py:522] step: 306 fraction_of_correct_next_step_preds:0.49130249 fraction_of_correct_next_step_preds/logits:0.49130249 grad_norm/all:0.55479544 grad_scale_all:1 log_pplx:1.6631523 log_pplx/logits:1.6631523 loss:1.6631523 loss/logits:1.6631523 num_samples_in_batch:128 var_norm/all:700.76337 I0420 08:11:08.898406 140375134025472 trainer.py:511] time: 13.945316 I0420 08:11:08.900434 140375134025472 trainer.py:522] step: 307 fraction_of_correct_next_step_preds:0.50046659 fraction_of_correct_next_step_preds/logits:0.50046659 grad_norm/all:0.5233255 grad_scale_all:1 log_pplx:1.6387581 log_pplx/logits:1.6387581 loss:1.6387581 loss/logits:1.6387581 num_samples_in_batch:128 var_norm/all:700.74335 I0420 08:11:23.458416 140375134025472 trainer.py:511] time: 14.557503 I0420 08:11:23.460469 140375134025472 trainer.py:522] step: 308 fraction_of_correct_next_step_preds:0.50056994 fraction_of_correct_next_step_preds/logits:0.50056994 grad_norm/all:0.53521502 grad_scale_all:1 log_pplx:1.6413771 log_pplx/logits:1.6413771 loss:1.6413771 loss/logits:1.6413771 num_samples_in_batch:128 var_norm/all:700.72327 I0420 08:11:25.631345 140375142418176 trainer.py:284] Write summary done: step 302 I0420 08:11:25.643549 140375142418176 base_runner.py:115] step: 302, steps/sec: 0.13, examples/sec: 17.66 I0420 08:11:25.647203 140375142418176 trainer.py:371] Steps/second: 0.128677, Examples/second: 17.486705 I0420 08:11:25.647681 140375142418176 trainer.py:268] Save checkpoint W0420 08:11:27.696300 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'dict' object has no attribute 'name' W0420 08:11:27.696728 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack. Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore. 'list' object has no attribute 'name' I0420 08:11:27.889962 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000308 I0420 08:11:28.140069 140375134025472 trainer.py:511] time: 4.679287 I0420 08:11:28.141210 140375134025472 trainer.py:522] step: 309 fraction_of_correct_next_step_preds:0.4929764 fraction_of_correct_next_step_preds/logits:0.4929764 grad_norm/all:0.60663974 grad_scale_all:1 log_pplx:1.6712581 log_pplx/logits:1.6712581 loss:1.6712581 loss/logits:1.6712581 num_samples_in_batch:256 var_norm/all:700.70325 I0420 08:11:35.650377 140375142418176 trainer.py:371] Steps/second: 0.128558, Examples/second: 17.467183 I0420 08:11:36.566423 140375134025472 trainer.py:511] time: 8.425011 I0420 08:11:36.567435 140375134025472 trainer.py:522] step: 310 fraction_of_correct_next_step_preds:0.49667397 fraction_of_correct_next_step_preds/logits:0.49667397 grad_norm/all:0.6936717 grad_scale_all:1 log_pplx:1.6473578 log_pplx/logits:1.6473578 loss:1.6473578 loss/logits:1.6473578 num_samples_in_batch:128 var_norm/all:700.68311 I0420 08:11:44.382427 140375134025472 trainer.py:511] time: 7.814773 I0420 08:11:44.383707 140375134025472 trainer.py:522] step: 311 fraction_of_correct_next_step_preds:0.51070797 fraction_of_correct_next_step_preds/logits:0.51070797 grad_norm/all:0.60806739 grad_scale_all:1 log_pplx:1.6229703 log_pplx/logits:1.6229703 loss:1.6229703 loss/logits:1.6229703 num_samples_in_batch:128 var_norm/all:700.66296 I0420 08:11:45.648066 140375142418176 trainer.py:371] Steps/second: 0.128854, Examples/second: 17.500896 I0420 08:11:52.074405 140375134025472 trainer.py:511] time: 7.690345 I0420 08:11:52.075900 140375134025472 trainer.py:522] step: 312 fraction_of_correct_next_step_preds:0.50249594 fraction_of_correct_next_step_preds/logits:0.50249594 grad_norm/all:0.64068145 grad_scale_all:1 log_pplx:1.6338384 log_pplx/logits:1.6338384 loss:1.6338384 loss/logits:1.6338384 num_samples_in_batch:128 var_norm/all:700.64282 I0420 08:11:55.658617 140375142418176 trainer.py:371] Steps/second: 0.128734, Examples/second: 17.481423 I0420 08:11:59.064162 140375134025472 trainer.py:511] time: 6.987994 I0420 08:11:59.065263 140375134025472 trainer.py:522] step: 313 fraction_of_correct_next_step_preds:0.50249946 fraction_of_correct_next_step_preds/logits:0.50249946 grad_norm/all:0.59031916 grad_scale_all:1 log_pplx:1.6381044 log_pplx/logits:1.6381044 loss:1.6381044 loss/logits:1.6381044 num_samples_in_batch:128 var_norm/all:700.62274 I0420 08:12:04.761533 140375134025472 trainer.py:511] time: 5.696028 I0420 08:12:04.762573 140375134025472 trainer.py:522] step: 314 fraction_of_correct_next_step_preds:0.49937415 fraction_of_correct_next_step_preds/logits:0.49937415 grad_norm/all:0.87156236 grad_scale_all:1 log_pplx:1.6565 log_pplx/logits:1.6565 loss:1.6565 loss/logits:1.6565 num_samples_in_batch:128 var_norm/all:700.60254 I0420 08:12:05.668169 140375142418176 trainer.py:371] Steps/second: 0.129026, Examples/second: 17.514715 I0420 08:12:13.708889 140375134025472 trainer.py:511] time: 8.945945 I0420 08:12:13.709841 140375134025472 trainer.py:522] step: 315 fraction_of_correct_next_step_preds:0.50379395 fraction_of_correct_next_step_preds/logits:0.50379395 grad_norm/all:0.72499329 grad_scale_all:1 log_pplx:1.631755 log_pplx/logits:1.631755 loss:1.631755 loss/logits:1.631755 num_samples_in_batch:128 var_norm/all:700.5824