WARNING: Logging before flag parsing goes to stderr.
W0420 07:30:11.011545 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/ops/py_x_ops.py:26: The name tf.resource_loader.get_path_to_datafile is deprecated. Please use tf.compat.v1.resource_loader.get_path_to_datafile instead.

W0420 07:30:11.038940 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1234: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead.

W0420 07:30:11.294478 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1556: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

W0420 07:30:11.295180 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/model_imports.py:46: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

I0420 07:30:11.295275 140395597309760 model_imports.py:46] Importing lingvo.tasks.asr.params
W0420 07:30:11.316593 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/model_registry.py:121: The name tf.logging.debug is deprecated. Please use tf.compat.v1.logging.debug instead.

I0420 07:30:11.316711 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.asr.params.librispeech
I0420 07:30:11.320092 140395597309760 model_imports.py:46] Importing lingvo.tasks.image.params
I0420 07:30:11.322403 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.image.params.mnist
I0420 07:30:11.322523 140395597309760 model_imports.py:46] Importing lingvo.tasks.lm.params
I0420 07:30:11.324174 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.lm.params.one_billion_wds
I0420 07:30:11.326229 140395597309760 model_imports.py:46] Importing lingvo.tasks.mt.params
I0420 07:30:11.330373 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.mt.params.wmt14_en_de
I0420 07:30:11.335803 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.mt.params.wmtm16_en_de
I0420 07:30:11.335920 140395597309760 model_imports.py:46] Importing lingvo.tasks.punctuator.params
I0420 07:30:11.337392 140395597309760 model_registry.py:124] Registering models from module: lingvo.tasks.punctuator.params.codelab
W0420 07:30:11.337567 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1515: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

W0420 07:30:11.337685 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1515: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

W0420 07:30:11.337846 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:1383: The name tf.train.Server is deprecated. Please use tf.distribute.Server instead.

2019-04-20 07:30:11.338234: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-04-20 07:30:11.350665: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcuda.so.1
2019-04-20 07:30:13.961255: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6fb62e0 executing computations on platform CUDA. Devices:
2019-04-20 07:30:13.961341: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): TITAN Xp, Compute Capability 6.1
2019-04-20 07:30:13.961383: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (1): TITAN Xp, Compute Capability 6.1
2019-04-20 07:30:13.961422: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (2): TITAN Xp, Compute Capability 6.1
2019-04-20 07:30:13.961434: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (3): TITAN Xp, Compute Capability 6.1
2019-04-20 07:30:13.969634: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199885000 Hz
2019-04-20 07:30:13.979006: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x711be40 executing computations on platform Host. Devices:
2019-04-20 07:30:13.979064: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-20 07:30:13.979776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:02:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB
2019-04-20 07:30:13.980268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 1 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:03:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB
2019-04-20 07:30:13.980740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 2 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:82:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB
2019-04-20 07:30:13.981206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1595] Found device 3 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:83:00.0
totalMemory: 11.91GiB freeMemory: 11.75GiB
2019-04-20 07:30:13.984208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1718] Adding visible gpu devices: 0, 1, 2, 3
2019-04-20 07:30:13.984778: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcudart.so.10.0
2019-04-20 07:30:13.991259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-20 07:30:13.991291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1132]      0 1 2 3 
2019-04-20 07:30:13.991309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 0:   N Y N N 
2019-04-20 07:30:13.991319: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 1:   Y N N N 
2019-04-20 07:30:13.991328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 2:   N N N Y 
2019-04-20 07:30:13.991337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1145] 3:   N N Y N 
2019-04-20 07:30:13.993076: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:0 with 11427 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:02:00.0, compute capability: 6.1)
2019-04-20 07:30:13.993564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:1 with 11427 MB memory) -> physical GPU (device: 1, name: TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)
2019-04-20 07:30:13.993994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:2 with 11427 MB memory) -> physical GPU (device: 2, name: TITAN Xp, pci bus id: 0000:82:00.0, compute capability: 6.1)
2019-04-20 07:30:13.995082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1266] Created TensorFlow device (/job:local/replica:0/task:0/device:GPU:3 with 11427 MB memory) -> physical GPU (device: 3, name: TITAN Xp, pci bus id: 0000:83:00.0, compute capability: 6.1)
2019-04-20 07:30:13.999280: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:250] Initialize GrpcChannelCache for job local -> {0 -> localhost:40087}
2019-04-20 07:30:14.010769: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:365] Started server with target: grpc://localhost:40087
I0420 07:30:14.019190 140395597309760 trainer.py:1263] Job controller start
I0420 07:30:14.087660 140395597309760 base_runner.py:67] ============================================================
I0420 07:30:14.100358 140395597309760 base_runner.py:69] allow_implicit_capture : NoneType
I0420 07:30:14.100505 140395597309760 base_runner.py:69] cls : type/lingvo.core.base_model/SingleTaskModel
I0420 07:30:14.100614 140395597309760 base_runner.py:69] cluster.add_summary : NoneType
I0420 07:30:14.100713 140395597309760 base_runner.py:69] cluster.cls : type/lingvo.core.cluster/_Cluster
I0420 07:30:14.100819 140395597309760 base_runner.py:69] cluster.controller.cpus_per_replica : 1
I0420 07:30:14.100914 140395597309760 base_runner.py:69] cluster.controller.devices_per_split : 1
I0420 07:30:14.101003 140395597309760 base_runner.py:69] cluster.controller.gpus_per_replica : 0
I0420 07:30:14.101094 140395597309760 base_runner.py:69] cluster.controller.name : '/job:local'
I0420 07:30:14.101186 140395597309760 base_runner.py:69] cluster.controller.num_tpu_hosts : 0
I0420 07:30:14.101277 140395597309760 base_runner.py:69] cluster.controller.replicas : 1
I0420 07:30:14.101367 140395597309760 base_runner.py:69] cluster.controller.tpus_per_replica : 0
I0420 07:30:14.101458 140395597309760 base_runner.py:69] cluster.decoder.cpus_per_replica : 1
I0420 07:30:14.101547 140395597309760 base_runner.py:69] cluster.decoder.devices_per_split : 1
I0420 07:30:14.101638 140395597309760 base_runner.py:69] cluster.decoder.gpus_per_replica : 1
I0420 07:30:14.101732 140395597309760 base_runner.py:69] cluster.decoder.name : '/job:local'
I0420 07:30:14.101828 140395597309760 base_runner.py:69] cluster.decoder.num_tpu_hosts : 0
I0420 07:30:14.101918 140395597309760 base_runner.py:69] cluster.decoder.replicas : 1
I0420 07:30:14.102009 140395597309760 base_runner.py:69] cluster.decoder.tpus_per_replica : 0
I0420 07:30:14.102098 140395597309760 base_runner.py:69] cluster.evaler.cpus_per_replica : 1
I0420 07:30:14.102188 140395597309760 base_runner.py:69] cluster.evaler.devices_per_split : 1
I0420 07:30:14.102277 140395597309760 base_runner.py:69] cluster.evaler.gpus_per_replica : 1
I0420 07:30:14.102365 140395597309760 base_runner.py:69] cluster.evaler.name : '/job:local'
I0420 07:30:14.102453 140395597309760 base_runner.py:69] cluster.evaler.num_tpu_hosts : 0
I0420 07:30:14.102544 140395597309760 base_runner.py:69] cluster.evaler.replicas : 1
I0420 07:30:14.102632 140395597309760 base_runner.py:69] cluster.evaler.tpus_per_replica : 0
I0420 07:30:14.102726 140395597309760 base_runner.py:69] cluster.input.cpus_per_replica : 1
I0420 07:30:14.102821 140395597309760 base_runner.py:69] cluster.input.devices_per_split : 1
I0420 07:30:14.102912 140395597309760 base_runner.py:69] cluster.input.gpus_per_replica : 0
I0420 07:30:14.103003 140395597309760 base_runner.py:69] cluster.input.name : '/job:local'
I0420 07:30:14.103091 140395597309760 base_runner.py:69] cluster.input.num_tpu_hosts : 0
I0420 07:30:14.103180 140395597309760 base_runner.py:69] cluster.input.replicas : 0
I0420 07:30:14.103271 140395597309760 base_runner.py:69] cluster.input.tpus_per_replica : 0
I0420 07:30:14.103358 140395597309760 base_runner.py:69] cluster.job : 'controller'
I0420 07:30:14.103446 140395597309760 base_runner.py:69] cluster.mode : 'async'
I0420 07:30:14.103534 140395597309760 base_runner.py:69] cluster.ps.cpus_per_replica : 1
I0420 07:30:14.103624 140395597309760 base_runner.py:69] cluster.ps.devices_per_split : 1
I0420 07:30:14.103738 140395597309760 base_runner.py:69] cluster.ps.gpus_per_replica : 0
I0420 07:30:14.103827 140395597309760 base_runner.py:69] cluster.ps.name : '/job:local'
I0420 07:30:14.103914 140395597309760 base_runner.py:69] cluster.ps.num_tpu_hosts : 0
I0420 07:30:14.104001 140395597309760 base_runner.py:69] cluster.ps.replicas : 1
I0420 07:30:14.104088 140395597309760 base_runner.py:69] cluster.ps.tpus_per_replica : 0
I0420 07:30:14.104173 140395597309760 base_runner.py:69] cluster.task : 0
I0420 07:30:14.104264 140395597309760 base_runner.py:69] cluster.worker.cpus_per_replica : 1
I0420 07:30:14.104351 140395597309760 base_runner.py:69] cluster.worker.devices_per_split : 1
I0420 07:30:14.104446 140395597309760 base_runner.py:69] cluster.worker.gpus_per_replica : 4
I0420 07:30:14.104536 140395597309760 base_runner.py:69] cluster.worker.name : '/job:local'
I0420 07:30:14.104620 140395597309760 base_runner.py:69] cluster.worker.num_tpu_hosts : 0
I0420 07:30:14.104707 140395597309760 base_runner.py:69] cluster.worker.replicas : 1
I0420 07:30:14.104811 140395597309760 base_runner.py:69] cluster.worker.tpus_per_replica : 0
I0420 07:30:14.104897 140395597309760 base_runner.py:69] dtype : float32
I0420 07:30:14.104984 140395597309760 base_runner.py:69] fprop_dtype : NoneType
I0420 07:30:14.105072 140395597309760 base_runner.py:69] inference_driver_name : NoneType
I0420 07:30:14.105158 140395597309760 base_runner.py:69] input.allow_implicit_capture : NoneType
I0420 07:30:14.105245 140395597309760 base_runner.py:69] input.append_eos_frame : True
I0420 07:30:14.105331 140395597309760 base_runner.py:69] input.bucket_adjust_every_n : 0
I0420 07:30:14.105418 140395597309760 base_runner.py:69] input.bucket_batch_limit : [64, 32, 32, 32, 32, 32, 32, 32]
I0420 07:30:14.105505 140395597309760 base_runner.py:69] input.bucket_upper_bound : [639, 1062, 1275, 1377, 1449, 1506, 1563, 1710]
I0420 07:30:14.105678 140395597309760 base_runner.py:69] input.cls : type/lingvo.tasks.asr.input_generator/AsrInput
I0420 07:30:14.105782 140395597309760 base_runner.py:69] input.dtype : float32
I0420 07:30:14.105878 140395597309760 base_runner.py:69] input.file_buffer_size : 10000
I0420 07:30:14.105963 140395597309760 base_runner.py:69] input.file_parallelism : 16
I0420 07:30:14.106050 140395597309760 base_runner.py:69] input.file_pattern : 'tfrecord:/data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*'
I0420 07:30:14.106137 140395597309760 base_runner.py:69] input.file_random_seed : 0
I0420 07:30:14.106223 140395597309760 base_runner.py:69] input.flush_every_n : 0
I0420 07:30:14.106308 140395597309760 base_runner.py:69] input.fprop_dtype : NoneType
I0420 07:30:14.106395 140395597309760 base_runner.py:69] input.frame_size : 80
I0420 07:30:14.106481 140395597309760 base_runner.py:69] input.inference_driver_name : NoneType
I0420 07:30:14.106566 140395597309760 base_runner.py:69] input.is_eval : False
I0420 07:30:14.106653 140395597309760 base_runner.py:69] input.is_inference : NoneType
I0420 07:30:14.106745 140395597309760 base_runner.py:69] input.name : 'input'
I0420 07:30:14.106836 140395597309760 base_runner.py:69] input.num_batcher_threads : 1
I0420 07:30:14.106921 140395597309760 base_runner.py:69] input.num_samples : 281241
I0420 07:30:14.107008 140395597309760 base_runner.py:69] input.pad_to_max_seq_length : False
I0420 07:30:14.107094 140395597309760 base_runner.py:69] input.params_init.method : 'xavier'
I0420 07:30:14.107178 140395597309760 base_runner.py:69] input.params_init.scale : 1.000001
I0420 07:30:14.107264 140395597309760 base_runner.py:69] input.params_init.seed : NoneType
I0420 07:30:14.107350 140395597309760 base_runner.py:69] input.random_seed : NoneType
I0420 07:30:14.107436 140395597309760 base_runner.py:69] input.require_sequential_order : False
I0420 07:30:14.107522 140395597309760 base_runner.py:69] input.skip_lp_regularization : NoneType
I0420 07:30:14.107608 140395597309760 base_runner.py:69] input.source_max_length : 3000
I0420 07:30:14.107693 140395597309760 base_runner.py:69] input.target_max_length : 620
I0420 07:30:14.107789 140395597309760 base_runner.py:69] input.tokenizer.allow_implicit_capture : NoneType
I0420 07:30:14.107877 140395597309760 base_runner.py:69] input.tokenizer.append_eos : True
I0420 07:30:14.107960 140395597309760 base_runner.py:69] input.tokenizer.cls : type/lingvo.core.tokenizers/AsciiTokenizer
I0420 07:30:14.108072 140395597309760 base_runner.py:69] input.tokenizer.dtype : float32
I0420 07:30:14.108164 140395597309760 base_runner.py:69] input.tokenizer.fprop_dtype : NoneType
I0420 07:30:14.108249 140395597309760 base_runner.py:69] input.tokenizer.inference_driver_name : NoneType
I0420 07:30:14.108335 140395597309760 base_runner.py:69] input.tokenizer.is_eval : NoneType
I0420 07:30:14.108431 140395597309760 base_runner.py:69] input.tokenizer.is_inference : NoneType
I0420 07:30:14.108519 140395597309760 base_runner.py:69] input.tokenizer.name : 'tokenizer'
I0420 07:30:14.108606 140395597309760 base_runner.py:69] input.tokenizer.pad_to_max_length : True
I0420 07:30:14.108691 140395597309760 base_runner.py:69] input.tokenizer.params_init.method : 'xavier'
I0420 07:30:14.108793 140395597309760 base_runner.py:69] input.tokenizer.params_init.scale : 1.000001
I0420 07:30:14.108880 140395597309760 base_runner.py:69] input.tokenizer.params_init.seed : NoneType
I0420 07:30:14.108967 140395597309760 base_runner.py:69] input.tokenizer.random_seed : NoneType
I0420 07:30:14.109051 140395597309760 base_runner.py:69] input.tokenizer.skip_lp_regularization : NoneType
I0420 07:30:14.109138 140395597309760 base_runner.py:69] input.tokenizer.target_eos_id : 2
I0420 07:30:14.109224 140395597309760 base_runner.py:69] input.tokenizer.target_sos_id : 1
I0420 07:30:14.109308 140395597309760 base_runner.py:69] input.tokenizer.target_unk_id : 0
I0420 07:30:14.109395 140395597309760 base_runner.py:69] input.tokenizer.vn.global_vn : False
I0420 07:30:14.109479 140395597309760 base_runner.py:69] input.tokenizer.vn.per_step_vn : False
I0420 07:30:14.109565 140395597309760 base_runner.py:69] input.tokenizer.vn.scale : NoneType
I0420 07:30:14.109652 140395597309760 base_runner.py:69] input.tokenizer.vn.seed : NoneType
I0420 07:30:14.109747 140395597309760 base_runner.py:69] input.tokenizer.vocab_size : 76
I0420 07:30:14.109838 140395597309760 base_runner.py:69] input.tokenizer_dict : {}
I0420 07:30:14.109922 140395597309760 base_runner.py:69] input.tpu_infeed_parallism : 1
I0420 07:30:14.110008 140395597309760 base_runner.py:69] input.use_per_host_infeed : False
I0420 07:30:14.110093 140395597309760 base_runner.py:69] input.use_within_batch_mixing : False
I0420 07:30:14.110177 140395597309760 base_runner.py:69] input.vn.global_vn : False
I0420 07:30:14.110264 140395597309760 base_runner.py:69] input.vn.per_step_vn : False
I0420 07:30:14.110349 140395597309760 base_runner.py:69] input.vn.scale : NoneType
I0420 07:30:14.110435 140395597309760 base_runner.py:69] input.vn.seed : NoneType
I0420 07:30:14.110521 140395597309760 base_runner.py:69] is_eval : NoneType
I0420 07:30:14.110605 140395597309760 base_runner.py:69] is_inference : NoneType
I0420 07:30:14.110690 140395597309760 base_runner.py:69] model : 'asr.librispeech.Librispeech960Grapheme@/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/params/librispeech.py:181'
I0420 07:30:14.110785 140395597309760 base_runner.py:69] name : ''
I0420 07:30:14.110872 140395597309760 base_runner.py:69] params_init.method : 'xavier'
I0420 07:30:14.110959 140395597309760 base_runner.py:69] params_init.scale : 1.000001
I0420 07:30:14.111044 140395597309760 base_runner.py:69] params_init.seed : NoneType
I0420 07:30:14.111129 140395597309760 base_runner.py:69] random_seed : NoneType
I0420 07:30:14.111215 140395597309760 base_runner.py:69] skip_lp_regularization : NoneType
I0420 07:30:14.111299 140395597309760 base_runner.py:69] task.allow_implicit_capture : NoneType
I0420 07:30:14.111383 140395597309760 base_runner.py:69] task.cls : type/lingvo.tasks.asr.model/AsrModel
I0420 07:30:14.111469 140395597309760 base_runner.py:69] task.decoder.allow_implicit_capture : NoneType
I0420 07:30:14.111555 140395597309760 base_runner.py:69] task.decoder.atten_context_dim : 0
I0420 07:30:14.111640 140395597309760 base_runner.py:69] task.decoder.attention.allow_implicit_capture : NoneType
I0420 07:30:14.111735 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_deterministic : False
I0420 07:30:14.111824 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_prob : 0.0
I0420 07:30:14.111911 140395597309760 base_runner.py:69] task.decoder.attention.cls : type/lingvo.core.attention/AdditiveAttention
I0420 07:30:14.111996 140395597309760 base_runner.py:69] task.decoder.attention.dtype : float32
I0420 07:30:14.112081 140395597309760 base_runner.py:69] task.decoder.attention.fprop_dtype : NoneType
I0420 07:30:14.112174 140395597309760 base_runner.py:69] task.decoder.attention.hidden_dim : 128
I0420 07:30:14.112262 140395597309760 base_runner.py:69] task.decoder.attention.inference_driver_name : NoneType
I0420 07:30:14.112346 140395597309760 base_runner.py:69] task.decoder.attention.is_eval : NoneType
I0420 07:30:14.112432 140395597309760 base_runner.py:69] task.decoder.attention.is_inference : NoneType
I0420 07:30:14.112517 140395597309760 base_runner.py:69] task.decoder.attention.name : ''
I0420 07:30:14.112601 140395597309760 base_runner.py:69] task.decoder.attention.packed_input : False
I0420 07:30:14.112687 140395597309760 base_runner.py:69] task.decoder.attention.params_init.method : 'uniform_sqrt_dim'
I0420 07:30:14.112781 140395597309760 base_runner.py:69] task.decoder.attention.params_init.scale : 1.73205080757
I0420 07:30:14.112869 140395597309760 base_runner.py:69] task.decoder.attention.params_init.seed : NoneType
I0420 07:30:14.112955 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.default : NoneType
I0420 07:30:14.113039 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.fullyconnected : NoneType
I0420 07:30:14.113125 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.softmax : NoneType
I0420 07:30:14.113209 140395597309760 base_runner.py:69] task.decoder.attention.query_dim : 0
I0420 07:30:14.113293 140395597309760 base_runner.py:69] task.decoder.attention.random_seed : NoneType
I0420 07:30:14.113379 140395597309760 base_runner.py:69] task.decoder.attention.same_batch_size : False
I0420 07:30:14.113464 140395597309760 base_runner.py:69] task.decoder.attention.skip_lp_regularization : NoneType
I0420 07:30:14.113549 140395597309760 base_runner.py:69] task.decoder.attention.source_dim : 0
I0420 07:30:14.113651 140395597309760 base_runner.py:69] task.decoder.attention.vn.global_vn : False
I0420 07:30:14.113740 140395597309760 base_runner.py:69] task.decoder.attention.vn.per_step_vn : False
I0420 07:30:14.113827 140395597309760 base_runner.py:69] task.decoder.attention.vn.scale : NoneType
I0420 07:30:14.113909 140395597309760 base_runner.py:69] task.decoder.attention.vn.seed : NoneType
I0420 07:30:14.113992 140395597309760 base_runner.py:69] task.decoder.attention_plot_font_properties : FontProperties
I0420 07:30:14.114078 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_empty_terminated_hyp : True
I0420 07:30:14.114161 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_implicit_capture : NoneType
I0420 07:30:14.114243 140395597309760 base_runner.py:69] task.decoder.beam_search.batch_major_state : True
I0420 07:30:14.114327 140395597309760 base_runner.py:69] task.decoder.beam_search.beam_size : 3.0
I0420 07:30:14.114411 140395597309760 base_runner.py:69] task.decoder.beam_search.cls : type/lingvo.core.beam_search_helper/BeamSearchHelper
I0420 07:30:14.114494 140395597309760 base_runner.py:69] task.decoder.beam_search.coverage_penalty : 0.0
I0420 07:30:14.114577 140395597309760 base_runner.py:69] task.decoder.beam_search.dtype : float32
I0420 07:30:14.114660 140395597309760 base_runner.py:69] task.decoder.beam_search.ensure_full_beam : False
I0420 07:30:14.114751 140395597309760 base_runner.py:69] task.decoder.beam_search.force_eos_in_last_step : False
I0420 07:30:14.114836 140395597309760 base_runner.py:69] task.decoder.beam_search.fprop_dtype : NoneType
I0420 07:30:14.114919 140395597309760 base_runner.py:69] task.decoder.beam_search.inference_driver_name : NoneType
I0420 07:30:14.115003 140395597309760 base_runner.py:69] task.decoder.beam_search.is_eval : NoneType
I0420 07:30:14.115087 140395597309760 base_runner.py:69] task.decoder.beam_search.is_inference : NoneType
I0420 07:30:14.115169 140395597309760 base_runner.py:69] task.decoder.beam_search.length_normalization : 0.0
I0420 07:30:14.115252 140395597309760 base_runner.py:69] task.decoder.beam_search.merge_paths : False
I0420 07:30:14.115334 140395597309760 base_runner.py:69] task.decoder.beam_search.name : 'beam_search'
I0420 07:30:14.115423 140395597309760 base_runner.py:69] task.decoder.beam_search.num_hyps_per_beam : 8
I0420 07:30:14.115508 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.method : 'xavier'
I0420 07:30:14.115592 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.scale : 1.000001
I0420 07:30:14.115674 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.seed : NoneType
I0420 07:30:14.115767 140395597309760 base_runner.py:69] task.decoder.beam_search.random_seed : NoneType
I0420 07:30:14.115852 140395597309760 base_runner.py:69] task.decoder.beam_search.skip_lp_regularization : NoneType
I0420 07:30:14.115935 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eoc_id : -1
I0420 07:30:14.116018 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eos_id : 2
I0420 07:30:14.116100 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_len : 0
I0420 07:30:14.116183 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_length_ratio : 1.0
I0420 07:30:14.116266 140395597309760 base_runner.py:69] task.decoder.beam_search.target_sos_id : 1
I0420 07:30:14.116348 140395597309760 base_runner.py:69] task.decoder.beam_search.valid_eos_max_logit_delta : 5.0
I0420 07:30:14.116430 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.global_vn : False
I0420 07:30:14.116513 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.per_step_vn : False
I0420 07:30:14.116596 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.scale : NoneType
I0420 07:30:14.116678 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.seed : NoneType
I0420 07:30:14.116769 140395597309760 base_runner.py:69] task.decoder.cls : type/lingvo.tasks.asr.decoder/AsrDecoder
I0420 07:30:14.116852 140395597309760 base_runner.py:69] task.decoder.contextualizer.allow_implicit_capture : NoneType
I0420 07:30:14.116936 140395597309760 base_runner.py:69] task.decoder.contextualizer.cls : type/lingvo.tasks.asr.contextualizer_base/NullContextualizer
I0420 07:30:14.117021 140395597309760 base_runner.py:69] task.decoder.contextualizer.dtype : float32
I0420 07:30:14.117105 140395597309760 base_runner.py:69] task.decoder.contextualizer.fprop_dtype : NoneType
I0420 07:30:14.117188 140395597309760 base_runner.py:69] task.decoder.contextualizer.inference_driver_name : NoneType
I0420 07:30:14.117273 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_eval : NoneType
I0420 07:30:14.117357 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_inference : NoneType
I0420 07:30:14.117440 140395597309760 base_runner.py:69] task.decoder.contextualizer.name : ''
I0420 07:30:14.117525 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.method : 'xavier'
I0420 07:30:14.117609 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.scale : 1.000001
I0420 07:30:14.117692 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.seed : NoneType
I0420 07:30:14.117789 140395597309760 base_runner.py:69] task.decoder.contextualizer.random_seed : NoneType
I0420 07:30:14.117873 140395597309760 base_runner.py:69] task.decoder.contextualizer.skip_lp_regularization : NoneType
I0420 07:30:14.117958 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.global_vn : False
I0420 07:30:14.118041 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.per_step_vn : False
I0420 07:30:14.118124 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.scale : NoneType
I0420 07:30:14.118207 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.seed : NoneType
I0420 07:30:14.118290 140395597309760 base_runner.py:69] task.decoder.dropout_prob : 0.0
I0420 07:30:14.118374 140395597309760 base_runner.py:69] task.decoder.dtype : float32
I0420 07:30:14.118457 140395597309760 base_runner.py:69] task.decoder.emb.allow_implicit_capture : NoneType
I0420 07:30:14.118540 140395597309760 base_runner.py:69] task.decoder.emb.cls : type/lingvo.core.layers/EmbeddingLayer
I0420 07:30:14.118630 140395597309760 base_runner.py:69] task.decoder.emb.dtype : float32
I0420 07:30:14.118716 140395597309760 base_runner.py:69] task.decoder.emb.embedding_dim : 0
I0420 07:30:14.118808 140395597309760 base_runner.py:69] task.decoder.emb.fprop_dtype : NoneType
I0420 07:30:14.118892 140395597309760 base_runner.py:69] task.decoder.emb.inference_driver_name : NoneType
I0420 07:30:14.118976 140395597309760 base_runner.py:69] task.decoder.emb.is_eval : NoneType
I0420 07:30:14.119060 140395597309760 base_runner.py:69] task.decoder.emb.is_inference : NoneType
I0420 07:30:14.119143 140395597309760 base_runner.py:69] task.decoder.emb.max_num_shards : 1
I0420 07:30:14.119225 140395597309760 base_runner.py:69] task.decoder.emb.name : ''
I0420 07:30:14.119308 140395597309760 base_runner.py:69] task.decoder.emb.on_ps : True
I0420 07:30:14.119393 140395597309760 base_runner.py:69] task.decoder.emb.params_init.method : 'uniform'
I0420 07:30:14.119476 140395597309760 base_runner.py:69] task.decoder.emb.params_init.scale : 1.0
I0420 07:30:14.119559 140395597309760 base_runner.py:69] task.decoder.emb.params_init.seed : NoneType
I0420 07:30:14.119642 140395597309760 base_runner.py:69] task.decoder.emb.random_seed : NoneType
I0420 07:30:14.119729 140395597309760 base_runner.py:69] task.decoder.emb.scale_sqrt_depth : False
I0420 07:30:14.119817 140395597309760 base_runner.py:69] task.decoder.emb.skip_lp_regularization : NoneType
I0420 07:30:14.119899 140395597309760 base_runner.py:69] task.decoder.emb.vn.global_vn : False
I0420 07:30:14.119982 140395597309760 base_runner.py:69] task.decoder.emb.vn.per_step_vn : False
I0420 07:30:14.120064 140395597309760 base_runner.py:69] task.decoder.emb.vn.scale : NoneType
I0420 07:30:14.120146 140395597309760 base_runner.py:69] task.decoder.emb.vn.seed : NoneType
I0420 07:30:14.120229 140395597309760 base_runner.py:69] task.decoder.emb.vocab_size : 76
I0420 07:30:14.120312 140395597309760 base_runner.py:69] task.decoder.emb_dim : 76
I0420 07:30:14.120395 140395597309760 base_runner.py:69] task.decoder.fprop_dtype : NoneType
I0420 07:30:14.120480 140395597309760 base_runner.py:69] task.decoder.fusion.allow_implicit_capture : NoneType
I0420 07:30:14.120563 140395597309760 base_runner.py:69] task.decoder.fusion.base_model_logits_dim : NoneType
I0420 07:30:14.120646 140395597309760 base_runner.py:69] task.decoder.fusion.cls : type/lingvo.tasks.asr.fusion/NullFusion
I0420 07:30:14.120735 140395597309760 base_runner.py:69] task.decoder.fusion.dtype : float32
I0420 07:30:14.120822 140395597309760 base_runner.py:69] task.decoder.fusion.fprop_dtype : NoneType
I0420 07:30:14.120906 140395597309760 base_runner.py:69] task.decoder.fusion.inference_driver_name : NoneType
I0420 07:30:14.120989 140395597309760 base_runner.py:69] task.decoder.fusion.is_eval : NoneType
I0420 07:30:14.121071 140395597309760 base_runner.py:69] task.decoder.fusion.is_inference : NoneType
I0420 07:30:14.121154 140395597309760 base_runner.py:69] task.decoder.fusion.lm.allow_implicit_capture : NoneType
I0420 07:30:14.121237 140395597309760 base_runner.py:69] task.decoder.fusion.lm.cls : type/lingvo.tasks.lm.layers/NullLm
I0420 07:30:14.121320 140395597309760 base_runner.py:69] task.decoder.fusion.lm.dtype : float32
I0420 07:30:14.121403 140395597309760 base_runner.py:69] task.decoder.fusion.lm.fprop_dtype : NoneType
I0420 07:30:14.121486 140395597309760 base_runner.py:69] task.decoder.fusion.lm.inference_driver_name : NoneType
I0420 07:30:14.121571 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_eval : NoneType
I0420 07:30:14.121651 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_inference : NoneType
I0420 07:30:14.121738 140395597309760 base_runner.py:69] task.decoder.fusion.lm.name : ''
I0420 07:30:14.121825 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.method : 'xavier'
I0420 07:30:14.121907 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.scale : 1.000001
I0420 07:30:14.121992 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.seed : NoneType
I0420 07:30:14.122082 140395597309760 base_runner.py:69] task.decoder.fusion.lm.random_seed : NoneType
I0420 07:30:14.122168 140395597309760 base_runner.py:69] task.decoder.fusion.lm.skip_lp_regularization : NoneType
I0420 07:30:14.122251 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.global_vn : False
I0420 07:30:14.122334 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.per_step_vn : False
I0420 07:30:14.122416 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.scale : NoneType
I0420 07:30:14.122499 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.seed : NoneType
I0420 07:30:14.122582 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vocab_size : 96
I0420 07:30:14.122665 140395597309760 base_runner.py:69] task.decoder.fusion.name : ''
I0420 07:30:14.122755 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.method : 'xavier'
I0420 07:30:14.122842 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.scale : 1.000001
I0420 07:30:14.122925 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.seed : NoneType
I0420 07:30:14.123008 140395597309760 base_runner.py:69] task.decoder.fusion.random_seed : NoneType
I0420 07:30:14.123091 140395597309760 base_runner.py:69] task.decoder.fusion.skip_lp_regularization : NoneType
I0420 07:30:14.123173 140395597309760 base_runner.py:69] task.decoder.fusion.vn.global_vn : False
I0420 07:30:14.123256 140395597309760 base_runner.py:69] task.decoder.fusion.vn.per_step_vn : False
I0420 07:30:14.123338 140395597309760 base_runner.py:69] task.decoder.fusion.vn.scale : NoneType
I0420 07:30:14.123420 140395597309760 base_runner.py:69] task.decoder.fusion.vn.seed : NoneType
I0420 07:30:14.123503 140395597309760 base_runner.py:69] task.decoder.inference_driver_name : NoneType
I0420 07:30:14.123585 140395597309760 base_runner.py:69] task.decoder.is_eval : NoneType
I0420 07:30:14.123667 140395597309760 base_runner.py:69] task.decoder.is_inference : NoneType
I0420 07:30:14.123760 140395597309760 base_runner.py:69] task.decoder.label_smoothing : NoneType
I0420 07:30:14.123846 140395597309760 base_runner.py:69] task.decoder.logit_types : {'logits': 1.0}
I0420 07:30:14.123929 140395597309760 base_runner.py:69] task.decoder.min_ground_truth_prob : 1.0
I0420 07:30:14.124013 140395597309760 base_runner.py:69] task.decoder.min_prob_step : 1000000.0
I0420 07:30:14.124128 140395597309760 base_runner.py:69] task.decoder.name : ''
I0420 07:30:14.124211 140395597309760 base_runner.py:69] task.decoder.packed_input : False
I0420 07:30:14.124291 140395597309760 base_runner.py:69] task.decoder.parallel_iterations : 30
I0420 07:30:14.124373 140395597309760 base_runner.py:69] task.decoder.params_init.method : 'xavier'
I0420 07:30:14.124455 140395597309760 base_runner.py:69] task.decoder.params_init.scale : 1.000001
I0420 07:30:14.124536 140395597309760 base_runner.py:69] task.decoder.params_init.seed : NoneType
I0420 07:30:14.124619 140395597309760 base_runner.py:69] task.decoder.per_token_avg_loss : True
I0420 07:30:14.124700 140395597309760 base_runner.py:69] task.decoder.prob_decay_start_step : 10000.0
I0420 07:30:14.124792 140395597309760 base_runner.py:69] task.decoder.random_seed : NoneType
I0420 07:30:14.124874 140395597309760 base_runner.py:69] task.decoder.residual_start : 0
I0420 07:30:14.124955 140395597309760 base_runner.py:69] task.decoder.rnn_cell_dim : 1024
I0420 07:30:14.125037 140395597309760 base_runner.py:69] task.decoder.rnn_cell_hidden_dim : 0
I0420 07:30:14.125118 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.125201 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.apply_pruning : False
I0420 07:30:14.125283 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.method : 'constant'
I0420 07:30:14.125365 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.scale : 0.0
I0420 07:30:14.125447 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.seed : 0
I0420 07:30:14.125535 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cell_value_cap : 10.0
I0420 07:30:14.125619 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple
I0420 07:30:14.125703 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.couple_input_forget_gates : False
I0420 07:30:14.125794 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.dtype : float32
I0420 07:30:14.125876 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.enable_lstm_bias : True
I0420 07:30:14.125958 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.forget_gate_bias : 0.0
I0420 07:30:14.126040 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.fprop_dtype : NoneType
I0420 07:30:14.126121 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inference_driver_name : NoneType
I0420 07:30:14.126204 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inputs_arity : 1
I0420 07:30:14.126285 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_eval : NoneType
I0420 07:30:14.126365 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_inference : NoneType
I0420 07:30:14.126446 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.name : ''
I0420 07:30:14.126527 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_hidden_nodes : 0
I0420 07:30:14.126609 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_input_nodes : 0
I0420 07:30:14.126689 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_output_nodes : 0
I0420 07:30:14.126785 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.output_nonlinearity : True
I0420 07:30:14.126868 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.method : 'uniform'
I0420 07:30:14.126950 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.scale : 0.1
I0420 07:30:14.127031 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.seed : NoneType
I0420 07:30:14.127113 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.c_state : NoneType
I0420 07:30:14.127194 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.default : NoneType
I0420 07:30:14.127275 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.fullyconnected : NoneType
I0420 07:30:14.127357 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.m_state : NoneType
I0420 07:30:14.127439 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.weight : NoneType
I0420 07:30:14.127520 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.random_seed : NoneType
I0420 07:30:14.127602 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.reset_cell_state : False
I0420 07:30:14.127684 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.127774 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.global_vn : False
I0420 07:30:14.127856 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.per_step_vn : False
I0420 07:30:14.127938 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.scale : NoneType
I0420 07:30:14.128021 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.seed : NoneType
I0420 07:30:14.128102 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:14.128184 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:14.128264 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zo_prob : 0.0
I0420 07:30:14.128346 140395597309760 base_runner.py:69] task.decoder.rnn_layers : 2
I0420 07:30:14.128427 140395597309760 base_runner.py:69] task.decoder.skip_lp_regularization : NoneType
I0420 07:30:14.128509 140395597309760 base_runner.py:69] task.decoder.softmax.allow_implicit_capture : NoneType
I0420 07:30:14.128592 140395597309760 base_runner.py:69] task.decoder.softmax.apply_pruning : False
I0420 07:30:14.128683 140395597309760 base_runner.py:69] task.decoder.softmax.chunk_size : 0
I0420 07:30:14.128774 140395597309760 base_runner.py:69] task.decoder.softmax.cls : type/lingvo.core.layers/SimpleFullSoftmax
I0420 07:30:14.128859 140395597309760 base_runner.py:69] task.decoder.softmax.dtype : float32
I0420 07:30:14.128942 140395597309760 base_runner.py:69] task.decoder.softmax.fprop_dtype : NoneType
I0420 07:30:14.129024 140395597309760 base_runner.py:69] task.decoder.softmax.inference_driver_name : NoneType
I0420 07:30:14.129106 140395597309760 base_runner.py:69] task.decoder.softmax.input_dim : 0
I0420 07:30:14.129188 140395597309760 base_runner.py:69] task.decoder.softmax.is_eval : NoneType
I0420 07:30:14.129271 140395597309760 base_runner.py:69] task.decoder.softmax.is_inference : NoneType
I0420 07:30:14.129352 140395597309760 base_runner.py:69] task.decoder.softmax.logits_abs_max : NoneType
I0420 07:30:14.129432 140395597309760 base_runner.py:69] task.decoder.softmax.name : ''
I0420 07:30:14.129514 140395597309760 base_runner.py:69] task.decoder.softmax.num_classes : 76
I0420 07:30:14.129596 140395597309760 base_runner.py:69] task.decoder.softmax.num_sampled : 0
I0420 07:30:14.129678 140395597309760 base_runner.py:69] task.decoder.softmax.num_shards : 1
I0420 07:30:14.129767 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.method : 'uniform'
I0420 07:30:14.129851 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.scale : 0.1
I0420 07:30:14.129931 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.seed : NoneType
I0420 07:30:14.130013 140395597309760 base_runner.py:69] task.decoder.softmax.qdomain.default : NoneType
I0420 07:30:14.130095 140395597309760 base_runner.py:69] task.decoder.softmax.random_seed : NoneType
I0420 07:30:14.130176 140395597309760 base_runner.py:69] task.decoder.softmax.skip_lp_regularization : NoneType
I0420 07:30:14.130256 140395597309760 base_runner.py:69] task.decoder.softmax.vn.global_vn : False
I0420 07:30:14.130337 140395597309760 base_runner.py:69] task.decoder.softmax.vn.per_step_vn : False
I0420 07:30:14.130419 140395597309760 base_runner.py:69] task.decoder.softmax.vn.scale : NoneType
I0420 07:30:14.130500 140395597309760 base_runner.py:69] task.decoder.softmax.vn.seed : NoneType
I0420 07:30:14.130580 140395597309760 base_runner.py:69] task.decoder.softmax_uses_attention : True
I0420 07:30:14.130661 140395597309760 base_runner.py:69] task.decoder.source_dim : 2048
I0420 07:30:14.130753 140395597309760 base_runner.py:69] task.decoder.target_eos_id : 2
I0420 07:30:14.130837 140395597309760 base_runner.py:69] task.decoder.target_seq_len : 620
I0420 07:30:14.130918 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.allow_implicit_capture : NoneType
I0420 07:30:14.131000 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.cls : type/lingvo.core.target_sequence_sampler/TargetSequenceSampler
I0420 07:30:14.131082 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.dtype : float32
I0420 07:30:14.131164 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.fprop_dtype : NoneType
I0420 07:30:14.131244 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.inference_driver_name : NoneType
I0420 07:30:14.131326 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_eval : NoneType
I0420 07:30:14.131407 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_inference : NoneType
I0420 07:30:14.131489 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.name : 'target_sequence_sampler'
I0420 07:30:14.131570 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.method : 'xavier'
I0420 07:30:14.131652 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.scale : 1.000001
I0420 07:30:14.131736 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.seed : NoneType
I0420 07:30:14.131822 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.random_seed : NoneType
I0420 07:30:14.131910 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.skip_lp_regularization : NoneType
I0420 07:30:14.131994 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eoc_id : -1
I0420 07:30:14.132076 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eos_id : 2
I0420 07:30:14.132157 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_seq_len : 0
I0420 07:30:14.132236 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_sos_id : 1
I0420 07:30:14.132318 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.temperature : 1.0
I0420 07:30:14.132397 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.global_vn : False
I0420 07:30:14.132478 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.per_step_vn : False
I0420 07:30:14.132560 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.scale : NoneType
I0420 07:30:14.132641 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.seed : NoneType
I0420 07:30:14.132725 140395597309760 base_runner.py:69] task.decoder.target_sos_id : 1
I0420 07:30:14.132812 140395597309760 base_runner.py:69] task.decoder.use_unnormalized_logits_as_log_probs : True
I0420 07:30:14.132894 140395597309760 base_runner.py:69] task.decoder.use_while_loop_based_unrolling : False
I0420 07:30:14.132977 140395597309760 base_runner.py:69] task.decoder.vn.global_vn : False
I0420 07:30:14.133059 140395597309760 base_runner.py:69] task.decoder.vn.per_step_vn : False
I0420 07:30:14.133138 140395597309760 base_runner.py:69] task.decoder.vn.scale : NoneType
I0420 07:30:14.133220 140395597309760 base_runner.py:69] task.decoder.vn.seed : NoneType
I0420 07:30:14.133301 140395597309760 base_runner.py:69] task.dtype : float32
I0420 07:30:14.133383 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.activation : 'RELU'
I0420 07:30:14.133462 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.133543 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.batch_norm : True
I0420 07:30:14.133625 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bias : False
I0420 07:30:14.133733 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_decay : 0.999
I0420 07:30:14.133817 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_fold_weights : NoneType
I0420 07:30:14.133898 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.causal_convolution : False
I0420 07:30:14.133977 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer
I0420 07:30:14.134061 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.conv_last : False
I0420 07:30:14.134140 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dilation_rate : (1, 1)
I0420 07:30:14.134221 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.disable_activation_quantization : False
I0420 07:30:14.134300 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dtype : float32
I0420 07:30:14.134382 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_shape : [3, 3, 'NoneType', 'NoneType']
I0420 07:30:14.134462 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_stride : [1, 1]
I0420 07:30:14.134541 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.fprop_dtype : NoneType
I0420 07:30:14.134619 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.inference_driver_name : NoneType
I0420 07:30:14.134700 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_eval : NoneType
I0420 07:30:14.134788 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_inference : NoneType
I0420 07:30:14.134876 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.name : ''
I0420 07:30:14.134958 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:14.135037 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.scale : 0.1
I0420 07:30:14.135118 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.seed : NoneType
I0420 07:30:14.135199 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.qdomain.default : NoneType
I0420 07:30:14.135278 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.random_seed : NoneType
I0420 07:30:14.135358 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.135437 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.global_vn : False
I0420 07:30:14.135518 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.per_step_vn : False
I0420 07:30:14.135596 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.scale : NoneType
I0420 07:30:14.135677 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.seed : NoneType
I0420 07:30:14.135762 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.weight_norm : False
I0420 07:30:14.135844 140395597309760 base_runner.py:69] task.encoder.allow_implicit_capture : NoneType
I0420 07:30:14.135925 140395597309760 base_runner.py:69] task.encoder.bidi_rnn_type : 'func'
I0420 07:30:14.136006 140395597309760 base_runner.py:69] task.encoder.cls : type/lingvo.tasks.asr.encoder/AsrEncoder
I0420 07:30:14.136085 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.activation : 'RELU'
I0420 07:30:14.136164 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.136245 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.batch_norm : True
I0420 07:30:14.136323 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bias : False
I0420 07:30:14.136404 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_decay : 0.999
I0420 07:30:14.136482 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_fold_weights : NoneType
I0420 07:30:14.136563 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.causal_convolution : False
I0420 07:30:14.136642 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer
I0420 07:30:14.136727 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.conv_last : False
I0420 07:30:14.136810 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dilation_rate : (1, 1)
I0420 07:30:14.136890 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.disable_activation_quantization : False
I0420 07:30:14.136971 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dtype : float32
I0420 07:30:14.137051 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_shape : (0, 0, 0, 0)
I0420 07:30:14.137130 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_stride : (0, 0)
I0420 07:30:14.137208 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.fprop_dtype : NoneType
I0420 07:30:14.137288 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.inference_driver_name : NoneType
I0420 07:30:14.137367 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_eval : NoneType
I0420 07:30:14.137447 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_inference : NoneType
I0420 07:30:14.137526 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.name : ''
I0420 07:30:14.137604 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.method : 'gaussian'
I0420 07:30:14.137685 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.scale : 0.001
I0420 07:30:14.137772 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.seed : NoneType
I0420 07:30:14.137852 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.qdomain.default : NoneType
I0420 07:30:14.137940 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.random_seed : NoneType
I0420 07:30:14.138022 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.138103 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.global_vn : False
I0420 07:30:14.138181 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.per_step_vn : False
I0420 07:30:14.138261 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.scale : NoneType
I0420 07:30:14.138339 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.seed : NoneType
I0420 07:30:14.138418 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.weight_norm : False
I0420 07:30:14.138498 140395597309760 base_runner.py:69] task.encoder.conv_filter_shapes : [(3, 3, 1, 32), (3, 3, 32, 32)]
I0420 07:30:14.138577 140395597309760 base_runner.py:69] task.encoder.conv_filter_strides : [(2, 2), (2, 2)]
I0420 07:30:14.138658 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.138742 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType']
I0420 07:30:14.138825 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_value_cap : 10.0
I0420 07:30:14.138905 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cls : type/lingvo.core.rnn_cell/ConvLSTMCell
I0420 07:30:14.138986 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.dtype : float32
I0420 07:30:14.139065 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.filter_shape : [1, 3]
I0420 07:30:14.139143 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.fprop_dtype : NoneType
I0420 07:30:14.139225 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inference_driver_name : NoneType
I0420 07:30:14.139303 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_arity : 1
I0420 07:30:14.139384 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType']
I0420 07:30:14.139463 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_eval : NoneType
I0420 07:30:14.139543 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_inference : NoneType
I0420 07:30:14.139622 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.name : ''
I0420 07:30:14.139702 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_input_nodes : 0
I0420 07:30:14.139790 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_output_nodes : 0
I0420 07:30:14.139870 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.output_nonlinearity : True
I0420 07:30:14.139950 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:14.140031 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.scale : 0.1
I0420 07:30:14.140111 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.seed : NoneType
I0420 07:30:14.140191 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.qdomain.default : NoneType
I0420 07:30:14.140269 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.random_seed : NoneType
I0420 07:30:14.140348 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.reset_cell_state : False
I0420 07:30:14.140429 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.140507 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.global_vn : False
I0420 07:30:14.140588 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.per_step_vn : False
I0420 07:30:14.140667 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.scale : NoneType
I0420 07:30:14.140753 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.seed : NoneType
I0420 07:30:14.140834 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:14.140921 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:14.141004 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zo_prob : 0.0
I0420 07:30:14.141084 140395597309760 base_runner.py:69] task.encoder.dtype : float32
I0420 07:30:14.141165 140395597309760 base_runner.py:69] task.encoder.extra_per_layer_outputs : False
I0420 07:30:14.141244 140395597309760 base_runner.py:69] task.encoder.fprop_dtype : NoneType
I0420 07:30:14.141324 140395597309760 base_runner.py:69] task.encoder.highway_skip : False
I0420 07:30:14.141405 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.141484 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.batch_norm : False
I0420 07:30:14.141565 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.carry_bias_init : 1.0
I0420 07:30:14.141644 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.cls : type/lingvo.core.layers/HighwaySkipLayer
I0420 07:30:14.141730 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.couple_carry_transform_gates : False
I0420 07:30:14.141813 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.dtype : float32
I0420 07:30:14.141894 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.fprop_dtype : NoneType
I0420 07:30:14.141973 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.inference_driver_name : NoneType
I0420 07:30:14.142054 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.input_dim : 0
I0420 07:30:14.142133 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_eval : NoneType
I0420 07:30:14.142214 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_inference : NoneType
I0420 07:30:14.142293 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.name : ''
I0420 07:30:14.142373 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.method : 'xavier'
I0420 07:30:14.142453 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.scale : 1.000001
I0420 07:30:14.142534 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.seed : NoneType
I0420 07:30:14.142612 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.random_seed : NoneType
I0420 07:30:14.142693 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.142781 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.global_vn : False
I0420 07:30:14.142863 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.per_step_vn : False
I0420 07:30:14.142941 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.scale : NoneType
I0420 07:30:14.143021 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.seed : NoneType
I0420 07:30:14.143100 140395597309760 base_runner.py:69] task.encoder.inference_driver_name : NoneType
I0420 07:30:14.143178 140395597309760 base_runner.py:69] task.encoder.input_shape : ['NoneType', 'NoneType', 80, 1]
I0420 07:30:14.143259 140395597309760 base_runner.py:69] task.encoder.is_eval : NoneType
I0420 07:30:14.143338 140395597309760 base_runner.py:69] task.encoder.is_inference : NoneType
I0420 07:30:14.143419 140395597309760 base_runner.py:69] task.encoder.lstm_cell_size : 1024
I0420 07:30:14.143497 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.143578 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.apply_pruning : False
I0420 07:30:14.143656 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.method : 'constant'
I0420 07:30:14.143760 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.scale : 0.0
I0420 07:30:14.143840 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.seed : 0
I0420 07:30:14.143918 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cell_value_cap : 10.0
I0420 07:30:14.144005 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple
I0420 07:30:14.144085 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.couple_input_forget_gates : False
I0420 07:30:14.144164 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.dtype : float32
I0420 07:30:14.144243 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.enable_lstm_bias : True
I0420 07:30:14.144320 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.forget_gate_bias : 0.0
I0420 07:30:14.144398 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.fprop_dtype : NoneType
I0420 07:30:14.144475 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inference_driver_name : NoneType
I0420 07:30:14.144553 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inputs_arity : 1
I0420 07:30:14.144632 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_eval : NoneType
I0420 07:30:14.144709 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_inference : NoneType
I0420 07:30:14.144798 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.name : ''
I0420 07:30:14.144877 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_hidden_nodes : 0
I0420 07:30:14.144958 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_input_nodes : 0
I0420 07:30:14.145035 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_output_nodes : 0
I0420 07:30:14.145113 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.output_nonlinearity : True
I0420 07:30:14.145191 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.method : 'uniform'
I0420 07:30:14.145270 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.scale : 0.1
I0420 07:30:14.145348 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.seed : NoneType
I0420 07:30:14.145426 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.c_state : NoneType
I0420 07:30:14.145504 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.default : NoneType
I0420 07:30:14.145581 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.fullyconnected : NoneType
I0420 07:30:14.145661 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.m_state : NoneType
I0420 07:30:14.145744 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.weight : NoneType
I0420 07:30:14.145826 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.random_seed : NoneType
I0420 07:30:14.145904 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.reset_cell_state : False
I0420 07:30:14.145982 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.146061 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.global_vn : False
I0420 07:30:14.146140 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.per_step_vn : False
I0420 07:30:14.146218 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.scale : NoneType
I0420 07:30:14.146296 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.seed : NoneType
I0420 07:30:14.146373 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:14.146450 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:14.146528 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zo_prob : 0.0
I0420 07:30:14.146608 140395597309760 base_runner.py:69] task.encoder.name : ''
I0420 07:30:14.146686 140395597309760 base_runner.py:69] task.encoder.num_cnn_layers : 2
I0420 07:30:14.146769 140395597309760 base_runner.py:69] task.encoder.num_conv_lstm_layers : 0
I0420 07:30:14.146848 140395597309760 base_runner.py:69] task.encoder.num_lstm_layers : 4
I0420 07:30:14.146925 140395597309760 base_runner.py:69] task.encoder.packed_input : False
I0420 07:30:14.147005 140395597309760 base_runner.py:69] task.encoder.pad_steps : 6
I0420 07:30:14.147083 140395597309760 base_runner.py:69] task.encoder.params_init.method : 'xavier'
I0420 07:30:14.147161 140395597309760 base_runner.py:69] task.encoder.params_init.scale : 1.000001
I0420 07:30:14.147245 140395597309760 base_runner.py:69] task.encoder.params_init.seed : NoneType
I0420 07:30:14.147325 140395597309760 base_runner.py:69] task.encoder.proj_tpl.activation : 'RELU'
I0420 07:30:14.147403 140395597309760 base_runner.py:69] task.encoder.proj_tpl.affine_last : False
I0420 07:30:14.147480 140395597309760 base_runner.py:69] task.encoder.proj_tpl.allow_implicit_capture : NoneType
I0420 07:30:14.147558 140395597309760 base_runner.py:69] task.encoder.proj_tpl.batch_norm : True
I0420 07:30:14.147636 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bias_init : 0.0
I0420 07:30:14.147715 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bn_fold_weights : NoneType
I0420 07:30:14.147800 140395597309760 base_runner.py:69] task.encoder.proj_tpl.cls : type/lingvo.core.layers/ProjectionLayer
I0420 07:30:14.147881 140395597309760 base_runner.py:69] task.encoder.proj_tpl.dtype : float32
I0420 07:30:14.147958 140395597309760 base_runner.py:69] task.encoder.proj_tpl.fprop_dtype : NoneType
I0420 07:30:14.148036 140395597309760 base_runner.py:69] task.encoder.proj_tpl.has_bias : False
I0420 07:30:14.148116 140395597309760 base_runner.py:69] task.encoder.proj_tpl.inference_driver_name : NoneType
I0420 07:30:14.148194 140395597309760 base_runner.py:69] task.encoder.proj_tpl.input_dim : 0
I0420 07:30:14.148272 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_eval : NoneType
I0420 07:30:14.148350 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_inference : NoneType
I0420 07:30:14.148427 140395597309760 base_runner.py:69] task.encoder.proj_tpl.name : ''
I0420 07:30:14.148505 140395597309760 base_runner.py:69] task.encoder.proj_tpl.output_dim : 0
I0420 07:30:14.148583 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:14.148662 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.scale : 0.1
I0420 07:30:14.148745 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.seed : NoneType
I0420 07:30:14.148827 140395597309760 base_runner.py:69] task.encoder.proj_tpl.qdomain.default : NoneType
I0420 07:30:14.148905 140395597309760 base_runner.py:69] task.encoder.proj_tpl.random_seed : NoneType
I0420 07:30:14.148983 140395597309760 base_runner.py:69] task.encoder.proj_tpl.skip_lp_regularization : NoneType
I0420 07:30:14.149061 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.global_vn : False
I0420 07:30:14.149141 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.per_step_vn : False
I0420 07:30:14.149219 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.scale : NoneType
I0420 07:30:14.149296 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.seed : NoneType
I0420 07:30:14.149374 140395597309760 base_runner.py:69] task.encoder.proj_tpl.weight_norm : False
I0420 07:30:14.149451 140395597309760 base_runner.py:69] task.encoder.project_lstm_output : True
I0420 07:30:14.149529 140395597309760 base_runner.py:69] task.encoder.random_seed : NoneType
I0420 07:30:14.149607 140395597309760 base_runner.py:69] task.encoder.residual_start : 0
I0420 07:30:14.149687 140395597309760 base_runner.py:69] task.encoder.residual_stride : 1
I0420 07:30:14.149770 140395597309760 base_runner.py:69] task.encoder.skip_lp_regularization : NoneType
I0420 07:30:14.149851 140395597309760 base_runner.py:69] task.encoder.vn.global_vn : False
I0420 07:30:14.149930 140395597309760 base_runner.py:69] task.encoder.vn.per_step_vn : False
I0420 07:30:14.150007 140395597309760 base_runner.py:69] task.encoder.vn.scale : NoneType
I0420 07:30:14.150085 140395597309760 base_runner.py:69] task.encoder.vn.seed : NoneType
I0420 07:30:14.150163 140395597309760 base_runner.py:69] task.eval.decoder_samples_per_summary : 0
I0420 07:30:14.150242 140395597309760 base_runner.py:69] task.eval.samples_per_summary : 5000
I0420 07:30:14.150321 140395597309760 base_runner.py:69] task.fprop_dtype : NoneType
I0420 07:30:14.150398 140395597309760 base_runner.py:69] task.frontend : NoneType
I0420 07:30:14.150489 140395597309760 base_runner.py:69] task.inference_driver_name : NoneType
I0420 07:30:14.150569 140395597309760 base_runner.py:69] task.input : NoneType
I0420 07:30:14.150648 140395597309760 base_runner.py:69] task.is_eval : NoneType
I0420 07:30:14.150729 140395597309760 base_runner.py:69] task.is_inference : NoneType
I0420 07:30:14.150811 140395597309760 base_runner.py:69] task.name : 'librispeech'
I0420 07:30:14.150891 140395597309760 base_runner.py:69] task.online_encoder : NoneType
I0420 07:30:14.150969 140395597309760 base_runner.py:69] task.params_init.method : 'xavier'
I0420 07:30:14.151047 140395597309760 base_runner.py:69] task.params_init.scale : 1.000001
I0420 07:30:14.151125 140395597309760 base_runner.py:69] task.params_init.seed : NoneType
I0420 07:30:14.151205 140395597309760 base_runner.py:69] task.random_seed : NoneType
I0420 07:30:14.151283 140395597309760 base_runner.py:69] task.skip_lp_regularization : NoneType
I0420 07:30:14.151360 140395597309760 base_runner.py:69] task.target_key : ''
I0420 07:30:14.151438 140395597309760 base_runner.py:69] task.train.bprop_variable_filter : NoneType
I0420 07:30:14.151516 140395597309760 base_runner.py:69] task.train.clip_gradient_norm_to_value : 1.0
I0420 07:30:14.151593 140395597309760 base_runner.py:69] task.train.clip_gradient_single_norm_to_value : 0.0
I0420 07:30:14.151671 140395597309760 base_runner.py:69] task.train.colocate_gradients_with_ops : True
I0420 07:30:14.151755 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.jobname : 'eval_dev'
I0420 07:30:14.151834 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.local_filesystem : False
I0420 07:30:14.151913 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.logdir : ''
I0420 07:30:14.151992 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.metric : 'log_pplx'
I0420 07:30:14.152070 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.minimize : True
I0420 07:30:14.152148 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.name : 'MetricHistory'
I0420 07:30:14.152226 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.tfevent_file : False
I0420 07:30:14.152304 140395597309760 base_runner.py:69] task.train.early_stop.name : 'EarlyStop'
I0420 07:30:14.152383 140395597309760 base_runner.py:69] task.train.early_stop.tolerance : 0.0
I0420 07:30:14.152461 140395597309760 base_runner.py:69] task.train.early_stop.verbose : True
I0420 07:30:14.152538 140395597309760 base_runner.py:69] task.train.early_stop.window : 0
I0420 07:30:14.152615 140395597309760 base_runner.py:69] task.train.ema_decay : 0.0
I0420 07:30:14.152693 140395597309760 base_runner.py:69] task.train.gate_gradients : False
I0420 07:30:14.152776 140395597309760 base_runner.py:69] task.train.grad_aggregation_method : 1
I0420 07:30:14.152857 140395597309760 base_runner.py:69] task.train.grad_norm_to_clip_to_zero : 100.0
I0420 07:30:14.152935 140395597309760 base_runner.py:69] task.train.grad_norm_tracker : NoneType
I0420 07:30:14.153012 140395597309760 base_runner.py:69] task.train.init_from_checkpoint_rules : {}
I0420 07:30:14.153090 140395597309760 base_runner.py:69] task.train.l1_regularizer_weight : NoneType
I0420 07:30:14.153168 140395597309760 base_runner.py:69] task.train.l2_regularizer_weight : 1e-06
I0420 07:30:14.153244 140395597309760 base_runner.py:69] task.train.learning_rate : 0.00025
I0420 07:30:14.153321 140395597309760 base_runner.py:69] task.train.lr_schedule.allow_implicit_capture : NoneType
I0420 07:30:14.153400 140395597309760 base_runner.py:69] task.train.lr_schedule.cls : type/lingvo.core.lr_schedule/ContinuousLearningRateSchedule
I0420 07:30:14.153480 140395597309760 base_runner.py:69] task.train.lr_schedule.dtype : float32
I0420 07:30:14.153557 140395597309760 base_runner.py:69] task.train.lr_schedule.fprop_dtype : NoneType
I0420 07:30:14.153634 140395597309760 base_runner.py:69] task.train.lr_schedule.half_life_steps : 100000
I0420 07:30:14.153719 140395597309760 base_runner.py:69] task.train.lr_schedule.inference_driver_name : NoneType
I0420 07:30:14.153809 140395597309760 base_runner.py:69] task.train.lr_schedule.initial_value : 1.0
I0420 07:30:14.153887 140395597309760 base_runner.py:69] task.train.lr_schedule.is_eval : NoneType
I0420 07:30:14.153983 140395597309760 base_runner.py:69] task.train.lr_schedule.is_inference : NoneType
I0420 07:30:14.154061 140395597309760 base_runner.py:69] task.train.lr_schedule.min : 0.01
I0420 07:30:14.154139 140395597309760 base_runner.py:69] task.train.lr_schedule.name : 'LRSched'
I0420 07:30:14.154215 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.method : 'xavier'
I0420 07:30:14.154292 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.scale : 1.000001
I0420 07:30:14.154370 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.seed : NoneType
I0420 07:30:14.154447 140395597309760 base_runner.py:69] task.train.lr_schedule.random_seed : NoneType
I0420 07:30:14.154524 140395597309760 base_runner.py:69] task.train.lr_schedule.skip_lp_regularization : NoneType
I0420 07:30:14.154602 140395597309760 base_runner.py:69] task.train.lr_schedule.start_step : 50000
I0420 07:30:14.154678 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.global_vn : False
I0420 07:30:14.154759 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.per_step_vn : False
I0420 07:30:14.154838 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.scale : NoneType
I0420 07:30:14.154913 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.seed : NoneType
I0420 07:30:14.154990 140395597309760 base_runner.py:69] task.train.max_steps : 4000000
I0420 07:30:14.155067 140395597309760 base_runner.py:69] task.train.optimizer.allow_implicit_capture : NoneType
I0420 07:30:14.155144 140395597309760 base_runner.py:69] task.train.optimizer.beta1 : 0.9
I0420 07:30:14.155221 140395597309760 base_runner.py:69] task.train.optimizer.beta2 : 0.999
I0420 07:30:14.155299 140395597309760 base_runner.py:69] task.train.optimizer.cls : type/lingvo.core.optimizer/Adam
I0420 07:30:14.155376 140395597309760 base_runner.py:69] task.train.optimizer.dtype : float32
I0420 07:30:14.155453 140395597309760 base_runner.py:69] task.train.optimizer.epsilon : 1e-06
I0420 07:30:14.155530 140395597309760 base_runner.py:69] task.train.optimizer.fprop_dtype : NoneType
I0420 07:30:14.155607 140395597309760 base_runner.py:69] task.train.optimizer.inference_driver_name : NoneType
I0420 07:30:14.155685 140395597309760 base_runner.py:69] task.train.optimizer.is_eval : NoneType
I0420 07:30:14.155767 140395597309760 base_runner.py:69] task.train.optimizer.is_inference : NoneType
I0420 07:30:14.155847 140395597309760 base_runner.py:69] task.train.optimizer.name : 'Adam'
I0420 07:30:14.155924 140395597309760 base_runner.py:69] task.train.optimizer.params_init.method : 'xavier'
I0420 07:30:14.156001 140395597309760 base_runner.py:69] task.train.optimizer.params_init.scale : 1.000001
I0420 07:30:14.156078 140395597309760 base_runner.py:69] task.train.optimizer.params_init.seed : NoneType
I0420 07:30:14.156155 140395597309760 base_runner.py:69] task.train.optimizer.random_seed : NoneType
I0420 07:30:14.156232 140395597309760 base_runner.py:69] task.train.optimizer.skip_lp_regularization : NoneType
I0420 07:30:14.156308 140395597309760 base_runner.py:69] task.train.optimizer.vn.global_vn : False
I0420 07:30:14.156385 140395597309760 base_runner.py:69] task.train.optimizer.vn.per_step_vn : False
I0420 07:30:14.156461 140395597309760 base_runner.py:69] task.train.optimizer.vn.scale : NoneType
I0420 07:30:14.156538 140395597309760 base_runner.py:69] task.train.optimizer.vn.seed : NoneType
I0420 07:30:14.156616 140395597309760 base_runner.py:69] task.train.pruning_hparams_dict : NoneType
I0420 07:30:14.156693 140395597309760 base_runner.py:69] task.train.save_interval_seconds : 600
I0420 07:30:14.156775 140395597309760 base_runner.py:69] task.train.start_up_delay_steps : 200
I0420 07:30:14.156853 140395597309760 base_runner.py:69] task.train.summary_interval_steps : 100
I0420 07:30:14.156939 140395597309760 base_runner.py:69] task.train.tpu_steps_per_loop : 20
I0420 07:30:14.157018 140395597309760 base_runner.py:69] task.train.vn_start_step : 20000
I0420 07:30:14.157095 140395597309760 base_runner.py:69] task.train.vn_std : 0.075
I0420 07:30:14.157172 140395597309760 base_runner.py:69] task.vn.global_vn : True
I0420 07:30:14.157249 140395597309760 base_runner.py:69] task.vn.per_step_vn : False
I0420 07:30:14.157327 140395597309760 base_runner.py:69] task.vn.scale : NoneType
I0420 07:30:14.157403 140395597309760 base_runner.py:69] task.vn.seed : NoneType
I0420 07:30:14.157480 140395597309760 base_runner.py:69] train.early_stop.metric_history.jobname : 'eval_dev'
I0420 07:30:14.157557 140395597309760 base_runner.py:69] train.early_stop.metric_history.local_filesystem : False
I0420 07:30:14.157634 140395597309760 base_runner.py:69] train.early_stop.metric_history.logdir : ''
I0420 07:30:14.157710 140395597309760 base_runner.py:69] train.early_stop.metric_history.metric : 'log_pplx'
I0420 07:30:14.157793 140395597309760 base_runner.py:69] train.early_stop.metric_history.minimize : True
I0420 07:30:14.157871 140395597309760 base_runner.py:69] train.early_stop.metric_history.name : 'MetricHistory'
I0420 07:30:14.157948 140395597309760 base_runner.py:69] train.early_stop.metric_history.tfevent_file : False
I0420 07:30:14.158025 140395597309760 base_runner.py:69] train.early_stop.name : 'EarlyStop'
I0420 07:30:14.158102 140395597309760 base_runner.py:69] train.early_stop.tolerance : 0.0
I0420 07:30:14.158179 140395597309760 base_runner.py:69] train.early_stop.verbose : True
I0420 07:30:14.158255 140395597309760 base_runner.py:69] train.early_stop.window : 0
I0420 07:30:14.158344 140395597309760 base_runner.py:69] train.ema_decay : 0.0
I0420 07:30:14.158476 140395597309760 base_runner.py:69] train.init_from_checkpoint_rules : {}
I0420 07:30:14.158565 140395597309760 base_runner.py:69] train.max_steps : 4000000
I0420 07:30:14.158658 140395597309760 base_runner.py:69] train.save_interval_seconds : 600
I0420 07:30:14.158746 140395597309760 base_runner.py:69] train.start_up_delay_steps : 200
I0420 07:30:14.158827 140395597309760 base_runner.py:69] train.summary_interval_steps : 100
I0420 07:30:14.158905 140395597309760 base_runner.py:69] train.tpu_steps_per_loop : 20
I0420 07:30:14.158983 140395597309760 base_runner.py:69] vn.global_vn : True
I0420 07:30:14.159061 140395597309760 base_runner.py:69] vn.per_step_vn : False
I0420 07:30:14.159140 140395597309760 base_runner.py:69] vn.scale : NoneType
I0420 07:30:14.159218 140395597309760 base_runner.py:69] vn.seed : NoneType
I0420 07:30:14.159296 140395597309760 base_runner.py:69] 
I0420 07:30:14.159385 140395597309760 base_runner.py:70] ============================================================
I0420 07:30:14.161322 140395597309760 base_runner.py:115] Starting ...
W0420 07:30:14.161540 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:186: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.

W0420 07:30:14.162086 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py:324: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

W0420 07:30:14.162875 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:192: The name tf.container is deprecated. Please use tf.compat.v1.container instead.

I0420 07:30:14.163144 140395597309760 cluster.py:429] _LeastLoadedPlacer : ['/job:local/replica:0/task:0/device:CPU:0']
W0420 07:30:14.169799 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1258: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0420 07:30:14.170238 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1260: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

I0420 07:30:14.174882 140395597309760 cluster.py:447] Place variable global_step on /job:local/replica:0/task:0/device:CPU:0 8
W0420 07:30:14.189157 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/py_utils.py:1250: The name tf.train.get_global_step is deprecated. Please use tf.compat.v1.train.get_global_step instead.

I0420 07:30:14.189383 140395597309760 base_model.py:1116] Training parameters for <class 'lingvo.core.base_model.SingleTaskModel'>: {
  early_stop: {
    metric_history: {
"eval_dev"
      local_filesystem: False
"/data/dingzhenyou/speech_data/librispeech/log/"
"log_pplx"
      minimize: True
"MetricHistory"
      tfevent_file: False
    }
"EarlyStop"
    tolerance: 0.0
    verbose: True
    window: 0
  }
  ema_decay: 0.0
  init_from_checkpoint_rules: {}
  max_steps: 4000000
  save_interval_seconds: 600
  start_up_delay_steps: 200
  summary_interval_steps: 100
  tpu_steps_per_loop: 20
}
I0420 07:30:14.208252 140395597309760 base_input_generator.py:510] bucket_batch_limit [64, 32, 32, 32, 32, 32, 32, 32]
W0420 07:30:14.209599 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/input_generator.py:47: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

W0420 07:30:14.209780 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/input_generator.py:51: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

I0420 07:30:14.308718 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 1160
I0420 07:30:14.311394 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/w/var:0 shape=(3, 3, 1, 32) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.318114 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 1288
I0420 07:30:14.320282 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.323750 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 1416
I0420 07:30:14.325911 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.330802 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 1544
I0420 07:30:14.332957 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.336487 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 1672
I0420 07:30:14.338845 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.349708 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 38536
I0420 07:30:14.352178 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/w/var:0 shape=(3, 3, 32, 32) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.359050 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 38664
I0420 07:30:14.361227 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.364701 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 38792
I0420 07:30:14.366914 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.371793 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 38920
I0420 07:30:14.373955 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.377459 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 39048
I0420 07:30:14.379628 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.406703 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 27302024
I0420 07:30:14.409185 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.418040 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 27318408
I0420 07:30:14.420245 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
W0420 07:30:14.428427 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/summary_utils.py:41: The name tf.summary.histogram is deprecated. Please use tf.compat.v1.summary.histogram instead.

I0420 07:30:14.449314 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 54581384
I0420 07:30:14.451812 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.460689 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 54597768
I0420 07:30:14.463033 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.496942 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 104929416
I0420 07:30:14.499413 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.509780 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 104945800
I0420 07:30:14.512828 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.551914 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 155277448
I0420 07:30:14.555337 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.567281 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 155293832
I0420 07:30:14.570199 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.615386 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 205625480
I0420 07:30:14.618185 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.629010 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 205641864
I0420 07:30:14.631587 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.664180 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 255973512
I0420 07:30:14.666968 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.676595 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 255989896
I0420 07:30:14.678953 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.717291 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 306321544
I0420 07:30:14.720068 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.729692 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 306337928
I0420 07:30:14.733038 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.765098 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 356669576
I0420 07:30:14.767899 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.777565 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 356685960
I0420 07:30:14.779922 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.805918 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 373463176
I0420 07:30:14.808506 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.816268 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 373471368
I0420 07:30:14.818604 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.822424 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 373479560
I0420 07:30:14.824768 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.830056 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 373487752
I0420 07:30:14.832406 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.836292 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 373495944
I0420 07:30:14.838644 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.850867 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 390273160
I0420 07:30:14.853652 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.861360 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 390281352
I0420 07:30:14.863693 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.868551 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 390289544
I0420 07:30:14.873275 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.879584 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 390297736
I0420 07:30:14.882167 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.886128 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 390305928
I0420 07:30:14.888609 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.899238 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/w/var on /job:local/replica:0/task:0/device:CPU:0 407083144
I0420 07:30:14.901216 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.906749 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/beta/var on /job:local/replica:0/task:0/device:CPU:0 407091336
I0420 07:30:14.908530 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.911384 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/gamma/var on /job:local/replica:0/task:0/device:CPU:0 407099528
I0420 07:30:14.913311 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.917190 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 407107720
I0420 07:30:14.919024 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.922070 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 407115912
I0420 07:30:14.923866 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.952696 140395597309760 cluster.py:447] Place variable librispeech/dec/emb/var_0/var on /job:local/replica:0/task:0/device:CPU:0 407139016
I0420 07:30:14.954703 140395597309760 py_utils.py:1220] Creating var librispeech/dec/emb/var_0/var:0 shape=(76, 76) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.963403 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/weight_0/var on /job:local/replica:0/task:0/device:CPU:0 408072904
I0420 07:30:14.965323 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/weight_0/var:0 shape=(3072, 76) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.972189 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/bias_0/var on /job:local/replica:0/task:0/device:CPU:0 408073208
I0420 07:30:14.973892 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/bias_0/var:0 shape=(76,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:14.992157 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/wm/var on /job:local/replica:0/task:0/device:CPU:0 459650040
I0420 07:30:14.994079 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/wm/var:0 shape=(3148, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.000926 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/b/var on /job:local/replica:0/task:0/device:CPU:0 459666424
I0420 07:30:15.002634 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.021346 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/wm/var on /job:local/replica:0/task:0/device:CPU:0 526775288
I0420 07:30:15.023277 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/wm/var:0 shape=(4096, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.030137 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/b/var on /job:local/replica:0/task:0/device:CPU:0 526791672
I0420 07:30:15.031835 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.242995 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/source_var/var on /job:local/replica:0/task:0/device:CPU:0 527840248
I0420 07:30:15.245058 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/source_var/var:0 shape=(2048, 128) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.255265 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/query_var/var on /job:local/replica:0/task:0/device:CPU:0 528364536
I0420 07:30:15.257199 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/query_var/var:0 shape=(1024, 128) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.267314 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/hidden_var/var on /job:local/replica:0/task:0/device:CPU:0 528365048
I0420 07:30:15.269248 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/hidden_var/var:0 shape=(128,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.326706 140395597309760 py_utils.py:1277] === worker 0 ===
I0420 07:30:15.328285 140395597309760 py_utils.py:1267] worker 0: decoder.atten.global_step                      /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328372 140395597309760 py_utils.py:1267] worker 0: decoder.atten.hidden_var                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328435 140395597309760 py_utils.py:1267] worker 0: decoder.atten.query_var                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328493 140395597309760 py_utils.py:1267] worker 0: decoder.atten.source_var                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328548 140395597309760 py_utils.py:1267] worker 0: decoder.beam_search.global_step                /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328603 140395597309760 py_utils.py:1267] worker 0: decoder.contextualizer.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328656 140395597309760 py_utils.py:1267] worker 0: decoder.emb.wm[0]                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328711 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.global_step                     /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328783 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.lm.global_step                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328838 140395597309760 py_utils.py:1267] worker 0: decoder.global_step                            /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328891 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].b                          /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328944 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].global_step                /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.328996 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].wm                         /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329049 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].b                          /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329102 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].global_step                /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329154 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].wm                         /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329206 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.bias_0                         /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329260 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329312 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.weight_0                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329365 140395597309760 py_utils.py:1267] worker 0: decoder.target_sequence_sampler.global_step    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329417 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329471 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.gamma                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329523 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.global_step                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329576 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329627 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329680 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329735 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.gamma                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329791 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.global_step                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329848 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329902 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.329955 140395597309760 py_utils.py:1267] worker 0: encoder.global_step                            /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330008 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330060 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.gamma                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330111 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.global_step                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330164 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330216 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330269 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330321 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.gamma                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330373 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.global_step                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330425 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330476 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330528 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330580 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.gamma                       /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330632 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.global_step                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330684 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330740 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330795 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330847 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330899 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.330956 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331011 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331063 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331115 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331168 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331222 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].global_step                     /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331274 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331326 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331378 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331430 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331484 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331535 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331588 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331640 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331691 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].global_step                     /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331751 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331804 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331856 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331908 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.331960 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332014 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332071 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332124 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332178 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].global_step                     /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332230 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332283 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332334 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332386 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332438 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332490 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332544 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332596 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332648 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].global_step                     /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332700 140395597309760 py_utils.py:1267] worker 0: global_step                                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332758 140395597309760 py_utils.py:1267] worker 0: input._tokenizer_default.global_step           /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332813 140395597309760 py_utils.py:1267] worker 0: input.global_step                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332865 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.global_step                    /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332918 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.linear.global_step             /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.332971 140395597309760 py_utils.py:1267] worker 0: lr_schedule.global_step                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.333023 140395597309760 py_utils.py:1267] worker 0: optimizer.global_step                          /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:15.333077 140395597309760 py_utils.py:1283] ==========
W0420 07:30:17.816431 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/plot.py:239: The name tf.summary.image is deprecated. Please use tf.compat.v1.summary.image instead.

W0420 07:30:18.624974 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/layers.py:1874: The name tf.logging.vlog is deprecated. Please use tf.compat.v1.logging.vlog instead.

I0420 07:30:18.679635 140395597309760 decoder.py:749] Merging metric loss: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:18.683741 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/fraction_of_correct_next_step_preds:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:18.687864 140395597309760 decoder.py:749] Merging metric log_pplx: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
W0420 07:30:18.731125 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/summary_utils.py:36: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

I0420 07:30:22.142855 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.hidden_var: <tf.Variable 'librispeech/dec/atten/hidden_var/var:0' shape=(128,) dtype=float32_ref>
I0420 07:30:22.143069 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.query_var: <tf.Variable 'librispeech/dec/atten/query_var/var:0' shape=(1024, 128) dtype=float32_ref>
I0420 07:30:22.143222 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.source_var: <tf.Variable 'librispeech/dec/atten/source_var/var:0' shape=(2048, 128) dtype=float32_ref>
I0420 07:30:22.143345 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.emb.wm_0: <tf.Variable 'librispeech/dec/emb/var_0/var:0' shape=(76, 76) dtype=float32_ref>
I0420 07:30:22.143487 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.b: <tf.Variable 'librispeech/dec/rnn_cell/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.143604 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.wm: <tf.Variable 'librispeech/dec/rnn_cell/wm/var:0' shape=(3148, 4096) dtype=float32_ref>
I0420 07:30:22.143743 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.b: <tf.Variable 'librispeech/dec/rnn_cell_1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.143887 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.wm: <tf.Variable 'librispeech/dec/rnn_cell_1/wm/var:0' shape=(4096, 4096) dtype=float32_ref>
I0420 07:30:22.144016 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.bias_0: <tf.Variable 'librispeech/dec/softmax/bias_0/var:0' shape=(76,) dtype=float32_ref>
I0420 07:30:22.144134 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.weight_0: <tf.Variable 'librispeech/dec/softmax/weight_0/var:0' shape=(3072, 76) dtype=float32_ref>
I0420 07:30:22.144254 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.beta: <tf.Variable 'librispeech/enc/conv_L0/beta/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:22.144364 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.gamma: <tf.Variable 'librispeech/enc/conv_L0/gamma/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:22.144476 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.w: <tf.Variable 'librispeech/enc/conv_L0/w/var:0' shape=(3, 3, 1, 32) dtype=float32_ref>
I0420 07:30:22.144601 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.beta: <tf.Variable 'librispeech/enc/conv_L1/beta/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:22.144728 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.gamma: <tf.Variable 'librispeech/enc/conv_L1/gamma/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:22.144845 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.w: <tf.Variable 'librispeech/enc/conv_L1/w/var:0' shape=(3, 3, 32, 32) dtype=float32_ref>
I0420 07:30:22.144972 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.beta: <tf.Variable 'librispeech/enc/proj_L0/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145085 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.gamma: <tf.Variable 'librispeech/enc/proj_L0/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145204 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.w: <tf.Variable 'librispeech/enc/proj_L0/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:22.145322 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.beta: <tf.Variable 'librispeech/enc/proj_L1/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145435 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.gamma: <tf.Variable 'librispeech/enc/proj_L1/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145554 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.w: <tf.Variable 'librispeech/enc/proj_L1/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:22.145672 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.beta: <tf.Variable 'librispeech/enc/proj_L2/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145792 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.gamma: <tf.Variable 'librispeech/enc/proj_L2/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:22.145904 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.w: <tf.Variable 'librispeech/enc/proj_L2/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:22.146023 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L0/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.146132 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L0/wm/var:0' shape=(1664, 4096) dtype=float32_ref>
I0420 07:30:22.146251 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L0/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.146362 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L0/wm/var:0' shape=(1664, 4096) dtype=float32_ref>
I0420 07:30:22.146480 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.146589 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L1/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:22.146711 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.146826 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L1/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:22.146948 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L2/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.147057 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L2/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:22.147182 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L2/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.147293 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L2/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:22.147413 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L3/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.147520 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L3/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:22.147639 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L3/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:22.147757 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L3/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
W0420 07:30:23.409567 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/optimizer.py:179: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

I0420 07:30:23.411606 140395597309760 cluster.py:447] Place variable beta1_power on /job:local/replica:0/task:0/device:CPU:0 528365052
I0420 07:30:23.414588 140395597309760 cluster.py:447] Place variable beta2_power on /job:local/replica:0/task:0/device:CPU:0 528365056
I0420 07:30:23.900799 140395597309760 cluster.py:447] Place variable librispeech/total_samples/var on /job:local/replica:0/task:0/device:CPU:0 528365064
I0420 07:30:23.902559 140395597309760 py_utils.py:1220] Creating var librispeech/total_samples/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:23.909670 140395597309760 cluster.py:447] Place variable total_nan_gradients/var on /job:local/replica:0/task:0/device:CPU:0 528365072
I0420 07:30:23.911412 140395597309760 py_utils.py:1220] Creating var total_nan_gradients/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0
W0420 07:30:23.934954 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py:156: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

W0420 07:30:24.054827 140395597309760 deprecation_wrapper.py:119] From /data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py:198: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

I0420 07:30:24.163578 140395597309760 py_utils.py:1267] MODEL ANALYSIS: 
I0420 07:30:24.163672 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.hidden_var          (128,)                      128 librispeech/dec/atten/hidden_var/var
I0420 07:30:24.163784 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.query_var           (1024, 128)              131072 librispeech/dec/atten/query_var/var
I0420 07:30:24.163866 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.atten.source_var          (2048, 128)              262144 librispeech/dec/atten/source_var/var
I0420 07:30:24.163948 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.emb.wm[0]                 (76, 76)                   5776 librispeech/dec/emb/var_0/var
I0420 07:30:24.164036 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[0].b             (4096,)                    4096 librispeech/dec/rnn_cell/b/var
I0420 07:30:24.164122 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[0].wm            (3148, 4096)           12894208 librispeech/dec/rnn_cell/wm/var
I0420 07:30:24.164206 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[1].b             (4096,)                    4096 librispeech/dec/rnn_cell_1/b/var
I0420 07:30:24.164299 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.rnn_cell[1].wm            (4096, 4096)           16777216 librispeech/dec/rnn_cell_1/wm/var
I0420 07:30:24.164386 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.softmax.bias_0            (76,)                        76 librispeech/dec/softmax/bias_0/var
I0420 07:30:24.164469 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.decoder.softmax.weight_0          (3072, 76)               233472 librispeech/dec/softmax/weight_0/var
I0420 07:30:24.164551 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].bn.beta           (32,)                        32 librispeech/enc/conv_L0/beta/var
I0420 07:30:24.164633 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].bn.gamma          (32,)                        32 librispeech/enc/conv_L0/gamma/var
I0420 07:30:24.164714 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[0].w                 (3, 3, 1, 32)               288 librispeech/enc/conv_L0/w/var
I0420 07:30:24.164803 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].bn.beta           (32,)                        32 librispeech/enc/conv_L1/beta/var
I0420 07:30:24.164881 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].bn.gamma          (32,)                        32 librispeech/enc/conv_L1/gamma/var
I0420 07:30:24.164966 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.conv[1].w                 (3, 3, 32, 32)             9216 librispeech/enc/conv_L1/w/var
I0420 07:30:24.165045 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].bn.beta           (2048,)                    2048 librispeech/enc/proj_L0/beta/var
I0420 07:30:24.165127 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].bn.gamma          (2048,)                    2048 librispeech/enc/proj_L0/gamma/var
I0420 07:30:24.165208 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[0].w                 (2048, 2048)            4194304 librispeech/enc/proj_L0/w/var
I0420 07:30:24.165291 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].bn.beta           (2048,)                    2048 librispeech/enc/proj_L1/beta/var
I0420 07:30:24.165373 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].bn.gamma          (2048,)                    2048 librispeech/enc/proj_L1/gamma/var
I0420 07:30:24.165455 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[1].w                 (2048, 2048)            4194304 librispeech/enc/proj_L1/w/var
I0420 07:30:24.165535 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].bn.beta           (2048,)                    2048 librispeech/enc/proj_L2/beta/var
I0420 07:30:24.165616 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].bn.gamma          (2048,)                    2048 librispeech/enc/proj_L2/gamma/var
I0420 07:30:24.165698 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.proj[2].w                 (2048, 2048)            4194304 librispeech/enc/proj_L2/w/var
I0420 07:30:24.165782 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].bak_rnn.cell.b     (4096,)                    4096 librispeech/enc/bak_rnn_L0/b/var
I0420 07:30:24.165864 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].bak_rnn.cell.wm    (1664, 4096)            6815744 librispeech/enc/bak_rnn_L0/wm/var
I0420 07:30:24.165945 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].fwd_rnn.cell.b     (4096,)                    4096 librispeech/enc/fwd_rnn_L0/b/var
I0420 07:30:24.166027 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[0].fwd_rnn.cell.wm    (1664, 4096)            6815744 librispeech/enc/fwd_rnn_L0/wm/var
I0420 07:30:24.166107 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].bak_rnn.cell.b     (4096,)                    4096 librispeech/enc/bak_rnn_L1/b/var
I0420 07:30:24.166194 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].bak_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/bak_rnn_L1/wm/var
I0420 07:30:24.166275 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].fwd_rnn.cell.b     (4096,)                    4096 librispeech/enc/fwd_rnn_L1/b/var
I0420 07:30:24.166356 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[1].fwd_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/fwd_rnn_L1/wm/var
I0420 07:30:24.166438 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].bak_rnn.cell.b     (4096,)                    4096 librispeech/enc/bak_rnn_L2/b/var
I0420 07:30:24.166517 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].bak_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/bak_rnn_L2/wm/var
I0420 07:30:24.166599 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].fwd_rnn.cell.b     (4096,)                    4096 librispeech/enc/fwd_rnn_L2/b/var
I0420 07:30:24.166678 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[2].fwd_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/fwd_rnn_L2/wm/var
I0420 07:30:24.166765 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].bak_rnn.cell.b     (4096,)                    4096 librispeech/enc/bak_rnn_L3/b/var
I0420 07:30:24.166847 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].bak_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/bak_rnn_L3/wm/var
I0420 07:30:24.166928 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].fwd_rnn.cell.b     (4096,)                    4096 librispeech/enc/fwd_rnn_L3/b/var
I0420 07:30:24.167011 140395597309760 py_utils.py:1267] MODEL ANALYSIS: _task.encoder.rnn[3].fwd_rnn.cell.wm    (3072, 4096)           12582912 librispeech/enc/fwd_rnn_L3/wm/var
I0420 07:30:24.167092 140395597309760 py_utils.py:1267] MODEL ANALYSIS: ====================================================================================================
I0420 07:30:24.167171 140395597309760 py_utils.py:1267] MODEL ANALYSIS: total #params:  132078844
I0420 07:30:24.167253 140395597309760 py_utils.py:1267] MODEL ANALYSIS: 
I0420 07:30:26.209583 140395597309760 trainer.py:1263] Job trainer start
I0420 07:30:26.219883 140395597309760 base_runner.py:67] ============================================================
I0420 07:30:26.225033 140395597309760 base_runner.py:69] allow_implicit_capture : NoneType
I0420 07:30:26.225131 140395597309760 base_runner.py:69] cls : type/lingvo.core.base_model/SingleTaskModel
I0420 07:30:26.225227 140395597309760 base_runner.py:69] cluster.add_summary : NoneType
I0420 07:30:26.225331 140395597309760 base_runner.py:69] cluster.cls : type/lingvo.core.cluster/_Cluster
I0420 07:30:26.225426 140395597309760 base_runner.py:69] cluster.controller.cpus_per_replica : 1
I0420 07:30:26.225514 140395597309760 base_runner.py:69] cluster.controller.devices_per_split : 1
I0420 07:30:26.225601 140395597309760 base_runner.py:69] cluster.controller.gpus_per_replica : 0
I0420 07:30:26.225688 140395597309760 base_runner.py:69] cluster.controller.name : '/job:local'
I0420 07:30:26.225781 140395597309760 base_runner.py:69] cluster.controller.num_tpu_hosts : 0
I0420 07:30:26.225868 140395597309760 base_runner.py:69] cluster.controller.replicas : 1
I0420 07:30:26.225955 140395597309760 base_runner.py:69] cluster.controller.tpus_per_replica : 0
I0420 07:30:26.226039 140395597309760 base_runner.py:69] cluster.decoder.cpus_per_replica : 1
I0420 07:30:26.226125 140395597309760 base_runner.py:69] cluster.decoder.devices_per_split : 1
I0420 07:30:26.226210 140395597309760 base_runner.py:69] cluster.decoder.gpus_per_replica : 1
I0420 07:30:26.226296 140395597309760 base_runner.py:69] cluster.decoder.name : '/job:local'
I0420 07:30:26.226378 140395597309760 base_runner.py:69] cluster.decoder.num_tpu_hosts : 0
I0420 07:30:26.226463 140395597309760 base_runner.py:69] cluster.decoder.replicas : 1
I0420 07:30:26.226557 140395597309760 base_runner.py:69] cluster.decoder.tpus_per_replica : 0
I0420 07:30:26.226644 140395597309760 base_runner.py:69] cluster.evaler.cpus_per_replica : 1
I0420 07:30:26.226736 140395597309760 base_runner.py:69] cluster.evaler.devices_per_split : 1
I0420 07:30:26.226835 140395597309760 base_runner.py:69] cluster.evaler.gpus_per_replica : 1
I0420 07:30:26.226921 140395597309760 base_runner.py:69] cluster.evaler.name : '/job:local'
I0420 07:30:26.227004 140395597309760 base_runner.py:69] cluster.evaler.num_tpu_hosts : 0
I0420 07:30:26.227087 140395597309760 base_runner.py:69] cluster.evaler.replicas : 1
I0420 07:30:26.227170 140395597309760 base_runner.py:69] cluster.evaler.tpus_per_replica : 0
I0420 07:30:26.227253 140395597309760 base_runner.py:69] cluster.input.cpus_per_replica : 1
I0420 07:30:26.227339 140395597309760 base_runner.py:69] cluster.input.devices_per_split : 1
I0420 07:30:26.227423 140395597309760 base_runner.py:69] cluster.input.gpus_per_replica : 0
I0420 07:30:26.227508 140395597309760 base_runner.py:69] cluster.input.name : '/job:local'
I0420 07:30:26.227591 140395597309760 base_runner.py:69] cluster.input.num_tpu_hosts : 0
I0420 07:30:26.227674 140395597309760 base_runner.py:69] cluster.input.replicas : 0
I0420 07:30:26.227766 140395597309760 base_runner.py:69] cluster.input.tpus_per_replica : 0
I0420 07:30:26.227849 140395597309760 base_runner.py:69] cluster.job : 'trainer'
I0420 07:30:26.227933 140395597309760 base_runner.py:69] cluster.mode : 'async'
I0420 07:30:26.228017 140395597309760 base_runner.py:69] cluster.ps.cpus_per_replica : 1
I0420 07:30:26.228100 140395597309760 base_runner.py:69] cluster.ps.devices_per_split : 1
I0420 07:30:26.228184 140395597309760 base_runner.py:69] cluster.ps.gpus_per_replica : 0
I0420 07:30:26.228266 140395597309760 base_runner.py:69] cluster.ps.name : '/job:local'
I0420 07:30:26.228353 140395597309760 base_runner.py:69] cluster.ps.num_tpu_hosts : 0
I0420 07:30:26.228435 140395597309760 base_runner.py:69] cluster.ps.replicas : 1
I0420 07:30:26.228518 140395597309760 base_runner.py:69] cluster.ps.tpus_per_replica : 0
I0420 07:30:26.228601 140395597309760 base_runner.py:69] cluster.task : 0
I0420 07:30:26.228687 140395597309760 base_runner.py:69] cluster.worker.cpus_per_replica : 1
I0420 07:30:26.228792 140395597309760 base_runner.py:69] cluster.worker.devices_per_split : 1
I0420 07:30:26.228882 140395597309760 base_runner.py:69] cluster.worker.gpus_per_replica : 4
I0420 07:30:26.228970 140395597309760 base_runner.py:69] cluster.worker.name : '/job:local'
I0420 07:30:26.229058 140395597309760 base_runner.py:69] cluster.worker.num_tpu_hosts : 0
I0420 07:30:26.229155 140395597309760 base_runner.py:69] cluster.worker.replicas : 1
I0420 07:30:26.229240 140395597309760 base_runner.py:69] cluster.worker.tpus_per_replica : 0
I0420 07:30:26.229325 140395597309760 base_runner.py:69] dtype : float32
I0420 07:30:26.229409 140395597309760 base_runner.py:69] fprop_dtype : NoneType
I0420 07:30:26.229495 140395597309760 base_runner.py:69] inference_driver_name : NoneType
I0420 07:30:26.229579 140395597309760 base_runner.py:69] input.allow_implicit_capture : NoneType
I0420 07:30:26.229664 140395597309760 base_runner.py:69] input.append_eos_frame : True
I0420 07:30:26.229756 140395597309760 base_runner.py:69] input.bucket_adjust_every_n : 0
I0420 07:30:26.229837 140395597309760 base_runner.py:69] input.bucket_batch_limit : [64, 32, 32, 32, 32, 32, 32, 32]
I0420 07:30:26.229921 140395597309760 base_runner.py:69] input.bucket_upper_bound : [639, 1062, 1275, 1377, 1449, 1506, 1563, 1710]
I0420 07:30:26.230006 140395597309760 base_runner.py:69] input.cls : type/lingvo.tasks.asr.input_generator/AsrInput
I0420 07:30:26.230089 140395597309760 base_runner.py:69] input.dtype : float32
I0420 07:30:26.230174 140395597309760 base_runner.py:69] input.file_buffer_size : 10000
I0420 07:30:26.230257 140395597309760 base_runner.py:69] input.file_parallelism : 16
I0420 07:30:26.230343 140395597309760 base_runner.py:69] input.file_pattern : 'tfrecord:/data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*'
I0420 07:30:26.230433 140395597309760 base_runner.py:69] input.file_random_seed : 0
I0420 07:30:26.230520 140395597309760 base_runner.py:69] input.flush_every_n : 0
I0420 07:30:26.230608 140395597309760 base_runner.py:69] input.fprop_dtype : NoneType
I0420 07:30:26.230688 140395597309760 base_runner.py:69] input.frame_size : 80
I0420 07:30:26.230776 140395597309760 base_runner.py:69] input.inference_driver_name : NoneType
I0420 07:30:26.230861 140395597309760 base_runner.py:69] input.is_eval : False
I0420 07:30:26.230947 140395597309760 base_runner.py:69] input.is_inference : NoneType
I0420 07:30:26.231029 140395597309760 base_runner.py:69] input.name : 'input'
I0420 07:30:26.231112 140395597309760 base_runner.py:69] input.num_batcher_threads : 1
I0420 07:30:26.231197 140395597309760 base_runner.py:69] input.num_samples : 281241
I0420 07:30:26.231281 140395597309760 base_runner.py:69] input.pad_to_max_seq_length : False
I0420 07:30:26.231364 140395597309760 base_runner.py:69] input.params_init.method : 'xavier'
I0420 07:30:26.231447 140395597309760 base_runner.py:69] input.params_init.scale : 1.000001
I0420 07:30:26.231534 140395597309760 base_runner.py:69] input.params_init.seed : NoneType
I0420 07:30:26.231617 140395597309760 base_runner.py:69] input.random_seed : NoneType
I0420 07:30:26.231700 140395597309760 base_runner.py:69] input.require_sequential_order : False
I0420 07:30:26.231795 140395597309760 base_runner.py:69] input.skip_lp_regularization : NoneType
I0420 07:30:26.231879 140395597309760 base_runner.py:69] input.source_max_length : 3000
I0420 07:30:26.231966 140395597309760 base_runner.py:69] input.target_max_length : 620
I0420 07:30:26.232048 140395597309760 base_runner.py:69] input.tokenizer.allow_implicit_capture : NoneType
I0420 07:30:26.232131 140395597309760 base_runner.py:69] input.tokenizer.append_eos : True
I0420 07:30:26.232215 140395597309760 base_runner.py:69] input.tokenizer.cls : type/lingvo.core.tokenizers/AsciiTokenizer
I0420 07:30:26.232300 140395597309760 base_runner.py:69] input.tokenizer.dtype : float32
I0420 07:30:26.232384 140395597309760 base_runner.py:69] input.tokenizer.fprop_dtype : NoneType
I0420 07:30:26.232472 140395597309760 base_runner.py:69] input.tokenizer.inference_driver_name : NoneType
I0420 07:30:26.232552 140395597309760 base_runner.py:69] input.tokenizer.is_eval : NoneType
I0420 07:30:26.232635 140395597309760 base_runner.py:69] input.tokenizer.is_inference : NoneType
I0420 07:30:26.232718 140395597309760 base_runner.py:69] input.tokenizer.name : 'tokenizer'
I0420 07:30:26.232809 140395597309760 base_runner.py:69] input.tokenizer.pad_to_max_length : True
I0420 07:30:26.232893 140395597309760 base_runner.py:69] input.tokenizer.params_init.method : 'xavier'
I0420 07:30:26.232979 140395597309760 base_runner.py:69] input.tokenizer.params_init.scale : 1.000001
I0420 07:30:26.233062 140395597309760 base_runner.py:69] input.tokenizer.params_init.seed : NoneType
I0420 07:30:26.233146 140395597309760 base_runner.py:69] input.tokenizer.random_seed : NoneType
I0420 07:30:26.233231 140395597309760 base_runner.py:69] input.tokenizer.skip_lp_regularization : NoneType
I0420 07:30:26.233316 140395597309760 base_runner.py:69] input.tokenizer.target_eos_id : 2
I0420 07:30:26.233401 140395597309760 base_runner.py:69] input.tokenizer.target_sos_id : 1
I0420 07:30:26.233484 140395597309760 base_runner.py:69] input.tokenizer.target_unk_id : 0
I0420 07:30:26.233570 140395597309760 base_runner.py:69] input.tokenizer.vn.global_vn : False
I0420 07:30:26.233652 140395597309760 base_runner.py:69] input.tokenizer.vn.per_step_vn : False
I0420 07:30:26.233743 140395597309760 base_runner.py:69] input.tokenizer.vn.scale : NoneType
I0420 07:30:26.233827 140395597309760 base_runner.py:69] input.tokenizer.vn.seed : NoneType
I0420 07:30:26.233911 140395597309760 base_runner.py:69] input.tokenizer.vocab_size : 76
I0420 07:30:26.233994 140395597309760 base_runner.py:69] input.tokenizer_dict : {}
I0420 07:30:26.234086 140395597309760 base_runner.py:69] input.tpu_infeed_parallism : 1
I0420 07:30:26.234169 140395597309760 base_runner.py:69] input.use_per_host_infeed : False
I0420 07:30:26.234253 140395597309760 base_runner.py:69] input.use_within_batch_mixing : False
I0420 07:30:26.234337 140395597309760 base_runner.py:69] input.vn.global_vn : False
I0420 07:30:26.234421 140395597309760 base_runner.py:69] input.vn.per_step_vn : False
I0420 07:30:26.234503 140395597309760 base_runner.py:69] input.vn.scale : NoneType
I0420 07:30:26.234589 140395597309760 base_runner.py:69] input.vn.seed : NoneType
I0420 07:30:26.234672 140395597309760 base_runner.py:69] is_eval : NoneType
I0420 07:30:26.234760 140395597309760 base_runner.py:69] is_inference : NoneType
I0420 07:30:26.234848 140395597309760 base_runner.py:69] model : 'asr.librispeech.Librispeech960Grapheme@/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/tasks/asr/params/librispeech.py:181'
I0420 07:30:26.234934 140395597309760 base_runner.py:69] name : ''
I0420 07:30:26.235013 140395597309760 base_runner.py:69] params_init.method : 'xavier'
I0420 07:30:26.235105 140395597309760 base_runner.py:69] params_init.scale : 1.000001
I0420 07:30:26.235184 140395597309760 base_runner.py:69] params_init.seed : NoneType
I0420 07:30:26.235270 140395597309760 base_runner.py:69] random_seed : NoneType
I0420 07:30:26.235354 140395597309760 base_runner.py:69] skip_lp_regularization : NoneType
I0420 07:30:26.235438 140395597309760 base_runner.py:69] task.allow_implicit_capture : NoneType
I0420 07:30:26.235522 140395597309760 base_runner.py:69] task.cls : type/lingvo.tasks.asr.model/AsrModel
I0420 07:30:26.235611 140395597309760 base_runner.py:69] task.decoder.allow_implicit_capture : NoneType
I0420 07:30:26.235691 140395597309760 base_runner.py:69] task.decoder.atten_context_dim : 0
I0420 07:30:26.235780 140395597309760 base_runner.py:69] task.decoder.attention.allow_implicit_capture : NoneType
I0420 07:30:26.235867 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_deterministic : False
I0420 07:30:26.235975 140395597309760 base_runner.py:69] task.decoder.attention.atten_dropout_prob : 0.0
I0420 07:30:26.236058 140395597309760 base_runner.py:69] task.decoder.attention.cls : type/lingvo.core.attention/AdditiveAttention
I0420 07:30:26.236149 140395597309760 base_runner.py:69] task.decoder.attention.dtype : float32
I0420 07:30:26.236237 140395597309760 base_runner.py:69] task.decoder.attention.fprop_dtype : NoneType
I0420 07:30:26.236335 140395597309760 base_runner.py:69] task.decoder.attention.hidden_dim : 128
I0420 07:30:26.236421 140395597309760 base_runner.py:69] task.decoder.attention.inference_driver_name : NoneType
I0420 07:30:26.236505 140395597309760 base_runner.py:69] task.decoder.attention.is_eval : NoneType
I0420 07:30:26.236589 140395597309760 base_runner.py:69] task.decoder.attention.is_inference : NoneType
I0420 07:30:26.236674 140395597309760 base_runner.py:69] task.decoder.attention.name : ''
I0420 07:30:26.236763 140395597309760 base_runner.py:69] task.decoder.attention.packed_input : False
I0420 07:30:26.236850 140395597309760 base_runner.py:69] task.decoder.attention.params_init.method : 'uniform_sqrt_dim'
I0420 07:30:26.236936 140395597309760 base_runner.py:69] task.decoder.attention.params_init.scale : 1.73205080757
I0420 07:30:26.237021 140395597309760 base_runner.py:69] task.decoder.attention.params_init.seed : NoneType
I0420 07:30:26.237107 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.default : NoneType
I0420 07:30:26.237190 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.fullyconnected : NoneType
I0420 07:30:26.237273 140395597309760 base_runner.py:69] task.decoder.attention.qdomain.softmax : NoneType
I0420 07:30:26.237356 140395597309760 base_runner.py:69] task.decoder.attention.query_dim : 0
I0420 07:30:26.237441 140395597309760 base_runner.py:69] task.decoder.attention.random_seed : NoneType
I0420 07:30:26.237524 140395597309760 base_runner.py:69] task.decoder.attention.same_batch_size : False
I0420 07:30:26.237612 140395597309760 base_runner.py:69] task.decoder.attention.skip_lp_regularization : NoneType
I0420 07:30:26.237698 140395597309760 base_runner.py:69] task.decoder.attention.source_dim : 0
I0420 07:30:26.237788 140395597309760 base_runner.py:69] task.decoder.attention.vn.global_vn : False
I0420 07:30:26.237874 140395597309760 base_runner.py:69] task.decoder.attention.vn.per_step_vn : False
I0420 07:30:26.237957 140395597309760 base_runner.py:69] task.decoder.attention.vn.scale : NoneType
I0420 07:30:26.238043 140395597309760 base_runner.py:69] task.decoder.attention.vn.seed : NoneType
I0420 07:30:26.238126 140395597309760 base_runner.py:69] task.decoder.attention_plot_font_properties : FontProperties
I0420 07:30:26.238209 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_empty_terminated_hyp : True
I0420 07:30:26.238293 140395597309760 base_runner.py:69] task.decoder.beam_search.allow_implicit_capture : NoneType
I0420 07:30:26.238375 140395597309760 base_runner.py:69] task.decoder.beam_search.batch_major_state : True
I0420 07:30:26.238461 140395597309760 base_runner.py:69] task.decoder.beam_search.beam_size : 3.0
I0420 07:30:26.238543 140395597309760 base_runner.py:69] task.decoder.beam_search.cls : type/lingvo.core.beam_search_helper/BeamSearchHelper
I0420 07:30:26.238630 140395597309760 base_runner.py:69] task.decoder.beam_search.coverage_penalty : 0.0
I0420 07:30:26.238713 140395597309760 base_runner.py:69] task.decoder.beam_search.dtype : float32
I0420 07:30:26.238805 140395597309760 base_runner.py:69] task.decoder.beam_search.ensure_full_beam : False
I0420 07:30:26.238889 140395597309760 base_runner.py:69] task.decoder.beam_search.force_eos_in_last_step : False
I0420 07:30:26.238975 140395597309760 base_runner.py:69] task.decoder.beam_search.fprop_dtype : NoneType
I0420 07:30:26.239058 140395597309760 base_runner.py:69] task.decoder.beam_search.inference_driver_name : NoneType
I0420 07:30:26.239142 140395597309760 base_runner.py:69] task.decoder.beam_search.is_eval : NoneType
I0420 07:30:26.239226 140395597309760 base_runner.py:69] task.decoder.beam_search.is_inference : NoneType
I0420 07:30:26.239310 140395597309760 base_runner.py:69] task.decoder.beam_search.length_normalization : 0.0
I0420 07:30:26.239393 140395597309760 base_runner.py:69] task.decoder.beam_search.merge_paths : False
I0420 07:30:26.239479 140395597309760 base_runner.py:69] task.decoder.beam_search.name : 'beam_search'
I0420 07:30:26.239561 140395597309760 base_runner.py:69] task.decoder.beam_search.num_hyps_per_beam : 8
I0420 07:30:26.239645 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.method : 'xavier'
I0420 07:30:26.239734 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.scale : 1.000001
I0420 07:30:26.239824 140395597309760 base_runner.py:69] task.decoder.beam_search.params_init.seed : NoneType
I0420 07:30:26.239908 140395597309760 base_runner.py:69] task.decoder.beam_search.random_seed : NoneType
I0420 07:30:26.239993 140395597309760 base_runner.py:69] task.decoder.beam_search.skip_lp_regularization : NoneType
I0420 07:30:26.240078 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eoc_id : -1
I0420 07:30:26.240160 140395597309760 base_runner.py:69] task.decoder.beam_search.target_eos_id : 2
I0420 07:30:26.240243 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_len : 0
I0420 07:30:26.240329 140395597309760 base_runner.py:69] task.decoder.beam_search.target_seq_length_ratio : 1.0
I0420 07:30:26.240412 140395597309760 base_runner.py:69] task.decoder.beam_search.target_sos_id : 1
I0420 07:30:26.240497 140395597309760 base_runner.py:69] task.decoder.beam_search.valid_eos_max_logit_delta : 5.0
I0420 07:30:26.240581 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.global_vn : False
I0420 07:30:26.240664 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.per_step_vn : False
I0420 07:30:26.240751 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.scale : NoneType
I0420 07:30:26.240845 140395597309760 base_runner.py:69] task.decoder.beam_search.vn.seed : NoneType
I0420 07:30:26.240926 140395597309760 base_runner.py:69] task.decoder.cls : type/lingvo.tasks.asr.decoder/AsrDecoder
I0420 07:30:26.241009 140395597309760 base_runner.py:69] task.decoder.contextualizer.allow_implicit_capture : NoneType
I0420 07:30:26.241094 140395597309760 base_runner.py:69] task.decoder.contextualizer.cls : type/lingvo.tasks.asr.contextualizer_base/NullContextualizer
I0420 07:30:26.241178 140395597309760 base_runner.py:69] task.decoder.contextualizer.dtype : float32
I0420 07:30:26.241261 140395597309760 base_runner.py:69] task.decoder.contextualizer.fprop_dtype : NoneType
I0420 07:30:26.241348 140395597309760 base_runner.py:69] task.decoder.contextualizer.inference_driver_name : NoneType
I0420 07:30:26.241427 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_eval : NoneType
I0420 07:30:26.241512 140395597309760 base_runner.py:69] task.decoder.contextualizer.is_inference : NoneType
I0420 07:30:26.241595 140395597309760 base_runner.py:69] task.decoder.contextualizer.name : ''
I0420 07:30:26.241678 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.method : 'xavier'
I0420 07:30:26.241766 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.scale : 1.000001
I0420 07:30:26.241852 140395597309760 base_runner.py:69] task.decoder.contextualizer.params_init.seed : NoneType
I0420 07:30:26.241936 140395597309760 base_runner.py:69] task.decoder.contextualizer.random_seed : NoneType
I0420 07:30:26.242022 140395597309760 base_runner.py:69] task.decoder.contextualizer.skip_lp_regularization : NoneType
I0420 07:30:26.242106 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.global_vn : False
I0420 07:30:26.242187 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.per_step_vn : False
I0420 07:30:26.242274 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.scale : NoneType
I0420 07:30:26.242360 140395597309760 base_runner.py:69] task.decoder.contextualizer.vn.seed : NoneType
I0420 07:30:26.242445 140395597309760 base_runner.py:69] task.decoder.dropout_prob : 0.0
I0420 07:30:26.242528 140395597309760 base_runner.py:69] task.decoder.dtype : float32
I0420 07:30:26.242613 140395597309760 base_runner.py:69] task.decoder.emb.allow_implicit_capture : NoneType
I0420 07:30:26.242696 140395597309760 base_runner.py:69] task.decoder.emb.cls : type/lingvo.core.layers/EmbeddingLayer
I0420 07:30:26.242794 140395597309760 base_runner.py:69] task.decoder.emb.dtype : float32
I0420 07:30:26.242881 140395597309760 base_runner.py:69] task.decoder.emb.embedding_dim : 0
I0420 07:30:26.242964 140395597309760 base_runner.py:69] task.decoder.emb.fprop_dtype : NoneType
I0420 07:30:26.243048 140395597309760 base_runner.py:69] task.decoder.emb.inference_driver_name : NoneType
I0420 07:30:26.243130 140395597309760 base_runner.py:69] task.decoder.emb.is_eval : NoneType
I0420 07:30:26.243216 140395597309760 base_runner.py:69] task.decoder.emb.is_inference : NoneType
I0420 07:30:26.243298 140395597309760 base_runner.py:69] task.decoder.emb.max_num_shards : 1
I0420 07:30:26.243381 140395597309760 base_runner.py:69] task.decoder.emb.name : ''
I0420 07:30:26.243465 140395597309760 base_runner.py:69] task.decoder.emb.on_ps : True
I0420 07:30:26.243550 140395597309760 base_runner.py:69] task.decoder.emb.params_init.method : 'uniform'
I0420 07:30:26.243633 140395597309760 base_runner.py:69] task.decoder.emb.params_init.scale : 1.0
I0420 07:30:26.243727 140395597309760 base_runner.py:69] task.decoder.emb.params_init.seed : NoneType
I0420 07:30:26.243807 140395597309760 base_runner.py:69] task.decoder.emb.random_seed : NoneType
I0420 07:30:26.243894 140395597309760 base_runner.py:69] task.decoder.emb.scale_sqrt_depth : False
I0420 07:30:26.243978 140395597309760 base_runner.py:69] task.decoder.emb.skip_lp_regularization : NoneType
I0420 07:30:26.244060 140395597309760 base_runner.py:69] task.decoder.emb.vn.global_vn : False
I0420 07:30:26.244147 140395597309760 base_runner.py:69] task.decoder.emb.vn.per_step_vn : False
I0420 07:30:26.244236 140395597309760 base_runner.py:69] task.decoder.emb.vn.scale : NoneType
I0420 07:30:26.244321 140395597309760 base_runner.py:69] task.decoder.emb.vn.seed : NoneType
I0420 07:30:26.244405 140395597309760 base_runner.py:69] task.decoder.emb.vocab_size : 76
I0420 07:30:26.244488 140395597309760 base_runner.py:69] task.decoder.emb_dim : 76
I0420 07:30:26.244571 140395597309760 base_runner.py:69] task.decoder.fprop_dtype : NoneType
I0420 07:30:26.244653 140395597309760 base_runner.py:69] task.decoder.fusion.allow_implicit_capture : NoneType
I0420 07:30:26.244745 140395597309760 base_runner.py:69] task.decoder.fusion.base_model_logits_dim : NoneType
I0420 07:30:26.244829 140395597309760 base_runner.py:69] task.decoder.fusion.cls : type/lingvo.tasks.asr.fusion/NullFusion
I0420 07:30:26.244915 140395597309760 base_runner.py:69] task.decoder.fusion.dtype : float32
I0420 07:30:26.244997 140395597309760 base_runner.py:69] task.decoder.fusion.fprop_dtype : NoneType
I0420 07:30:26.245081 140395597309760 base_runner.py:69] task.decoder.fusion.inference_driver_name : NoneType
I0420 07:30:26.245165 140395597309760 base_runner.py:69] task.decoder.fusion.is_eval : NoneType
I0420 07:30:26.245248 140395597309760 base_runner.py:69] task.decoder.fusion.is_inference : NoneType
I0420 07:30:26.245331 140395597309760 base_runner.py:69] task.decoder.fusion.lm.allow_implicit_capture : NoneType
I0420 07:30:26.245415 140395597309760 base_runner.py:69] task.decoder.fusion.lm.cls : type/lingvo.tasks.lm.layers/NullLm
I0420 07:30:26.245498 140395597309760 base_runner.py:69] task.decoder.fusion.lm.dtype : float32
I0420 07:30:26.245584 140395597309760 base_runner.py:69] task.decoder.fusion.lm.fprop_dtype : NoneType
I0420 07:30:26.245666 140395597309760 base_runner.py:69] task.decoder.fusion.lm.inference_driver_name : NoneType
I0420 07:30:26.245759 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_eval : NoneType
I0420 07:30:26.245843 140395597309760 base_runner.py:69] task.decoder.fusion.lm.is_inference : NoneType
I0420 07:30:26.245923 140395597309760 base_runner.py:69] task.decoder.fusion.lm.name : ''
I0420 07:30:26.246007 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.method : 'xavier'
I0420 07:30:26.246092 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.scale : 1.000001
I0420 07:30:26.246176 140395597309760 base_runner.py:69] task.decoder.fusion.lm.params_init.seed : NoneType
I0420 07:30:26.246260 140395597309760 base_runner.py:69] task.decoder.fusion.lm.random_seed : NoneType
I0420 07:30:26.246344 140395597309760 base_runner.py:69] task.decoder.fusion.lm.skip_lp_regularization : NoneType
I0420 07:30:26.246428 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.global_vn : False
I0420 07:30:26.246514 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.per_step_vn : False
I0420 07:30:26.246598 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.scale : NoneType
I0420 07:30:26.246678 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vn.seed : NoneType
I0420 07:30:26.246769 140395597309760 base_runner.py:69] task.decoder.fusion.lm.vocab_size : 96
I0420 07:30:26.246859 140395597309760 base_runner.py:69] task.decoder.fusion.name : ''
I0420 07:30:26.246937 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.method : 'xavier'
I0420 07:30:26.247023 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.scale : 1.000001
I0420 07:30:26.247107 140395597309760 base_runner.py:69] task.decoder.fusion.params_init.seed : NoneType
I0420 07:30:26.247189 140395597309760 base_runner.py:69] task.decoder.fusion.random_seed : NoneType
I0420 07:30:26.247276 140395597309760 base_runner.py:69] task.decoder.fusion.skip_lp_regularization : NoneType
I0420 07:30:26.247361 140395597309760 base_runner.py:69] task.decoder.fusion.vn.global_vn : False
I0420 07:30:26.247445 140395597309760 base_runner.py:69] task.decoder.fusion.vn.per_step_vn : False
I0420 07:30:26.247528 140395597309760 base_runner.py:69] task.decoder.fusion.vn.scale : NoneType
I0420 07:30:26.247617 140395597309760 base_runner.py:69] task.decoder.fusion.vn.seed : NoneType
I0420 07:30:26.247699 140395597309760 base_runner.py:69] task.decoder.inference_driver_name : NoneType
I0420 07:30:26.247792 140395597309760 base_runner.py:69] task.decoder.is_eval : NoneType
I0420 07:30:26.247878 140395597309760 base_runner.py:69] task.decoder.is_inference : NoneType
I0420 07:30:26.247961 140395597309760 base_runner.py:69] task.decoder.label_smoothing : NoneType
I0420 07:30:26.248045 140395597309760 base_runner.py:69] task.decoder.logit_types : {'logits': 1.0}
I0420 07:30:26.248128 140395597309760 base_runner.py:69] task.decoder.min_ground_truth_prob : 1.0
I0420 07:30:26.248213 140395597309760 base_runner.py:69] task.decoder.min_prob_step : 1000000.0
I0420 07:30:26.248296 140395597309760 base_runner.py:69] task.decoder.name : ''
I0420 07:30:26.248378 140395597309760 base_runner.py:69] task.decoder.packed_input : False
I0420 07:30:26.248464 140395597309760 base_runner.py:69] task.decoder.parallel_iterations : 30
I0420 07:30:26.248547 140395597309760 base_runner.py:69] task.decoder.params_init.method : 'xavier'
I0420 07:30:26.248631 140395597309760 base_runner.py:69] task.decoder.params_init.scale : 1.000001
I0420 07:30:26.248712 140395597309760 base_runner.py:69] task.decoder.params_init.seed : NoneType
I0420 07:30:26.248805 140395597309760 base_runner.py:69] task.decoder.per_token_avg_loss : True
I0420 07:30:26.248888 140395597309760 base_runner.py:69] task.decoder.prob_decay_start_step : 10000.0
I0420 07:30:26.248972 140395597309760 base_runner.py:69] task.decoder.random_seed : NoneType
I0420 07:30:26.249056 140395597309760 base_runner.py:69] task.decoder.residual_start : 0
I0420 07:30:26.249140 140395597309760 base_runner.py:69] task.decoder.rnn_cell_dim : 1024
I0420 07:30:26.249222 140395597309760 base_runner.py:69] task.decoder.rnn_cell_hidden_dim : 0
I0420 07:30:26.249308 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.249392 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.apply_pruning : False
I0420 07:30:26.249478 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.method : 'constant'
I0420 07:30:26.249562 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.scale : 0.0
I0420 07:30:26.249648 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.bias_init.seed : 0
I0420 07:30:26.249737 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cell_value_cap : 10.0
I0420 07:30:26.249821 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple
I0420 07:30:26.249906 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.couple_input_forget_gates : False
I0420 07:30:26.249989 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.dtype : float32
I0420 07:30:26.250072 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.enable_lstm_bias : True
I0420 07:30:26.250157 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.forget_gate_bias : 0.0
I0420 07:30:26.250240 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.fprop_dtype : NoneType
I0420 07:30:26.250324 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inference_driver_name : NoneType
I0420 07:30:26.250407 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.inputs_arity : 1
I0420 07:30:26.250492 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_eval : NoneType
I0420 07:30:26.250572 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.is_inference : NoneType
I0420 07:30:26.250659 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.name : ''
I0420 07:30:26.250751 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_hidden_nodes : 0
I0420 07:30:26.250835 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_input_nodes : 0
I0420 07:30:26.250921 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.num_output_nodes : 0
I0420 07:30:26.251012 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.output_nonlinearity : True
I0420 07:30:26.251096 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.method : 'uniform'
I0420 07:30:26.251180 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.scale : 0.1
I0420 07:30:26.251264 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.params_init.seed : NoneType
I0420 07:30:26.251348 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.c_state : NoneType
I0420 07:30:26.251435 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.default : NoneType
I0420 07:30:26.251521 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.fullyconnected : NoneType
I0420 07:30:26.251600 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.m_state : NoneType
I0420 07:30:26.251688 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.qdomain.weight : NoneType
I0420 07:30:26.251780 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.random_seed : NoneType
I0420 07:30:26.251859 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.reset_cell_state : False
I0420 07:30:26.251946 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.252032 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.global_vn : False
I0420 07:30:26.252113 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.per_step_vn : False
I0420 07:30:26.252199 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.scale : NoneType
I0420 07:30:26.252285 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.vn.seed : NoneType
I0420 07:30:26.252367 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:26.252451 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:26.252537 140395597309760 base_runner.py:69] task.decoder.rnn_cell_tpl.zo_prob : 0.0
I0420 07:30:26.252619 140395597309760 base_runner.py:69] task.decoder.rnn_layers : 2
I0420 07:30:26.252703 140395597309760 base_runner.py:69] task.decoder.skip_lp_regularization : NoneType
I0420 07:30:26.252793 140395597309760 base_runner.py:69] task.decoder.softmax.allow_implicit_capture : NoneType
I0420 07:30:26.252877 140395597309760 base_runner.py:69] task.decoder.softmax.apply_pruning : False
I0420 07:30:26.252960 140395597309760 base_runner.py:69] task.decoder.softmax.chunk_size : 0
I0420 07:30:26.253046 140395597309760 base_runner.py:69] task.decoder.softmax.cls : type/lingvo.core.layers/SimpleFullSoftmax
I0420 07:30:26.253129 140395597309760 base_runner.py:69] task.decoder.softmax.dtype : float32
I0420 07:30:26.253212 140395597309760 base_runner.py:69] task.decoder.softmax.fprop_dtype : NoneType
I0420 07:30:26.253293 140395597309760 base_runner.py:69] task.decoder.softmax.inference_driver_name : NoneType
I0420 07:30:26.253380 140395597309760 base_runner.py:69] task.decoder.softmax.input_dim : 0
I0420 07:30:26.253467 140395597309760 base_runner.py:69] task.decoder.softmax.is_eval : NoneType
I0420 07:30:26.253550 140395597309760 base_runner.py:69] task.decoder.softmax.is_inference : NoneType
I0420 07:30:26.253635 140395597309760 base_runner.py:69] task.decoder.softmax.logits_abs_max : NoneType
I0420 07:30:26.253717 140395597309760 base_runner.py:69] task.decoder.softmax.name : ''
I0420 07:30:26.253808 140395597309760 base_runner.py:69] task.decoder.softmax.num_classes : 76
I0420 07:30:26.253891 140395597309760 base_runner.py:69] task.decoder.softmax.num_sampled : 0
I0420 07:30:26.253976 140395597309760 base_runner.py:69] task.decoder.softmax.num_shards : 1
I0420 07:30:26.254060 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.method : 'uniform'
I0420 07:30:26.254143 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.scale : 0.1
I0420 07:30:26.254229 140395597309760 base_runner.py:69] task.decoder.softmax.params_init.seed : NoneType
I0420 07:30:26.254314 140395597309760 base_runner.py:69] task.decoder.softmax.qdomain.default : NoneType
I0420 07:30:26.254401 140395597309760 base_runner.py:69] task.decoder.softmax.random_seed : NoneType
I0420 07:30:26.254491 140395597309760 base_runner.py:69] task.decoder.softmax.skip_lp_regularization : NoneType
I0420 07:30:26.254570 140395597309760 base_runner.py:69] task.decoder.softmax.vn.global_vn : False
I0420 07:30:26.254654 140395597309760 base_runner.py:69] task.decoder.softmax.vn.per_step_vn : False
I0420 07:30:26.254745 140395597309760 base_runner.py:69] task.decoder.softmax.vn.scale : NoneType
I0420 07:30:26.254828 140395597309760 base_runner.py:69] task.decoder.softmax.vn.seed : NoneType
I0420 07:30:26.254913 140395597309760 base_runner.py:69] task.decoder.softmax_uses_attention : True
I0420 07:30:26.254997 140395597309760 base_runner.py:69] task.decoder.source_dim : 2048
I0420 07:30:26.255093 140395597309760 base_runner.py:69] task.decoder.target_eos_id : 2
I0420 07:30:26.255172 140395597309760 base_runner.py:69] task.decoder.target_seq_len : 620
I0420 07:30:26.255256 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.allow_implicit_capture : NoneType
I0420 07:30:26.255338 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.cls : type/lingvo.core.target_sequence_sampler/TargetSequenceSampler
I0420 07:30:26.255422 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.dtype : float32
I0420 07:30:26.255502 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.fprop_dtype : NoneType
I0420 07:30:26.255585 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.inference_driver_name : NoneType
I0420 07:30:26.255665 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_eval : NoneType
I0420 07:30:26.255753 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.is_inference : NoneType
I0420 07:30:26.255835 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.name : 'target_sequence_sampler'
I0420 07:30:26.255918 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.method : 'xavier'
I0420 07:30:26.255999 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.scale : 1.000001
I0420 07:30:26.256082 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.params_init.seed : NoneType
I0420 07:30:26.256161 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.random_seed : NoneType
I0420 07:30:26.256247 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.skip_lp_regularization : NoneType
I0420 07:30:26.256328 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eoc_id : -1
I0420 07:30:26.256412 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_eos_id : 2
I0420 07:30:26.256491 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_seq_len : 0
I0420 07:30:26.256576 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.target_sos_id : 1
I0420 07:30:26.256654 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.temperature : 1.0
I0420 07:30:26.256743 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.global_vn : False
I0420 07:30:26.256824 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.per_step_vn : False
I0420 07:30:26.256910 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.scale : NoneType
I0420 07:30:26.256989 140395597309760 base_runner.py:69] task.decoder.target_sequence_sampler.vn.seed : NoneType
I0420 07:30:26.257072 140395597309760 base_runner.py:69] task.decoder.target_sos_id : 1
I0420 07:30:26.257152 140395597309760 base_runner.py:69] task.decoder.use_unnormalized_logits_as_log_probs : True
I0420 07:30:26.257235 140395597309760 base_runner.py:69] task.decoder.use_while_loop_based_unrolling : False
I0420 07:30:26.257314 140395597309760 base_runner.py:69] task.decoder.vn.global_vn : False
I0420 07:30:26.257402 140395597309760 base_runner.py:69] task.decoder.vn.per_step_vn : False
I0420 07:30:26.257483 140395597309760 base_runner.py:69] task.decoder.vn.scale : NoneType
I0420 07:30:26.257567 140395597309760 base_runner.py:69] task.decoder.vn.seed : NoneType
I0420 07:30:26.257646 140395597309760 base_runner.py:69] task.dtype : float32
I0420 07:30:26.257735 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.activation : 'RELU'
I0420 07:30:26.257816 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.257900 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.batch_norm : True
I0420 07:30:26.257980 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bias : False
I0420 07:30:26.258064 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_decay : 0.999
I0420 07:30:26.258142 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.bn_fold_weights : NoneType
I0420 07:30:26.258225 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.causal_convolution : False
I0420 07:30:26.258306 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer
I0420 07:30:26.258389 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.conv_last : False
I0420 07:30:26.258470 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dilation_rate : (1, 1)
I0420 07:30:26.258554 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.disable_activation_quantization : False
I0420 07:30:26.258634 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.dtype : float32
I0420 07:30:26.258716 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_shape : [3, 3, 'NoneType', 'NoneType']
I0420 07:30:26.258807 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.filter_stride : [1, 1]
I0420 07:30:26.258893 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.fprop_dtype : NoneType
I0420 07:30:26.258972 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.inference_driver_name : NoneType
I0420 07:30:26.259056 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_eval : NoneType
I0420 07:30:26.259135 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.is_inference : NoneType
I0420 07:30:26.259218 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.name : ''
I0420 07:30:26.259304 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:26.259387 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.scale : 0.1
I0420 07:30:26.259468 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.params_init.seed : NoneType
I0420 07:30:26.259552 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.qdomain.default : NoneType
I0420 07:30:26.259630 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.random_seed : NoneType
I0420 07:30:26.259715 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.259798 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.global_vn : False
I0420 07:30:26.259886 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.per_step_vn : False
I0420 07:30:26.259964 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.scale : NoneType
I0420 07:30:26.260051 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.vn.seed : NoneType
I0420 07:30:26.260134 140395597309760 base_runner.py:69] task.encoder.after_conv_lstm_cnn_tpl.weight_norm : False
I0420 07:30:26.260221 140395597309760 base_runner.py:69] task.encoder.allow_implicit_capture : NoneType
I0420 07:30:26.260303 140395597309760 base_runner.py:69] task.encoder.bidi_rnn_type : 'func'
I0420 07:30:26.260396 140395597309760 base_runner.py:69] task.encoder.cls : type/lingvo.tasks.asr.encoder/AsrEncoder
I0420 07:30:26.260477 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.activation : 'RELU'
I0420 07:30:26.260560 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.260638 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.batch_norm : True
I0420 07:30:26.260721 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bias : False
I0420 07:30:26.260806 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_decay : 0.999
I0420 07:30:26.260890 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.bn_fold_weights : NoneType
I0420 07:30:26.260970 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.causal_convolution : False
I0420 07:30:26.261049 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.cls : type/lingvo.core.layers/Conv2DLayer
I0420 07:30:26.261132 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.conv_last : False
I0420 07:30:26.261218 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dilation_rate : (1, 1)
I0420 07:30:26.261297 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.disable_activation_quantization : False
I0420 07:30:26.261383 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.dtype : float32
I0420 07:30:26.261461 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_shape : (0, 0, 0, 0)
I0420 07:30:26.261547 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.filter_stride : (0, 0)
I0420 07:30:26.261626 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.fprop_dtype : NoneType
I0420 07:30:26.261708 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.inference_driver_name : NoneType
I0420 07:30:26.261795 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_eval : NoneType
I0420 07:30:26.261881 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.is_inference : NoneType
I0420 07:30:26.261960 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.name : ''
I0420 07:30:26.262042 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.method : 'gaussian'
I0420 07:30:26.262119 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.scale : 0.001
I0420 07:30:26.262206 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.params_init.seed : NoneType
I0420 07:30:26.262285 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.qdomain.default : NoneType
I0420 07:30:26.262368 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.random_seed : NoneType
I0420 07:30:26.262449 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.262531 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.global_vn : False
I0420 07:30:26.262610 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.per_step_vn : False
I0420 07:30:26.262693 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.scale : NoneType
I0420 07:30:26.262778 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.vn.seed : NoneType
I0420 07:30:26.262865 140395597309760 base_runner.py:69] task.encoder.cnn_tpl.weight_norm : False
I0420 07:30:26.262943 140395597309760 base_runner.py:69] task.encoder.conv_filter_shapes : [(3, 3, 1, 32), (3, 3, 32, 32)]
I0420 07:30:26.263027 140395597309760 base_runner.py:69] task.encoder.conv_filter_strides : [(2, 2), (2, 2)]
I0420 07:30:26.263106 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.263190 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType']
I0420 07:30:26.263268 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cell_value_cap : 10.0
I0420 07:30:26.263353 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.cls : type/lingvo.core.rnn_cell/ConvLSTMCell
I0420 07:30:26.263432 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.dtype : float32
I0420 07:30:26.263515 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.filter_shape : [1, 3]
I0420 07:30:26.263600 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.fprop_dtype : NoneType
I0420 07:30:26.263685 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inference_driver_name : NoneType
I0420 07:30:26.263771 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_arity : 1
I0420 07:30:26.263859 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.inputs_shape : ['NoneType', 'NoneType', 'NoneType', 'NoneType']
I0420 07:30:26.263937 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_eval : NoneType
I0420 07:30:26.264022 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.is_inference : NoneType
I0420 07:30:26.264101 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.name : ''
I0420 07:30:26.264185 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_input_nodes : 0
I0420 07:30:26.264265 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.num_output_nodes : 0
I0420 07:30:26.264348 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.output_nonlinearity : True
I0420 07:30:26.264429 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:26.264513 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.scale : 0.1
I0420 07:30:26.264594 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.params_init.seed : NoneType
I0420 07:30:26.264678 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.qdomain.default : NoneType
I0420 07:30:26.264764 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.random_seed : NoneType
I0420 07:30:26.264847 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.reset_cell_state : False
I0420 07:30:26.264930 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.265014 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.global_vn : False
I0420 07:30:26.265100 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.per_step_vn : False
I0420 07:30:26.265185 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.scale : NoneType
I0420 07:30:26.265264 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.vn.seed : NoneType
I0420 07:30:26.265352 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:26.265431 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:26.265516 140395597309760 base_runner.py:69] task.encoder.conv_lstm_tpl.zo_prob : 0.0
I0420 07:30:26.265594 140395597309760 base_runner.py:69] task.encoder.dtype : float32
I0420 07:30:26.265680 140395597309760 base_runner.py:69] task.encoder.extra_per_layer_outputs : False
I0420 07:30:26.265764 140395597309760 base_runner.py:69] task.encoder.fprop_dtype : NoneType
I0420 07:30:26.265851 140395597309760 base_runner.py:69] task.encoder.highway_skip : False
I0420 07:30:26.265930 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.266016 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.batch_norm : False
I0420 07:30:26.266094 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.carry_bias_init : 1.0
I0420 07:30:26.266179 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.cls : type/lingvo.core.layers/HighwaySkipLayer
I0420 07:30:26.266258 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.couple_carry_transform_gates : False
I0420 07:30:26.266340 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.dtype : float32
I0420 07:30:26.266423 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.fprop_dtype : NoneType
I0420 07:30:26.266509 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.inference_driver_name : NoneType
I0420 07:30:26.266597 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.input_dim : 0
I0420 07:30:26.266688 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_eval : NoneType
I0420 07:30:26.266772 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.is_inference : NoneType
I0420 07:30:26.266856 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.name : ''
I0420 07:30:26.266940 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.method : 'xavier'
I0420 07:30:26.267024 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.scale : 1.000001
I0420 07:30:26.267110 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.params_init.seed : NoneType
I0420 07:30:26.267194 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.random_seed : NoneType
I0420 07:30:26.267273 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.267358 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.global_vn : False
I0420 07:30:26.267437 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.per_step_vn : False
I0420 07:30:26.267522 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.scale : NoneType
I0420 07:30:26.267599 140395597309760 base_runner.py:69] task.encoder.highway_skip_tpl.vn.seed : NoneType
I0420 07:30:26.267685 140395597309760 base_runner.py:69] task.encoder.inference_driver_name : NoneType
I0420 07:30:26.267771 140395597309760 base_runner.py:69] task.encoder.input_shape : ['NoneType', 'NoneType', 80, 1]
I0420 07:30:26.267854 140395597309760 base_runner.py:69] task.encoder.is_eval : NoneType
I0420 07:30:26.267937 140395597309760 base_runner.py:69] task.encoder.is_inference : NoneType
I0420 07:30:26.268023 140395597309760 base_runner.py:69] task.encoder.lstm_cell_size : 1024
I0420 07:30:26.268101 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.268187 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.apply_pruning : False
I0420 07:30:26.268265 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.method : 'constant'
I0420 07:30:26.268349 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.scale : 0.0
I0420 07:30:26.268430 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.bias_init.seed : 0
I0420 07:30:26.268515 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cell_value_cap : 10.0
I0420 07:30:26.268594 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.cls : type/lingvo.core.rnn_cell/LSTMCellSimple
I0420 07:30:26.268677 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.couple_input_forget_gates : False
I0420 07:30:26.268764 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.dtype : float32
I0420 07:30:26.268850 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.enable_lstm_bias : True
I0420 07:30:26.268927 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.forget_gate_bias : 0.0
I0420 07:30:26.269013 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.fprop_dtype : NoneType
I0420 07:30:26.269093 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inference_driver_name : NoneType
I0420 07:30:26.269176 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.inputs_arity : 1
I0420 07:30:26.269254 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_eval : NoneType
I0420 07:30:26.269340 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.is_inference : NoneType
I0420 07:30:26.269419 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.name : ''
I0420 07:30:26.269503 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_hidden_nodes : 0
I0420 07:30:26.269582 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_input_nodes : 0
I0420 07:30:26.269663 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.num_output_nodes : 0
I0420 07:30:26.269752 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.output_nonlinearity : True
I0420 07:30:26.269839 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.method : 'uniform'
I0420 07:30:26.269921 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.scale : 0.1
I0420 07:30:26.270009 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.params_init.seed : NoneType
I0420 07:30:26.270087 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.c_state : NoneType
I0420 07:30:26.270175 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.default : NoneType
I0420 07:30:26.270255 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.fullyconnected : NoneType
I0420 07:30:26.270339 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.m_state : NoneType
I0420 07:30:26.270418 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.qdomain.weight : NoneType
I0420 07:30:26.270503 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.random_seed : NoneType
I0420 07:30:26.270581 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.reset_cell_state : False
I0420 07:30:26.270664 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.270754 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.global_vn : False
I0420 07:30:26.270838 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.per_step_vn : False
I0420 07:30:26.270915 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.scale : NoneType
I0420 07:30:26.271002 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.vn.seed : NoneType
I0420 07:30:26.271080 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.method : 'zeros'
I0420 07:30:26.271169 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zero_state_init_params.seed : NoneType
I0420 07:30:26.271248 140395597309760 base_runner.py:69] task.encoder.lstm_tpl.zo_prob : 0.0
I0420 07:30:26.271332 140395597309760 base_runner.py:69] task.encoder.name : ''
I0420 07:30:26.271411 140395597309760 base_runner.py:69] task.encoder.num_cnn_layers : 2
I0420 07:30:26.271496 140395597309760 base_runner.py:69] task.encoder.num_conv_lstm_layers : 0
I0420 07:30:26.271574 140395597309760 base_runner.py:69] task.encoder.num_lstm_layers : 4
I0420 07:30:26.271656 140395597309760 base_runner.py:69] task.encoder.packed_input : False
I0420 07:30:26.271743 140395597309760 base_runner.py:69] task.encoder.pad_steps : 6
I0420 07:30:26.271828 140395597309760 base_runner.py:69] task.encoder.params_init.method : 'xavier'
I0420 07:30:26.271908 140395597309760 base_runner.py:69] task.encoder.params_init.scale : 1.000001
I0420 07:30:26.271992 140395597309760 base_runner.py:69] task.encoder.params_init.seed : NoneType
I0420 07:30:26.272073 140395597309760 base_runner.py:69] task.encoder.proj_tpl.activation : 'RELU'
I0420 07:30:26.272156 140395597309760 base_runner.py:69] task.encoder.proj_tpl.affine_last : False
I0420 07:30:26.272236 140395597309760 base_runner.py:69] task.encoder.proj_tpl.allow_implicit_capture : NoneType
I0420 07:30:26.272319 140395597309760 base_runner.py:69] task.encoder.proj_tpl.batch_norm : True
I0420 07:30:26.272397 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bias_init : 0.0
I0420 07:30:26.272481 140395597309760 base_runner.py:69] task.encoder.proj_tpl.bn_fold_weights : NoneType
I0420 07:30:26.272561 140395597309760 base_runner.py:69] task.encoder.proj_tpl.cls : type/lingvo.core.layers/ProjectionLayer
I0420 07:30:26.272644 140395597309760 base_runner.py:69] task.encoder.proj_tpl.dtype : float32
I0420 07:30:26.272727 140395597309760 base_runner.py:69] task.encoder.proj_tpl.fprop_dtype : NoneType
I0420 07:30:26.272814 140395597309760 base_runner.py:69] task.encoder.proj_tpl.has_bias : False
I0420 07:30:26.272893 140395597309760 base_runner.py:69] task.encoder.proj_tpl.inference_driver_name : NoneType
I0420 07:30:26.272977 140395597309760 base_runner.py:69] task.encoder.proj_tpl.input_dim : 0
I0420 07:30:26.273056 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_eval : NoneType
I0420 07:30:26.273139 140395597309760 base_runner.py:69] task.encoder.proj_tpl.is_inference : NoneType
I0420 07:30:26.273219 140395597309760 base_runner.py:69] task.encoder.proj_tpl.name : ''
I0420 07:30:26.273309 140395597309760 base_runner.py:69] task.encoder.proj_tpl.output_dim : 0
I0420 07:30:26.273391 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.method : 'truncated_gaussian'
I0420 07:30:26.273472 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.scale : 0.1
I0420 07:30:26.273554 140395597309760 base_runner.py:69] task.encoder.proj_tpl.params_init.seed : NoneType
I0420 07:30:26.273639 140395597309760 base_runner.py:69] task.encoder.proj_tpl.qdomain.default : NoneType
I0420 07:30:26.273720 140395597309760 base_runner.py:69] task.encoder.proj_tpl.random_seed : NoneType
I0420 07:30:26.273808 140395597309760 base_runner.py:69] task.encoder.proj_tpl.skip_lp_regularization : NoneType
I0420 07:30:26.273889 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.global_vn : False
I0420 07:30:26.273974 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.per_step_vn : False
I0420 07:30:26.274058 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.scale : NoneType
I0420 07:30:26.274142 140395597309760 base_runner.py:69] task.encoder.proj_tpl.vn.seed : NoneType
I0420 07:30:26.274221 140395597309760 base_runner.py:69] task.encoder.proj_tpl.weight_norm : False
I0420 07:30:26.274306 140395597309760 base_runner.py:69] task.encoder.project_lstm_output : True
I0420 07:30:26.274385 140395597309760 base_runner.py:69] task.encoder.random_seed : NoneType
I0420 07:30:26.274470 140395597309760 base_runner.py:69] task.encoder.residual_start : 0
I0420 07:30:26.274550 140395597309760 base_runner.py:69] task.encoder.residual_stride : 1
I0420 07:30:26.274635 140395597309760 base_runner.py:69] task.encoder.skip_lp_regularization : NoneType
I0420 07:30:26.274713 140395597309760 base_runner.py:69] task.encoder.vn.global_vn : False
I0420 07:30:26.274802 140395597309760 base_runner.py:69] task.encoder.vn.per_step_vn : False
I0420 07:30:26.274883 140395597309760 base_runner.py:69] task.encoder.vn.scale : NoneType
I0420 07:30:26.274967 140395597309760 base_runner.py:69] task.encoder.vn.seed : NoneType
I0420 07:30:26.275044 140395597309760 base_runner.py:69] task.eval.decoder_samples_per_summary : 0
I0420 07:30:26.275131 140395597309760 base_runner.py:69] task.eval.samples_per_summary : 5000
I0420 07:30:26.275209 140395597309760 base_runner.py:69] task.fprop_dtype : NoneType
I0420 07:30:26.275296 140395597309760 base_runner.py:69] task.frontend : NoneType
I0420 07:30:26.275378 140395597309760 base_runner.py:69] task.inference_driver_name : NoneType
I0420 07:30:26.275464 140395597309760 base_runner.py:69] task.input : NoneType
I0420 07:30:26.275543 140395597309760 base_runner.py:69] task.is_eval : NoneType
I0420 07:30:26.275628 140395597309760 base_runner.py:69] task.is_inference : NoneType
I0420 07:30:26.275707 140395597309760 base_runner.py:69] task.name : 'librispeech'
I0420 07:30:26.275796 140395597309760 base_runner.py:69] task.online_encoder : NoneType
I0420 07:30:26.275876 140395597309760 base_runner.py:69] task.params_init.method : 'xavier'
I0420 07:30:26.275960 140395597309760 base_runner.py:69] task.params_init.scale : 1.000001
I0420 07:30:26.276041 140395597309760 base_runner.py:69] task.params_init.seed : NoneType
I0420 07:30:26.276124 140395597309760 base_runner.py:69] task.random_seed : NoneType
I0420 07:30:26.276204 140395597309760 base_runner.py:69] task.skip_lp_regularization : NoneType
I0420 07:30:26.276288 140395597309760 base_runner.py:69] task.target_key : ''
I0420 07:30:26.276365 140395597309760 base_runner.py:69] task.train.bprop_variable_filter : NoneType
I0420 07:30:26.276451 140395597309760 base_runner.py:69] task.train.clip_gradient_norm_to_value : 1.0
I0420 07:30:26.276530 140395597309760 base_runner.py:69] task.train.clip_gradient_single_norm_to_value : 0.0
I0420 07:30:26.276617 140395597309760 base_runner.py:69] task.train.colocate_gradients_with_ops : True
I0420 07:30:26.276699 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.jobname : 'eval_dev'
I0420 07:30:26.276793 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.local_filesystem : False
I0420 07:30:26.276876 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.logdir : ''
I0420 07:30:26.276961 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.metric : 'log_pplx'
I0420 07:30:26.277040 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.minimize : True
I0420 07:30:26.277124 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.name : 'MetricHistory'
I0420 07:30:26.277204 140395597309760 base_runner.py:69] task.train.early_stop.metric_history.tfevent_file : False
I0420 07:30:26.277287 140395597309760 base_runner.py:69] task.train.early_stop.name : 'EarlyStop'
I0420 07:30:26.277373 140395597309760 base_runner.py:69] task.train.early_stop.tolerance : 0.0
I0420 07:30:26.277456 140395597309760 base_runner.py:69] task.train.early_stop.verbose : True
I0420 07:30:26.277537 140395597309760 base_runner.py:69] task.train.early_stop.window : 0
I0420 07:30:26.277616 140395597309760 base_runner.py:69] task.train.ema_decay : 0.0
I0420 07:30:26.277699 140395597309760 base_runner.py:69] task.train.gate_gradients : False
I0420 07:30:26.277790 140395597309760 base_runner.py:69] task.train.grad_aggregation_method : 1
I0420 07:30:26.277870 140395597309760 base_runner.py:69] task.train.grad_norm_to_clip_to_zero : 100.0
I0420 07:30:26.277954 140395597309760 base_runner.py:69] task.train.grad_norm_tracker : NoneType
I0420 07:30:26.278033 140395597309760 base_runner.py:69] task.train.init_from_checkpoint_rules : {}
I0420 07:30:26.278116 140395597309760 base_runner.py:69] task.train.l1_regularizer_weight : NoneType
I0420 07:30:26.278198 140395597309760 base_runner.py:69] task.train.l2_regularizer_weight : 1e-06
I0420 07:30:26.278280 140395597309760 base_runner.py:69] task.train.learning_rate : 0.00025
I0420 07:30:26.278367 140395597309760 base_runner.py:69] task.train.lr_schedule.allow_implicit_capture : NoneType
I0420 07:30:26.278454 140395597309760 base_runner.py:69] task.train.lr_schedule.cls : type/lingvo.core.lr_schedule/ContinuousLearningRateSchedule
I0420 07:30:26.278532 140395597309760 base_runner.py:69] task.train.lr_schedule.dtype : float32
I0420 07:30:26.278616 140395597309760 base_runner.py:69] task.train.lr_schedule.fprop_dtype : NoneType
I0420 07:30:26.278697 140395597309760 base_runner.py:69] task.train.lr_schedule.half_life_steps : 100000
I0420 07:30:26.278785 140395597309760 base_runner.py:69] task.train.lr_schedule.inference_driver_name : NoneType
I0420 07:30:26.278866 140395597309760 base_runner.py:69] task.train.lr_schedule.initial_value : 1.0
I0420 07:30:26.278949 140395597309760 base_runner.py:69] task.train.lr_schedule.is_eval : NoneType
I0420 07:30:26.279028 140395597309760 base_runner.py:69] task.train.lr_schedule.is_inference : NoneType
I0420 07:30:26.279109 140395597309760 base_runner.py:69] task.train.lr_schedule.min : 0.01
I0420 07:30:26.279192 140395597309760 base_runner.py:69] task.train.lr_schedule.name : 'LRSched'
I0420 07:30:26.279278 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.method : 'xavier'
I0420 07:30:26.279364 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.scale : 1.000001
I0420 07:30:26.279448 140395597309760 base_runner.py:69] task.train.lr_schedule.params_init.seed : NoneType
I0420 07:30:26.279526 140395597309760 base_runner.py:69] task.train.lr_schedule.random_seed : NoneType
I0420 07:30:26.279611 140395597309760 base_runner.py:69] task.train.lr_schedule.skip_lp_regularization : NoneType
I0420 07:30:26.279690 140395597309760 base_runner.py:69] task.train.lr_schedule.start_step : 50000
I0420 07:30:26.279779 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.global_vn : False
I0420 07:30:26.279859 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.per_step_vn : False
I0420 07:30:26.279944 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.scale : NoneType
I0420 07:30:26.280024 140395597309760 base_runner.py:69] task.train.lr_schedule.vn.seed : NoneType
I0420 07:30:26.280117 140395597309760 base_runner.py:69] task.train.max_steps : 4000000
I0420 07:30:26.280201 140395597309760 base_runner.py:69] task.train.optimizer.allow_implicit_capture : NoneType
I0420 07:30:26.280287 140395597309760 base_runner.py:69] task.train.optimizer.beta1 : 0.9
I0420 07:30:26.280368 140395597309760 base_runner.py:69] task.train.optimizer.beta2 : 0.999
I0420 07:30:26.280452 140395597309760 base_runner.py:69] task.train.optimizer.cls : type/lingvo.core.optimizer/Adam
I0420 07:30:26.280530 140395597309760 base_runner.py:69] task.train.optimizer.dtype : float32
I0420 07:30:26.280616 140395597309760 base_runner.py:69] task.train.optimizer.epsilon : 1e-06
I0420 07:30:26.280705 140395597309760 base_runner.py:69] task.train.optimizer.fprop_dtype : NoneType
I0420 07:30:26.280791 140395597309760 base_runner.py:69] task.train.optimizer.inference_driver_name : NoneType
I0420 07:30:26.280874 140395597309760 base_runner.py:69] task.train.optimizer.is_eval : NoneType
I0420 07:30:26.280957 140395597309760 base_runner.py:69] task.train.optimizer.is_inference : NoneType
I0420 07:30:26.281035 140395597309760 base_runner.py:69] task.train.optimizer.name : 'Adam'
I0420 07:30:26.281120 140395597309760 base_runner.py:69] task.train.optimizer.params_init.method : 'xavier'
I0420 07:30:26.281200 140395597309760 base_runner.py:69] task.train.optimizer.params_init.scale : 1.000001
I0420 07:30:26.281286 140395597309760 base_runner.py:69] task.train.optimizer.params_init.seed : NoneType
I0420 07:30:26.281364 140395597309760 base_runner.py:69] task.train.optimizer.random_seed : NoneType
I0420 07:30:26.281450 140395597309760 base_runner.py:69] task.train.optimizer.skip_lp_regularization : NoneType
I0420 07:30:26.281533 140395597309760 base_runner.py:69] task.train.optimizer.vn.global_vn : False
I0420 07:30:26.281619 140395597309760 base_runner.py:69] task.train.optimizer.vn.per_step_vn : False
I0420 07:30:26.281699 140395597309760 base_runner.py:69] task.train.optimizer.vn.scale : NoneType
I0420 07:30:26.281788 140395597309760 base_runner.py:69] task.train.optimizer.vn.seed : NoneType
I0420 07:30:26.281868 140395597309760 base_runner.py:69] task.train.pruning_hparams_dict : NoneType
I0420 07:30:26.281953 140395597309760 base_runner.py:69] task.train.save_interval_seconds : 600
I0420 07:30:26.282032 140395597309760 base_runner.py:69] task.train.start_up_delay_steps : 200
I0420 07:30:26.282119 140395597309760 base_runner.py:69] task.train.summary_interval_steps : 100
I0420 07:30:26.282197 140395597309760 base_runner.py:69] task.train.tpu_steps_per_loop : 20
I0420 07:30:26.282283 140395597309760 base_runner.py:69] task.train.vn_start_step : 20000
I0420 07:30:26.282366 140395597309760 base_runner.py:69] task.train.vn_std : 0.075
I0420 07:30:26.282450 140395597309760 base_runner.py:69] task.vn.global_vn : True
I0420 07:30:26.282533 140395597309760 base_runner.py:69] task.vn.per_step_vn : False
I0420 07:30:26.282620 140395597309760 base_runner.py:69] task.vn.scale : NoneType
I0420 07:30:26.282701 140395597309760 base_runner.py:69] task.vn.seed : NoneType
I0420 07:30:26.282788 140395597309760 base_runner.py:69] train.early_stop.metric_history.jobname : 'eval_dev'
I0420 07:30:26.282871 140395597309760 base_runner.py:69] train.early_stop.metric_history.local_filesystem : False
I0420 07:30:26.282958 140395597309760 base_runner.py:69] train.early_stop.metric_history.logdir : ''
I0420 07:30:26.283037 140395597309760 base_runner.py:69] train.early_stop.metric_history.metric : 'log_pplx'
I0420 07:30:26.283122 140395597309760 base_runner.py:69] train.early_stop.metric_history.minimize : True
I0420 07:30:26.283200 140395597309760 base_runner.py:69] train.early_stop.metric_history.name : 'MetricHistory'
I0420 07:30:26.283282 140395597309760 base_runner.py:69] train.early_stop.metric_history.tfevent_file : False
I0420 07:30:26.283366 140395597309760 base_runner.py:69] train.early_stop.name : 'EarlyStop'
I0420 07:30:26.283451 140395597309760 base_runner.py:69] train.early_stop.tolerance : 0.0
I0420 07:30:26.283533 140395597309760 base_runner.py:69] train.early_stop.verbose : True
I0420 07:30:26.283624 140395597309760 base_runner.py:69] train.early_stop.window : 0
I0420 07:30:26.283704 140395597309760 base_runner.py:69] train.ema_decay : 0.0
I0420 07:30:26.283792 140395597309760 base_runner.py:69] train.init_from_checkpoint_rules : {}
I0420 07:30:26.283878 140395597309760 base_runner.py:69] train.max_steps : 4000000
I0420 07:30:26.283962 140395597309760 base_runner.py:69] train.save_interval_seconds : 600
I0420 07:30:26.284043 140395597309760 base_runner.py:69] train.start_up_delay_steps : 200
I0420 07:30:26.284126 140395597309760 base_runner.py:69] train.summary_interval_steps : 100
I0420 07:30:26.284204 140395597309760 base_runner.py:69] train.tpu_steps_per_loop : 20
I0420 07:30:26.284286 140395597309760 base_runner.py:69] vn.global_vn : True
I0420 07:30:26.284372 140395597309760 base_runner.py:69] vn.per_step_vn : False
I0420 07:30:26.284456 140395597309760 base_runner.py:69] vn.scale : NoneType
I0420 07:30:26.284537 140395597309760 base_runner.py:69] vn.seed : NoneType
I0420 07:30:26.284621 140395597309760 base_runner.py:69] 
I0420 07:30:26.284714 140395597309760 base_runner.py:70] ============================================================
I0420 07:30:26.286209 140395597309760 base_runner.py:115] Starting ...
I0420 07:30:26.286483 140395597309760 cluster.py:429] _LeastLoadedPlacer : ['/job:local/replica:0/task:0/device:CPU:0']
I0420 07:30:26.294176 140395597309760 cluster.py:447] Place variable global_step on /job:local/replica:0/task:0/device:CPU:0 8
I0420 07:30:26.304714 140395597309760 base_model.py:1116] Training parameters for <class 'lingvo.core.base_model.SingleTaskModel'>: {
  early_stop: {
    metric_history: {
"eval_dev"
      local_filesystem: False
"/data/dingzhenyou/speech_data/librispeech/log/"
"log_pplx"
      minimize: True
"MetricHistory"
      tfevent_file: False
    }
"EarlyStop"
    tolerance: 0.0
    verbose: True
    window: 0
  }
  ema_decay: 0.0
  init_from_checkpoint_rules: {}
  max_steps: 4000000
  save_interval_seconds: 600
  start_up_delay_steps: 200
  summary_interval_steps: 100
  tpu_steps_per_loop: 20
}
I0420 07:30:26.318938 140395597309760 base_input_generator.py:510] bucket_batch_limit [256, 128, 128, 128, 128, 128, 128, 128]
I0420 07:30:26.584214 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 1160
I0420 07:30:26.586292 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/w/var:0 shape=(3, 3, 1, 32) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.591540 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 1288
I0420 07:30:26.593260 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.595974 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 1416
I0420 07:30:26.597687 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.601490 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 1544
I0420 07:30:26.603195 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.605947 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 1672
I0420 07:30:26.607652 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L0/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.616122 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 38536
I0420 07:30:26.618041 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/w/var:0 shape=(3, 3, 32, 32) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.623229 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 38664
I0420 07:30:26.625087 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/beta/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.627810 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 38792
I0420 07:30:26.629525 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/gamma/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.633286 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 38920
I0420 07:30:26.635004 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_mean/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.637748 140395597309760 cluster.py:447] Place variable librispeech/enc/conv_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 39048
I0420 07:30:26.639466 140395597309760 py_utils.py:1220] Creating var librispeech/enc/conv_L1/moving_variance/var:0 shape=(32,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.660010 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 27302024
I0420 07:30:26.662070 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.668963 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 27318408
I0420 07:30:26.670681 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.688720 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/wm/var on /job:local/replica:0/task:0/device:CPU:0 54581384
I0420 07:30:26.690670 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/wm/var:0 shape=(1664, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.697510 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L0/b/var on /job:local/replica:0/task:0/device:CPU:0 54597768
I0420 07:30:26.699352 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L0/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.721532 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 104929416
I0420 07:30:26.723484 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.730324 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 104945800
I0420 07:30:26.732203 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.750165 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/wm/var on /job:local/replica:0/task:0/device:CPU:0 155277448
I0420 07:30:26.752114 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.759006 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L1/b/var on /job:local/replica:0/task:0/device:CPU:0 155293832
I0420 07:30:26.760740 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.782478 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 205625480
I0420 07:30:26.784432 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.791290 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 205641864
I0420 07:30:26.793705 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.811717 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/wm/var on /job:local/replica:0/task:0/device:CPU:0 255973512
I0420 07:30:26.813673 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.820574 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L2/b/var on /job:local/replica:0/task:0/device:CPU:0 255989896
I0420 07:30:26.822328 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L2/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.843966 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 306321544
I0420 07:30:26.845961 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.852847 140395597309760 cluster.py:447] Place variable librispeech/enc/fwd_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 306337928
I0420 07:30:26.854681 140395597309760 py_utils.py:1220] Creating var librispeech/enc/fwd_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.873220 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/wm/var on /job:local/replica:0/task:0/device:CPU:0 356669576
I0420 07:30:26.875176 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/wm/var:0 shape=(3072, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.882061 140395597309760 cluster.py:447] Place variable librispeech/enc/bak_rnn_L3/b/var on /job:local/replica:0/task:0/device:CPU:0 356685960
I0420 07:30:26.883821 140395597309760 py_utils.py:1220] Creating var librispeech/enc/bak_rnn_L3/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.898005 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/w/var on /job:local/replica:0/task:0/device:CPU:0 373463176
I0420 07:30:26.899913 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.905128 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/beta/var on /job:local/replica:0/task:0/device:CPU:0 373471368
I0420 07:30:26.906843 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.909568 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/gamma/var on /job:local/replica:0/task:0/device:CPU:0 373479560
I0420 07:30:26.911293 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.915091 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 373487752
I0420 07:30:26.916820 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.919569 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L0/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 373495944
I0420 07:30:26.921300 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L0/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.929511 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/w/var on /job:local/replica:0/task:0/device:CPU:0 390273160
I0420 07:30:26.931543 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.936654 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/beta/var on /job:local/replica:0/task:0/device:CPU:0 390281352
I0420 07:30:26.938420 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.941274 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/gamma/var on /job:local/replica:0/task:0/device:CPU:0 390289544
I0420 07:30:26.942984 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.946660 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 390297736
I0420 07:30:26.948585 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.951374 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L1/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 390305928
I0420 07:30:26.953089 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L1/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.961455 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/w/var on /job:local/replica:0/task:0/device:CPU:0 407083144
I0420 07:30:26.963459 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/w/var:0 shape=(2048, 2048) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.969258 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/beta/var on /job:local/replica:0/task:0/device:CPU:0 407091336
I0420 07:30:26.970977 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/beta/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.973731 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/gamma/var on /job:local/replica:0/task:0/device:CPU:0 407099528
I0420 07:30:26.975578 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/gamma/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.979238 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_mean/var on /job:local/replica:0/task:0/device:CPU:0 407107720
I0420 07:30:26.980963 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_mean/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:26.983851 140395597309760 cluster.py:447] Place variable librispeech/enc/proj_L2/moving_variance/var on /job:local/replica:0/task:0/device:CPU:0 407115912
I0420 07:30:26.985569 140395597309760 py_utils.py:1220] Creating var librispeech/enc/proj_L2/moving_variance/var:0 shape=(2048,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.012496 140395597309760 cluster.py:447] Place variable librispeech/dec/emb/var_0/var on /job:local/replica:0/task:0/device:CPU:0 407139016
I0420 07:30:27.014452 140395597309760 py_utils.py:1220] Creating var librispeech/dec/emb/var_0/var:0 shape=(76, 76) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.023178 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/weight_0/var on /job:local/replica:0/task:0/device:CPU:0 408072904
I0420 07:30:27.025145 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/weight_0/var:0 shape=(3072, 76) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.032419 140395597309760 cluster.py:447] Place variable librispeech/dec/softmax/bias_0/var on /job:local/replica:0/task:0/device:CPU:0 408073208
I0420 07:30:27.034135 140395597309760 py_utils.py:1220] Creating var librispeech/dec/softmax/bias_0/var:0 shape=(76,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.052334 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/wm/var on /job:local/replica:0/task:0/device:CPU:0 459650040
I0420 07:30:27.054295 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/wm/var:0 shape=(3148, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.061111 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell/b/var on /job:local/replica:0/task:0/device:CPU:0 459666424
I0420 07:30:27.062843 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.077316 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/wm/var on /job:local/replica:0/task:0/device:CPU:0 526775288
I0420 07:30:27.079365 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/wm/var:0 shape=(4096, 4096) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.086210 140395597309760 cluster.py:447] Place variable librispeech/dec/rnn_cell_1/b/var on /job:local/replica:0/task:0/device:CPU:0 526791672
I0420 07:30:27.087944 140395597309760 py_utils.py:1220] Creating var librispeech/dec/rnn_cell_1/b/var:0 shape=(4096,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.102298 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/source_var/var on /job:local/replica:0/task:0/device:CPU:0 527840248
I0420 07:30:27.104248 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/source_var/var:0 shape=(2048, 128) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.114279 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/query_var/var on /job:local/replica:0/task:0/device:CPU:0 528364536
I0420 07:30:27.116240 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/query_var/var:0 shape=(1024, 128) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.127183 140395597309760 cluster.py:447] Place variable librispeech/dec/atten/hidden_var/var on /job:local/replica:0/task:0/device:CPU:0 528365048
I0420 07:30:27.129138 140395597309760 py_utils.py:1220] Creating var librispeech/dec/atten/hidden_var/var:0 shape=(128,) on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.232089 140395597309760 py_utils.py:1277] === worker 0 ===
I0420 07:30:27.233619 140395597309760 py_utils.py:1267] worker 0: decoder.atten.global_step                      /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.233711 140395597309760 py_utils.py:1267] worker 0: decoder.atten.hidden_var                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.233795 140395597309760 py_utils.py:1267] worker 0: decoder.atten.query_var                        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.233886 140395597309760 py_utils.py:1267] worker 0: decoder.atten.source_var                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.233968 140395597309760 py_utils.py:1267] worker 0: decoder.beam_search.global_step                /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.234051 140395597309760 py_utils.py:1267] worker 0: decoder.contextualizer.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.234136 140395597309760 py_utils.py:1267] worker 0: decoder.emb.wm[0]                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:27.234220 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.global_step                     /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.234311 140395597309760 py_utils.py:1267] worker 0: decoder.fusion.lm.global_step                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.234388 140395597309760 py_utils.py:1267] worker 0: decoder.global_step                            /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.234468 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].b                          /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.234544 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].global_step                /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.234625 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[0].wm                         /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.234702 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].b                          /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.234791 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].global_step                /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.234868 140395597309760 py_utils.py:1267] worker 0: decoder.rnn_cell[1].wm                         /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.234951 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.bias_0                         /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.235027 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.235106 140395597309760 py_utils.py:1267] worker 0: decoder.softmax.weight_0                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.235183 140395597309760 py_utils.py:1267] worker 0: decoder.target_sequence_sampler.global_step    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.235264 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.235340 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.gamma                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.235421 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].bn.global_step                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.235497 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.235578 140395597309760 py_utils.py:1267] worker 0: encoder.conv[0].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.235655 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.235740 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.gamma                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.235817 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].bn.global_step                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.235903 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.235980 140395597309760 py_utils.py:1267] worker 0: encoder.conv[1].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.236061 140395597309760 py_utils.py:1267] worker 0: encoder.global_step                            /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.236138 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.236217 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.gamma                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.236293 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].bn.global_step                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.236375 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.236449 140395597309760 py_utils.py:1267] worker 0: encoder.proj[0].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.236531 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.236605 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.gamma                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.236687 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].bn.global_step                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.236771 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.236854 140395597309760 py_utils.py:1267] worker 0: encoder.proj[1].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.236928 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.beta                        /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.237009 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.gamma                       /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.237085 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].bn.global_step                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.237164 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.237241 140395597309760 py_utils.py:1267] worker 0: encoder.proj[2].w                              /job:local/replica:0/task:0/device:CPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.237322 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.237397 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.237498 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.237581 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].bak_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.237664 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.237750 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.237853 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.237926 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].fwd_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.238007 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[0].global_step                     /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.238082 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.238162 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.238239 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.238321 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].bak_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.238395 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.238476 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.238553 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.238634 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].fwd_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.238708 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[1].global_step                     /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.238795 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.238872 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.238953 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.239028 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].bak_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.239109 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.239190 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.239272 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.239350 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].fwd_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.239430 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[2].global_step                     /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.239506 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.239588 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.239665 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.239753 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].bak_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.239830 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.b                  /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.239909 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.global_step        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.239985 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.cell.wm                 /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.240067 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].fwd_rnn.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.240142 140395597309760 py_utils.py:1267] worker 0: encoder.rnn[3].global_step                     /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.240221 140395597309760 py_utils.py:1267] worker 0: global_step                                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.240297 140395597309760 py_utils.py:1267] worker 0: input._tokenizer_default.global_step           /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.240376 140395597309760 py_utils.py:1267] worker 0: input.global_step                              /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.240453 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.global_step                    /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:0
I0420 07:30:27.240534 140395597309760 py_utils.py:1267] worker 0: lr_schedule.exp.linear.global_step             /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:1
I0420 07:30:27.240608 140395597309760 py_utils.py:1267] worker 0: lr_schedule.global_step                        /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:2
I0420 07:30:27.240689 140395597309760 py_utils.py:1267] worker 0: optimizer.global_step                          /job:local/replica:0/task:0/device:GPU:0 -> /job:local/replica:0/task:0/device:GPU:3
I0420 07:30:27.240773 140395597309760 py_utils.py:1283] ==========
I0420 07:30:30.333229 140395597309760 decoder.py:749] Merging metric loss: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:30.337625 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/fraction_of_correct_next_step_preds:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:30.341779 140395597309760 decoder.py:749] Merging metric log_pplx: (<tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_0/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:33.729021 140395597309760 decoder.py:749] Merging metric loss: (<tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:33.733247 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (<tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/fraction_of_correct_next_step_preds:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:33.737484 140395597309760 decoder.py:749] Merging metric log_pplx: (<tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_1/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:36.879219 140395597309760 decoder.py:749] Merging metric loss: (<tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:36.883528 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (<tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/fraction_of_correct_next_step_preds:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:36.887641 140395597309760 decoder.py:749] Merging metric log_pplx: (<tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_2/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:40.128005 140395597309760 decoder.py:749] Merging metric loss: (<tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:40.132308 140395597309760 decoder.py:749] Merging metric fraction_of_correct_next_step_preds: (<tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/fraction_of_correct_next_step_preds:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:40.136519 140395597309760 decoder.py:749] Merging metric log_pplx: (<tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/truediv_1:0' shape=() dtype=float32>, <tf.Tensor 'fprop/librispeech/tower_0_3/dec_1/Sum:0' shape=() dtype=float32>)
I0420 07:30:53.594866 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.hidden_var: <tf.Variable 'librispeech/dec/atten/hidden_var/var:0' shape=(128,) dtype=float32_ref>
I0420 07:30:53.595122 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.query_var: <tf.Variable 'librispeech/dec/atten/query_var/var:0' shape=(1024, 128) dtype=float32_ref>
I0420 07:30:53.595273 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.atten.source_var: <tf.Variable 'librispeech/dec/atten/source_var/var:0' shape=(2048, 128) dtype=float32_ref>
I0420 07:30:53.595398 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.emb.wm_0: <tf.Variable 'librispeech/dec/emb/var_0/var:0' shape=(76, 76) dtype=float32_ref>
I0420 07:30:53.595546 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.b: <tf.Variable 'librispeech/dec/rnn_cell/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.595670 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_0.wm: <tf.Variable 'librispeech/dec/rnn_cell/wm/var:0' shape=(3148, 4096) dtype=float32_ref>
I0420 07:30:53.595813 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.b: <tf.Variable 'librispeech/dec/rnn_cell_1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.595932 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.rnn_cell_1.wm: <tf.Variable 'librispeech/dec/rnn_cell_1/wm/var:0' shape=(4096, 4096) dtype=float32_ref>
I0420 07:30:53.596065 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.bias_0: <tf.Variable 'librispeech/dec/softmax/bias_0/var:0' shape=(76,) dtype=float32_ref>
I0420 07:30:53.596189 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: decoder.softmax.weight_0: <tf.Variable 'librispeech/dec/softmax/weight_0/var:0' shape=(3072, 76) dtype=float32_ref>
I0420 07:30:53.596312 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.beta: <tf.Variable 'librispeech/enc/conv_L0/beta/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:53.596432 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.bn.gamma: <tf.Variable 'librispeech/enc/conv_L0/gamma/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:53.596545 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_0.w: <tf.Variable 'librispeech/enc/conv_L0/w/var:0' shape=(3, 3, 1, 32) dtype=float32_ref>
I0420 07:30:53.596672 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.beta: <tf.Variable 'librispeech/enc/conv_L1/beta/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:53.596795 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.bn.gamma: <tf.Variable 'librispeech/enc/conv_L1/gamma/var:0' shape=(32,) dtype=float32_ref>
I0420 07:30:53.596910 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.conv_1.w: <tf.Variable 'librispeech/enc/conv_L1/w/var:0' shape=(3, 3, 32, 32) dtype=float32_ref>
I0420 07:30:53.597038 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.beta: <tf.Variable 'librispeech/enc/proj_L0/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597156 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.bn.gamma: <tf.Variable 'librispeech/enc/proj_L0/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597265 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_0.w: <tf.Variable 'librispeech/enc/proj_L0/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:53.597389 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.beta: <tf.Variable 'librispeech/enc/proj_L1/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597506 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.bn.gamma: <tf.Variable 'librispeech/enc/proj_L1/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597620 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_1.w: <tf.Variable 'librispeech/enc/proj_L1/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:53.597750 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.beta: <tf.Variable 'librispeech/enc/proj_L2/beta/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597863 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.bn.gamma: <tf.Variable 'librispeech/enc/proj_L2/gamma/var:0' shape=(2048,) dtype=float32_ref>
I0420 07:30:53.597970 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.proj_2.w: <tf.Variable 'librispeech/enc/proj_L2/w/var:0' shape=(2048, 2048) dtype=float32_ref>
I0420 07:30:53.598094 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L0/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.598207 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L0/wm/var:0' shape=(1664, 4096) dtype=float32_ref>
I0420 07:30:53.598339 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L0/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.598453 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_0.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L0/wm/var:0' shape=(1664, 4096) dtype=float32_ref>
I0420 07:30:53.598577 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.598686 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L1/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:53.598819 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L1/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.598931 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_1.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L1/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:53.599054 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L2/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.599164 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L2/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:53.599287 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L2/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.599399 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_2.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L2/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:53.599523 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.b: <tf.Variable 'librispeech/enc/bak_rnn_L3/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.599636 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.bak_rnn.cell.wm: <tf.Variable 'librispeech/enc/bak_rnn_L3/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:53.599764 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.b: <tf.Variable 'librispeech/enc/fwd_rnn_L3/b/var:0' shape=(4096,) dtype=float32_ref>
I0420 07:30:53.599878 140395597309760 py_utils.py:1730] AdjustGradientsWithLpLoss: encoder.rnn_3.fwd_rnn.cell.wm: <tf.Variable 'librispeech/enc/fwd_rnn_L3/wm/var:0' shape=(3072, 4096) dtype=float32_ref>
I0420 07:30:54.965993 140395597309760 cluster.py:447] Place variable beta1_power on /job:local/replica:0/task:0/device:CPU:0 528365052
I0420 07:30:54.969089 140395597309760 cluster.py:447] Place variable beta2_power on /job:local/replica:0/task:0/device:CPU:0 528365056
I0420 07:30:55.471826 140395597309760 cluster.py:447] Place variable librispeech/total_samples/var on /job:local/replica:0/task:0/device:CPU:0 528365064
I0420 07:30:55.473663 140395597309760 py_utils.py:1220] Creating var librispeech/total_samples/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:55.480046 140395597309760 cluster.py:447] Place variable total_nan_gradients/var on /job:local/replica:0/task:0/device:CPU:0 528365072
I0420 07:30:55.481794 140395597309760 py_utils.py:1220] Creating var total_nan_gradients/var:0 shape=() on device /job:local/replica:0/task:0/device:CPU:0
I0420 07:30:55.505249 140395597309760 trainer.py:392] Trainer number of enqueue ops: 0
I0420 07:30:55.505595 140395597309760 trainer.py:401] AttributeError. Expected for single task models.
I0420 07:31:00.711345 140395597309760 trainer.py:1329] Starting runners
I0420 07:31:00.712577 140375142418176 base_runner.py:195] controller started.
I0420 07:31:00.712903 140395597309760 trainer.py:1336] Total num runner.enqueue_ops: 0
I0420 07:31:00.713491 140375134025472 base_runner.py:195] trainer started.
I0420 07:31:00.713701 140395597309760 trainer.py:1336] Total num runner.enqueue_ops: 0
I0420 07:31:00.714096 140395597309760 trainer.py:1346] Waiting for runners to finish...
2019-04-20 07:31:01.873354: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1485] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
I0420 07:31:02.337265 140375142418176 trainer.py:302] Uninitialized var list: ['global_step', 'librispeech/enc/conv_L0/w/var', 'librispeech/enc/conv_L0/beta/var', 'librispeech/enc/conv_L0/gamma/var', 'librispeech/enc/conv_L0/moving_mean/var', 'librispeech/enc/conv_L0/moving_variance/var', 'librispeech/enc/conv_L1/w/var', 'librispeech/enc/conv_L1/beta/var', 'librispeech/enc/conv_L1/gamma/var', 'librispeech/enc/conv_L1/moving_mean/var', 'librispeech/enc/conv_L1/moving_variance/var', 'librispeech/enc/fwd_rnn_L0/wm/var', 'librispeech/enc/fwd_rnn_L0/b/var', 'librispeech/enc/bak_rnn_L0/wm/var', 'librispeech/enc/bak_rnn_L0/b/var', 'librispeech/enc/fwd_rnn_L1/wm/var', 'librispeech/enc/fwd_rnn_L1/b/var', 'librispeech/enc/bak_rnn_L1/wm/var', 'librispeech/enc/bak_rnn_L1/b/var', 'librispeech/enc/fwd_rnn_L2/wm/var', 'librispeech/enc/fwd_rnn_L2/b/var', 'librispeech/enc/bak_rnn_L2/wm/var', 'librispeech/enc/bak_rnn_L2/b/var', 'librispeech/enc/fwd_rnn_L3/wm/var', 'librispeech/enc/fwd_rnn_L3/b/var', 'librispeech/enc/bak_rnn_L3/wm/var', 'librispeech/enc/bak_rnn_L3/b/var', 'librispeech/enc/proj_L0/w/var', 'librispeech/enc/proj_L0/beta/var', 'librispeech/enc/proj_L0/gamma/var', 'librispeech/enc/proj_L0/moving_mean/var', 'librispeech/enc/proj_L0/moving_variance/var', 'librispeech/enc/proj_L1/w/var', 'librispeech/enc/proj_L1/beta/var', 'librispeech/enc/proj_L1/gamma/var', 'librispeech/enc/proj_L1/moving_mean/var', 'librispeech/enc/proj_L1/moving_variance/var', 'librispeech/enc/proj_L2/w/var', 'librispeech/enc/proj_L2/beta/var', 'librispeech/enc/proj_L2/gamma/var', 'librispeech/enc/proj_L2/moving_mean/var', 'librispeech/enc/proj_L2/moving_variance/var', 'librispeech/dec/emb/var_0/var', 'librispeech/dec/softmax/weight_0/var', 'librispeech/dec/softmax/bias_0/var', 'librispeech/dec/rnn_cell/wm/var', 'librispeech/dec/rnn_cell/b/var', 'librispeech/dec/rnn_cell_1/wm/var', 'librispeech/dec/rnn_cell_1/b/var', 'librispeech/dec/atten/source_var/var', 'librispeech/dec/atten/query_var/var', 'librispeech/dec/atten/hidden_var/var', 'beta1_power', 'beta2_power', 'librispeech/dec/atten/hidden_var/var/Adam', 'librispeech/dec/atten/hidden_var/var/Adam_1', 'librispeech/dec/atten/query_var/var/Adam', 'librispeech/dec/atten/query_var/var/Adam_1', 'librispeech/dec/atten/source_var/var/Adam', 'librispeech/dec/atten/source_var/var/Adam_1', 'librispeech/dec/emb/var_0/var/Adam', 'librispeech/dec/emb/var_0/var/Adam_1', 'librispeech/dec/rnn_cell/b/var/Adam', 'librispeech/dec/rnn_cell/b/var/Adam_1', 'librispeech/dec/rnn_cell/wm/var/Adam', 'librispeech/dec/rnn_cell/wm/var/Adam_1', 'librispeech/dec/rnn_cell_1/b/var/Adam', 'librispeech/dec/rnn_cell_1/b/var/Adam_1', 'librispeech/dec/rnn_cell_1/wm/var/Adam', 'librispeech/dec/rnn_cell_1/wm/var/Adam_1', 'librispeech/dec/softmax/bias_0/var/Adam', 'librispeech/dec/softmax/bias_0/var/Adam_1', 'librispeech/dec/softmax/weight_0/var/Adam', 'librispeech/dec/softmax/weight_0/var/Adam_1', 'librispeech/enc/conv_L0/beta/var/Adam', 'librispeech/enc/conv_L0/beta/var/Adam_1', 'librispeech/enc/conv_L0/gamma/var/Adam', 'librispeech/enc/conv_L0/gamma/var/Adam_1', 'librispeech/enc/conv_L0/w/var/Adam', 'librispeech/enc/conv_L0/w/var/Adam_1', 'librispeech/enc/conv_L1/beta/var/Adam', 'librispeech/enc/conv_L1/beta/var/Adam_1', 'librispeech/enc/conv_L1/gamma/var/Adam', 'librispeech/enc/conv_L1/gamma/var/Adam_1', 'librispeech/enc/conv_L1/w/var/Adam', 'librispeech/enc/conv_L1/w/var/Adam_1', 'librispeech/enc/proj_L0/beta/var/Adam', 'librispeech/enc/proj_L0/beta/var/Adam_1', 'librispeech/enc/proj_L0/gamma/var/Adam', 'librispeech/enc/proj_L0/gamma/var/Adam_1', 'librispeech/enc/proj_L0/w/var/Adam', 'librispeech/enc/proj_L0/w/var/Adam_1', 'librispeech/enc/proj_L1/beta/var/Adam', 'librispeech/enc/proj_L1/beta/var/Adam_1', 'librispeech/enc/proj_L1/gamma/var/Adam', 'librispeech/enc/proj_L1/gamma/var/Adam_1', 'librispeech/enc/proj_L1/w/var/Adam', 'librispeech/enc/proj_L1/w/var/Adam_1', 'librispeech/enc/proj_L2/beta/var/Adam', 'librispeech/enc/proj_L2/beta/var/Adam_1', 'librispeech/enc/proj_L2/gamma/var/Adam', 'librispeech/enc/proj_L2/gamma/var/Adam_1', 'librispeech/enc/proj_L2/w/var/Adam', 'librispeech/enc/proj_L2/w/var/Adam_1', 'librispeech/enc/bak_rnn_L0/b/var/Adam', 'librispeech/enc/bak_rnn_L0/b/var/Adam_1', 'librispeech/enc/bak_rnn_L0/wm/var/Adam', 'librispeech/enc/bak_rnn_L0/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/b/var/Adam', 'librispeech/enc/fwd_rnn_L0/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L1/b/var/Adam', 'librispeech/enc/bak_rnn_L1/b/var/Adam_1', 'librispeech/enc/bak_rnn_L1/wm/var/Adam', 'librispeech/enc/bak_rnn_L1/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/b/var/Adam', 'librispeech/enc/fwd_rnn_L1/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L2/b/var/Adam', 'librispeech/enc/bak_rnn_L2/b/var/Adam_1', 'librispeech/enc/bak_rnn_L2/wm/var/Adam', 'librispeech/enc/bak_rnn_L2/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/b/var/Adam', 'librispeech/enc/fwd_rnn_L2/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L3/b/var/Adam', 'librispeech/enc/bak_rnn_L3/b/var/Adam_1', 'librispeech/enc/bak_rnn_L3/wm/var/Adam', 'librispeech/enc/bak_rnn_L3/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/b/var/Adam', 'librispeech/enc/fwd_rnn_L3/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam_1', 'librispeech/total_samples/var', 'total_nan_gradients/var'] 
I0420 07:31:02.337812 140375142418176 trainer.py:313] Initialize ALL variables: ['global_step', 'librispeech/enc/conv_L0/w/var', 'librispeech/enc/conv_L0/beta/var', 'librispeech/enc/conv_L0/gamma/var', 'librispeech/enc/conv_L0/moving_mean/var', 'librispeech/enc/conv_L0/moving_variance/var', 'librispeech/enc/conv_L1/w/var', 'librispeech/enc/conv_L1/beta/var', 'librispeech/enc/conv_L1/gamma/var', 'librispeech/enc/conv_L1/moving_mean/var', 'librispeech/enc/conv_L1/moving_variance/var', 'librispeech/enc/fwd_rnn_L0/wm/var', 'librispeech/enc/fwd_rnn_L0/b/var', 'librispeech/enc/bak_rnn_L0/wm/var', 'librispeech/enc/bak_rnn_L0/b/var', 'librispeech/enc/fwd_rnn_L1/wm/var', 'librispeech/enc/fwd_rnn_L1/b/var', 'librispeech/enc/bak_rnn_L1/wm/var', 'librispeech/enc/bak_rnn_L1/b/var', 'librispeech/enc/fwd_rnn_L2/wm/var', 'librispeech/enc/fwd_rnn_L2/b/var', 'librispeech/enc/bak_rnn_L2/wm/var', 'librispeech/enc/bak_rnn_L2/b/var', 'librispeech/enc/fwd_rnn_L3/wm/var', 'librispeech/enc/fwd_rnn_L3/b/var', 'librispeech/enc/bak_rnn_L3/wm/var', 'librispeech/enc/bak_rnn_L3/b/var', 'librispeech/enc/proj_L0/w/var', 'librispeech/enc/proj_L0/beta/var', 'librispeech/enc/proj_L0/gamma/var', 'librispeech/enc/proj_L0/moving_mean/var', 'librispeech/enc/proj_L0/moving_variance/var', 'librispeech/enc/proj_L1/w/var', 'librispeech/enc/proj_L1/beta/var', 'librispeech/enc/proj_L1/gamma/var', 'librispeech/enc/proj_L1/moving_mean/var', 'librispeech/enc/proj_L1/moving_variance/var', 'librispeech/enc/proj_L2/w/var', 'librispeech/enc/proj_L2/beta/var', 'librispeech/enc/proj_L2/gamma/var', 'librispeech/enc/proj_L2/moving_mean/var', 'librispeech/enc/proj_L2/moving_variance/var', 'librispeech/dec/emb/var_0/var', 'librispeech/dec/softmax/weight_0/var', 'librispeech/dec/softmax/bias_0/var', 'librispeech/dec/rnn_cell/wm/var', 'librispeech/dec/rnn_cell/b/var', 'librispeech/dec/rnn_cell_1/wm/var', 'librispeech/dec/rnn_cell_1/b/var', 'librispeech/dec/atten/source_var/var', 'librispeech/dec/atten/query_var/var', 'librispeech/dec/atten/hidden_var/var', 'beta1_power', 'beta2_power', 'librispeech/dec/atten/hidden_var/var/Adam', 'librispeech/dec/atten/hidden_var/var/Adam_1', 'librispeech/dec/atten/query_var/var/Adam', 'librispeech/dec/atten/query_var/var/Adam_1', 'librispeech/dec/atten/source_var/var/Adam', 'librispeech/dec/atten/source_var/var/Adam_1', 'librispeech/dec/emb/var_0/var/Adam', 'librispeech/dec/emb/var_0/var/Adam_1', 'librispeech/dec/rnn_cell/b/var/Adam', 'librispeech/dec/rnn_cell/b/var/Adam_1', 'librispeech/dec/rnn_cell/wm/var/Adam', 'librispeech/dec/rnn_cell/wm/var/Adam_1', 'librispeech/dec/rnn_cell_1/b/var/Adam', 'librispeech/dec/rnn_cell_1/b/var/Adam_1', 'librispeech/dec/rnn_cell_1/wm/var/Adam', 'librispeech/dec/rnn_cell_1/wm/var/Adam_1', 'librispeech/dec/softmax/bias_0/var/Adam', 'librispeech/dec/softmax/bias_0/var/Adam_1', 'librispeech/dec/softmax/weight_0/var/Adam', 'librispeech/dec/softmax/weight_0/var/Adam_1', 'librispeech/enc/conv_L0/beta/var/Adam', 'librispeech/enc/conv_L0/beta/var/Adam_1', 'librispeech/enc/conv_L0/gamma/var/Adam', 'librispeech/enc/conv_L0/gamma/var/Adam_1', 'librispeech/enc/conv_L0/w/var/Adam', 'librispeech/enc/conv_L0/w/var/Adam_1', 'librispeech/enc/conv_L1/beta/var/Adam', 'librispeech/enc/conv_L1/beta/var/Adam_1', 'librispeech/enc/conv_L1/gamma/var/Adam', 'librispeech/enc/conv_L1/gamma/var/Adam_1', 'librispeech/enc/conv_L1/w/var/Adam', 'librispeech/enc/conv_L1/w/var/Adam_1', 'librispeech/enc/proj_L0/beta/var/Adam', 'librispeech/enc/proj_L0/beta/var/Adam_1', 'librispeech/enc/proj_L0/gamma/var/Adam', 'librispeech/enc/proj_L0/gamma/var/Adam_1', 'librispeech/enc/proj_L0/w/var/Adam', 'librispeech/enc/proj_L0/w/var/Adam_1', 'librispeech/enc/proj_L1/beta/var/Adam', 'librispeech/enc/proj_L1/beta/var/Adam_1', 'librispeech/enc/proj_L1/gamma/var/Adam', 'librispeech/enc/proj_L1/gamma/var/Adam_1', 'librispeech/enc/proj_L1/w/var/Adam', 'librispeech/enc/proj_L1/w/var/Adam_1', 'librispeech/enc/proj_L2/beta/var/Adam', 'librispeech/enc/proj_L2/beta/var/Adam_1', 'librispeech/enc/proj_L2/gamma/var/Adam', 'librispeech/enc/proj_L2/gamma/var/Adam_1', 'librispeech/enc/proj_L2/w/var/Adam', 'librispeech/enc/proj_L2/w/var/Adam_1', 'librispeech/enc/bak_rnn_L0/b/var/Adam', 'librispeech/enc/bak_rnn_L0/b/var/Adam_1', 'librispeech/enc/bak_rnn_L0/wm/var/Adam', 'librispeech/enc/bak_rnn_L0/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/b/var/Adam', 'librispeech/enc/fwd_rnn_L0/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam', 'librispeech/enc/fwd_rnn_L0/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L1/b/var/Adam', 'librispeech/enc/bak_rnn_L1/b/var/Adam_1', 'librispeech/enc/bak_rnn_L1/wm/var/Adam', 'librispeech/enc/bak_rnn_L1/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/b/var/Adam', 'librispeech/enc/fwd_rnn_L1/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam', 'librispeech/enc/fwd_rnn_L1/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L2/b/var/Adam', 'librispeech/enc/bak_rnn_L2/b/var/Adam_1', 'librispeech/enc/bak_rnn_L2/wm/var/Adam', 'librispeech/enc/bak_rnn_L2/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/b/var/Adam', 'librispeech/enc/fwd_rnn_L2/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam', 'librispeech/enc/fwd_rnn_L2/wm/var/Adam_1', 'librispeech/enc/bak_rnn_L3/b/var/Adam', 'librispeech/enc/bak_rnn_L3/b/var/Adam_1', 'librispeech/enc/bak_rnn_L3/wm/var/Adam', 'librispeech/enc/bak_rnn_L3/wm/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/b/var/Adam', 'librispeech/enc/fwd_rnn_L3/b/var/Adam_1', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam', 'librispeech/enc/fwd_rnn_L3/wm/var/Adam_1', 'librispeech/total_samples/var', 'total_nan_gradients/var']
I0420 07:31:03.263567 140375134025472 trainer.py:455] Probably the expected race on global_step: Attempting to use uninitialized value global_step
	 [[{{node _send_global_step_0}}]]
I0420 07:31:04.268563 140375134025472 retry.py:68] Retry: caught exception: _WaitTillInit while running FailedPreconditionError: Attempting to use uninitialized value global_step
	 [[{{node _send_global_step_0}}]]
. Call failed at (most recent call last):
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 774, in __bootstrap
    self.__bootstrap_inner()
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 420, in Start
    self._RunLoop('trainer', self._Loop)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper
    return func(*args, **kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py", line 196, in _RunLoop
    loop_func(*loop_args)
Traceback for above exception (most recent call last):
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper
    return func(*args, **kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 453, in _WaitTillInit
    global_step = sess.run(py_utils.GetGlobalStep())
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 933, in run
    run_metadata_ptr)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1156, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1333, in _do_run
    run_metadata)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1353, in _do_call
    raise type(e)(node_def, op, message)
Waiting for 1.53 seconds before retrying.
I0420 07:31:04.272396 140375134025472 trainer.py:455] Probably the expected race on global_step: Attempting to use uninitialized value global_step
	 [[{{node _send_global_step_0}}]]
I0420 07:31:05.105427 140375142418176 trainer.py:315] Initialize variables done.
I0420 07:31:05.805938 140375134025472 retry.py:68] Retry: caught exception: _WaitTillInit while running FailedPreconditionError: Attempting to use uninitialized value global_step
	 [[{{node _send_global_step_0}}]]
. Call failed at (most recent call last):
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 774, in __bootstrap
    self.__bootstrap_inner()
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/home/dingzhenyou/anaconda2/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 420, in Start
    self._RunLoop('trainer', self._Loop)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper
    return func(*args, **kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/base_runner.py", line 196, in _RunLoop
    loop_func(*loop_args)
Traceback for above exception (most recent call last):
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/core/retry.py", line 50, in wrapper
    return func(*args, **kwargs)
  File "/data/dingzhenyou/lingvo/bazel-bin/lingvo/trainer.runfiles/__main__/lingvo/trainer.py", line 453, in _WaitTillInit
    global_step = sess.run(py_utils.GetGlobalStep())
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 933, in run
    run_metadata_ptr)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1156, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1333, in _do_run
    run_metadata)
  File "/home/dingzhenyou/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1353, in _do_call
    raise type(e)(node_def, op, message)
Waiting for 2.37 seconds before retrying.
I0420 07:31:05.807986 140375134025472 base_runner.py:115] step:     0
I0420 07:31:05.836642 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000
I0420 07:31:05.837508 140375142418176 trainer.py:268] Save checkpoint
W0420 07:31:08.370887 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'dict' object has no attribute 'name'
W0420 07:31:08.371248 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'list' object has no attribute 'name'
I0420 07:31:08.566843 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000000
I0420 07:31:12.045002 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000
I0420 07:31:22.052608 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000
2019-04-20 07:31:25.052466: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcublas.so.10.0
2019-04-20 07:31:25.262119: I ./lingvo/core/ops/input_common.h:68] Create RecordProcessor
2019-04-20 07:31:25.278404: I lingvo/core/ops/input_common.cc:30] Input source weights are empty, fall back to legacy behavior.
2019-04-20 07:31:25.280768: I lingvo/core/ops/record_yielder.cc:288] 0x7fa805806f60 Record yielder start
2019-04-20 07:31:25.280785: I lingvo/core/ops/record_yielder.cc:290] Randomly seed RecordYielder.
2019-04-20 07:31:25.281236: I ./lingvo/core/ops/input_common.h:73] Create batcher
2019-04-20 07:31:25.281280: I lingvo/core/ops/record_yielder.cc:341] Epoch 1 /data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*
2019-04-20 07:31:29.134968: I tensorflow/stream_executor/platform/default/dso_loader.cc:43] Successfully opened dynamic library libcudnn.so.7
I0420 07:31:32.057389 140375142418176 trainer.py:371] Steps/second: 0.000000, Examples/second: 0.000000
I0420 07:31:40.928052 140375134025472 trainer.py:511] time: 35.119649
I0420 07:31:40.929809 140375134025472 base_runner.py:115] step:     1 fraction_of_correct_next_step_preds:0.021143124 fraction_of_correct_next_step_preds/logits:0.021143124 grad_norm/all:95.786835 grad_scale_all:0.010439848 log_pplx:4.8166261 log_pplx/logits:4.8166261 loss:4.8166261 loss/logits:4.8166261 num_samples_in_batch:128 var_norm/all:704.50604
I0420 07:31:42.068264 140375142418176 trainer.py:371] Steps/second: 0.099892, Examples/second: 12.786187
I0420 07:31:42.069178 140375142418176 trainer.py:275] Write summary @1
2019-04-20 07:31:49.473411: I ./lingvo/core/ops/input_common.h:68] Create RecordProcessor
2019-04-20 07:31:49.476861: I lingvo/core/ops/input_common.cc:30] Input source weights are empty, fall back to legacy behavior.
2019-04-20 07:31:49.477990: I lingvo/core/ops/record_yielder.cc:288] 0x7fa7920a4690 Record yielder start
2019-04-20 07:31:49.478007: I lingvo/core/ops/record_yielder.cc:290] Randomly seed RecordYielder.
2019-04-20 07:31:49.478464: I ./lingvo/core/ops/input_common.h:73] Create batcher
2019-04-20 07:31:49.478492: I lingvo/core/ops/record_yielder.cc:341] Epoch 1 /data/dingzhenyou/speech_data/librispeech/train/train.tfrecords-*
I0420 07:31:50.867839 140375134025472 trainer.py:511] time: 9.934763
I0420 07:31:50.874564 140375134025472 trainer.py:522] step:     2 fraction_of_correct_next_step_preds:0.1764628 fraction_of_correct_next_step_preds/logits:0.1764628 grad_norm/all:21.061581 grad_scale_all:0.047479816 log_pplx:3.8171189 log_pplx/logits:3.8171189 loss:3.8171189 loss/logits:3.8171189 num_samples_in_batch:128 var_norm/all:704.50598
I0420 07:32:01.550363 140375134025472 trainer.py:511] time: 10.675241
I0420 07:32:01.553199 140375134025472 trainer.py:522] step:     3 fraction_of_correct_next_step_preds:0.18393762 fraction_of_correct_next_step_preds/logits:0.18393762 grad_norm/all:26.088888 grad_scale_all:0.038330495 log_pplx:3.4706612 log_pplx/logits:3.4706612 loss:3.4706612 loss/logits:3.4706612 num_samples_in_batch:128 var_norm/all:704.50604
I0420 07:32:13.798919 140375134025472 trainer.py:511] time: 12.244547
I0420 07:32:13.802406 140375134025472 trainer.py:522] step:     4 fraction_of_correct_next_step_preds:0.10271726 fraction_of_correct_next_step_preds/logits:0.10271726 grad_norm/all:22.408554 grad_scale_all:0.044625815 log_pplx:3.4633482 log_pplx/logits:3.4633482 loss:3.4633482 loss/logits:3.4633482 num_samples_in_batch:128 var_norm/all:704.50623
I0420 07:32:24.322017 140375134025472 trainer.py:511] time: 10.517802
I0420 07:32:24.333794 140375134025472 trainer.py:522] step:     5 fraction_of_correct_next_step_preds:0.063654847 fraction_of_correct_next_step_preds/logits:0.063654847 grad_norm/all:13.663131 grad_scale_all:0.073189668 log_pplx:3.3751349 log_pplx/logits:3.3751349 loss:3.3751349 loss/logits:3.3751349 num_samples_in_batch:128 var_norm/all:704.50647
I0420 07:32:33.058339 140375134025472 trainer.py:511] time: 8.714539
I0420 07:32:33.064846 140375134025472 trainer.py:522] step:     6 fraction_of_correct_next_step_preds:0.17381285 fraction_of_correct_next_step_preds/logits:0.17381285 grad_norm/all:8.5088272 grad_scale_all:0.11752501 log_pplx:3.1021802 log_pplx/logits:3.1021802 loss:3.1021802 loss/logits:3.1021802 num_samples_in_batch:128 var_norm/all:704.50677
2019-04-20 07:32:33.077313: I lingvo/core/ops/record_batcher.cc:344] 68 total seconds passed. Total records yielded: 1930. Total records skipped: 1
2019-04-20 07:32:33.077468: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1720
I0420 07:32:43.407505 140375142418176 trainer.py:284] Write summary done: step 1
I0420 07:32:43.408030 140375142418176 base_runner.py:115] step:     1, steps/sec: 0.10, examples/sec: 12.79
I0420 07:32:43.427865 140375142418176 trainer.py:371] Steps/second: 0.084068, Examples/second: 10.760739
I0420 07:32:43.763868 140375134025472 trainer.py:511] time: 10.698063
I0420 07:32:43.765400 140375134025472 trainer.py:522] step:     7 fraction_of_correct_next_step_preds:0.18137592 fraction_of_correct_next_step_preds/logits:0.18137592 grad_norm/all:11.258716 grad_scale_all:0.088820077 log_pplx:3.1270289 log_pplx/logits:3.1270289 loss:3.1270289 loss/logits:3.1270289 num_samples_in_batch:128 var_norm/all:704.50696
I0420 07:32:51.922163 140375134025472 trainer.py:511] time: 8.156427
I0420 07:32:51.923819 140375134025472 trainer.py:522] step:     8 fraction_of_correct_next_step_preds:0.1022212 fraction_of_correct_next_step_preds/logits:0.1022212 grad_norm/all:10.946689 grad_scale_all:0.091351829 log_pplx:3.1191299 log_pplx/logits:3.1191299 loss:3.1191299 loss/logits:3.1191299 num_samples_in_batch:128 var_norm/all:704.5072
I0420 07:32:53.429886 140375142418176 trainer.py:371] Steps/second: 0.098313, Examples/second: 12.584109
I0420 07:33:01.255381 140375134025472 trainer.py:511] time: 9.331202
I0420 07:33:01.257316 140375134025472 trainer.py:522] step:     9 fraction_of_correct_next_step_preds:0.090148062 fraction_of_correct_next_step_preds/logits:0.090148062 grad_norm/all:5.2309275 grad_scale_all:0.19117069 log_pplx:3.0450623 log_pplx/logits:3.0450623 loss:3.0450623 loss/logits:3.0450623 num_samples_in_batch:128 var_norm/all:704.50745
I0420 07:33:03.441674 140375142418176 trainer.py:371] Steps/second: 0.098485, Examples/second: 12.606123
I0420 07:33:10.237653 140375134025472 trainer.py:511] time: 8.979791
I0420 07:33:10.239054 140375134025472 trainer.py:522] step:    10 fraction_of_correct_next_step_preds:0.18519549 fraction_of_correct_next_step_preds/logits:0.18519549 grad_norm/all:3.8206174 grad_scale_all:0.26173779 log_pplx:2.9197507 log_pplx/logits:2.9197507 loss:2.9197507 loss/logits:2.9197507 num_samples_in_batch:128 var_norm/all:704.50757
I0420 07:33:13.448293 140375142418176 trainer.py:371] Steps/second: 0.098628, Examples/second: 12.624399
I0420 07:33:18.439397 140375134025472 trainer.py:511] time: 8.200120
I0420 07:33:18.440263 140375134025472 trainer.py:522] step:    11 fraction_of_correct_next_step_preds:0.18182154 fraction_of_correct_next_step_preds/logits:0.18182154 grad_norm/all:7.2013311 grad_scale_all:0.13886322 log_pplx:2.9674447 log_pplx/logits:2.9674447 loss:2.9674447 loss/logits:2.9674447 num_samples_in_batch:128 var_norm/all:704.50751
I0420 07:33:23.461621 140375142418176 trainer.py:371] Steps/second: 0.098740, Examples/second: 12.638698
I0420 07:33:25.952996 140375134025472 trainer.py:511] time: 7.512431
I0420 07:33:25.954210 140375134025472 trainer.py:522] step:    12 fraction_of_correct_next_step_preds:0.1485029 fraction_of_correct_next_step_preds/logits:0.1485029 grad_norm/all:3.4744987 grad_scale_all:0.28781131 log_pplx:2.9354002 log_pplx/logits:2.9354002 loss:2.9354002 loss/logits:2.9354002 num_samples_in_batch:128 var_norm/all:704.50739
I0420 07:33:32.858994 140375134025472 trainer.py:511] time: 6.904439
I0420 07:33:32.860662 140375134025472 trainer.py:522] step:    13 fraction_of_correct_next_step_preds:0.16566102 fraction_of_correct_next_step_preds/logits:0.16566102 grad_norm/all:2.9549899 grad_scale_all:0.33841065 log_pplx:2.9054689 log_pplx/logits:2.9054689 loss:2.9054689 loss/logits:2.9054689 num_samples_in_batch:128 var_norm/all:704.50702
I0420 07:33:33.468287 140375142418176 trainer.py:371] Steps/second: 0.107074, Examples/second: 13.705518
I0420 07:33:38.838541 140375134025472 trainer.py:511] time: 5.977578
I0420 07:33:38.839858 140375134025472 trainer.py:522] step:    14 fraction_of_correct_next_step_preds:0.18039839 fraction_of_correct_next_step_preds/logits:0.18039839 grad_norm/all:2.9032726 grad_scale_all:0.34443888 log_pplx:2.8980184 log_pplx/logits:2.8980184 loss:2.8980184 loss/logits:2.8980184 num_samples_in_batch:128 var_norm/all:704.50647
I0420 07:33:43.207788 140375134025472 trainer.py:511] time: 4.367637
I0420 07:33:43.209603 140375134025472 trainer.py:522] step:    15 fraction_of_correct_next_step_preds:0.17200515 fraction_of_correct_next_step_preds/logits:0.17200515 grad_norm/all:2.4388075 grad_scale_all:0.41003647 log_pplx:2.9262927 log_pplx/logits:2.9262927 loss:2.9262927 loss/logits:2.9262927 num_samples_in_batch:256 var_norm/all:704.50562
I0420 07:33:43.493648 140375142418176 trainer.py:371] Steps/second: 0.114124, Examples/second: 15.581691
I0420 07:33:52.574059 140375134025472 trainer.py:511] time: 9.364165
I0420 07:33:52.575781 140375134025472 trainer.py:522] step:    16 fraction_of_correct_next_step_preds:0.18149137 fraction_of_correct_next_step_preds/logits:0.18149137 grad_norm/all:2.0045273 grad_scale_all:0.49887073 log_pplx:2.8566251 log_pplx/logits:2.8566251 loss:2.8566251 loss/logits:2.8566251 num_samples_in_batch:128 var_norm/all:704.50458
I0420 07:33:53.487867 140375142418176 trainer.py:371] Steps/second: 0.113130, Examples/second: 15.385644
I0420 07:34:01.174437 140375134025472 trainer.py:511] time: 8.598373
I0420 07:34:01.176188 140375134025472 trainer.py:522] step:    17 fraction_of_correct_next_step_preds:0.1833981 fraction_of_correct_next_step_preds/logits:0.1833981 grad_norm/all:2.2261834 grad_scale_all:0.44919929 log_pplx:2.8556404 log_pplx/logits:2.8556404 loss:2.8556404 loss/logits:2.8556404 num_samples_in_batch:128 var_norm/all:704.50323
I0420 07:34:03.499483 140375142418176 trainer.py:371] Steps/second: 0.112254, Examples/second: 15.213734
I0420 07:34:09.164508 140375134025472 trainer.py:511] time: 7.988012
I0420 07:34:09.165910 140375134025472 trainer.py:522] step:    18 fraction_of_correct_next_step_preds:0.21112825 fraction_of_correct_next_step_preds/logits:0.21112825 grad_norm/all:1.8983856 grad_scale_all:0.52676338 log_pplx:2.8433697 log_pplx/logits:2.8433697 loss:2.8433697 loss/logits:2.8433697 num_samples_in_batch:128 var_norm/all:704.50177
I0420 07:34:13.508529 140375142418176 trainer.py:371] Steps/second: 0.111489, Examples/second: 15.063375
I0420 07:34:17.413304 140375134025472 trainer.py:511] time: 8.247067
I0420 07:34:17.414127 140375134025472 trainer.py:522] step:    19 fraction_of_correct_next_step_preds:0.19221394 fraction_of_correct_next_step_preds/logits:0.19221394 grad_norm/all:1.5632075 grad_scale_all:0.63971031 log_pplx:2.8191531 log_pplx/logits:2.8191531 loss:2.8191531 loss/logits:2.8191531 num_samples_in_batch:128 var_norm/all:704.5
I0420 07:34:23.518748 140375142418176 trainer.py:371] Steps/second: 0.110812, Examples/second: 14.930481
I0420 07:34:25.138550 140375134025472 trainer.py:511] time: 7.724124
I0420 07:34:25.139547 140375134025472 trainer.py:522] step:    20 fraction_of_correct_next_step_preds:0.18065803 fraction_of_correct_next_step_preds/logits:0.18065803 grad_norm/all:1.513368 grad_scale_all:0.66077781 log_pplx:2.8195481 log_pplx/logits:2.8195481 loss:2.8195481 loss/logits:2.8195481 num_samples_in_batch:128 var_norm/all:704.4978
I0420 07:34:32.060746 140375134025472 trainer.py:511] time: 6.920765
I0420 07:34:32.062063 140375134025472 trainer.py:522] step:    21 fraction_of_correct_next_step_preds:0.22351789 fraction_of_correct_next_step_preds/logits:0.22351789 grad_norm/all:1.794582 grad_scale_all:0.5572328 log_pplx:2.8125205 log_pplx/logits:2.8125205 loss:2.8125205 loss/logits:2.8125205 num_samples_in_batch:128 var_norm/all:704.4953
I0420 07:34:33.528932 140375142418176 trainer.py:371] Steps/second: 0.115721, Examples/second: 15.517583
I0420 07:34:38.001070 140375134025472 trainer.py:511] time: 5.938352
I0420 07:34:38.002382 140375134025472 trainer.py:522] step:    22 fraction_of_correct_next_step_preds:0.18407217 fraction_of_correct_next_step_preds/logits:0.18407217 grad_norm/all:1.6261265 grad_scale_all:0.61495829 log_pplx:2.8218467 log_pplx/logits:2.8218467 loss:2.8218467 loss/logits:2.8218467 num_samples_in_batch:128 var_norm/all:704.49268
2019-04-20 07:34:38.010061: I lingvo/core/ops/record_batcher.cc:344] 193 total seconds passed. Total records yielded: 3947. Total records skipped: 7
2019-04-20 07:34:38.010244: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711
2019-04-20 07:34:38.010277: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1726
2019-04-20 07:34:38.010300: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717
2019-04-20 07:34:38.010322: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717
2019-04-20 07:34:38.010343: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1735
2019-04-20 07:34:38.010362: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2431
I0420 07:34:43.538116 140375142418176 trainer.py:371] Steps/second: 0.114894, Examples/second: 15.374913
I0420 07:34:47.163630 140375134025472 trainer.py:511] time: 9.160850
I0420 07:34:47.164750 140375134025472 trainer.py:522] step:    23 fraction_of_correct_next_step_preds:0.19543481 fraction_of_correct_next_step_preds/logits:0.19543481 grad_norm/all:0.95440471 grad_scale_all:1 log_pplx:2.7736437 log_pplx/logits:2.7736437 loss:2.7736437 loss/logits:2.7736437 num_samples_in_batch:128 var_norm/all:704.48987
I0420 07:34:53.550586 140375142418176 trainer.py:371] Steps/second: 0.114148, Examples/second: 15.246183
I0420 07:34:55.746999 140375134025472 trainer.py:511] time: 8.582018
I0420 07:34:55.748157 140375134025472 trainer.py:522] step:    24 fraction_of_correct_next_step_preds:0.21631305 fraction_of_correct_next_step_preds/logits:0.21631305 grad_norm/all:1.2232789 grad_scale_all:0.81747508 log_pplx:2.7470989 log_pplx/logits:2.7470989 loss:2.7470989 loss/logits:2.7470989 num_samples_in_batch:128 var_norm/all:704.48651
I0420 07:35:03.559312 140375142418176 trainer.py:371] Steps/second: 0.113474, Examples/second: 15.129889
I0420 07:35:04.091337 140375134025472 trainer.py:511] time: 8.342919
I0420 07:35:04.092581 140375134025472 trainer.py:522] step:    25 fraction_of_correct_next_step_preds:0.21755469 fraction_of_correct_next_step_preds/logits:0.21755469 grad_norm/all:1.304142 grad_scale_all:0.76678765 log_pplx:2.7486336 log_pplx/logits:2.7486336 loss:2.7486336 loss/logits:2.7486336 num_samples_in_batch:128 var_norm/all:704.48285
I0420 07:35:12.140954 140375134025472 trainer.py:511] time: 8.048148
I0420 07:35:12.141916 140375134025472 trainer.py:522] step:    26 fraction_of_correct_next_step_preds:0.23160981 fraction_of_correct_next_step_preds/logits:0.23160981 grad_norm/all:0.91558677 grad_scale_all:1 log_pplx:2.7053049 log_pplx/logits:2.7053049 loss:2.7053049 loss/logits:2.7053049 num_samples_in_batch:128 var_norm/all:704.47894
I0420 07:35:13.568135 140375142418176 trainer.py:371] Steps/second: 0.117376, Examples/second: 15.601947
I0420 07:35:19.786127 140375134025472 trainer.py:511] time: 7.643813
I0420 07:35:19.787606 140375134025472 trainer.py:522] step:    27 fraction_of_correct_next_step_preds:0.25725779 fraction_of_correct_next_step_preds/logits:0.25725779 grad_norm/all:1.5240614 grad_scale_all:0.65614152 log_pplx:2.7172825 log_pplx/logits:2.7172825 loss:2.7172825 loss/logits:2.7172825 num_samples_in_batch:128 var_norm/all:704.47467
I0420 07:35:23.578629 140375142418176 trainer.py:371] Steps/second: 0.116620, Examples/second: 15.480218
I0420 07:35:27.002883 140375134025472 trainer.py:511] time: 7.215031
I0420 07:35:27.004050 140375134025472 trainer.py:522] step:    28 fraction_of_correct_next_step_preds:0.19790526 fraction_of_correct_next_step_preds/logits:0.19790526 grad_norm/all:2.5601723 grad_scale_all:0.39059871 log_pplx:2.7200246 log_pplx/logits:2.7200246 loss:2.7200246 loss/logits:2.7200246 num_samples_in_batch:128 var_norm/all:704.47046
I0420 07:35:31.298074 140375134025472 trainer.py:511] time: 4.293699
I0420 07:35:31.298980 140375134025472 trainer.py:522] step:    29 fraction_of_correct_next_step_preds:0.23508346 fraction_of_correct_next_step_preds/logits:0.23508346 grad_norm/all:1.768149 grad_scale_all:0.5655632 log_pplx:2.7536347 log_pplx/logits:2.7536347 loss:2.7536347 loss/logits:2.7536347 num_samples_in_batch:256 var_norm/all:704.46643
I0420 07:35:33.588792 140375142418176 trainer.py:371] Steps/second: 0.120067, Examples/second: 16.428504
I0420 07:35:37.132637 140375134025472 trainer.py:511] time: 5.833392
I0420 07:35:37.133708 140375134025472 trainer.py:522] step:    30 fraction_of_correct_next_step_preds:0.27157485 fraction_of_correct_next_step_preds/logits:0.27157485 grad_norm/all:1.1383265 grad_scale_all:0.87848258 log_pplx:2.6550455 log_pplx/logits:2.6550455 loss:2.6550455 loss/logits:2.6550455 num_samples_in_batch:128 var_norm/all:704.46259
I0420 07:35:43.598779 140375142418176 trainer.py:371] Steps/second: 0.119265, Examples/second: 16.283604
I0420 07:35:46.383224 140375134025472 trainer.py:511] time: 9.249290
I0420 07:35:46.384552 140375134025472 trainer.py:522] step:    31 fraction_of_correct_next_step_preds:0.26169586 fraction_of_correct_next_step_preds/logits:0.26169586 grad_norm/all:2.2130964 grad_scale_all:0.4518556 log_pplx:2.6702738 log_pplx/logits:2.6702738 loss:2.6702738 loss/logits:2.6702738 num_samples_in_batch:128 var_norm/all:704.45856
I0420 07:35:53.610418 140375142418176 trainer.py:371] Steps/second: 0.118523, Examples/second: 16.149695
I0420 07:35:55.118340 140375134025472 trainer.py:511] time: 8.733485
I0420 07:35:55.119571 140375134025472 trainer.py:522] step:    32 fraction_of_correct_next_step_preds:0.24295399 fraction_of_correct_next_step_preds/logits:0.24295399 grad_norm/all:2.4151533 grad_scale_all:0.4140524 log_pplx:2.6709504 log_pplx/logits:2.6709504 loss:2.6709504 loss/logits:2.6709504 num_samples_in_batch:128 var_norm/all:704.45471
I0420 07:36:03.338177 140375134025472 trainer.py:511] time: 8.218271
I0420 07:36:03.339405 140375134025472 trainer.py:522] step:    33 fraction_of_correct_next_step_preds:0.27863407 fraction_of_correct_next_step_preds/logits:0.27863407 grad_norm/all:1.0046402 grad_scale_all:0.99538124 log_pplx:2.6023529 log_pplx/logits:2.6023529 loss:2.6023529 loss/logits:2.6023529 num_samples_in_batch:128 var_norm/all:704.45111
I0420 07:36:03.618396 140375142418176 trainer.py:371] Steps/second: 0.121520, Examples/second: 16.497211
I0420 07:36:11.618510 140375134025472 trainer.py:511] time: 8.278831
I0420 07:36:11.620147 140375134025472 trainer.py:522] step:    34 fraction_of_correct_next_step_preds:0.2705746 fraction_of_correct_next_step_preds/logits:0.2705746 grad_norm/all:3.2093797 grad_scale_all:0.31158671 log_pplx:2.6450973 log_pplx/logits:2.6450973 loss:2.6450973 loss/logits:2.6450973 num_samples_in_batch:128 var_norm/all:704.44714
I0420 07:36:13.628343 140375142418176 trainer.py:371] Steps/second: 0.120751, Examples/second: 16.365323
I0420 07:36:19.194186 140375134025472 trainer.py:511] time: 7.573756
I0420 07:36:19.195812 140375134025472 trainer.py:522] step:    35 fraction_of_correct_next_step_preds:0.26238686 fraction_of_correct_next_step_preds/logits:0.26238686 grad_norm/all:1.9304246 grad_scale_all:0.51802075 log_pplx:2.6112156 log_pplx/logits:2.6112156 loss:2.6112156 loss/logits:2.6112156 num_samples_in_batch:128 var_norm/all:704.44342
I0420 07:36:23.638386 140375142418176 trainer.py:371] Steps/second: 0.120035, Examples/second: 16.242484
I0420 07:36:26.169241 140375134025472 trainer.py:511] time: 6.973244
I0420 07:36:26.170348 140375134025472 trainer.py:522] step:    36 fraction_of_correct_next_step_preds:0.28620055 fraction_of_correct_next_step_preds/logits:0.28620055 grad_norm/all:1.9523199 grad_scale_all:0.51221114 log_pplx:2.6116624 log_pplx/logits:2.6116624 loss:2.6116624 loss/logits:2.6116624 num_samples_in_batch:128 var_norm/all:704.43976
I0420 07:36:33.649904 140375142418176 trainer.py:371] Steps/second: 0.119366, Examples/second: 16.127728
I0420 07:36:35.362273 140375134025472 trainer.py:511] time: 9.191655
I0420 07:36:35.363902 140375134025472 trainer.py:522] step:    37 fraction_of_correct_next_step_preds:0.26480764 fraction_of_correct_next_step_preds/logits:0.26480764 grad_norm/all:2.1899633 grad_scale_all:0.45662865 log_pplx:2.5841577 log_pplx/logits:2.5841577 loss:2.5841577 loss/logits:2.5841577 num_samples_in_batch:128 var_norm/all:704.43622
I0420 07:36:41.210024 140375134025472 trainer.py:511] time: 5.845796
I0420 07:36:41.211704 140375134025472 trainer.py:522] step:    38 fraction_of_correct_next_step_preds:0.27340057 fraction_of_correct_next_step_preds/logits:0.27340057 grad_norm/all:1.3420987 grad_scale_all:0.74510169 log_pplx:2.5779867 log_pplx/logits:2.5779867 loss:2.5779867 loss/logits:2.5779867 num_samples_in_batch:128 var_norm/all:704.43274
I0420 07:36:43.658245 140375142418176 trainer.py:371] Steps/second: 0.121951, Examples/second: 16.431276
I0420 07:36:49.907310 140375134025472 trainer.py:511] time: 8.695390
I0420 07:36:49.908164 140375134025472 trainer.py:522] step:    39 fraction_of_correct_next_step_preds:0.25044024 fraction_of_correct_next_step_preds/logits:0.25044024 grad_norm/all:2.3219526 grad_scale_all:0.43067202 log_pplx:2.5622635 log_pplx/logits:2.5622635 loss:2.5622635 loss/logits:2.5622635 num_samples_in_batch:128 var_norm/all:704.42908
I0420 07:36:53.669751 140375142418176 trainer.py:371] Steps/second: 0.121264, Examples/second: 16.317783
I0420 07:36:58.235479 140375134025472 trainer.py:511] time: 8.326842
I0420 07:36:58.236478 140375134025472 trainer.py:522] step:    40 fraction_of_correct_next_step_preds:0.27789244 fraction_of_correct_next_step_preds/logits:0.27789244 grad_norm/all:1.7515401 grad_scale_all:0.57092613 log_pplx:2.5496616 log_pplx/logits:2.5496616 loss:2.5496616 loss/logits:2.5496616 num_samples_in_batch:128 var_norm/all:704.42548
I0420 07:37:03.679372 140375142418176 trainer.py:371] Steps/second: 0.120619, Examples/second: 16.211234
I0420 07:37:06.227189 140375134025472 trainer.py:511] time: 7.990343
I0420 07:37:06.228425 140375134025472 trainer.py:522] step:    41 fraction_of_correct_next_step_preds:0.27679548 fraction_of_correct_next_step_preds/logits:0.27679548 grad_norm/all:2.7463503 grad_scale_all:0.36411962 log_pplx:2.5383792 log_pplx/logits:2.5383792 loss:2.5383792 loss/logits:2.5383792 num_samples_in_batch:128 var_norm/all:704.42194
I0420 07:37:13.687947 140375142418176 trainer.py:371] Steps/second: 0.120013, Examples/second: 16.110977
I0420 07:37:13.896599 140375134025472 trainer.py:511] time: 7.667946
I0420 07:37:13.897425 140375134025472 trainer.py:522] step:    42 fraction_of_correct_next_step_preds:0.27243295 fraction_of_correct_next_step_preds/logits:0.27243295 grad_norm/all:1.3269877 grad_scale_all:0.75358647 log_pplx:2.5244553 log_pplx/logits:2.5244553 loss:2.5244553 loss/logits:2.5244553 num_samples_in_batch:128 var_norm/all:704.4184
I0420 07:37:20.823199 140375134025472 trainer.py:511] time: 6.925393
I0420 07:37:20.824395 140375134025472 trainer.py:522] step:    43 fraction_of_correct_next_step_preds:0.25611943 fraction_of_correct_next_step_preds/logits:0.25611943 grad_norm/all:2.4771316 grad_scale_all:0.40369272 log_pplx:2.5433397 log_pplx/logits:2.5433397 loss:2.5433397 loss/logits:2.5433397 num_samples_in_batch:128 var_norm/all:704.41473
I0420 07:37:23.694926 140375142418176 trainer.py:371] Steps/second: 0.122285, Examples/second: 16.380501
I0420 07:37:29.416392 140375134025472 trainer.py:511] time: 8.591591
I0420 07:37:29.417622 140375134025472 trainer.py:522] step:    44 fraction_of_correct_next_step_preds:0.2867482 fraction_of_correct_next_step_preds/logits:0.2867482 grad_norm/all:2.2287934 grad_scale_all:0.44867328 log_pplx:2.5145919 log_pplx/logits:2.5145919 loss:2.5145919 loss/logits:2.5145919 num_samples_in_batch:128 var_norm/all:704.41107
I0420 07:37:33.707510 140375142418176 trainer.py:371] Steps/second: 0.121665, Examples/second: 16.280931
I0420 07:37:38.633066 140375134025472 trainer.py:511] time: 9.191278
I0420 07:37:38.634128 140375134025472 trainer.py:522] step:    45 fraction_of_correct_next_step_preds:0.2967239 fraction_of_correct_next_step_preds/logits:0.2967239 grad_norm/all:1.9373232 grad_scale_all:0.5161761 log_pplx:2.4962223 log_pplx/logits:2.4962223 loss:2.4962223 loss/logits:2.4962223 num_samples_in_batch:128 var_norm/all:704.40753
I0420 07:37:43.716531 140375142418176 trainer.py:371] Steps/second: 0.121079, Examples/second: 16.186881
I0420 07:37:44.558804 140375134025472 trainer.py:511] time: 5.924469
I0420 07:37:44.559990 140375134025472 trainer.py:522] step:    46 fraction_of_correct_next_step_preds:0.27961257 fraction_of_correct_next_step_preds/logits:0.27961257 grad_norm/all:1.3620263 grad_scale_all:0.73420018 log_pplx:2.4990051 log_pplx/logits:2.4990051 loss:2.4990051 loss/logits:2.4990051 num_samples_in_batch:128 var_norm/all:704.40393
I0420 07:37:52.829343 140375134025472 trainer.py:511] time: 8.269159
I0420 07:37:52.830336 140375134025472 trainer.py:522] step:    47 fraction_of_correct_next_step_preds:0.29120943 fraction_of_correct_next_step_preds/logits:0.29120943 grad_norm/all:1.6608243 grad_scale_all:0.60211062 log_pplx:2.4610975 log_pplx/logits:2.4610975 loss:2.4610975 loss/logits:2.4610975 num_samples_in_batch:128 var_norm/all:704.39996
I0420 07:37:53.725608 140375142418176 trainer.py:371] Steps/second: 0.123144, Examples/second: 16.768488
I0420 07:37:57.058876 140375134025472 trainer.py:511] time: 4.228302
I0420 07:37:57.060250 140375134025472 trainer.py:522] step:    48 fraction_of_correct_next_step_preds:0.28546801 fraction_of_correct_next_step_preds/logits:0.28546801 grad_norm/all:1.3926181 grad_scale_all:0.718072 log_pplx:2.5202353 log_pplx/logits:2.5202353 loss:2.5202353 loss/logits:2.5202353 num_samples_in_batch:256 var_norm/all:704.396
I0420 07:38:03.736061 140375142418176 trainer.py:371] Steps/second: 0.122549, Examples/second: 16.666722
I0420 07:38:04.924762 140375134025472 trainer.py:511] time: 7.864333
I0420 07:38:04.925765 140375134025472 trainer.py:522] step:    49 fraction_of_correct_next_step_preds:0.30097163 fraction_of_correct_next_step_preds/logits:0.30097163 grad_norm/all:0.98358774 grad_scale_all:1 log_pplx:2.4348617 log_pplx/logits:2.4348617 loss:2.4348617 loss/logits:2.4348617 num_samples_in_batch:128 var_norm/all:704.39172
I0420 07:38:11.901254 140375134025472 trainer.py:511] time: 6.975202
I0420 07:38:11.902563 140375134025472 trainer.py:522] step:    50 fraction_of_correct_next_step_preds:0.29285777 fraction_of_correct_next_step_preds/logits:0.29285777 grad_norm/all:1.3046882 grad_scale_all:0.76646662 log_pplx:2.4325299 log_pplx/logits:2.4325299 loss:2.4325299 loss/logits:2.4325299 num_samples_in_batch:128 var_norm/all:704.38702
I0420 07:38:13.745528 140375142418176 trainer.py:371] Steps/second: 0.124475, Examples/second: 16.888723
I0420 07:38:19.548118 140375134025472 trainer.py:511] time: 7.645136
I0420 07:38:19.549738 140375134025472 trainer.py:522] step:    51 fraction_of_correct_next_step_preds:0.30415326 fraction_of_correct_next_step_preds/logits:0.30415326 grad_norm/all:1.2448066 grad_scale_all:0.80333763 log_pplx:2.4177649 log_pplx/logits:2.4177649 loss:2.4177649 loss/logits:2.4177649 num_samples_in_batch:128 var_norm/all:704.38208
I0420 07:38:23.756176 140375142418176 trainer.py:371] Steps/second: 0.123877, Examples/second: 16.788972
I0420 07:38:28.114622 140375134025472 trainer.py:511] time: 8.564568
I0420 07:38:28.115804 140375134025472 trainer.py:522] step:    52 fraction_of_correct_next_step_preds:0.31214514 fraction_of_correct_next_step_preds/logits:0.31214514 grad_norm/all:1.0816773 grad_scale_all:0.92449015 log_pplx:2.4007387 log_pplx/logits:2.4007387 loss:2.4007387 loss/logits:2.4007387 num_samples_in_batch:128 var_norm/all:704.37695
I0420 07:38:33.765763 140375142418176 trainer.py:371] Steps/second: 0.123308, Examples/second: 16.694001
I0420 07:38:37.219835 140375134025472 trainer.py:511] time: 9.103512
I0420 07:38:37.221427 140375134025472 trainer.py:522] step:    53 fraction_of_correct_next_step_preds:0.31481665 fraction_of_correct_next_step_preds/logits:0.31481665 grad_norm/all:0.74732727 grad_scale_all:1 log_pplx:2.3677359 log_pplx/logits:2.3677359 loss:2.3677359 loss/logits:2.3677359 num_samples_in_batch:128 var_norm/all:704.3714
I0420 07:38:43.775006 140375142418176 trainer.py:371] Steps/second: 0.122765, Examples/second: 16.603444
I0420 07:38:45.475708 140375134025472 trainer.py:511] time: 8.253978
I0420 07:38:45.476913 140375134025472 trainer.py:522] step:    54 fraction_of_correct_next_step_preds:0.30801314 fraction_of_correct_next_step_preds/logits:0.30801314 grad_norm/all:1.1673788 grad_scale_all:0.85662001 log_pplx:2.3846834 log_pplx/logits:2.3846834 loss:2.3846834 loss/logits:2.3846834 num_samples_in_batch:128 var_norm/all:704.36542
2019-04-20 07:38:45.480067: I lingvo/core/ops/record_batcher.cc:344] 440 total seconds passed. Total records yielded: 8285. Total records skipped: 8
2019-04-20 07:38:45.481832: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711
I0420 07:38:51.320868 140375134025472 trainer.py:511] time: 5.843708
I0420 07:38:51.322346 140375134025472 trainer.py:522] step:    55 fraction_of_correct_next_step_preds:0.30724242 fraction_of_correct_next_step_preds/logits:0.30724242 grad_norm/all:1.4477671 grad_scale_all:0.69071883 log_pplx:2.4033675 log_pplx/logits:2.4033675 loss:2.4033675 loss/logits:2.4033675 num_samples_in_batch:128 var_norm/all:704.35919
I0420 07:38:53.785290 140375142418176 trainer.py:371] Steps/second: 0.124511, Examples/second: 16.806726
I0420 07:38:59.143594 140375134025472 trainer.py:511] time: 7.820905
I0420 07:38:59.145212 140375134025472 trainer.py:522] step:    56 fraction_of_correct_next_step_preds:0.32211804 fraction_of_correct_next_step_preds/logits:0.32211804 grad_norm/all:0.99645722 grad_scale_all:1 log_pplx:2.3391604 log_pplx/logits:2.3391604 loss:2.3391604 loss/logits:2.3391604 num_samples_in_batch:128 var_norm/all:704.35303
I0420 07:39:03.794799 140375142418176 trainer.py:371] Steps/second: 0.123966, Examples/second: 16.717675
I0420 07:39:05.955985 140375134025472 trainer.py:511] time: 6.810547
I0420 07:39:05.957156 140375134025472 trainer.py:522] step:    57 fraction_of_correct_next_step_preds:0.31032616 fraction_of_correct_next_step_preds/logits:0.31032616 grad_norm/all:1.3071382 grad_scale_all:0.76503003 log_pplx:2.3732302 log_pplx/logits:2.3732302 loss:2.3732302 loss/logits:2.3732302 num_samples_in_batch:128 var_norm/all:704.34644
I0420 07:39:13.333959 140375134025472 trainer.py:511] time: 7.376561
I0420 07:39:13.335150 140375134025472 trainer.py:522] step:    58 fraction_of_correct_next_step_preds:0.31746674 fraction_of_correct_next_step_preds/logits:0.31746674 grad_norm/all:0.7144863 grad_scale_all:1 log_pplx:2.3458943 log_pplx/logits:2.3458943 loss:2.3458943 loss/logits:2.3458943 num_samples_in_batch:128 var_norm/all:704.33972
I0420 07:39:13.804999 140375142418176 trainer.py:371] Steps/second: 0.125610, Examples/second: 16.909668
I0420 07:39:21.758083 140375134025472 trainer.py:511] time: 8.422377
I0420 07:39:21.759244 140375134025472 trainer.py:522] step:    59 fraction_of_correct_next_step_preds:0.32590339 fraction_of_correct_next_step_preds/logits:0.32590339 grad_norm/all:0.80505741 grad_scale_all:1 log_pplx:2.3301501 log_pplx/logits:2.3301501 loss:2.3301501 loss/logits:2.3301501 num_samples_in_batch:128 var_norm/all:704.3327
I0420 07:39:23.814544 140375142418176 trainer.py:371] Steps/second: 0.125064, Examples/second: 16.822212
I0420 07:39:30.864382 140375134025472 trainer.py:511] time: 9.104589
I0420 07:39:30.866740 140375134025472 trainer.py:522] step:    60 fraction_of_correct_next_step_preds:0.3370963 fraction_of_correct_next_step_preds/logits:0.3370963 grad_norm/all:0.67958254 grad_scale_all:1 log_pplx:2.306602 log_pplx/logits:2.306602 loss:2.306602 loss/logits:2.306602 num_samples_in_batch:128 var_norm/all:704.32526
I0420 07:39:33.827307 140375142418176 trainer.py:371] Steps/second: 0.124541, Examples/second: 16.738280
I0420 07:39:39.266830 140375134025472 trainer.py:511] time: 8.399826
I0420 07:39:39.268023 140375134025472 trainer.py:522] step:    61 fraction_of_correct_next_step_preds:0.33446094 fraction_of_correct_next_step_preds/logits:0.33446094 grad_norm/all:0.94569188 grad_scale_all:1 log_pplx:2.3149652 log_pplx/logits:2.3149652 loss:2.3149652 loss/logits:2.3149652 num_samples_in_batch:128 var_norm/all:704.31757
I0420 07:39:43.848258 140375142418176 trainer.py:371] Steps/second: 0.124036, Examples/second: 16.657488
I0420 07:39:47.206741 140375134025472 trainer.py:511] time: 7.938489
I0420 07:39:47.207648 140375134025472 trainer.py:522] step:    62 fraction_of_correct_next_step_preds:0.33041391 fraction_of_correct_next_step_preds/logits:0.33041391 grad_norm/all:0.97894734 grad_scale_all:1 log_pplx:2.3103309 log_pplx/logits:2.3103309 loss:2.3103309 loss/logits:2.3103309 num_samples_in_batch:128 var_norm/all:704.30957
I0420 07:39:52.905343 140375134025472 trainer.py:511] time: 5.697142
I0420 07:39:52.906486 140375134025472 trainer.py:522] step:    63 fraction_of_correct_next_step_preds:0.33234218 fraction_of_correct_next_step_preds/logits:0.33234218 grad_norm/all:1.9755416 grad_scale_all:0.5061903 log_pplx:2.3209085 log_pplx/logits:2.3209085 loss:2.3209085 loss/logits:2.3209085 num_samples_in_batch:128 var_norm/all:704.30127
I0420 07:39:53.844923 140375142418176 trainer.py:371] Steps/second: 0.125551, Examples/second: 16.835810
I0420 07:39:59.689280 140375134025472 trainer.py:511] time: 6.782313
I0420 07:39:59.690049 140375134025472 trainer.py:522] step:    64 fraction_of_correct_next_step_preds:0.33565956 fraction_of_correct_next_step_preds/logits:0.33565956 grad_norm/all:0.87551606 grad_scale_all:1 log_pplx:2.3032544 log_pplx/logits:2.3032544 loss:2.3032544 loss/logits:2.3032544 num_samples_in_batch:128 var_norm/all:704.2934
I0420 07:40:03.856801 140375142418176 trainer.py:371] Steps/second: 0.125049, Examples/second: 17.006665
I0420 07:40:04.021229 140375134025472 trainer.py:511] time: 4.330891
I0420 07:40:04.022106 140375134025472 trainer.py:522] step:    65 fraction_of_correct_next_step_preds:0.31422696 fraction_of_correct_next_step_preds/logits:0.31422696 grad_norm/all:1.7568246 grad_scale_all:0.5692088 log_pplx:2.3677559 log_pplx/logits:2.3677559 loss:2.3677559 loss/logits:2.3677559 num_samples_in_batch:256 var_norm/all:704.28528
I0420 07:40:11.636953 140375134025472 trainer.py:511] time: 7.614421
I0420 07:40:11.638170 140375134025472 trainer.py:522] step:    66 fraction_of_correct_next_step_preds:0.33445784 fraction_of_correct_next_step_preds/logits:0.33445784 grad_norm/all:1.4875379 grad_scale_all:0.67225182 log_pplx:2.3114297 log_pplx/logits:2.3114297 loss:2.3114297 loss/logits:2.3114297 num_samples_in_batch:128 var_norm/all:704.27734
I0420 07:40:13.866642 140375142418176 trainer.py:371] Steps/second: 0.126483, Examples/second: 17.171027
I0420 07:40:20.701528 140375134025472 trainer.py:511] time: 9.062950
I0420 07:40:20.702819 140375134025472 trainer.py:522] step:    67 fraction_of_correct_next_step_preds:0.34274742 fraction_of_correct_next_step_preds/logits:0.34274742 grad_norm/all:1.3280134 grad_scale_all:0.75300443 log_pplx:2.2751744 log_pplx/logits:2.2751744 loss:2.2751744 loss/logits:2.2751744 num_samples_in_batch:128 var_norm/all:704.26959
I0420 07:40:23.875549 140375142418176 trainer.py:371] Steps/second: 0.125983, Examples/second: 17.088548
I0420 07:40:29.276140 140375134025472 trainer.py:511] time: 8.572985
I0420 07:40:29.277910 140375134025472 trainer.py:522] step:    68 fraction_of_correct_next_step_preds:0.35038498 fraction_of_correct_next_step_preds/logits:0.35038498 grad_norm/all:0.74925572 grad_scale_all:1 log_pplx:2.257813 log_pplx/logits:2.257813 loss:2.257813 loss/logits:2.257813 num_samples_in_batch:128 var_norm/all:704.26178
I0420 07:40:33.885135 140375142418176 trainer.py:371] Steps/second: 0.125501, Examples/second: 17.009094
I0420 07:40:37.489234 140375134025472 trainer.py:511] time: 8.211099
I0420 07:40:37.490010 140375134025472 trainer.py:522] step:    69 fraction_of_correct_next_step_preds:0.3476578 fraction_of_correct_next_step_preds/logits:0.3476578 grad_norm/all:1.2810773 grad_scale_all:0.78059304 log_pplx:2.2596214 log_pplx/logits:2.2596214 loss:2.2596214 loss/logits:2.2596214 num_samples_in_batch:128 var_norm/all:704.25372
I0420 07:40:43.896034 140375142418176 trainer.py:371] Steps/second: 0.125037, Examples/second: 16.932488
I0420 07:40:45.529628 140375134025472 trainer.py:511] time: 8.039236
I0420 07:40:45.530534 140375134025472 trainer.py:522] step:    70 fraction_of_correct_next_step_preds:0.34041184 fraction_of_correct_next_step_preds/logits:0.34041184 grad_norm/all:1.2760593 grad_scale_all:0.78366268 log_pplx:2.281177 log_pplx/logits:2.281177 loss:2.281177 loss/logits:2.281177 num_samples_in_batch:128 var_norm/all:704.24561
I0420 07:40:52.443317 140375134025472 trainer.py:511] time: 6.912189
I0420 07:40:52.444982 140375134025472 trainer.py:522] step:    71 fraction_of_correct_next_step_preds:0.34689143 fraction_of_correct_next_step_preds/logits:0.34689143 grad_norm/all:1.113879 grad_scale_all:0.89776361 log_pplx:2.2540269 log_pplx/logits:2.2540269 loss:2.2540269 loss/logits:2.2540269 num_samples_in_batch:128 var_norm/all:704.23737
I0420 07:40:53.904791 140375142418176 trainer.py:371] Steps/second: 0.126369, Examples/second: 17.086490
I0420 07:40:58.216379 140375134025472 trainer.py:511] time: 5.771179
I0420 07:40:58.217495 140375134025472 trainer.py:522] step:    72 fraction_of_correct_next_step_preds:0.33866972 fraction_of_correct_next_step_preds/logits:0.33866972 grad_norm/all:1.1875348 grad_scale_all:0.84208059 log_pplx:2.2681174 log_pplx/logits:2.2681174 loss:2.2681174 loss/logits:2.2681174 num_samples_in_batch:128 var_norm/all:704.229
I0420 07:41:03.916526 140375142418176 trainer.py:371] Steps/second: 0.125905, Examples/second: 17.011183
I0420 07:41:03.917241 140375142418176 trainer.py:268] Save checkpoint
W0420 07:41:06.173212 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'dict' object has no attribute 'name'
W0420 07:41:06.173594 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'list' object has no attribute 'name'
I0420 07:41:06.398998 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000072
I0420 07:41:06.594940 140375134025472 trainer.py:511] time: 8.353298
I0420 07:41:06.595875 140375134025472 trainer.py:522] step:    73 fraction_of_correct_next_step_preds:0.35307753 fraction_of_correct_next_step_preds/logits:0.35307753 grad_norm/all:0.75280231 grad_scale_all:1 log_pplx:2.2375162 log_pplx/logits:2.2375162 loss:2.2375162 loss/logits:2.2375162 num_samples_in_batch:128 var_norm/all:704.22052
I0420 07:41:13.935550 140375142418176 trainer.py:371] Steps/second: 0.125456, Examples/second: 16.938254
I0420 07:41:15.875637 140375134025472 trainer.py:511] time: 9.279429
I0420 07:41:15.876627 140375134025472 trainer.py:522] step:    74 fraction_of_correct_next_step_preds:0.36721641 fraction_of_correct_next_step_preds/logits:0.36721641 grad_norm/all:0.86656165 grad_scale_all:1 log_pplx:2.2110724 log_pplx/logits:2.2110724 loss:2.2110724 loss/logits:2.2110724 num_samples_in_batch:128 var_norm/all:704.21179
I0420 07:41:23.576463 140375134025472 trainer.py:511] time: 7.699559
I0420 07:41:23.577770 140375134025472 trainer.py:522] step:    75 fraction_of_correct_next_step_preds:0.3473832 fraction_of_correct_next_step_preds/logits:0.3473832 grad_norm/all:1.2296442 grad_scale_all:0.81324339 log_pplx:2.2439103 log_pplx/logits:2.2439103 loss:2.2439103 loss/logits:2.2439103 num_samples_in_batch:128 var_norm/all:704.20282
I0420 07:41:23.932002 140375142418176 trainer.py:371] Steps/second: 0.126716, Examples/second: 17.084698
I0420 07:41:31.270879 140375134025472 trainer.py:511] time: 7.692725
I0420 07:41:31.271617 140375134025472 trainer.py:522] step:    76 fraction_of_correct_next_step_preds:0.34664184 fraction_of_correct_next_step_preds/logits:0.34664184 grad_norm/all:1.7527937 grad_scale_all:0.57051778 log_pplx:2.2464612 log_pplx/logits:2.2464612 loss:2.2464612 loss/logits:2.2464612 num_samples_in_batch:128 var_norm/all:704.19379
I0420 07:41:33.943514 140375142418176 trainer.py:371] Steps/second: 0.126270, Examples/second: 17.013188
I0420 07:41:39.387933 140375134025472 trainer.py:511] time: 8.116039
I0420 07:41:39.389003 140375134025472 trainer.py:522] step:    77 fraction_of_correct_next_step_preds:0.35830984 fraction_of_correct_next_step_preds/logits:0.35830984 grad_norm/all:0.81886053 grad_scale_all:1 log_pplx:2.2158952 log_pplx/logits:2.2158952 loss:2.2158952 loss/logits:2.2158952 num_samples_in_batch:128 var_norm/all:704.18512
I0420 07:41:43.953299 140375142418176 trainer.py:371] Steps/second: 0.125838, Examples/second: 16.944062
I0420 07:41:46.216691 140375134025472 trainer.py:511] time: 6.827460
I0420 07:41:46.217935 140375134025472 trainer.py:522] step:    78 fraction_of_correct_next_step_preds:0.35563675 fraction_of_correct_next_step_preds/logits:0.35563675 grad_norm/all:0.79842186 grad_scale_all:1 log_pplx:2.2194674 log_pplx/logits:2.2194674 loss:2.2194674 loss/logits:2.2194674 num_samples_in_batch:128 var_norm/all:704.17609
I0420 07:41:51.873646 140375134025472 trainer.py:511] time: 5.655520
I0420 07:41:51.874325 140375134025472 trainer.py:522] step:    79 fraction_of_correct_next_step_preds:0.35122594 fraction_of_correct_next_step_preds/logits:0.35122594 grad_norm/all:1.6447265 grad_scale_all:0.6080038 log_pplx:2.24704 log_pplx/logits:2.24704 loss:2.24704 loss/logits:2.24704 num_samples_in_batch:128 var_norm/all:704.16681
I0420 07:41:53.964529 140375142418176 trainer.py:371] Steps/second: 0.127029, Examples/second: 17.082936
I0420 07:42:00.195930 140375134025472 trainer.py:511] time: 8.321231
I0420 07:42:00.196805 140375134025472 trainer.py:522] step:    80 fraction_of_correct_next_step_preds:0.36126739 fraction_of_correct_next_step_preds/logits:0.36126739 grad_norm/all:0.98546255 grad_scale_all:1 log_pplx:2.1959085 log_pplx/logits:2.1959085 loss:2.1959085 loss/logits:2.1959085 num_samples_in_batch:128 var_norm/all:704.15784
I0420 07:42:03.973375 140375142418176 trainer.py:371] Steps/second: 0.126599, Examples/second: 17.217479
I0420 07:42:04.534535 140375134025472 trainer.py:511] time: 4.337248
I0420 07:42:04.536370 140375134025472 trainer.py:522] step:    81 fraction_of_correct_next_step_preds:0.35335645 fraction_of_correct_next_step_preds/logits:0.35335645 grad_norm/all:0.70726234 grad_scale_all:1 log_pplx:2.2483265 log_pplx/logits:2.2483265 loss:2.2483265 loss/logits:2.2483265 num_samples_in_batch:256 var_norm/all:704.14862
I0420 07:42:13.410237 140375134025472 trainer.py:511] time: 8.873534
I0420 07:42:13.411325 140375134025472 trainer.py:522] step:    82 fraction_of_correct_next_step_preds:0.3593379 fraction_of_correct_next_step_preds/logits:0.3593379 grad_norm/all:1.1385168 grad_scale_all:0.87833577 log_pplx:2.1953106 log_pplx/logits:2.1953106 loss:2.1953106 loss/logits:2.1953106 num_samples_in_batch:128 var_norm/all:704.13922
I0420 07:42:13.978193 140375142418176 trainer.py:371] Steps/second: 0.127742, Examples/second: 17.347933
I0420 07:42:20.770096 140375134025472 trainer.py:511] time: 7.358493
I0420 07:42:20.771768 140375134025472 trainer.py:522] step:    83 fraction_of_correct_next_step_preds:0.35221922 fraction_of_correct_next_step_preds/logits:0.35221922 grad_norm/all:1.4923625 grad_scale_all:0.67007846 log_pplx:2.2021344 log_pplx/logits:2.2021344 loss:2.2021344 loss/logits:2.2021344 num_samples_in_batch:128 var_norm/all:704.12964
I0420 07:42:23.988683 140375142418176 trainer.py:371] Steps/second: 0.127314, Examples/second: 17.277894
I0420 07:42:28.549462 140375134025472 trainer.py:511] time: 7.777474
I0420 07:42:28.550607 140375134025472 trainer.py:522] step:    84 fraction_of_correct_next_step_preds:0.37899226 fraction_of_correct_next_step_preds/logits:0.37899226 grad_norm/all:0.82399756 grad_scale_all:1 log_pplx:2.1561263 log_pplx/logits:2.1561263 loss:2.1561263 loss/logits:2.1561263 num_samples_in_batch:128 var_norm/all:704.12036
I0420 07:42:33.999638 140375142418176 trainer.py:371] Steps/second: 0.126899, Examples/second: 17.209961
I0420 07:42:36.480242 140375134025472 trainer.py:511] time: 7.929422
I0420 07:42:36.481380 140375134025472 trainer.py:522] step:    85 fraction_of_correct_next_step_preds:0.36427727 fraction_of_correct_next_step_preds/logits:0.36427727 grad_norm/all:0.73982769 grad_scale_all:1 log_pplx:2.1625452 log_pplx/logits:2.1625452 loss:2.1625452 loss/logits:2.1625452 num_samples_in_batch:128 var_norm/all:704.11084
I0420 07:42:43.379302 140375134025472 trainer.py:511] time: 6.897688
I0420 07:42:43.380394 140375134025472 trainer.py:522] step:    86 fraction_of_correct_next_step_preds:0.35739926 fraction_of_correct_next_step_preds/logits:0.35739926 grad_norm/all:1.5248234 grad_scale_all:0.65581363 log_pplx:2.1884296 log_pplx/logits:2.1884296 loss:2.1884296 loss/logits:2.1884296 num_samples_in_batch:128 var_norm/all:704.10114
I0420 07:42:44.008449 140375142418176 trainer.py:371] Steps/second: 0.127986, Examples/second: 17.334595
I0420 07:42:52.863199 140375134025472 trainer.py:511] time: 9.482465
I0420 07:42:52.864460 140375134025472 trainer.py:522] step:    87 fraction_of_correct_next_step_preds:0.3832756 fraction_of_correct_next_step_preds/logits:0.3832756 grad_norm/all:0.99459577 grad_scale_all:1 log_pplx:2.1456861 log_pplx/logits:2.1456861 loss:2.1456861 loss/logits:2.1456861 num_samples_in_batch:128 var_norm/all:704.09155
I0420 07:42:54.018491 140375142418176 trainer.py:371] Steps/second: 0.127573, Examples/second: 17.267847
I0420 07:42:58.559226 140375134025472 trainer.py:511] time: 5.694561
I0420 07:42:58.560539 140375134025472 trainer.py:522] step:    88 fraction_of_correct_next_step_preds:0.36507463 fraction_of_correct_next_step_preds/logits:0.36507463 grad_norm/all:0.8461504 grad_scale_all:1 log_pplx:2.1562793 log_pplx/logits:2.1562793 loss:2.1562793 loss/logits:2.1562793 num_samples_in_batch:128 var_norm/all:704.08179
I0420 07:43:04.029028 140375142418176 trainer.py:371] Steps/second: 0.127173, Examples/second: 17.203016
I0420 07:43:06.955815 140375134025472 trainer.py:511] time: 8.395064
I0420 07:43:06.956528 140375134025472 trainer.py:522] step:    89 fraction_of_correct_next_step_preds:0.38289136 fraction_of_correct_next_step_preds/logits:0.38289136 grad_norm/all:0.5773949 grad_scale_all:1 log_pplx:2.1110232 log_pplx/logits:2.1110232 loss:2.1110232 loss/logits:2.1110232 num_samples_in_batch:128 var_norm/all:704.07178
I0420 07:43:14.034790 140375142418176 trainer.py:371] Steps/second: 0.126785, Examples/second: 17.140155
I0420 07:43:14.521831 140375134025472 trainer.py:511] time: 7.564866
I0420 07:43:14.522579 140375134025472 trainer.py:522] step:    90 fraction_of_correct_next_step_preds:0.37470537 fraction_of_correct_next_step_preds/logits:0.37470537 grad_norm/all:0.70084506 grad_scale_all:1 log_pplx:2.1351776 log_pplx/logits:2.1351776 loss:2.1351776 loss/logits:2.1351776 num_samples_in_batch:128 var_norm/all:704.06158
I0420 07:43:22.593677 140375134025472 trainer.py:511] time: 8.070737
I0420 07:43:22.595352 140375134025472 trainer.py:522] step:    91 fraction_of_correct_next_step_preds:0.37510994 fraction_of_correct_next_step_preds/logits:0.37510994 grad_norm/all:0.89258677 grad_scale_all:1 log_pplx:2.1301734 log_pplx/logits:2.1301734 loss:2.1301734 loss/logits:2.1301734 num_samples_in_batch:128 var_norm/all:704.05121
I0420 07:43:24.044142 140375142418176 trainer.py:371] Steps/second: 0.127811, Examples/second: 17.258747
I0420 07:43:30.330538 140375134025472 trainer.py:511] time: 7.734857
I0420 07:43:30.331336 140375134025472 trainer.py:522] step:    92 fraction_of_correct_next_step_preds:0.37568703 fraction_of_correct_next_step_preds/logits:0.37568703 grad_norm/all:1.012671 grad_scale_all:0.98748755 log_pplx:2.1166055 log_pplx/logits:2.1166055 loss:2.1166055 loss/logits:2.1166055 num_samples_in_batch:128 var_norm/all:704.04065
I0420 07:43:34.055632 140375142418176 trainer.py:371] Steps/second: 0.127424, Examples/second: 17.196722
I0420 07:43:39.545340 140375134025472 trainer.py:511] time: 9.213729
I0420 07:43:39.546376 140375134025472 trainer.py:522] step:    93 fraction_of_correct_next_step_preds:0.37040973 fraction_of_correct_next_step_preds/logits:0.37040973 grad_norm/all:1.307186 grad_scale_all:0.76500207 log_pplx:2.1472313 log_pplx/logits:2.1472313 loss:2.1472313 loss/logits:2.1472313 num_samples_in_batch:128 var_norm/all:704.02991
I0420 07:43:44.059933 140375142418176 trainer.py:371] Steps/second: 0.127049, Examples/second: 17.136553
I0420 07:43:46.515561 140375134025472 trainer.py:511] time: 6.968886
I0420 07:43:46.517144 140375134025472 trainer.py:522] step:    94 fraction_of_correct_next_step_preds:0.37518752 fraction_of_correct_next_step_preds/logits:0.37518752 grad_norm/all:1.1413982 grad_scale_all:0.87611842 log_pplx:2.1311126 log_pplx/logits:2.1311126 loss:2.1311126 loss/logits:2.1311126 num_samples_in_batch:128 var_norm/all:704.01935
I0420 07:43:52.326225 140375134025472 trainer.py:511] time: 5.808762
I0420 07:43:52.327689 140375134025472 trainer.py:522] step:    95 fraction_of_correct_next_step_preds:0.38357881 fraction_of_correct_next_step_preds/logits:0.38357881 grad_norm/all:0.84444368 grad_scale_all:1 log_pplx:2.1159191 log_pplx/logits:2.1159191 loss:2.1159191 loss/logits:2.1159191 num_samples_in_batch:128 var_norm/all:704.00879
I0420 07:43:54.059287 140375142418176 trainer.py:371] Steps/second: 0.128032, Examples/second: 17.250629
I0420 07:44:01.040597 140375134025472 trainer.py:511] time: 8.689469
I0420 07:44:01.041412 140375134025472 trainer.py:522] step:    96 fraction_of_correct_next_step_preds:0.37989101 fraction_of_correct_next_step_preds/logits:0.37989101 grad_norm/all:0.82067651 grad_scale_all:1 log_pplx:2.1037724 log_pplx/logits:2.1037724 loss:2.1037724 loss/logits:2.1037724 num_samples_in_batch:128 var_norm/all:703.99811
I0420 07:44:04.069787 140375142418176 trainer.py:371] Steps/second: 0.127657, Examples/second: 17.361416
I0420 07:44:05.287786 140375134025472 trainer.py:511] time: 4.245918
I0420 07:44:05.289007 140375134025472 trainer.py:522] step:    97 fraction_of_correct_next_step_preds:0.36707532 fraction_of_correct_next_step_preds/logits:0.36707532 grad_norm/all:1.0457711 grad_scale_all:0.95623219 log_pplx:2.1651423 log_pplx/logits:2.1651423 loss:2.1651423 loss/logits:2.1651423 num_samples_in_batch:256 var_norm/all:703.98724
I0420 07:44:12.534790 140375134025472 trainer.py:511] time: 7.245540
I0420 07:44:12.536067 140375134025472 trainer.py:522] step:    98 fraction_of_correct_next_step_preds:0.37593955 fraction_of_correct_next_step_preds/logits:0.37593955 grad_norm/all:1.6557065 grad_scale_all:0.60397178 log_pplx:2.1272745 log_pplx/logits:2.1272745 loss:2.1272745 loss/logits:2.1272745 num_samples_in_batch:128 var_norm/all:703.97632
I0420 07:44:14.079885 140375142418176 trainer.py:371] Steps/second: 0.128605, Examples/second: 17.469302
I0420 07:44:20.448647 140375134025472 trainer.py:511] time: 7.912352
I0420 07:44:20.449891 140375134025472 trainer.py:522] step:    99 fraction_of_correct_next_step_preds:0.38830405 fraction_of_correct_next_step_preds/logits:0.38830405 grad_norm/all:0.66534883 grad_scale_all:1 log_pplx:2.0848999 log_pplx/logits:2.0848999 loss:2.0848999 loss/logits:2.0848999 num_samples_in_batch:128 var_norm/all:703.96582
I0420 07:44:24.089390 140375142418176 trainer.py:371] Steps/second: 0.128233, Examples/second: 17.408603
I0420 07:44:29.597444 140375134025472 trainer.py:511] time: 9.147288
I0420 07:44:29.598999 140375134025472 trainer.py:522] step:   100 fraction_of_correct_next_step_preds:0.37406558 fraction_of_correct_next_step_preds/logits:0.37406558 grad_norm/all:1.759958 grad_scale_all:0.56819534 log_pplx:2.1023078 log_pplx/logits:2.1023078 loss:2.1023078 loss/logits:2.1023078 num_samples_in_batch:128 var_norm/all:703.9552
I0420 07:44:34.098342 140375142418176 trainer.py:371] Steps/second: 0.127871, Examples/second: 17.349474
I0420 07:44:37.641819 140375134025472 trainer.py:511] time: 8.042531
I0420 07:44:37.643115 140375134025472 base_runner.py:115] step:   101 fraction_of_correct_next_step_preds:0.38138181 fraction_of_correct_next_step_preds/logits:0.38138181 grad_norm/all:1.0481714 grad_scale_all:0.95404243 log_pplx:2.1053541 log_pplx/logits:2.1053541 loss:2.1053541 loss/logits:2.1053541 num_samples_in_batch:128 var_norm/all:703.94501
I0420 07:44:44.107460 140375142418176 trainer.py:371] Steps/second: 0.127517, Examples/second: 17.291835
I0420 07:44:44.108102 140375142418176 trainer.py:275] Write summary @101
2019-04-20 07:44:44.115563: I lingvo/core/ops/record_batcher.cc:344] 775 total seconds passed. Total records yielded: 264. Total records skipped: 0
I0420 07:44:44.722656 140375134025472 trainer.py:511] time: 7.079163
I0420 07:44:44.724056 140375134025472 trainer.py:522] step:   102 fraction_of_correct_next_step_preds:0.38316277 fraction_of_correct_next_step_preds/logits:0.38316277 grad_norm/all:1.0969374 grad_scale_all:0.91162902 log_pplx:2.0958667 log_pplx/logits:2.0958667 loss:2.0958667 loss/logits:2.0958667 num_samples_in_batch:128 var_norm/all:703.93457
I0420 07:44:55.762762 140375134025472 trainer.py:511] time: 11.038428
I0420 07:44:55.776536 140375134025472 trainer.py:522] step:   103 fraction_of_correct_next_step_preds:0.38022271 fraction_of_correct_next_step_preds/logits:0.38022271 grad_norm/all:1.1487237 grad_scale_all:0.87053132 log_pplx:2.0835736 log_pplx/logits:2.0835736 loss:2.0835736 loss/logits:2.0835736 num_samples_in_batch:128 var_norm/all:703.92407
I0420 07:45:03.670341 140375134025472 trainer.py:511] time: 7.891746
I0420 07:45:03.671505 140375134025472 trainer.py:522] step:   104 fraction_of_correct_next_step_preds:0.38704854 fraction_of_correct_next_step_preds/logits:0.38704854 grad_norm/all:0.81093085 grad_scale_all:1 log_pplx:2.0827632 log_pplx/logits:2.0827632 loss:2.0827632 loss/logits:2.0827632 num_samples_in_batch:128 var_norm/all:703.91351
I0420 07:45:14.271611 140375134025472 trainer.py:511] time: 10.599824
I0420 07:45:14.273308 140375134025472 trainer.py:522] step:   105 fraction_of_correct_next_step_preds:0.38036564 fraction_of_correct_next_step_preds/logits:0.38036564 grad_norm/all:1.2385389 grad_scale_all:0.80740303 log_pplx:2.0951517 log_pplx/logits:2.0951517 loss:2.0951517 loss/logits:2.0951517 num_samples_in_batch:128 var_norm/all:703.90283
I0420 07:45:27.701951 140375134025472 trainer.py:511] time: 13.428311
I0420 07:45:27.704013 140375134025472 trainer.py:522] step:   106 fraction_of_correct_next_step_preds:0.38831434 fraction_of_correct_next_step_preds/logits:0.38831434 grad_norm/all:0.8401624 grad_scale_all:1 log_pplx:2.0681973 log_pplx/logits:2.0681973 loss:2.0681973 loss/logits:2.0681973 num_samples_in_batch:128 var_norm/all:703.89215
I0420 07:45:40.065790 140375134025472 trainer.py:511] time: 12.361295
I0420 07:45:40.068871 140375134025472 trainer.py:522] step:   107 fraction_of_correct_next_step_preds:0.39199519 fraction_of_correct_next_step_preds/logits:0.39199519 grad_norm/all:0.82031536 grad_scale_all:1 log_pplx:2.0627809 log_pplx/logits:2.0627809 loss:2.0627809 loss/logits:2.0627809 num_samples_in_batch:128 var_norm/all:703.88129
I0420 07:45:51.887306 140375134025472 trainer.py:511] time: 11.818003
I0420 07:45:51.889163 140375134025472 trainer.py:522] step:   108 fraction_of_correct_next_step_preds:0.39280939 fraction_of_correct_next_step_preds/logits:0.39280939 grad_norm/all:1.099699 grad_scale_all:0.90933973 log_pplx:2.0636985 log_pplx/logits:2.0636985 loss:2.0636985 loss/logits:2.0636985 num_samples_in_batch:128 var_norm/all:703.87018
I0420 07:45:55.701033 140375142418176 trainer.py:284] Write summary done: step 101
I0420 07:45:55.714133 140375142418176 base_runner.py:115] step:   101, steps/sec: 0.13, examples/sec: 17.29
I0420 07:45:55.718148 140375142418176 trainer.py:371] Steps/second: 0.125049, Examples/second: 16.895521
I0420 07:45:59.875958 140375134025472 trainer.py:511] time: 7.985847
I0420 07:45:59.877171 140375134025472 trainer.py:522] step:   109 fraction_of_correct_next_step_preds:0.38182643 fraction_of_correct_next_step_preds/logits:0.38182643 grad_norm/all:1.3109812 grad_scale_all:0.76278746 log_pplx:2.0635197 log_pplx/logits:2.0635197 loss:2.0635197 loss/logits:2.0635197 num_samples_in_batch:128 var_norm/all:703.85907
I0420 07:46:05.727878 140375142418176 trainer.py:371] Steps/second: 0.124761, Examples/second: 16.848456
I0420 07:46:08.144885 140375134025472 trainer.py:511] time: 8.267169
I0420 07:46:08.146713 140375134025472 trainer.py:522] step:   110 fraction_of_correct_next_step_preds:0.39472944 fraction_of_correct_next_step_preds/logits:0.39472944 grad_norm/all:1.1448041 grad_scale_all:0.87351191 log_pplx:2.0443304 log_pplx/logits:2.0443304 loss:2.0443304 loss/logits:2.0443304 num_samples_in_batch:128 var_norm/all:703.84808
I0420 07:46:13.943571 140375134025472 trainer.py:511] time: 5.796645
I0420 07:46:13.944745 140375134025472 trainer.py:522] step:   111 fraction_of_correct_next_step_preds:0.39036024 fraction_of_correct_next_step_preds/logits:0.39036024 grad_norm/all:0.98204678 grad_scale_all:1 log_pplx:2.0808325 log_pplx/logits:2.0808325 loss:2.0808325 loss/logits:2.0808325 num_samples_in_batch:128 var_norm/all:703.8371
I0420 07:46:15.740056 140375142418176 trainer.py:371] Steps/second: 0.125611, Examples/second: 17.092108
I0420 07:46:18.146146 140375134025472 trainer.py:511] time: 4.201175
I0420 07:46:18.147413 140375134025472 trainer.py:522] step:   112 fraction_of_correct_next_step_preds:0.38091317 fraction_of_correct_next_step_preds/logits:0.38091317 grad_norm/all:0.8844955 grad_scale_all:1 log_pplx:2.1076317 log_pplx/logits:2.1076317 loss:2.1076317 loss/logits:2.1076317 num_samples_in_batch:256 var_norm/all:703.82587
I0420 07:46:25.491939 140375134025472 trainer.py:511] time: 7.344290
I0420 07:46:25.492826 140375134025472 trainer.py:522] step:   113 fraction_of_correct_next_step_preds:0.40295595 fraction_of_correct_next_step_preds/logits:0.40295595 grad_norm/all:0.7746343 grad_scale_all:1 log_pplx:2.0239127 log_pplx/logits:2.0239127 loss:2.0239127 loss/logits:2.0239127 num_samples_in_batch:128 var_norm/all:703.81451
I0420 07:46:25.765045 140375142418176 trainer.py:371] Steps/second: 0.126440, Examples/second: 17.186828
I0420 07:46:34.654975 140375134025472 trainer.py:511] time: 9.161685
I0420 07:46:34.656054 140375134025472 trainer.py:522] step:   114 fraction_of_correct_next_step_preds:0.40458798 fraction_of_correct_next_step_preds/logits:0.40458798 grad_norm/all:0.77485955 grad_scale_all:1 log_pplx:2.0200136 log_pplx/logits:2.0200136 loss:2.0200136 loss/logits:2.0200136 num_samples_in_batch:128 var_norm/all:703.80304
I0420 07:46:35.756966 140375142418176 trainer.py:371] Steps/second: 0.126148, Examples/second: 17.138438
I0420 07:46:42.850702 140375134025472 trainer.py:511] time: 8.194245
I0420 07:46:42.852169 140375134025472 trainer.py:522] step:   115 fraction_of_correct_next_step_preds:0.40469682 fraction_of_correct_next_step_preds/logits:0.40469682 grad_norm/all:1.019146 grad_scale_all:0.98121369 log_pplx:2.0309305 log_pplx/logits:2.0309305 loss:2.0309305 loss/logits:2.0309305 num_samples_in_batch:128 var_norm/all:703.79138
I0420 07:46:45.768280 140375142418176 trainer.py:371] Steps/second: 0.125860, Examples/second: 17.090745
I0420 07:46:50.728563 140375134025472 trainer.py:511] time: 7.876101
I0420 07:46:50.729914 140375134025472 trainer.py:522] step:   116 fraction_of_correct_next_step_preds:0.3940883 fraction_of_correct_next_step_preds/logits:0.3940883 grad_norm/all:1.2589619 grad_scale_all:0.79430521 log_pplx:2.0252271 log_pplx/logits:2.0252271 loss:2.0252271 loss/logits:2.0252271 num_samples_in_batch:128 var_norm/all:703.77966
2019-04-20 07:46:50.734156: I lingvo/core/ops/record_batcher.cc:344] 925 total seconds passed. Total records yielded: 16765. Total records skipped: 11
2019-04-20 07:46:50.734335: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1726
2019-04-20 07:46:50.734376: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1729
2019-04-20 07:46:50.734412: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1966
I0420 07:46:55.777012 140375142418176 trainer.py:371] Steps/second: 0.125579, Examples/second: 17.044132
I0420 07:46:57.789110 140375134025472 trainer.py:511] time: 7.058867
I0420 07:46:57.790066 140375134025472 trainer.py:522] step:   117 fraction_of_correct_next_step_preds:0.40091401 fraction_of_correct_next_step_preds/logits:0.40091401 grad_norm/all:0.99531782 grad_scale_all:1 log_pplx:2.0293548 log_pplx/logits:2.0293548 loss:2.0293548 loss/logits:2.0293548 num_samples_in_batch:128 var_norm/all:703.76801
I0420 07:47:03.618060 140375134025472 trainer.py:511] time: 5.827690
I0420 07:47:03.619812 140375134025472 trainer.py:522] step:   118 fraction_of_correct_next_step_preds:0.3924461 fraction_of_correct_next_step_preds/logits:0.3924461 grad_norm/all:0.97392869 grad_scale_all:1 log_pplx:2.0520797 log_pplx/logits:2.0520797 loss:2.0520797 loss/logits:2.0520797 num_samples_in_batch:128 var_norm/all:703.75629
I0420 07:47:05.787250 140375142418176 trainer.py:371] Steps/second: 0.126375, Examples/second: 17.135577
I0420 07:47:12.191804 140375134025472 trainer.py:511] time: 8.571755
I0420 07:47:12.193002 140375134025472 trainer.py:522] step:   119 fraction_of_correct_next_step_preds:0.39932805 fraction_of_correct_next_step_preds/logits:0.39932805 grad_norm/all:1.4277698 grad_scale_all:0.70039302 log_pplx:2.0204892 log_pplx/logits:2.0204892 loss:2.0204892 loss/logits:2.0204892 num_samples_in_batch:128 var_norm/all:703.74438
I0420 07:47:15.798336 140375142418176 trainer.py:371] Steps/second: 0.126094, Examples/second: 17.089437
I0420 07:47:19.713876 140375134025472 trainer.py:511] time: 7.520668
I0420 07:47:19.714983 140375134025472 trainer.py:522] step:   120 fraction_of_correct_next_step_preds:0.4065825 fraction_of_correct_next_step_preds/logits:0.4065825 grad_norm/all:1.0304936 grad_scale_all:0.97040874 log_pplx:2.01543 log_pplx/logits:2.01543 loss:2.01543 loss/logits:2.01543 num_samples_in_batch:128 var_norm/all:703.73279
I0420 07:47:25.808001 140375142418176 trainer.py:371] Steps/second: 0.125819, Examples/second: 17.044289
I0420 07:47:28.746090 140375134025472 trainer.py:511] time: 9.030880
I0420 07:47:28.747275 140375134025472 trainer.py:522] step:   121 fraction_of_correct_next_step_preds:0.40240431 fraction_of_correct_next_step_preds/logits:0.40240431 grad_norm/all:0.93115848 grad_scale_all:1 log_pplx:1.9913671 log_pplx/logits:1.9913671 loss:1.9913671 loss/logits:1.9913671 num_samples_in_batch:128 var_norm/all:703.72107
I0420 07:47:35.818212 140375142418176 trainer.py:371] Steps/second: 0.125550, Examples/second: 17.000070
I0420 07:47:36.670398 140375134025472 trainer.py:511] time: 7.922841
I0420 07:47:36.671638 140375134025472 trainer.py:522] step:   122 fraction_of_correct_next_step_preds:0.39683267 fraction_of_correct_next_step_preds/logits:0.39683267 grad_norm/all:0.82973373 grad_scale_all:1 log_pplx:2.0151408 log_pplx/logits:2.0151408 loss:2.0151408 loss/logits:2.0151408 num_samples_in_batch:128 var_norm/all:703.70923
I0420 07:47:44.871414 140375134025472 trainer.py:511] time: 8.199487
I0420 07:47:44.872231 140375134025472 trainer.py:522] step:   123 fraction_of_correct_next_step_preds:0.40840337 fraction_of_correct_next_step_preds/logits:0.40840337 grad_norm/all:1.4440463 grad_scale_all:0.69249856 log_pplx:1.9989125 log_pplx/logits:1.9989125 loss:1.9989125 loss/logits:1.9989125 num_samples_in_batch:128 var_norm/all:703.69714
I0420 07:47:45.822077 140375142418176 trainer.py:371] Steps/second: 0.126314, Examples/second: 17.088316
I0420 07:47:51.782823 140375134025472 trainer.py:511] time: 6.910181
I0420 07:47:51.784365 140375134025472 trainer.py:522] step:   124 fraction_of_correct_next_step_preds:0.39865616 fraction_of_correct_next_step_preds/logits:0.39865616 grad_norm/all:0.97169137 grad_scale_all:1 log_pplx:2.017369 log_pplx/logits:2.017369 loss:2.017369 loss/logits:2.017369 num_samples_in_batch:128 var_norm/all:703.68542
I0420 07:47:55.832262 140375142418176 trainer.py:371] Steps/second: 0.126045, Examples/second: 17.044550
I0420 07:48:00.231045 140375134025472 trainer.py:511] time: 8.446399
I0420 07:48:00.232076 140375134025472 trainer.py:522] step:   125 fraction_of_correct_next_step_preds:0.41026807 fraction_of_correct_next_step_preds/logits:0.41026807 grad_norm/all:1.2182258 grad_scale_all:0.82086587 log_pplx:1.9927412 log_pplx/logits:1.9927412 loss:1.9927412 loss/logits:1.9927412 num_samples_in_batch:128 var_norm/all:703.67346
I0420 07:48:05.842519 140375142418176 trainer.py:371] Steps/second: 0.125782, Examples/second: 17.001663
I0420 07:48:09.408751 140375134025472 trainer.py:511] time: 9.176420
I0420 07:48:09.410067 140375134025472 trainer.py:522] step:   126 fraction_of_correct_next_step_preds:0.40561908 fraction_of_correct_next_step_preds/logits:0.40561908 grad_norm/all:1.1187987 grad_scale_all:0.89381582 log_pplx:2.000335 log_pplx/logits:2.000335 loss:2.000335 loss/logits:2.000335 num_samples_in_batch:128 var_norm/all:703.66162
I0420 07:48:15.108469 140375134025472 trainer.py:511] time: 5.698159
I0420 07:48:15.109544 140375134025472 trainer.py:522] step:   127 fraction_of_correct_next_step_preds:0.413378 fraction_of_correct_next_step_preds/logits:0.413378 grad_norm/all:0.75346535 grad_scale_all:1 log_pplx:1.9956678 log_pplx/logits:1.9956678 loss:1.9956678 loss/logits:1.9956678 num_samples_in_batch:128 var_norm/all:703.64972
I0420 07:48:15.850579 140375142418176 trainer.py:371] Steps/second: 0.126520, Examples/second: 17.087184
I0420 07:48:22.503793 140375134025472 trainer.py:511] time: 7.393976
I0420 07:48:22.505626 140375134025472 trainer.py:522] step:   128 fraction_of_correct_next_step_preds:0.40938216 fraction_of_correct_next_step_preds/logits:0.40938216 grad_norm/all:0.88790184 grad_scale_all:1 log_pplx:1.9819201 log_pplx/logits:1.9819201 loss:1.9819201 loss/logits:1.9819201 num_samples_in_batch:128 var_norm/all:703.6377
I0420 07:48:25.861005 140375142418176 trainer.py:371] Steps/second: 0.126257, Examples/second: 17.170978
I0420 07:48:26.824466 140375134025472 trainer.py:511] time: 4.318640
I0420 07:48:26.825557 140375134025472 trainer.py:522] step:   129 fraction_of_correct_next_step_preds:0.4062759 fraction_of_correct_next_step_preds/logits:0.4062759 grad_norm/all:0.69528306 grad_scale_all:1 log_pplx:2.0344479 log_pplx/logits:2.0344479 loss:2.0344479 loss/logits:2.0344479 num_samples_in_batch:256 var_norm/all:703.62555
I0420 07:48:34.608892 140375134025472 trainer.py:511] time: 7.783073
I0420 07:48:34.610289 140375134025472 trainer.py:522] step:   130 fraction_of_correct_next_step_preds:0.4116973 fraction_of_correct_next_step_preds/logits:0.4116973 grad_norm/all:0.97755039 grad_scale_all:1 log_pplx:1.9893242 log_pplx/logits:1.9893242 loss:1.9893242 loss/logits:1.9893242 num_samples_in_batch:128 var_norm/all:703.61316
I0420 07:48:35.871114 140375142418176 trainer.py:371] Steps/second: 0.126976, Examples/second: 17.253137
I0420 07:48:42.722692 140375134025472 trainer.py:511] time: 8.112124
I0420 07:48:42.724195 140375134025472 trainer.py:522] step:   131 fraction_of_correct_next_step_preds:0.4181805 fraction_of_correct_next_step_preds/logits:0.4181805 grad_norm/all:0.60749316 grad_scale_all:1 log_pplx:1.9431043 log_pplx/logits:1.9431043 loss:1.9431043 loss/logits:1.9431043 num_samples_in_batch:128 var_norm/all:703.60071
I0420 07:48:45.880964 140375142418176 trainer.py:371] Steps/second: 0.126714, Examples/second: 17.209899
I0420 07:48:49.451812 140375134025472 trainer.py:511] time: 6.727396
I0420 07:48:49.452764 140375134025472 trainer.py:522] step:   132 fraction_of_correct_next_step_preds:0.41381198 fraction_of_correct_next_step_preds/logits:0.41381198 grad_norm/all:0.83036727 grad_scale_all:1 log_pplx:1.9733791 log_pplx/logits:1.9733791 loss:1.9733791 loss/logits:1.9733791 num_samples_in_batch:128 var_norm/all:703.58807
I0420 07:48:55.892543 140375142418176 trainer.py:371] Steps/second: 0.126457, Examples/second: 17.167462
I0420 07:48:57.872793 140375134025472 trainer.py:511] time: 8.419803
I0420 07:48:57.873647 140375134025472 trainer.py:522] step:   133 fraction_of_correct_next_step_preds:0.41793913 fraction_of_correct_next_step_preds/logits:0.41793913 grad_norm/all:1.3502294 grad_scale_all:0.74061489 log_pplx:1.9688462 log_pplx/logits:1.9688462 loss:1.9688462 loss/logits:1.9688462 num_samples_in_batch:128 var_norm/all:703.57538
I0420 07:49:05.900649 140375142418176 trainer.py:371] Steps/second: 0.126205, Examples/second: 17.125886
I0420 07:49:06.834827 140375134025472 trainer.py:511] time: 8.960698
I0420 07:49:06.835812 140375134025472 trainer.py:522] step:   134 fraction_of_correct_next_step_preds:0.40741351 fraction_of_correct_next_step_preds/logits:0.40741351 grad_norm/all:1.2382512 grad_scale_all:0.80759054 log_pplx:1.9809607 log_pplx/logits:1.9809607 loss:1.9809607 loss/logits:1.9809607 num_samples_in_batch:128 var_norm/all:703.56293
I0420 07:49:14.305708 140375134025472 trainer.py:511] time: 7.469594
I0420 07:49:14.307034 140375134025472 trainer.py:522] step:   135 fraction_of_correct_next_step_preds:0.41747892 fraction_of_correct_next_step_preds/logits:0.41747892 grad_norm/all:0.66647661 grad_scale_all:1 log_pplx:1.9676461 log_pplx/logits:1.9676461 loss:1.9676461 loss/logits:1.9676461 num_samples_in_batch:128 var_norm/all:703.55054
I0420 07:49:15.900914 140375142418176 trainer.py:371] Steps/second: 0.126898, Examples/second: 17.205537
I0420 07:49:22.252697 140375134025472 trainer.py:511] time: 7.945407
I0420 07:49:22.254210 140375134025472 trainer.py:522] step:   136 fraction_of_correct_next_step_preds:0.41611537 fraction_of_correct_next_step_preds/logits:0.41611537 grad_norm/all:0.70107651 grad_scale_all:1 log_pplx:1.9448195 log_pplx/logits:1.9448195 loss:1.9448195 loss/logits:1.9448195 num_samples_in_batch:128 var_norm/all:703.53796
I0420 07:49:25.910279 140375142418176 trainer.py:371] Steps/second: 0.126647, Examples/second: 17.164361
I0420 07:49:28.031122 140375134025472 trainer.py:511] time: 5.776590
I0420 07:49:28.032166 140375134025472 trainer.py:522] step:   137 fraction_of_correct_next_step_preds:0.41829944 fraction_of_correct_next_step_preds/logits:0.41829944 grad_norm/all:0.84144682 grad_scale_all:1 log_pplx:1.9547106 log_pplx/logits:1.9547106 loss:1.9547106 loss/logits:1.9547106 num_samples_in_batch:128 var_norm/all:703.52521
I0420 07:49:35.922549 140375142418176 trainer.py:371] Steps/second: 0.126400, Examples/second: 17.123903
I0420 07:49:36.090886 140375134025472 trainer.py:511] time: 8.058493
I0420 07:49:36.092125 140375134025472 trainer.py:522] step:   138 fraction_of_correct_next_step_preds:0.41517875 fraction_of_correct_next_step_preds/logits:0.41517875 grad_norm/all:1.4478648 grad_scale_all:0.69067222 log_pplx:1.9582996 log_pplx/logits:1.9582996 loss:1.9582996 loss/logits:1.9582996 num_samples_in_batch:128 var_norm/all:703.51233
I0420 07:49:42.946846 140375134025472 trainer.py:511] time: 6.854485
I0420 07:49:42.947705 140375134025472 trainer.py:522] step:   139 fraction_of_correct_next_step_preds:0.41506374 fraction_of_correct_next_step_preds/logits:0.41506374 grad_norm/all:0.63443571 grad_scale_all:1 log_pplx:1.9501776 log_pplx/logits:1.9501776 loss:1.9501776 loss/logits:1.9501776 num_samples_in_batch:128 var_norm/all:703.49976
I0420 07:49:45.931711 140375142418176 trainer.py:371] Steps/second: 0.127071, Examples/second: 17.201245
I0420 07:49:51.288666 140375134025472 trainer.py:511] time: 8.340743
I0420 07:49:51.289536 140375134025472 trainer.py:522] step:   140 fraction_of_correct_next_step_preds:0.4142887 fraction_of_correct_next_step_preds/logits:0.4142887 grad_norm/all:1.1109182 grad_scale_all:0.90015632 log_pplx:1.9538394 log_pplx/logits:1.9538394 loss:1.9538394 loss/logits:1.9538394 num_samples_in_batch:128 var_norm/all:703.487
I0420 07:49:55.941807 140375142418176 trainer.py:371] Steps/second: 0.126825, Examples/second: 17.161219
I0420 07:50:00.419210 140375134025472 trainer.py:511] time: 9.104871
I0420 07:50:00.420401 140375134025472 trainer.py:522] step:   141 fraction_of_correct_next_step_preds:0.4269422 fraction_of_correct_next_step_preds/logits:0.4269422 grad_norm/all:1.5170504 grad_scale_all:0.65917391 log_pplx:1.9341648 log_pplx/logits:1.9341648 loss:1.9341648 loss/logits:1.9341648 num_samples_in_batch:128 var_norm/all:703.4743
I0420 07:50:05.952049 140375142418176 trainer.py:371] Steps/second: 0.126583, Examples/second: 17.121908
I0420 07:50:08.352792 140375134025472 trainer.py:511] time: 7.932153
I0420 07:50:08.353964 140375134025472 trainer.py:522] step:   142 fraction_of_correct_next_step_preds:0.41864634 fraction_of_correct_next_step_preds/logits:0.41864634 grad_norm/all:0.79958361 grad_scale_all:1 log_pplx:1.9402517 log_pplx/logits:1.9402517 loss:1.9402517 loss/logits:1.9402517 num_samples_in_batch:128 var_norm/all:703.46191
I0420 07:50:15.772907 140375134025472 trainer.py:511] time: 7.418736
I0420 07:50:15.774157 140375134025472 trainer.py:522] step:   143 fraction_of_correct_next_step_preds:0.42049187 fraction_of_correct_next_step_preds/logits:0.42049187 grad_norm/all:1.1368365 grad_scale_all:0.87963396 log_pplx:1.9241009 log_pplx/logits:1.9241009 loss:1.9241009 loss/logits:1.9241009 num_samples_in_batch:128 var_norm/all:703.44934
I0420 07:50:16.017703 140375142418176 trainer.py:371] Steps/second: 0.127229, Examples/second: 17.196336
I0420 07:50:21.387326 140375134025472 trainer.py:511] time: 5.612883
I0420 07:50:21.390589 140375134025472 trainer.py:522] step:   144 fraction_of_correct_next_step_preds:0.41908437 fraction_of_correct_next_step_preds/logits:0.41908437 grad_norm/all:1.8576202 grad_scale_all:0.53832316 log_pplx:1.9595932 log_pplx/logits:1.9595932 loss:1.9595932 loss/logits:1.9595932 num_samples_in_batch:128 var_norm/all:703.43665
I0420 07:50:25.971918 140375142418176 trainer.py:371] Steps/second: 0.126994, Examples/second: 17.158259
I0420 07:50:29.444952 140375134025472 trainer.py:511] time: 8.053926
I0420 07:50:29.446449 140375134025472 trainer.py:522] step:   145 fraction_of_correct_next_step_preds:0.42030919 fraction_of_correct_next_step_preds/logits:0.42030919 grad_norm/all:0.76838368 grad_scale_all:1 log_pplx:1.9421583 log_pplx/logits:1.9421583 loss:1.9421583 loss/logits:1.9421583 num_samples_in_batch:128 var_norm/all:703.42462
I0420 07:50:35.969763 140375142418176 trainer.py:371] Steps/second: 0.126758, Examples/second: 17.120192
I0420 07:50:36.290680 140375134025472 trainer.py:511] time: 6.844025
I0420 07:50:36.291791 140375134025472 trainer.py:522] step:   146 fraction_of_correct_next_step_preds:0.40135837 fraction_of_correct_next_step_preds/logits:0.40135837 grad_norm/all:2.3181317 grad_scale_all:0.43138188 log_pplx:1.9873381 log_pplx/logits:1.9873381 loss:1.9873381 loss/logits:1.9873381 num_samples_in_batch:128 var_norm/all:703.41223
I0420 07:50:40.472434 140375134025472 trainer.py:511] time: 4.180376
I0420 07:50:40.473872 140375134025472 trainer.py:522] step:   147 fraction_of_correct_next_step_preds:0.39410341 fraction_of_correct_next_step_preds/logits:0.39410341 grad_norm/all:1.6320084 grad_scale_all:0.61274195 log_pplx:2.002454 log_pplx/logits:2.002454 loss:2.002454 loss/logits:2.002454 num_samples_in_batch:256 var_norm/all:703.40051
I0420 07:50:45.980564 140375142418176 trainer.py:371] Steps/second: 0.127391, Examples/second: 17.304445
I0420 07:50:48.851433 140375134025472 trainer.py:511] time: 8.377389
I0420 07:50:48.852555 140375134025472 trainer.py:522] step:   148 fraction_of_correct_next_step_preds:0.41687503 fraction_of_correct_next_step_preds/logits:0.41687503 grad_norm/all:1.9557862 grad_scale_all:0.51130331 log_pplx:1.9490486 log_pplx/logits:1.9490486 loss:1.9490486 loss/logits:1.9490486 num_samples_in_batch:128 var_norm/all:703.3891
I0420 07:50:55.991750 140375142418176 trainer.py:371] Steps/second: 0.127155, Examples/second: 17.265579
I0420 07:50:57.909780 140375134025472 trainer.py:511] time: 9.057049
I0420 07:50:57.910986 140375134025472 trainer.py:522] step:   149 fraction_of_correct_next_step_preds:0.42624637 fraction_of_correct_next_step_preds/logits:0.42624637 grad_norm/all:1.7976826 grad_scale_all:0.55627173 log_pplx:1.9306651 log_pplx/logits:1.9306651 loss:1.9306651 loss/logits:1.9306651 num_samples_in_batch:128 var_norm/all:703.37805
I0420 07:51:05.706504 140375134025472 trainer.py:511] time: 7.795318
I0420 07:51:05.707679 140375134025472 trainer.py:522] step:   150 fraction_of_correct_next_step_preds:0.41418889 fraction_of_correct_next_step_preds/logits:0.41418889 grad_norm/all:1.5489297 grad_scale_all:0.64560711 log_pplx:1.9437562 log_pplx/logits:1.9437562 loss:1.9437562 loss/logits:1.9437562 num_samples_in_batch:128 var_norm/all:703.36743
I0420 07:51:06.000524 140375142418176 trainer.py:371] Steps/second: 0.127774, Examples/second: 17.336444
I0420 07:51:06.001041 140375142418176 trainer.py:268] Save checkpoint
W0420 07:51:08.298032 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'dict' object has no attribute 'name'
W0420 07:51:08.298470 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'list' object has no attribute 'name'
I0420 07:51:08.496934 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000150
I0420 07:51:12.993555 140375134025472 trainer.py:511] time: 7.285614
I0420 07:51:12.994955 140375134025472 trainer.py:522] step:   151 fraction_of_correct_next_step_preds:0.41762453 fraction_of_correct_next_step_preds/logits:0.41762453 grad_norm/all:1.4242008 grad_scale_all:0.7021482 log_pplx:1.9302332 log_pplx/logits:1.9302332 loss:1.9302332 loss/logits:1.9302332 num_samples_in_batch:128 var_norm/all:703.35681
I0420 07:51:16.007457 140375142418176 trainer.py:371] Steps/second: 0.127539, Examples/second: 17.298026
I0420 07:51:18.698502 140375134025472 trainer.py:511] time: 5.703286
I0420 07:51:18.699486 140375134025472 trainer.py:522] step:   152 fraction_of_correct_next_step_preds:0.42245334 fraction_of_correct_next_step_preds/logits:0.42245334 grad_norm/all:1.4734817 grad_scale_all:0.67866468 log_pplx:1.9470966 log_pplx/logits:1.9470966 loss:1.9470966 loss/logits:1.9470966 num_samples_in_batch:128 var_norm/all:703.34631
I0420 07:51:26.018496 140375142418176 trainer.py:371] Steps/second: 0.127307, Examples/second: 17.260195
I0420 07:51:26.805413 140375134025472 trainer.py:511] time: 8.105734
I0420 07:51:26.806492 140375134025472 trainer.py:522] step:   153 fraction_of_correct_next_step_preds:0.41691357 fraction_of_correct_next_step_preds/logits:0.41691357 grad_norm/all:1.4528413 grad_scale_all:0.68830645 log_pplx:1.9422051 log_pplx/logits:1.9422051 loss:1.9422051 loss/logits:1.9422051 num_samples_in_batch:128 var_norm/all:703.33582
I0420 07:51:33.581444 140375134025472 trainer.py:511] time: 6.774749
I0420 07:51:33.582201 140375134025472 trainer.py:522] step:   154 fraction_of_correct_next_step_preds:0.42445534 fraction_of_correct_next_step_preds/logits:0.42445534 grad_norm/all:1.031893 grad_scale_all:0.96909273 log_pplx:1.9087092 log_pplx/logits:1.9087092 loss:1.9087092 loss/logits:1.9087092 num_samples_in_batch:128 var_norm/all:703.32538
I0420 07:51:36.028353 140375142418176 trainer.py:371] Steps/second: 0.127910, Examples/second: 17.329321
I0420 07:51:41.803100 140375134025472 trainer.py:511] time: 8.220578
I0420 07:51:41.804466 140375134025472 trainer.py:522] step:   155 fraction_of_correct_next_step_preds:0.42430681 fraction_of_correct_next_step_preds/logits:0.42430681 grad_norm/all:1.2035612 grad_scale_all:0.83086759 log_pplx:1.9094154 log_pplx/logits:1.9094154 loss:1.9094154 loss/logits:1.9094154 num_samples_in_batch:128 var_norm/all:703.31458
I0420 07:51:46.037686 140375142418176 trainer.py:371] Steps/second: 0.127679, Examples/second: 17.291878
I0420 07:51:49.656562 140375134025472 trainer.py:511] time: 7.851894
I0420 07:51:49.657808 140375134025472 trainer.py:522] step:   156 fraction_of_correct_next_step_preds:0.43289766 fraction_of_correct_next_step_preds/logits:0.43289766 grad_norm/all:0.75488198 grad_scale_all:1 log_pplx:1.8982086 log_pplx/logits:1.8982086 loss:1.8982086 loss/logits:1.8982086 num_samples_in_batch:128 var_norm/all:703.30353
I0420 07:51:56.048183 140375142418176 trainer.py:371] Steps/second: 0.127452, Examples/second: 17.255033
I0420 07:51:58.760415 140375134025472 trainer.py:511] time: 9.102371
I0420 07:51:58.761346 140375134025472 trainer.py:522] step:   157 fraction_of_correct_next_step_preds:0.43443087 fraction_of_correct_next_step_preds/logits:0.43443087 grad_norm/all:1.1150398 grad_scale_all:0.89682895 log_pplx:1.9025295 log_pplx/logits:1.9025295 loss:1.9025295 loss/logits:1.9025295 num_samples_in_batch:128 var_norm/all:703.29211
I0420 07:52:06.058149 140375142418176 trainer.py:371] Steps/second: 0.127228, Examples/second: 17.218792
I0420 07:52:06.211291 140375134025472 trainer.py:511] time: 7.449661
I0420 07:52:06.212383 140375134025472 trainer.py:522] step:   158 fraction_of_correct_next_step_preds:0.43455517 fraction_of_correct_next_step_preds/logits:0.43455517 grad_norm/all:0.75359613 grad_scale_all:1 log_pplx:1.8875916 log_pplx/logits:1.8875916 loss:1.8875916 loss/logits:1.8875916 num_samples_in_batch:128 var_norm/all:703.28046
I0420 07:52:11.963902 140375134025472 trainer.py:511] time: 5.751313
I0420 07:52:11.964863 140375134025472 trainer.py:522] step:   159 fraction_of_correct_next_step_preds:0.42446634 fraction_of_correct_next_step_preds/logits:0.42446634 grad_norm/all:0.78566986 grad_scale_all:1 log_pplx:1.9137037 log_pplx/logits:1.9137037 loss:1.9137037 loss/logits:1.9137037 num_samples_in_batch:128 var_norm/all:703.26849
I0420 07:52:16.069747 140375142418176 trainer.py:371] Steps/second: 0.127812, Examples/second: 17.286004
I0420 07:52:19.933387 140375134025472 trainer.py:511] time: 7.968307
I0420 07:52:19.934446 140375134025472 trainer.py:522] step:   160 fraction_of_correct_next_step_preds:0.43351826 fraction_of_correct_next_step_preds/logits:0.43351826 grad_norm/all:0.69631839 grad_scale_all:1 log_pplx:1.8786674 log_pplx/logits:1.8786674 loss:1.8786674 loss/logits:1.8786674 num_samples_in_batch:128 var_norm/all:703.2561
I0420 07:52:26.076610 140375142418176 trainer.py:371] Steps/second: 0.127590, Examples/second: 17.250133
I0420 07:52:27.057111 140375134025472 trainer.py:511] time: 7.122404
I0420 07:52:27.058207 140375134025472 trainer.py:522] step:   161 fraction_of_correct_next_step_preds:0.43161297 fraction_of_correct_next_step_preds/logits:0.43161297 grad_norm/all:0.67985362 grad_scale_all:1 log_pplx:1.8794951 log_pplx/logits:1.8794951 loss:1.8794951 loss/logits:1.8794951 num_samples_in_batch:128 var_norm/all:703.24347
I0420 07:52:31.308384 140375134025472 trainer.py:511] time: 4.249896
I0420 07:52:31.309541 140375134025472 trainer.py:522] step:   162 fraction_of_correct_next_step_preds:0.42555854 fraction_of_correct_next_step_preds/logits:0.42555854 grad_norm/all:0.81493938 grad_scale_all:1 log_pplx:1.926751 log_pplx/logits:1.926751 loss:1.926751 loss/logits:1.926751 num_samples_in_batch:256 var_norm/all:703.23053
I0420 07:52:36.087415 140375142418176 trainer.py:371] Steps/second: 0.128162, Examples/second: 17.417308
I0420 07:52:39.672578 140375134025472 trainer.py:511] time: 8.335060
I0420 07:52:39.673749 140375134025472 trainer.py:522] step:   163 fraction_of_correct_next_step_preds:0.4331291 fraction_of_correct_next_step_preds/logits:0.4331291 grad_norm/all:0.95574957 grad_scale_all:1 log_pplx:1.8833166 log_pplx/logits:1.8833166 loss:1.8833166 loss/logits:1.8833166 num_samples_in_batch:128 var_norm/all:703.21741
I0420 07:52:46.096961 140375142418176 trainer.py:371] Steps/second: 0.127940, Examples/second: 17.380935
I0420 07:52:48.769392 140375134025472 trainer.py:511] time: 9.095292
I0420 07:52:48.770503 140375134025472 trainer.py:522] step:   164 fraction_of_correct_next_step_preds:0.43304643 fraction_of_correct_next_step_preds/logits:0.43304643 grad_norm/all:0.83339173 grad_scale_all:1 log_pplx:1.8863426 log_pplx/logits:1.8863426 loss:1.8863426 loss/logits:1.8863426 num_samples_in_batch:128 var_norm/all:703.20416
I0420 07:52:56.107681 140375142418176 trainer.py:371] Steps/second: 0.127721, Examples/second: 17.345115
I0420 07:52:56.502384 140375134025472 trainer.py:511] time: 7.731666
I0420 07:52:56.503449 140375134025472 trainer.py:522] step:   165 fraction_of_correct_next_step_preds:0.43856791 fraction_of_correct_next_step_preds/logits:0.43856791 grad_norm/all:0.69169378 grad_scale_all:1 log_pplx:1.8613528 log_pplx/logits:1.8613528 loss:1.8613528 loss/logits:1.8613528 num_samples_in_batch:128 var_norm/all:703.19061
I0420 07:53:03.854849 140375134025472 trainer.py:511] time: 7.351137
I0420 07:53:03.856317 140375134025472 trainer.py:522] step:   166 fraction_of_correct_next_step_preds:0.43318194 fraction_of_correct_next_step_preds/logits:0.43318194 grad_norm/all:0.88919246 grad_scale_all:1 log_pplx:1.8757007 log_pplx/logits:1.8757007 loss:1.8757007 loss/logits:1.8757007 num_samples_in_batch:128 var_norm/all:703.17694
I0420 07:53:06.116745 140375142418176 trainer.py:371] Steps/second: 0.128279, Examples/second: 17.408784
I0420 07:53:09.463217 140375134025472 trainer.py:511] time: 5.606602
I0420 07:53:09.464037 140375134025472 trainer.py:522] step:   167 fraction_of_correct_next_step_preds:0.43450034 fraction_of_correct_next_step_preds/logits:0.43450034 grad_norm/all:1.098273 grad_scale_all:0.91052037 log_pplx:1.8758377 log_pplx/logits:1.8758377 loss:1.8758377 loss/logits:1.8758377 num_samples_in_batch:128 var_norm/all:703.16309
I0420 07:53:16.127732 140375142418176 trainer.py:371] Steps/second: 0.128061, Examples/second: 17.373296
I0420 07:53:17.562886 140375134025472 trainer.py:511] time: 8.098476
I0420 07:53:17.563997 140375134025472 trainer.py:522] step:   168 fraction_of_correct_next_step_preds:0.43977895 fraction_of_correct_next_step_preds/logits:0.43977895 grad_norm/all:0.50802606 grad_scale_all:1 log_pplx:1.8625075 log_pplx/logits:1.8625075 loss:1.8625075 loss/logits:1.8625075 num_samples_in_batch:128 var_norm/all:703.14923
I0420 07:53:24.470643 140375134025472 trainer.py:511] time: 6.906448
I0420 07:53:24.471700 140375134025472 trainer.py:522] step:   169 fraction_of_correct_next_step_preds:0.43477678 fraction_of_correct_next_step_preds/logits:0.43477678 grad_norm/all:0.91479504 grad_scale_all:1 log_pplx:1.8775673 log_pplx/logits:1.8775673 loss:1.8775673 loss/logits:1.8775673 num_samples_in_batch:128 var_norm/all:703.13519
I0420 07:53:26.137291 140375142418176 trainer.py:371] Steps/second: 0.128607, Examples/second: 17.435774
I0420 07:53:33.559447 140375134025472 trainer.py:511] time: 9.087479
I0420 07:53:33.560532 140375134025472 trainer.py:522] step:   170 fraction_of_correct_next_step_preds:0.44339436 fraction_of_correct_next_step_preds/logits:0.44339436 grad_norm/all:0.80379444 grad_scale_all:1 log_pplx:1.8566017 log_pplx/logits:1.8566017 loss:1.8566017 loss/logits:1.8566017 num_samples_in_batch:128 var_norm/all:703.12097
I0420 07:53:36.147758 140375142418176 trainer.py:371] Steps/second: 0.128390, Examples/second: 17.400624
I0420 07:53:41.228625 140375134025472 trainer.py:511] time: 7.667775
I0420 07:53:41.229818 140375134025472 trainer.py:522] step:   171 fraction_of_correct_next_step_preds:0.44055197 fraction_of_correct_next_step_preds/logits:0.44055197 grad_norm/all:0.43629116 grad_scale_all:1 log_pplx:1.8634899 log_pplx/logits:1.8634899 loss:1.8634899 loss/logits:1.8634899 num_samples_in_batch:128 var_norm/all:703.10669
I0420 07:53:46.150906 140375142418176 trainer.py:371] Steps/second: 0.128177, Examples/second: 17.366098
I0420 07:53:49.603144 140375134025472 trainer.py:511] time: 8.372799
I0420 07:53:49.604515 140375134025472 trainer.py:522] step:   172 fraction_of_correct_next_step_preds:0.4352791 fraction_of_correct_next_step_preds/logits:0.4352791 grad_norm/all:0.70247895 grad_scale_all:1 log_pplx:1.863916 log_pplx/logits:1.863916 loss:1.863916 loss/logits:1.863916 num_samples_in_batch:128 var_norm/all:703.09222
I0420 07:53:56.157335 140375142418176 trainer.py:371] Steps/second: 0.127967, Examples/second: 17.332044
I0420 07:53:56.834737 140375134025472 trainer.py:511] time: 7.229920
I0420 07:53:56.835755 140375134025472 trainer.py:522] step:   173 fraction_of_correct_next_step_preds:0.43851206 fraction_of_correct_next_step_preds/logits:0.43851206 grad_norm/all:0.53970212 grad_scale_all:1 log_pplx:1.8514322 log_pplx/logits:1.8514322 loss:1.8514322 loss/logits:1.8514322 num_samples_in_batch:128 var_norm/all:703.07758
I0420 07:54:03.801841 140375134025472 trainer.py:511] time: 6.965903
I0420 07:54:03.802872 140375134025472 trainer.py:522] step:   174 fraction_of_correct_next_step_preds:0.43447852 fraction_of_correct_next_step_preds/logits:0.43447852 grad_norm/all:0.76130515 grad_scale_all:1 log_pplx:1.8808705 log_pplx/logits:1.8808705 loss:1.8808705 loss/logits:1.8808705 num_samples_in_batch:128 var_norm/all:703.06281
I0420 07:54:06.168541 140375142418176 trainer.py:371] Steps/second: 0.128498, Examples/second: 17.392959
I0420 07:54:09.511055 140375134025472 trainer.py:511] time: 5.707862
I0420 07:54:09.512670 140375134025472 trainer.py:522] step:   175 fraction_of_correct_next_step_preds:0.43812004 fraction_of_correct_next_step_preds/logits:0.43812004 grad_norm/all:0.58219862 grad_scale_all:1 log_pplx:1.8744855 log_pplx/logits:1.8744855 loss:1.8744855 loss/logits:1.8744855 num_samples_in_batch:128 var_norm/all:703.04791
I0420 07:54:16.176268 140375142418176 trainer.py:371] Steps/second: 0.128288, Examples/second: 17.359190
I0420 07:54:18.592397 140375134025472 trainer.py:511] time: 9.079428
I0420 07:54:18.593550 140375134025472 trainer.py:522] step:   176 fraction_of_correct_next_step_preds:0.44414341 fraction_of_correct_next_step_preds/logits:0.44414341 grad_norm/all:0.6023612 grad_scale_all:1 log_pplx:1.8393581 log_pplx/logits:1.8393581 loss:1.8393581 loss/logits:1.8393581 num_samples_in_batch:128 var_norm/all:703.03296
I0420 07:54:26.179130 140375142418176 trainer.py:371] Steps/second: 0.128082, Examples/second: 17.325976
I0420 07:54:26.692585 140375134025472 trainer.py:511] time: 8.098862
I0420 07:54:26.693692 140375134025472 trainer.py:522] step:   177 fraction_of_correct_next_step_preds:0.44614813 fraction_of_correct_next_step_preds/logits:0.44614813 grad_norm/all:0.41581571 grad_scale_all:1 log_pplx:1.8420435 log_pplx/logits:1.8420435 loss:1.8420435 loss/logits:1.8420435 num_samples_in_batch:128 var_norm/all:703.01788
I0420 07:54:34.614481 140375134025472 trainer.py:511] time: 7.920510
I0420 07:54:34.617162 140375134025472 trainer.py:522] step:   178 fraction_of_correct_next_step_preds:0.44184723 fraction_of_correct_next_step_preds/logits:0.44184723 grad_norm/all:0.62634295 grad_scale_all:1 log_pplx:1.8419659 log_pplx/logits:1.8419659 loss:1.8419659 loss/logits:1.8419659 num_samples_in_batch:128 var_norm/all:703.00269
I0420 07:54:36.192054 140375142418176 trainer.py:371] Steps/second: 0.128600, Examples/second: 17.478068
I0420 07:54:38.892163 140375134025472 trainer.py:511] time: 4.274740
I0420 07:54:38.893187 140375134025472 trainer.py:522] step:   179 fraction_of_correct_next_step_preds:0.43799087 fraction_of_correct_next_step_preds/logits:0.43799087 grad_norm/all:0.61439598 grad_scale_all:1 log_pplx:1.8842611 log_pplx/logits:1.8842611 loss:1.8842611 loss/logits:1.8842611 num_samples_in_batch:256 var_norm/all:702.98743
I0420 07:54:46.200571 140375142418176 trainer.py:371] Steps/second: 0.128394, Examples/second: 17.444407
I0420 07:54:47.041579 140375134025472 trainer.py:511] time: 8.148187
I0420 07:54:47.042654 140375134025472 trainer.py:522] step:   180 fraction_of_correct_next_step_preds:0.44396985 fraction_of_correct_next_step_preds/logits:0.44396985 grad_norm/all:0.8471067 grad_scale_all:1 log_pplx:1.8453927 log_pplx/logits:1.8453927 loss:1.8453927 loss/logits:1.8453927 num_samples_in_batch:128 var_norm/all:702.97211
I0420 07:54:54.404463 140375134025472 trainer.py:511] time: 7.361601
I0420 07:54:54.405478 140375134025472 trainer.py:522] step:   181 fraction_of_correct_next_step_preds:0.44369069 fraction_of_correct_next_step_preds/logits:0.44369069 grad_norm/all:0.48298407 grad_scale_all:1 log_pplx:1.8413055 log_pplx/logits:1.8413055 loss:1.8413055 loss/logits:1.8413055 num_samples_in_batch:128 var_norm/all:702.95679
I0420 07:54:56.209449 140375142418176 trainer.py:371] Steps/second: 0.128903, Examples/second: 17.502377
I0420 07:55:01.492369 140375134025472 trainer.py:511] time: 7.086571
I0420 07:55:01.493474 140375134025472 trainer.py:522] step:   182 fraction_of_correct_next_step_preds:0.44348097 fraction_of_correct_next_step_preds/logits:0.44348097 grad_norm/all:0.70817178 grad_scale_all:1 log_pplx:1.8433393 log_pplx/logits:1.8433393 loss:1.8433393 loss/logits:1.8433393 num_samples_in_batch:128 var_norm/all:702.94135
I0420 07:55:06.219089 140375142418176 trainer.py:371] Steps/second: 0.128698, Examples/second: 17.469006
I0420 07:55:10.661694 140375134025472 trainer.py:511] time: 9.167789
I0420 07:55:10.662916 140375134025472 trainer.py:522] step:   183 fraction_of_correct_next_step_preds:0.44200012 fraction_of_correct_next_step_preds/logits:0.44200012 grad_norm/all:0.78830743 grad_scale_all:1 log_pplx:1.8404706 log_pplx/logits:1.8404706 loss:1.8404706 loss/logits:1.8404706 num_samples_in_batch:128 var_norm/all:702.92584
I0420 07:55:16.229026 140375142418176 trainer.py:371] Steps/second: 0.128496, Examples/second: 17.436100
I0420 07:55:16.485204 140375134025472 trainer.py:511] time: 5.821826
I0420 07:55:16.486191 140375134025472 trainer.py:522] step:   184 fraction_of_correct_next_step_preds:0.44723406 fraction_of_correct_next_step_preds/logits:0.44723406 grad_norm/all:0.84321076 grad_scale_all:1 log_pplx:1.8484147 log_pplx/logits:1.8484147 loss:1.8484147 loss/logits:1.8484147 num_samples_in_batch:128 var_norm/all:702.91028
I0420 07:55:24.522628 140375134025472 trainer.py:511] time: 8.036144
I0420 07:55:24.524127 140375134025472 trainer.py:522] step:   185 fraction_of_correct_next_step_preds:0.44457987 fraction_of_correct_next_step_preds/logits:0.44457987 grad_norm/all:0.94873601 grad_scale_all:1 log_pplx:1.8245742 log_pplx/logits:1.8245742 loss:1.8245742 loss/logits:1.8245742 num_samples_in_batch:128 var_norm/all:702.89471
I0420 07:55:26.238769 140375142418176 trainer.py:371] Steps/second: 0.128993, Examples/second: 17.492905
I0420 07:55:32.738296 140375134025472 trainer.py:511] time: 8.213897
I0420 07:55:32.740339 140375134025472 trainer.py:522] step:   186 fraction_of_correct_next_step_preds:0.4430176 fraction_of_correct_next_step_preds/logits:0.4430176 grad_norm/all:0.93333519 grad_scale_all:1 log_pplx:1.8442669 log_pplx/logits:1.8442669 loss:1.8442669 loss/logits:1.8442669 num_samples_in_batch:128 var_norm/all:702.87897
I0420 07:55:36.248698 140375142418176 trainer.py:371] Steps/second: 0.128792, Examples/second: 17.460290
I0420 07:55:40.608198 140375134025472 trainer.py:511] time: 7.867668
I0420 07:55:40.609160 140375134025472 trainer.py:522] step:   187 fraction_of_correct_next_step_preds:0.44213852 fraction_of_correct_next_step_preds/logits:0.44213852 grad_norm/all:0.77158284 grad_scale_all:1 log_pplx:1.8315784 log_pplx/logits:1.8315784 loss:1.8315784 loss/logits:1.8315784 num_samples_in_batch:128 var_norm/all:702.86328
I0420 07:55:46.251959 140375142418176 trainer.py:371] Steps/second: 0.128594, Examples/second: 17.428206
I0420 07:55:48.021662 140375134025472 trainer.py:511] time: 7.412346
I0420 07:55:48.022475 140375134025472 trainer.py:522] step:   188 fraction_of_correct_next_step_preds:0.4498955 fraction_of_correct_next_step_preds/logits:0.4498955 grad_norm/all:0.72052014 grad_scale_all:1 log_pplx:1.8241475 log_pplx/logits:1.8241475 loss:1.8241475 loss/logits:1.8241475 num_samples_in_batch:128 var_norm/all:702.84747
I0420 07:55:56.254241 140375142418176 trainer.py:371] Steps/second: 0.128398, Examples/second: 17.396567
I0420 07:55:57.268708 140375134025472 trainer.py:511] time: 9.246066
I0420 07:55:57.269650 140375134025472 trainer.py:522] step:   189 fraction_of_correct_next_step_preds:0.45732975 fraction_of_correct_next_step_preds/logits:0.45732975 grad_norm/all:0.61174208 grad_scale_all:1 log_pplx:1.8068582 log_pplx/logits:1.8068582 loss:1.8068582 loss/logits:1.8068582 num_samples_in_batch:128 var_norm/all:702.8316
I0420 07:56:03.981911 140375134025472 trainer.py:511] time: 6.711774
I0420 07:56:03.983181 140375134025472 trainer.py:522] step:   190 fraction_of_correct_next_step_preds:0.44194993 fraction_of_correct_next_step_preds/logits:0.44194993 grad_norm/all:0.84591258 grad_scale_all:1 log_pplx:1.8346351 log_pplx/logits:1.8346351 loss:1.8346351 loss/logits:1.8346351 num_samples_in_batch:128 var_norm/all:702.81567
I0420 07:56:06.265222 140375142418176 trainer.py:371] Steps/second: 0.128883, Examples/second: 17.452085
I0420 07:56:09.590390 140375134025472 trainer.py:511] time: 5.606873
I0420 07:56:09.591136 140375134025472 trainer.py:522] step:   191 fraction_of_correct_next_step_preds:0.44372281 fraction_of_correct_next_step_preds/logits:0.44372281 grad_norm/all:0.93953317 grad_scale_all:1 log_pplx:1.8477401 log_pplx/logits:1.8477401 loss:1.8477401 loss/logits:1.8477401 num_samples_in_batch:128 var_norm/all:702.79968
I0420 07:56:16.276658 140375142418176 trainer.py:371] Steps/second: 0.128687, Examples/second: 17.420608
I0420 07:56:17.942615 140375134025472 trainer.py:511] time: 8.351234
I0420 07:56:17.943763 140375134025472 trainer.py:522] step:   192 fraction_of_correct_next_step_preds:0.44863379 fraction_of_correct_next_step_preds/logits:0.44863379 grad_norm/all:0.96895403 grad_scale_all:1 log_pplx:1.819329 log_pplx/logits:1.819329 loss:1.819329 loss/logits:1.819329 num_samples_in_batch:128 var_norm/all:702.78369
I0420 07:56:26.073008 140375134025472 trainer.py:511] time: 8.128952
I0420 07:56:26.074163 140375134025472 trainer.py:522] step:   193 fraction_of_correct_next_step_preds:0.44438726 fraction_of_correct_next_step_preds/logits:0.44438726 grad_norm/all:0.88765609 grad_scale_all:1 log_pplx:1.83139 log_pplx/logits:1.83139 loss:1.83139 loss/logits:1.83139 num_samples_in_batch:128 var_norm/all:702.76752
I0420 07:56:26.339052 140375142418176 trainer.py:371] Steps/second: 0.129159, Examples/second: 17.474617
I0420 07:56:33.424146 140375134025472 trainer.py:511] time: 7.349628
I0420 07:56:33.425277 140375134025472 trainer.py:522] step:   194 fraction_of_correct_next_step_preds:0.45994666 fraction_of_correct_next_step_preds/logits:0.45994666 grad_norm/all:0.60090661 grad_scale_all:1 log_pplx:1.7978886 log_pplx/logits:1.7978886 loss:1.7978886 loss/logits:1.7978886 num_samples_in_batch:128 var_norm/all:702.7514
I0420 07:56:36.295216 140375142418176 trainer.py:371] Steps/second: 0.128969, Examples/second: 17.444049
I0420 07:56:41.154331 140375134025472 trainer.py:511] time: 7.728805
I0420 07:56:41.155858 140375134025472 trainer.py:522] step:   195 fraction_of_correct_next_step_preds:0.4580164 fraction_of_correct_next_step_preds/logits:0.4580164 grad_norm/all:0.74558669 grad_scale_all:1 log_pplx:1.7964088 log_pplx/logits:1.7964088 loss:1.7964088 loss/logits:1.7964088 num_samples_in_batch:128 var_norm/all:702.73523
I0420 07:56:45.528053 140375134025472 trainer.py:511] time: 4.371963
I0420 07:56:45.528832 140375134025472 trainer.py:522] step:   196 fraction_of_correct_next_step_preds:0.44655734 fraction_of_correct_next_step_preds/logits:0.44655734 grad_norm/all:0.8231467 grad_scale_all:1 log_pplx:1.8481317 log_pplx/logits:1.8481317 loss:1.8481317 loss/logits:1.8481317 num_samples_in_batch:256 var_norm/all:702.71906
I0420 07:56:46.304059 140375142418176 trainer.py:371] Steps/second: 0.129437, Examples/second: 17.582339
I0420 07:56:54.434890 140375134025472 trainer.py:511] time: 8.905689
I0420 07:56:54.436141 140375134025472 trainer.py:522] step:   197 fraction_of_correct_next_step_preds:0.45484895 fraction_of_correct_next_step_preds/logits:0.45484895 grad_norm/all:0.79258645 grad_scale_all:1 log_pplx:1.8016788 log_pplx/logits:1.8016788 loss:1.8016788 loss/logits:1.8016788 num_samples_in_batch:128 var_norm/all:702.70276
I0420 07:56:56.315849 140375142418176 trainer.py:371] Steps/second: 0.129243, Examples/second: 17.550831
I0420 07:57:01.329133 140375134025472 trainer.py:511] time: 6.892796
I0420 07:57:01.330216 140375134025472 trainer.py:522] step:   198 fraction_of_correct_next_step_preds:0.44605467 fraction_of_correct_next_step_preds/logits:0.44605467 grad_norm/all:1.2284788 grad_scale_all:0.81401485 log_pplx:1.8213179 log_pplx/logits:1.8213179 loss:1.8213179 loss/logits:1.8213179 num_samples_in_batch:128 var_norm/all:702.68652
I0420 07:57:06.324517 140375142418176 trainer.py:371] Steps/second: 0.129052, Examples/second: 17.519764
I0420 07:57:09.592889 140375134025472 trainer.py:511] time: 8.262392
I0420 07:57:09.595169 140375134025472 trainer.py:522] step:   199 fraction_of_correct_next_step_preds:0.45952439 fraction_of_correct_next_step_preds/logits:0.45952439 grad_norm/all:0.93626809 grad_scale_all:1 log_pplx:1.807832 log_pplx/logits:1.807832 loss:1.807832 loss/logits:1.807832 num_samples_in_batch:128 var_norm/all:702.67047
I0420 07:57:15.281085 140375134025472 trainer.py:511] time: 5.685653
I0420 07:57:15.282196 140375134025472 trainer.py:522] step:   200 fraction_of_correct_next_step_preds:0.45597693 fraction_of_correct_next_step_preds/logits:0.45597693 grad_norm/all:1.0720363 grad_scale_all:0.93280429 log_pplx:1.8038468 log_pplx/logits:1.8038468 loss:1.8038468 loss/logits:1.8038468 num_samples_in_batch:128 var_norm/all:702.65454
I0420 07:57:16.325232 140375142418176 trainer.py:371] Steps/second: 0.129511, Examples/second: 17.572081
I0420 07:57:23.514617 140375134025472 trainer.py:511] time: 8.014513
I0420 07:57:23.515427 140375134025472 base_runner.py:115] step:   201 fraction_of_correct_next_step_preds:0.46213683 fraction_of_correct_next_step_preds/logits:0.46213683 grad_norm/all:0.51647431 grad_scale_all:1 log_pplx:1.775112 log_pplx/logits:1.775112 loss:1.775112 loss/logits:1.775112 num_samples_in_batch:128 var_norm/all:702.63855
I0420 07:57:26.336309 140375142418176 trainer.py:371] Steps/second: 0.129320, Examples/second: 17.541254
I0420 07:57:26.337402 140375142418176 trainer.py:275] Write summary @201
2019-04-20 07:57:26.341258: I lingvo/core/ops/record_batcher.cc:344] 1537 total seconds passed. Total records yielded: 304. Total records skipped: 0
I0420 07:57:32.520065 140375134025472 trainer.py:511] time: 9.004160
I0420 07:57:32.524442 140375134025472 trainer.py:522] step:   202 fraction_of_correct_next_step_preds:0.44965822 fraction_of_correct_next_step_preds/logits:0.44965822 grad_norm/all:0.92829382 grad_scale_all:1 log_pplx:1.8050215 log_pplx/logits:1.8050215 loss:1.8050215 loss/logits:1.8050215 num_samples_in_batch:128 var_norm/all:702.62244
I0420 07:57:42.691868 140375134025472 trainer.py:511] time: 10.166058
I0420 07:57:42.693574 140375134025472 trainer.py:522] step:   203 fraction_of_correct_next_step_preds:0.45663935 fraction_of_correct_next_step_preds/logits:0.45663935 grad_norm/all:0.89732426 grad_scale_all:1 log_pplx:1.7954292 log_pplx/logits:1.7954292 loss:1.7954292 loss/logits:1.7954292 num_samples_in_batch:128 var_norm/all:702.60632
I0420 07:57:56.001332 140375134025472 trainer.py:511] time: 13.307436
I0420 07:57:56.002633 140375134025472 trainer.py:522] step:   204 fraction_of_correct_next_step_preds:0.46008012 fraction_of_correct_next_step_preds/logits:0.46008012 grad_norm/all:0.80453771 grad_scale_all:1 log_pplx:1.7907335 log_pplx/logits:1.7907335 loss:1.7907335 loss/logits:1.7907335 num_samples_in_batch:128 var_norm/all:702.59015
I0420 07:58:06.379566 140375134025472 trainer.py:511] time: 10.376438
I0420 07:58:06.381184 140375134025472 trainer.py:522] step:   205 fraction_of_correct_next_step_preds:0.45583883 fraction_of_correct_next_step_preds/logits:0.45583883 grad_norm/all:0.98260665 grad_scale_all:1 log_pplx:1.7955699 log_pplx/logits:1.7955699 loss:1.7955699 loss/logits:1.7955699 num_samples_in_batch:128 var_norm/all:702.57391
I0420 07:58:11.054907 140375142418176 trainer.py:284] Write summary done: step 201
I0420 07:58:11.064054 140375142418176 base_runner.py:115] step:   201, steps/sec: 0.13, examples/sec: 17.54
I0420 07:58:11.067320 140375142418176 trainer.py:371] Steps/second: 0.128204, Examples/second: 17.370748
I0420 07:58:16.056494 140375134025472 trainer.py:511] time: 9.675025
I0420 07:58:16.057492 140375134025472 trainer.py:522] step:   206 fraction_of_correct_next_step_preds:0.46253374 fraction_of_correct_next_step_preds/logits:0.46253374 grad_norm/all:0.79528224 grad_scale_all:1 log_pplx:1.7718325 log_pplx/logits:1.7718325 loss:1.7718325 loss/logits:1.7718325 num_samples_in_batch:128 var_norm/all:702.55756
I0420 07:58:21.075256 140375142418176 trainer.py:371] Steps/second: 0.128028, Examples/second: 17.342256
I0420 07:58:21.783164 140375134025472 trainer.py:511] time: 5.725306
I0420 07:58:21.784272 140375134025472 trainer.py:522] step:   207 fraction_of_correct_next_step_preds:0.45435455 fraction_of_correct_next_step_preds/logits:0.45435455 grad_norm/all:0.87439638 grad_scale_all:1 log_pplx:1.8060277 log_pplx/logits:1.8060277 loss:1.8060277 loss/logits:1.8060277 num_samples_in_batch:128 var_norm/all:702.54114
I0420 07:58:29.925393 140375134025472 trainer.py:511] time: 8.140813
I0420 07:58:29.926698 140375134025472 trainer.py:522] step:   208 fraction_of_correct_next_step_preds:0.46172714 fraction_of_correct_next_step_preds/logits:0.46172714 grad_norm/all:0.7580263 grad_scale_all:1 log_pplx:1.7839309 log_pplx/logits:1.7839309 loss:1.7839309 loss/logits:1.7839309 num_samples_in_batch:128 var_norm/all:702.52466
I0420 07:58:31.081680 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.393191
I0420 07:58:37.610832 140375134025472 trainer.py:511] time: 7.683848
I0420 07:58:37.611921 140375134025472 trainer.py:522] step:   209 fraction_of_correct_next_step_preds:0.45912775 fraction_of_correct_next_step_preds/logits:0.45912775 grad_norm/all:0.59735614 grad_scale_all:1 log_pplx:1.7847834 log_pplx/logits:1.7847834 loss:1.7847834 loss/logits:1.7847834 num_samples_in_batch:128 var_norm/all:702.50818
I0420 07:58:41.092617 140375142418176 trainer.py:371] Steps/second: 0.128297, Examples/second: 17.364879
I0420 07:58:45.093353 140375134025472 trainer.py:511] time: 7.481086
I0420 07:58:45.094466 140375134025472 trainer.py:522] step:   210 fraction_of_correct_next_step_preds:0.45871964 fraction_of_correct_next_step_preds/logits:0.45871964 grad_norm/all:0.63530314 grad_scale_all:1 log_pplx:1.7760886 log_pplx/logits:1.7760886 loss:1.7760886 loss/logits:1.7760886 num_samples_in_batch:128 var_norm/all:702.49158
I0420 07:58:51.101516 140375142418176 trainer.py:371] Steps/second: 0.128123, Examples/second: 17.336934
I0420 07:58:54.138889 140375134025472 trainer.py:511] time: 9.044044
I0420 07:58:54.139625 140375134025472 trainer.py:522] step:   211 fraction_of_correct_next_step_preds:0.4615027 fraction_of_correct_next_step_preds/logits:0.4615027 grad_norm/all:0.5618149 grad_scale_all:1 log_pplx:1.7774103 log_pplx/logits:1.7774103 loss:1.7774103 loss/logits:1.7774103 num_samples_in_batch:128 var_norm/all:702.47498
I0420 07:59:01.086863 140375134025472 trainer.py:511] time: 6.946899
I0420 07:59:01.088413 140375134025472 trainer.py:522] step:   212 fraction_of_correct_next_step_preds:0.45972911 fraction_of_correct_next_step_preds/logits:0.45972911 grad_norm/all:0.69746411 grad_scale_all:1 log_pplx:1.7848744 log_pplx/logits:1.7848744 loss:1.7848744 loss/logits:1.7848744 num_samples_in_batch:128 var_norm/all:702.45831
I0420 07:59:01.174531 140375142418176 trainer.py:371] Steps/second: 0.128554, Examples/second: 17.308655
I0420 07:59:09.226310 140375134025472 trainer.py:511] time: 8.137713
I0420 07:59:09.227124 140375134025472 trainer.py:522] step:   213 fraction_of_correct_next_step_preds:0.46676597 fraction_of_correct_next_step_preds/logits:0.46676597 grad_norm/all:0.57188773 grad_scale_all:1 log_pplx:1.761503 log_pplx/logits:1.761503 loss:1.761503 loss/logits:1.761503 num_samples_in_batch:128 var_norm/all:702.44159
I0420 07:59:11.122014 140375142418176 trainer.py:371] Steps/second: 0.128386, Examples/second: 17.436331
I0420 07:59:13.427846 140375134025472 trainer.py:511] time: 4.200434
I0420 07:59:13.428952 140375134025472 trainer.py:522] step:   214 fraction_of_correct_next_step_preds:0.44703782 fraction_of_correct_next_step_preds/logits:0.44703782 grad_norm/all:0.80825353 grad_scale_all:1 log_pplx:1.8188372 log_pplx/logits:1.8188372 loss:1.8188372 loss/logits:1.8188372 num_samples_in_batch:256 var_norm/all:702.42468
I0420 07:59:19.056699 140375134025472 trainer.py:511] time: 5.627521
I0420 07:59:19.057606 140375134025472 trainer.py:522] step:   215 fraction_of_correct_next_step_preds:0.45846134 fraction_of_correct_next_step_preds/logits:0.45846134 grad_norm/all:0.78043801 grad_scale_all:1 log_pplx:1.7941139 log_pplx/logits:1.7941139 loss:1.7941139 loss/logits:1.7941139 num_samples_in_batch:128 var_norm/all:702.4079
I0420 07:59:21.130456 140375142418176 trainer.py:371] Steps/second: 0.128814, Examples/second: 17.485154
I0420 07:59:27.227977 140375134025472 trainer.py:511] time: 8.170123
I0420 07:59:27.229242 140375134025472 trainer.py:522] step:   216 fraction_of_correct_next_step_preds:0.46632123 fraction_of_correct_next_step_preds/logits:0.46632123 grad_norm/all:1.0154312 grad_scale_all:0.98480332 log_pplx:1.7582886 log_pplx/logits:1.7582886 loss:1.7582886 loss/logits:1.7582886 num_samples_in_batch:128 var_norm/all:702.39105
I0420 07:59:31.131680 140375142418176 trainer.py:371] Steps/second: 0.128642, Examples/second: 17.457238
I0420 07:59:35.057662 140375134025472 trainer.py:511] time: 7.828189
I0420 07:59:35.058621 140375134025472 trainer.py:522] step:   217 fraction_of_correct_next_step_preds:0.46656239 fraction_of_correct_next_step_preds/logits:0.46656239 grad_norm/all:0.84083319 grad_scale_all:1 log_pplx:1.7765833 log_pplx/logits:1.7765833 loss:1.7765833 loss/logits:1.7765833 num_samples_in_batch:128 var_norm/all:702.37415
I0420 07:59:41.134951 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.429633
I0420 07:59:42.318872 140375134025472 trainer.py:511] time: 7.260002
I0420 07:59:42.319998 140375134025472 trainer.py:522] step:   218 fraction_of_correct_next_step_preds:0.46115926 fraction_of_correct_next_step_preds/logits:0.46115926 grad_norm/all:0.79595202 grad_scale_all:1 log_pplx:1.7707779 log_pplx/logits:1.7707779 loss:1.7707779 loss/logits:1.7707779 num_samples_in_batch:128 var_norm/all:702.35724
I0420 07:59:51.163680 140375142418176 trainer.py:371] Steps/second: 0.128303, Examples/second: 17.402101
I0420 07:59:51.239382 140375134025472 trainer.py:511] time: 8.919143
I0420 07:59:51.240328 140375134025472 trainer.py:522] step:   219 fraction_of_correct_next_step_preds:0.47008803 fraction_of_correct_next_step_preds/logits:0.47008803 grad_norm/all:0.54538399 grad_scale_all:1 log_pplx:1.7490183 log_pplx/logits:1.7490183 loss:1.7490183 loss/logits:1.7490183 num_samples_in_batch:128 var_norm/all:702.34021
I0420 07:59:58.012236 140375134025472 trainer.py:511] time: 6.771603
I0420 07:59:58.013267 140375134025472 trainer.py:522] step:   220 fraction_of_correct_next_step_preds:0.46866179 fraction_of_correct_next_step_preds/logits:0.46866179 grad_norm/all:0.91434407 grad_scale_all:1 log_pplx:1.7713562 log_pplx/logits:1.7713562 loss:1.7713562 loss/logits:1.7713562 num_samples_in_batch:128 var_norm/all:702.32318
I0420 08:00:01.152885 140375142418176 trainer.py:371] Steps/second: 0.128723, Examples/second: 17.450165
I0420 08:00:03.786756 140375134025472 trainer.py:511] time: 5.773228
I0420 08:00:03.788918 140375134025472 trainer.py:522] step:   221 fraction_of_correct_next_step_preds:0.45591307 fraction_of_correct_next_step_preds/logits:0.45591307 grad_norm/all:0.75105029 grad_scale_all:1 log_pplx:1.7882202 log_pplx/logits:1.7882202 loss:1.7882202 loss/logits:1.7882202 num_samples_in_batch:128 var_norm/all:702.30609
I0420 08:00:11.162324 140375142418176 trainer.py:371] Steps/second: 0.128555, Examples/second: 17.423020
I0420 08:00:11.915620 140375134025472 trainer.py:511] time: 8.126472
I0420 08:00:11.916634 140375134025472 trainer.py:522] step:   222 fraction_of_correct_next_step_preds:0.46780324 fraction_of_correct_next_step_preds/logits:0.46780324 grad_norm/all:0.84238148 grad_scale_all:1 log_pplx:1.7543784 log_pplx/logits:1.7543784 loss:1.7543784 loss/logits:1.7543784 num_samples_in_batch:128 var_norm/all:702.28894
I0420 08:00:20.346482 140375134025472 trainer.py:511] time: 8.429596
I0420 08:00:20.348227 140375134025472 trainer.py:522] step:   223 fraction_of_correct_next_step_preds:0.4653933 fraction_of_correct_next_step_preds/logits:0.4653933 grad_norm/all:0.6992746 grad_scale_all:1 log_pplx:1.7581142 log_pplx/logits:1.7581142 loss:1.7581142 loss/logits:1.7581142 num_samples_in_batch:128 var_norm/all:702.27179
I0420 08:00:21.172450 140375142418176 trainer.py:371] Steps/second: 0.128968, Examples/second: 17.470208
I0420 08:00:28.055938 140375134025472 trainer.py:511] time: 7.707268
I0420 08:00:28.056710 140375134025472 trainer.py:522] step:   224 fraction_of_correct_next_step_preds:0.46492544 fraction_of_correct_next_step_preds/logits:0.46492544 grad_norm/all:0.73407185 grad_scale_all:1 log_pplx:1.7659955 log_pplx/logits:1.7659955 loss:1.7659955 loss/logits:1.7659955 num_samples_in_batch:128 var_norm/all:702.25458
I0420 08:00:31.183248 140375142418176 trainer.py:371] Steps/second: 0.128800, Examples/second: 17.443245
I0420 08:00:37.036247 140375134025472 trainer.py:511] time: 8.979163
I0420 08:00:37.037311 140375134025472 trainer.py:522] step:   225 fraction_of_correct_next_step_preds:0.47491661 fraction_of_correct_next_step_preds/logits:0.47491661 grad_norm/all:0.80260926 grad_scale_all:1 log_pplx:1.7374296 log_pplx/logits:1.7374296 loss:1.7374296 loss/logits:1.7374296 num_samples_in_batch:128 var_norm/all:702.23724
I0420 08:00:41.191685 140375142418176 trainer.py:371] Steps/second: 0.128635, Examples/second: 17.416615
I0420 08:00:44.467958 140375134025472 trainer.py:511] time: 7.430286
I0420 08:00:44.468951 140375134025472 trainer.py:522] step:   226 fraction_of_correct_next_step_preds:0.46759757 fraction_of_correct_next_step_preds/logits:0.46759757 grad_norm/all:0.47861227 grad_scale_all:1 log_pplx:1.756492 log_pplx/logits:1.756492 loss:1.756492 loss/logits:1.756492 num_samples_in_batch:128 var_norm/all:702.21997
I0420 08:00:51.193407 140375142418176 trainer.py:371] Steps/second: 0.128472, Examples/second: 17.390355
I0420 08:00:52.937412 140375134025472 trainer.py:511] time: 8.468178
I0420 08:00:52.938375 140375134025472 trainer.py:522] step:   227 fraction_of_correct_next_step_preds:0.46996406 fraction_of_correct_next_step_preds/logits:0.46996406 grad_norm/all:0.58960819 grad_scale_all:1 log_pplx:1.7427672 log_pplx/logits:1.7427672 loss:1.7427672 loss/logits:1.7427672 num_samples_in_batch:128 var_norm/all:702.20258
I0420 08:00:57.075196 140375134025472 trainer.py:511] time: 4.136514
I0420 08:00:57.076277 140375134025472 trainer.py:522] step:   228 fraction_of_correct_next_step_preds:0.46155813 fraction_of_correct_next_step_preds/logits:0.46155813 grad_norm/all:0.61242658 grad_scale_all:1 log_pplx:1.7996696 log_pplx/logits:1.7996696 loss:1.7996696 loss/logits:1.7996696 num_samples_in_batch:256 var_norm/all:702.18518
I0420 08:01:01.202909 140375142418176 trainer.py:371] Steps/second: 0.128876, Examples/second: 17.509017
I0420 08:01:04.000119 140375134025472 trainer.py:511] time: 6.923594
I0420 08:01:04.001703 140375134025472 trainer.py:522] step:   229 fraction_of_correct_next_step_preds:0.46669361 fraction_of_correct_next_step_preds/logits:0.46669361 grad_norm/all:0.83380169 grad_scale_all:1 log_pplx:1.7546902 log_pplx/logits:1.7546902 loss:1.7546902 loss/logits:1.7546902 num_samples_in_batch:128 var_norm/all:702.16779
I0420 08:01:11.213485 140375142418176 trainer.py:371] Steps/second: 0.128713, Examples/second: 17.482445
I0420 08:01:11.214068 140375142418176 trainer.py:268] Save checkpoint
I0420 08:01:12.066224 140375134025472 trainer.py:511] time: 8.064283
I0420 08:01:12.067042 140375134025472 trainer.py:522] step:   230 fraction_of_correct_next_step_preds:0.47051281 fraction_of_correct_next_step_preds/logits:0.47051281 grad_norm/all:0.89386725 grad_scale_all:1 log_pplx:1.7668961 log_pplx/logits:1.7668961 loss:1.7668961 loss/logits:1.7668961 num_samples_in_batch:128 var_norm/all:702.15027
W0420 08:01:13.397505 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'dict' object has no attribute 'name'
W0420 08:01:13.397821 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'list' object has no attribute 'name'
I0420 08:01:13.592818 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000229
I0420 08:01:17.900814 140375134025472 trainer.py:511] time: 5.833391
I0420 08:01:17.902220 140375134025472 trainer.py:522] step:   231 fraction_of_correct_next_step_preds:0.46253422 fraction_of_correct_next_step_preds/logits:0.46253422 grad_norm/all:0.90772861 grad_scale_all:1 log_pplx:1.7824725 log_pplx/logits:1.7824725 loss:1.7824725 loss/logits:1.7824725 num_samples_in_batch:128 var_norm/all:702.13281
I0420 08:01:21.222120 140375142418176 trainer.py:371] Steps/second: 0.129111, Examples/second: 17.527732
I0420 08:01:25.758121 140375134025472 trainer.py:511] time: 7.855735
I0420 08:01:25.759021 140375134025472 trainer.py:522] step:   232 fraction_of_correct_next_step_preds:0.46910375 fraction_of_correct_next_step_preds/logits:0.46910375 grad_norm/all:1.3872631 grad_scale_all:0.72084379 log_pplx:1.7563069 log_pplx/logits:1.7563069 loss:1.7563069 loss/logits:1.7563069 num_samples_in_batch:128 var_norm/all:702.11523
I0420 08:01:31.223759 140375142418176 trainer.py:371] Steps/second: 0.128949, Examples/second: 17.501440
I0420 08:01:34.945935 140375134025472 trainer.py:511] time: 9.186722
I0420 08:01:34.947176 140375134025472 trainer.py:522] step:   233 fraction_of_correct_next_step_preds:0.47437832 fraction_of_correct_next_step_preds/logits:0.47437832 grad_norm/all:0.63594443 grad_scale_all:1 log_pplx:1.741677 log_pplx/logits:1.741677 loss:1.741677 loss/logits:1.741677 num_samples_in_batch:128 var_norm/all:702.09827
I0420 08:01:41.231878 140375142418176 trainer.py:371] Steps/second: 0.128788, Examples/second: 17.475373
I0420 08:01:42.245994 140375134025472 trainer.py:511] time: 7.298589
I0420 08:01:42.247441 140375134025472 trainer.py:522] step:   234 fraction_of_correct_next_step_preds:0.46905929 fraction_of_correct_next_step_preds/logits:0.46905929 grad_norm/all:0.96724659 grad_scale_all:1 log_pplx:1.7541467 log_pplx/logits:1.7541467 loss:1.7541467 loss/logits:1.7541467 num_samples_in_batch:128 var_norm/all:702.08112
I0420 08:01:50.494924 140375134025472 trainer.py:511] time: 8.247263
I0420 08:01:50.496362 140375134025472 trainer.py:522] step:   235 fraction_of_correct_next_step_preds:0.47312659 fraction_of_correct_next_step_preds/logits:0.47312659 grad_norm/all:1.1440966 grad_scale_all:0.87405205 log_pplx:1.7468596 log_pplx/logits:1.7468596 loss:1.7468596 loss/logits:1.7468596 num_samples_in_batch:128 var_norm/all:702.0639
I0420 08:01:51.243119 140375142418176 trainer.py:371] Steps/second: 0.129179, Examples/second: 17.519926
I0420 08:01:56.405097 140375134025472 trainer.py:511] time: 5.908442
I0420 08:01:56.406378 140375134025472 trainer.py:522] step:   236 fraction_of_correct_next_step_preds:0.47224933 fraction_of_correct_next_step_preds/logits:0.47224933 grad_norm/all:0.83175695 grad_scale_all:1 log_pplx:1.7403471 log_pplx/logits:1.7403471 loss:1.7403471 loss/logits:1.7403471 num_samples_in_batch:128 var_norm/all:702.04681
I0420 08:02:01.251878 140375142418176 trainer.py:371] Steps/second: 0.129019, Examples/second: 17.494039
I0420 08:02:04.464371 140375134025472 trainer.py:511] time: 8.057689
I0420 08:02:04.465137 140375134025472 trainer.py:522] step:   237 fraction_of_correct_next_step_preds:0.470265 fraction_of_correct_next_step_preds/logits:0.470265 grad_norm/all:0.5707919 grad_scale_all:1 log_pplx:1.7329159 log_pplx/logits:1.7329159 loss:1.7329159 loss/logits:1.7329159 num_samples_in_batch:128 var_norm/all:702.02966
I0420 08:02:11.283281 140375142418176 trainer.py:371] Steps/second: 0.128859, Examples/second: 17.468229
I0420 08:02:11.318855 140375134025472 trainer.py:511] time: 6.853433
I0420 08:02:11.319916 140375134025472 trainer.py:522] step:   238 fraction_of_correct_next_step_preds:0.47459942 fraction_of_correct_next_step_preds/logits:0.47459942 grad_norm/all:1.0437591 grad_scale_all:0.95807546 log_pplx:1.7425517 log_pplx/logits:1.7425517 loss:1.7425517 loss/logits:1.7425517 num_samples_in_batch:128 var_norm/all:702.01239
I0420 08:02:18.968424 140375134025472 trainer.py:511] time: 7.648252
I0420 08:02:18.969336 140375134025472 trainer.py:522] step:   239 fraction_of_correct_next_step_preds:0.46933535 fraction_of_correct_next_step_preds/logits:0.46933535 grad_norm/all:0.87260091 grad_scale_all:1 log_pplx:1.7371274 log_pplx/logits:1.7371274 loss:1.7371274 loss/logits:1.7371274 num_samples_in_batch:128 var_norm/all:701.99518
I0420 08:02:21.271883 140375142418176 trainer.py:371] Steps/second: 0.129244, Examples/second: 17.512300
I0420 08:02:28.374356 140375134025472 trainer.py:511] time: 9.404775
I0420 08:02:28.375895 140375134025472 trainer.py:522] step:   240 fraction_of_correct_next_step_preds:0.46985883 fraction_of_correct_next_step_preds/logits:0.46985883 grad_norm/all:0.69113106 grad_scale_all:1 log_pplx:1.740096 log_pplx/logits:1.740096 loss:1.740096 loss/logits:1.740096 num_samples_in_batch:128 var_norm/all:701.97778
I0420 08:02:31.282310 140375142418176 trainer.py:371] Steps/second: 0.129086, Examples/second: 17.486856
I0420 08:02:37.180418 140375134025472 trainer.py:511] time: 8.804240
I0420 08:02:37.181627 140375134025472 trainer.py:522] step:   241 fraction_of_correct_next_step_preds:0.47086367 fraction_of_correct_next_step_preds/logits:0.47086367 grad_norm/all:0.52514201 grad_scale_all:1 log_pplx:1.7458298 log_pplx/logits:1.7458298 loss:1.7458298 loss/logits:1.7458298 num_samples_in_batch:128 var_norm/all:701.96039
I0420 08:02:41.291951 140375142418176 trainer.py:371] Steps/second: 0.128930, Examples/second: 17.461692
I0420 08:02:44.532416 140375134025472 trainer.py:511] time: 7.350590
I0420 08:02:44.533498 140375134025472 trainer.py:522] step:   242 fraction_of_correct_next_step_preds:0.47320038 fraction_of_correct_next_step_preds/logits:0.47320038 grad_norm/all:0.77490425 grad_scale_all:1 log_pplx:1.7374105 log_pplx/logits:1.7374105 loss:1.7374105 loss/logits:1.7374105 num_samples_in_batch:128 var_norm/all:701.94281
I0420 08:02:48.880996 140375134025472 trainer.py:511] time: 4.347199
I0420 08:02:48.882100 140375134025472 trainer.py:522] step:   243 fraction_of_correct_next_step_preds:0.46300641 fraction_of_correct_next_step_preds/logits:0.46300641 grad_norm/all:0.83082104 grad_scale_all:1 log_pplx:1.7768013 log_pplx/logits:1.7768013 loss:1.7768013 loss/logits:1.7768013 num_samples_in_batch:256 var_norm/all:701.92523
I0420 08:02:51.303666 140375142418176 trainer.py:371] Steps/second: 0.129307, Examples/second: 17.573003
I0420 08:02:57.107769 140375134025472 trainer.py:511] time: 8.225317
I0420 08:02:57.108860 140375134025472 trainer.py:522] step:   244 fraction_of_correct_next_step_preds:0.4717139 fraction_of_correct_next_step_preds/logits:0.4717139 grad_norm/all:0.93524802 grad_scale_all:1 log_pplx:1.7399476 log_pplx/logits:1.7399476 loss:1.7399476 loss/logits:1.7399476 num_samples_in_batch:128 var_norm/all:701.90759
2019-04-20 08:02:57.112897: I lingvo/core/ops/record_batcher.cc:344] 1892 total seconds passed. Total records yielded: 33820. Total records skipped: 28
2019-04-20 08:02:57.113046: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2110
2019-04-20 08:02:57.113072: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1771
2019-04-20 08:02:57.113136: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2152
2019-04-20 08:02:57.113153: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711
2019-04-20 08:02:57.113169: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2713
2019-04-20 08:02:57.113188: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1741
2019-04-20 08:02:57.113206: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1717
2019-04-20 08:02:57.113225: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1711
2019-04-20 08:02:57.113246: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 1714
2019-04-20 08:02:57.113263: I lingvo/core/ops/record_batcher.cc:349] Out-of-range sample: 2368
I0420 08:03:01.313136 140375142418176 trainer.py:371] Steps/second: 0.129151, Examples/second: 17.547652
I0420 08:03:02.861757 140375134025472 trainer.py:511] time: 5.752685
I0420 08:03:02.862689 140375134025472 trainer.py:522] step:   245 fraction_of_correct_next_step_preds:0.46972477 fraction_of_correct_next_step_preds/logits:0.46972477 grad_norm/all:0.63129705 grad_scale_all:1 log_pplx:1.7548465 log_pplx/logits:1.7548465 loss:1.7548465 loss/logits:1.7548465 num_samples_in_batch:128 var_norm/all:701.88983
I0420 08:03:11.322582 140375142418176 trainer.py:371] Steps/second: 0.128997, Examples/second: 17.522566
I0420 08:03:11.842037 140375134025472 trainer.py:511] time: 8.979055
I0420 08:03:11.843192 140375134025472 trainer.py:522] step:   246 fraction_of_correct_next_step_preds:0.47632533 fraction_of_correct_next_step_preds/logits:0.47632533 grad_norm/all:0.88023895 grad_scale_all:1 log_pplx:1.7404553 log_pplx/logits:1.7404553 loss:1.7404553 loss/logits:1.7404553 num_samples_in_batch:128 var_norm/all:701.87207
I0420 08:03:19.596901 140375134025472 trainer.py:511] time: 7.753358
I0420 08:03:19.597985 140375134025472 trainer.py:522] step:   247 fraction_of_correct_next_step_preds:0.46976799 fraction_of_correct_next_step_preds/logits:0.46976799 grad_norm/all:1.2076995 grad_scale_all:0.82802051 log_pplx:1.7418838 log_pplx/logits:1.7418838 loss:1.7418838 loss/logits:1.7418838 num_samples_in_batch:128 var_norm/all:701.85431
I0420 08:03:21.325915 140375142418176 trainer.py:371] Steps/second: 0.129369, Examples/second: 17.564842
I0420 08:03:26.526834 140375134025472 trainer.py:511] time: 6.928554
I0420 08:03:26.528428 140375134025472 trainer.py:522] step:   248 fraction_of_correct_next_step_preds:0.4831104 fraction_of_correct_next_step_preds/logits:0.4831104 grad_norm/all:0.73449987 grad_scale_all:1 log_pplx:1.7084304 log_pplx/logits:1.7084304 loss:1.7084304 loss/logits:1.7084304 num_samples_in_batch:128 var_norm/all:701.83673
I0420 08:03:31.333636 140375142418176 trainer.py:371] Steps/second: 0.129215, Examples/second: 17.539945
I0420 08:03:35.295617 140375134025472 trainer.py:511] time: 8.766982
I0420 08:03:35.296690 140375134025472 trainer.py:522] step:   249 fraction_of_correct_next_step_preds:0.48221907 fraction_of_correct_next_step_preds/logits:0.48221907 grad_norm/all:0.97499126 grad_scale_all:1 log_pplx:1.7169322 log_pplx/logits:1.7169322 loss:1.7169322 loss/logits:1.7169322 num_samples_in_batch:128 var_norm/all:701.81909
I0420 08:03:41.343588 140375142418176 trainer.py:371] Steps/second: 0.129063, Examples/second: 17.515286
I0420 08:03:42.775986 140375134025472 trainer.py:511] time: 7.479129
I0420 08:03:42.776784 140375134025472 trainer.py:522] step:   250 fraction_of_correct_next_step_preds:0.46579447 fraction_of_correct_next_step_preds/logits:0.46579447 grad_norm/all:1.0307354 grad_scale_all:0.97018111 log_pplx:1.7483062 log_pplx/logits:1.7483062 loss:1.7483062 loss/logits:1.7483062 num_samples_in_batch:128 var_norm/all:701.80145
I0420 08:03:51.352524 140375142418176 trainer.py:371] Steps/second: 0.128913, Examples/second: 17.490890
I0420 08:03:51.670706 140375134025472 trainer.py:511] time: 8.893632
I0420 08:03:51.671466 140375134025472 trainer.py:522] step:   251 fraction_of_correct_next_step_preds:0.47927254 fraction_of_correct_next_step_preds/logits:0.47927254 grad_norm/all:1.1698438 grad_scale_all:0.85481501 log_pplx:1.724764 log_pplx/logits:1.724764 loss:1.724764 loss/logits:1.724764 num_samples_in_batch:128 var_norm/all:701.78369
I0420 08:03:59.819484 140375134025472 trainer.py:511] time: 8.147690
I0420 08:03:59.821422 140375134025472 trainer.py:522] step:   252 fraction_of_correct_next_step_preds:0.47445041 fraction_of_correct_next_step_preds/logits:0.47445041 grad_norm/all:0.8330512 grad_scale_all:1 log_pplx:1.7177838 log_pplx/logits:1.7177838 loss:1.7177838 loss/logits:1.7177838 num_samples_in_batch:128 var_norm/all:701.76624
I0420 08:04:01.362076 140375142418176 trainer.py:371] Steps/second: 0.129277, Examples/second: 17.532405
I0420 08:04:06.722750 140375134025472 trainer.py:511] time: 6.901026
I0420 08:04:06.724076 140375134025472 trainer.py:522] step:   253 fraction_of_correct_next_step_preds:0.48421052 fraction_of_correct_next_step_preds/logits:0.48421052 grad_norm/all:0.87252587 grad_scale_all:1 log_pplx:1.7105172 log_pplx/logits:1.7105172 loss:1.7105172 loss/logits:1.7105172 num_samples_in_batch:128 var_norm/all:701.74854
I0420 08:04:11.372586 140375142418176 trainer.py:371] Steps/second: 0.129127, Examples/second: 17.508157
I0420 08:04:12.560870 140375134025472 trainer.py:511] time: 5.836542
I0420 08:04:12.561919 140375134025472 trainer.py:522] step:   254 fraction_of_correct_next_step_preds:0.47887152 fraction_of_correct_next_step_preds/logits:0.47887152 grad_norm/all:1.3652116 grad_scale_all:0.7324872 log_pplx:1.7168174 log_pplx/logits:1.7168174 loss:1.7168174 loss/logits:1.7168174 num_samples_in_batch:128 var_norm/all:701.73077
I0420 08:04:20.203851 140375134025472 trainer.py:511] time: 7.641743
I0420 08:04:20.204858 140375134025472 trainer.py:522] step:   255 fraction_of_correct_next_step_preds:0.48240939 fraction_of_correct_next_step_preds/logits:0.48240939 grad_norm/all:0.72274792 grad_scale_all:1 log_pplx:1.7139823 log_pplx/logits:1.7139823 loss:1.7139823 loss/logits:1.7139823 num_samples_in_batch:128 var_norm/all:701.7135
I0420 08:04:21.382539 140375142418176 trainer.py:371] Steps/second: 0.129486, Examples/second: 17.549158
I0420 08:04:27.892129 140375134025472 trainer.py:511] time: 7.686936
I0420 08:04:27.893568 140375134025472 trainer.py:522] step:   256 fraction_of_correct_next_step_preds:0.46987805 fraction_of_correct_next_step_preds/logits:0.46987805 grad_norm/all:1.5648351 grad_scale_all:0.639045 log_pplx:1.7303852 log_pplx/logits:1.7303852 loss:1.7303852 loss/logits:1.7303852 num_samples_in_batch:128 var_norm/all:701.69611
I0420 08:04:31.392179 140375142418176 trainer.py:371] Steps/second: 0.129336, Examples/second: 17.525079
I0420 08:04:36.150166 140375134025472 trainer.py:511] time: 8.256426
I0420 08:04:36.150923 140375134025472 trainer.py:522] step:   257 fraction_of_correct_next_step_preds:0.48051399 fraction_of_correct_next_step_preds/logits:0.48051399 grad_norm/all:0.92478973 grad_scale_all:1 log_pplx:1.712966 log_pplx/logits:1.712966 loss:1.712966 loss/logits:1.712966 num_samples_in_batch:128 var_norm/all:701.67926
I0420 08:04:41.403631 140375142418176 trainer.py:371] Steps/second: 0.129188, Examples/second: 17.501228
I0420 08:04:45.177848 140375134025472 trainer.py:511] time: 9.026632
I0420 08:04:45.179218 140375134025472 trainer.py:522] step:   258 fraction_of_correct_next_step_preds:0.48063409 fraction_of_correct_next_step_preds/logits:0.48063409 grad_norm/all:1.1676286 grad_scale_all:0.85643667 log_pplx:1.7200587 log_pplx/logits:1.7200587 loss:1.7200587 loss/logits:1.7200587 num_samples_in_batch:128 var_norm/all:701.66211
I0420 08:04:51.412882 140375142418176 trainer.py:371] Steps/second: 0.129042, Examples/second: 17.477632
I0420 08:04:52.784845 140375134025472 trainer.py:511] time: 7.605420
I0420 08:04:52.785636 140375134025472 trainer.py:522] step:   259 fraction_of_correct_next_step_preds:0.47774762 fraction_of_correct_next_step_preds/logits:0.47774762 grad_norm/all:1.1290226 grad_scale_all:0.88572186 log_pplx:1.7213285 log_pplx/logits:1.7213285 loss:1.7213285 loss/logits:1.7213285 num_samples_in_batch:128 var_norm/all:701.64526
I0420 08:04:59.750433 140375134025472 trainer.py:511] time: 6.964426
I0420 08:04:59.751822 140375134025472 trainer.py:522] step:   260 fraction_of_correct_next_step_preds:0.4800995 fraction_of_correct_next_step_preds/logits:0.4800995 grad_norm/all:0.97840136 grad_scale_all:1 log_pplx:1.7151138 log_pplx/logits:1.7151138 loss:1.7151138 loss/logits:1.7151138 num_samples_in_batch:128 var_norm/all:701.62842
I0420 08:05:01.424606 140375142418176 trainer.py:371] Steps/second: 0.129394, Examples/second: 17.581654
I0420 08:05:04.086353 140375134025472 trainer.py:511] time: 4.334335
I0420 08:05:04.087388 140375134025472 trainer.py:522] step:   261 fraction_of_correct_next_step_preds:0.46974903 fraction_of_correct_next_step_preds/logits:0.46974903 grad_norm/all:1.0458601 grad_scale_all:0.95615089 log_pplx:1.75725 log_pplx/logits:1.75725 loss:1.75725 loss/logits:1.75725 num_samples_in_batch:256 var_norm/all:701.61133
I0420 08:05:11.432993 140375142418176 trainer.py:371] Steps/second: 0.129248, Examples/second: 17.557903
I0420 08:05:12.238946 140375134025472 trainer.py:511] time: 8.151325
I0420 08:05:12.239756 140375134025472 trainer.py:522] step:   262 fraction_of_correct_next_step_preds:0.48408258 fraction_of_correct_next_step_preds/logits:0.48408258 grad_norm/all:0.81047904 grad_scale_all:1 log_pplx:1.7101616 log_pplx/logits:1.7101616 loss:1.7101616 loss/logits:1.7101616 num_samples_in_batch:128 var_norm/all:701.59418
I0420 08:05:17.831022 140375134025472 trainer.py:511] time: 5.590824
I0420 08:05:17.831772 140375134025472 trainer.py:522] step:   263 fraction_of_correct_next_step_preds:0.48156208 fraction_of_correct_next_step_preds/logits:0.48156208 grad_norm/all:1.1611915 grad_scale_all:0.86118442 log_pplx:1.7126567 log_pplx/logits:1.7126567 loss:1.7126567 loss/logits:1.7126567 num_samples_in_batch:128 var_norm/all:701.5769
I0420 08:05:21.436331 140375142418176 trainer.py:371] Steps/second: 0.129596, Examples/second: 17.597503
I0420 08:05:26.736340 140375134025472 trainer.py:511] time: 8.904243
I0420 08:05:26.737313 140375134025472 trainer.py:522] step:   264 fraction_of_correct_next_step_preds:0.4776144 fraction_of_correct_next_step_preds/logits:0.4776144 grad_norm/all:0.87775058 grad_scale_all:1 log_pplx:1.7094767 log_pplx/logits:1.7094767 loss:1.7094767 loss/logits:1.7094767 num_samples_in_batch:128 var_norm/all:701.55969
I0420 08:05:31.443900 140375142418176 trainer.py:371] Steps/second: 0.129451, Examples/second: 17.573912
I0420 08:05:34.129517 140375134025472 trainer.py:511] time: 7.391932
I0420 08:05:34.130951 140375134025472 trainer.py:522] step:   265 fraction_of_correct_next_step_preds:0.48271435 fraction_of_correct_next_step_preds/logits:0.48271435 grad_norm/all:1.0273635 grad_scale_all:0.97336531 log_pplx:1.7104719 log_pplx/logits:1.7104719 loss:1.7104719 loss/logits:1.7104719 num_samples_in_batch:128 var_norm/all:701.54242
I0420 08:05:41.453370 140375142418176 trainer.py:371] Steps/second: 0.129306, Examples/second: 17.550536
I0420 08:05:42.639960 140375134025472 trainer.py:511] time: 8.508722
I0420 08:05:42.641098 140375134025472 trainer.py:522] step:   266 fraction_of_correct_next_step_preds:0.48919922 fraction_of_correct_next_step_preds/logits:0.48919922 grad_norm/all:0.61877698 grad_scale_all:1 log_pplx:1.6739788 log_pplx/logits:1.6739788 loss:1.6739788 loss/logits:1.6739788 num_samples_in_batch:128 var_norm/all:701.52502
I0420 08:05:50.417263 140375134025472 trainer.py:511] time: 7.775875
I0420 08:05:50.418644 140375134025472 trainer.py:522] step:   267 fraction_of_correct_next_step_preds:0.47832456 fraction_of_correct_next_step_preds/logits:0.47832456 grad_norm/all:0.93601477 grad_scale_all:1 log_pplx:1.7074571 log_pplx/logits:1.7074571 loss:1.7074571 loss/logits:1.7074571 num_samples_in_batch:128 var_norm/all:701.50745
I0420 08:05:51.466936 140375142418176 trainer.py:371] Steps/second: 0.129649, Examples/second: 17.589507
I0420 08:05:56.287250 140375134025472 trainer.py:511] time: 5.868383
I0420 08:05:56.288091 140375134025472 trainer.py:522] step:   268 fraction_of_correct_next_step_preds:0.47999004 fraction_of_correct_next_step_preds/logits:0.47999004 grad_norm/all:0.85305828 grad_scale_all:1 log_pplx:1.7220774 log_pplx/logits:1.7220774 loss:1.7220774 loss/logits:1.7220774 num_samples_in_batch:128 var_norm/all:701.48969
I0420 08:06:01.473455 140375142418176 trainer.py:371] Steps/second: 0.129505, Examples/second: 17.566308
I0420 08:06:03.218576 140375134025472 trainer.py:511] time: 6.930291
I0420 08:06:03.219804 140375134025472 trainer.py:522] step:   269 fraction_of_correct_next_step_preds:0.48043379 fraction_of_correct_next_step_preds/logits:0.48043379 grad_norm/all:0.712749 grad_scale_all:1 log_pplx:1.6929581 log_pplx/logits:1.6929581 loss:1.6929581 loss/logits:1.6929581 num_samples_in_batch:128 var_norm/all:701.4718
I0420 08:06:11.352437 140375134025472 trainer.py:511] time: 8.132378
I0420 08:06:11.353750 140375134025472 trainer.py:522] step:   270 fraction_of_correct_next_step_preds:0.48388165 fraction_of_correct_next_step_preds/logits:0.48388165 grad_norm/all:0.61178142 grad_scale_all:1 log_pplx:1.6985079 log_pplx/logits:1.6985079 loss:1.6985079 loss/logits:1.6985079 num_samples_in_batch:128 var_norm/all:701.45386
I0420 08:06:11.496427 140375142418176 trainer.py:371] Steps/second: 0.129843, Examples/second: 17.543193
I0420 08:06:20.475584 140375134025472 trainer.py:511] time: 9.121587
I0420 08:06:20.476536 140375134025472 trainer.py:522] step:   271 fraction_of_correct_next_step_preds:0.4857007 fraction_of_correct_next_step_preds/logits:0.4857007 grad_norm/all:0.76059753 grad_scale_all:1 log_pplx:1.703305 log_pplx/logits:1.703305 loss:1.703305 loss/logits:1.703305 num_samples_in_batch:128 var_norm/all:701.43573
I0420 08:06:21.493537 140375142418176 trainer.py:371] Steps/second: 0.129700, Examples/second: 17.581776
I0420 08:06:28.765491 140375134025472 trainer.py:511] time: 8.288722
I0420 08:06:28.767008 140375134025472 trainer.py:522] step:   272 fraction_of_correct_next_step_preds:0.48852551 fraction_of_correct_next_step_preds/logits:0.48852551 grad_norm/all:0.78125584 grad_scale_all:1 log_pplx:1.6890454 log_pplx/logits:1.6890454 loss:1.6890454 loss/logits:1.6890454 num_samples_in_batch:128 var_norm/all:701.41754
I0420 08:06:31.502388 140375142418176 trainer.py:371] Steps/second: 0.129558, Examples/second: 17.558926
I0420 08:06:36.136847 140375134025472 trainer.py:511] time: 7.369630
I0420 08:06:36.137947 140375134025472 trainer.py:522] step:   273 fraction_of_correct_next_step_preds:0.47953263 fraction_of_correct_next_step_preds/logits:0.47953263 grad_norm/all:0.89224702 grad_scale_all:1 log_pplx:1.6981378 log_pplx/logits:1.6981378 loss:1.6981378 loss/logits:1.6981378 num_samples_in_batch:128 var_norm/all:701.39923
I0420 08:06:41.513550 140375142418176 trainer.py:371] Steps/second: 0.129417, Examples/second: 17.536274
I0420 08:06:41.777515 140375134025472 trainer.py:511] time: 5.639333
I0420 08:06:41.778304 140375134025472 trainer.py:522] step:   274 fraction_of_correct_next_step_preds:0.48499754 fraction_of_correct_next_step_preds/logits:0.48499754 grad_norm/all:0.78847671 grad_scale_all:1 log_pplx:1.7076409 log_pplx/logits:1.7076409 loss:1.7076409 loss/logits:1.7076409 num_samples_in_batch:128 var_norm/all:701.3808
I0420 08:06:49.734405 140375134025472 trainer.py:511] time: 7.955748
I0420 08:06:49.735455 140375134025472 trainer.py:522] step:   275 fraction_of_correct_next_step_preds:0.491671 fraction_of_correct_next_step_preds/logits:0.491671 grad_norm/all:0.54001719 grad_scale_all:1 log_pplx:1.6724101 log_pplx/logits:1.6724101 loss:1.6724101 loss/logits:1.6724101 num_samples_in_batch:128 var_norm/all:701.3623
I0420 08:06:51.523920 140375142418176 trainer.py:371] Steps/second: 0.129750, Examples/second: 17.574235
I0420 08:06:56.667988 140375134025472 trainer.py:511] time: 6.932291
I0420 08:06:56.669092 140375134025472 trainer.py:522] step:   276 fraction_of_correct_next_step_preds:0.48732969 fraction_of_correct_next_step_preds/logits:0.48732969 grad_norm/all:0.75405437 grad_scale_all:1 log_pplx:1.6760614 log_pplx/logits:1.6760614 loss:1.6760614 loss/logits:1.6760614 num_samples_in_batch:128 var_norm/all:701.34375
I0420 08:07:01.532896 140375142418176 trainer.py:371] Steps/second: 0.129609, Examples/second: 17.551739
I0420 08:07:04.908233 140375134025472 trainer.py:511] time: 8.238858
I0420 08:07:04.909634 140375134025472 trainer.py:522] step:   277 fraction_of_correct_next_step_preds:0.49006793 fraction_of_correct_next_step_preds/logits:0.49006793 grad_norm/all:0.5504958 grad_scale_all:1 log_pplx:1.6877584 log_pplx/logits:1.6877584 loss:1.6877584 loss/logits:1.6877584 num_samples_in_batch:128 var_norm/all:701.32507
I0420 08:07:09.160671 140375134025472 trainer.py:511] time: 4.250840
I0420 08:07:09.161709 140375134025472 trainer.py:522] step:   278 fraction_of_correct_next_step_preds:0.47712374 fraction_of_correct_next_step_preds/logits:0.47712374 grad_norm/all:0.75490832 grad_scale_all:1 log_pplx:1.7187968 log_pplx/logits:1.7187968 loss:1.7187968 loss/logits:1.7187968 num_samples_in_batch:256 var_norm/all:701.30634
I0420 08:07:11.542623 140375142418176 trainer.py:371] Steps/second: 0.129938, Examples/second: 17.649105
I0420 08:07:17.528034 140375134025472 trainer.py:511] time: 8.366018
I0420 08:07:17.529331 140375134025472 trainer.py:522] step:   279 fraction_of_correct_next_step_preds:0.48743379 fraction_of_correct_next_step_preds/logits:0.48743379 grad_norm/all:0.92561078 grad_scale_all:1 log_pplx:1.6828735 log_pplx/logits:1.6828735 loss:1.6828735 loss/logits:1.6828735 num_samples_in_batch:128 var_norm/all:701.28754
I0420 08:07:21.552903 140375142418176 trainer.py:371] Steps/second: 0.129798, Examples/second: 17.626461
I0420 08:07:26.421845 140375134025472 trainer.py:511] time: 8.892352
I0420 08:07:26.422921 140375134025472 trainer.py:522] step:   280 fraction_of_correct_next_step_preds:0.49203226 fraction_of_correct_next_step_preds/logits:0.49203226 grad_norm/all:0.74147272 grad_scale_all:1 log_pplx:1.6697469 log_pplx/logits:1.6697469 loss:1.6697469 loss/logits:1.6697469 num_samples_in_batch:128 var_norm/all:701.26868
I0420 08:07:31.554145 140375142418176 trainer.py:371] Steps/second: 0.129660, Examples/second: 17.604102
I0420 08:07:33.752093 140375134025472 trainer.py:511] time: 7.328991
I0420 08:07:33.753096 140375134025472 trainer.py:522] step:   281 fraction_of_correct_next_step_preds:0.49754408 fraction_of_correct_next_step_preds/logits:0.49754408 grad_norm/all:0.9283191 grad_scale_all:1 log_pplx:1.6691937 log_pplx/logits:1.6691937 loss:1.6691937 loss/logits:1.6691937 num_samples_in_batch:128 var_norm/all:701.24982
I0420 08:07:39.357894 140375134025472 trainer.py:511] time: 5.604640
I0420 08:07:39.358714 140375134025472 trainer.py:522] step:   282 fraction_of_correct_next_step_preds:0.49084425 fraction_of_correct_next_step_preds/logits:0.49084425 grad_norm/all:0.66445577 grad_scale_all:1 log_pplx:1.6889721 log_pplx/logits:1.6889721 loss:1.6889721 loss/logits:1.6889721 num_samples_in_batch:128 var_norm/all:701.23096
I0420 08:07:41.564384 140375142418176 trainer.py:371] Steps/second: 0.129983, Examples/second: 17.640874
I0420 08:07:46.204224 140375134025472 trainer.py:511] time: 6.845251
I0420 08:07:46.205530 140375134025472 trainer.py:522] step:   283 fraction_of_correct_next_step_preds:0.49110562 fraction_of_correct_next_step_preds/logits:0.49110562 grad_norm/all:0.97234619 grad_scale_all:1 log_pplx:1.6718854 log_pplx/logits:1.6718854 loss:1.6718854 loss/logits:1.6718854 num_samples_in_batch:128 var_norm/all:701.21191
I0420 08:07:51.573463 140375142418176 trainer.py:371] Steps/second: 0.129845, Examples/second: 17.618590
I0420 08:07:54.169616 140375134025472 trainer.py:511] time: 7.963868
I0420 08:07:54.170538 140375134025472 trainer.py:522] step:   284 fraction_of_correct_next_step_preds:0.4850997 fraction_of_correct_next_step_preds/logits:0.4850997 grad_norm/all:0.79734397 grad_scale_all:1 log_pplx:1.6948088 log_pplx/logits:1.6948088 loss:1.6948088 loss/logits:1.6948088 num_samples_in_batch:128 var_norm/all:701.19293
I0420 08:08:01.584471 140375142418176 trainer.py:371] Steps/second: 0.129708, Examples/second: 17.596494
I0420 08:08:02.195472 140375134025472 trainer.py:511] time: 8.024653
I0420 08:08:02.196593 140375134025472 trainer.py:522] step:   285 fraction_of_correct_next_step_preds:0.49519411 fraction_of_correct_next_step_preds/logits:0.49519411 grad_norm/all:0.60972106 grad_scale_all:1 log_pplx:1.6642418 log_pplx/logits:1.6642418 loss:1.6642418 loss/logits:1.6642418 num_samples_in_batch:128 var_norm/all:701.17395
I0420 08:08:10.374047 140375134025472 trainer.py:511] time: 8.177250
I0420 08:08:10.375052 140375134025472 trainer.py:522] step:   286 fraction_of_correct_next_step_preds:0.49601725 fraction_of_correct_next_step_preds/logits:0.49601725 grad_norm/all:0.71168172 grad_scale_all:1 log_pplx:1.6702002 log_pplx/logits:1.6702002 loss:1.6702002 loss/logits:1.6702002 num_samples_in_batch:128 var_norm/all:701.15485
I0420 08:08:11.585005 140375142418176 trainer.py:371] Steps/second: 0.130028, Examples/second: 17.632876
I0420 08:08:19.508388 140375134025472 trainer.py:511] time: 9.133091
I0420 08:08:19.509922 140375134025472 trainer.py:522] step:   287 fraction_of_correct_next_step_preds:0.49295774 fraction_of_correct_next_step_preds/logits:0.49295774 grad_norm/all:0.56234354 grad_scale_all:1 log_pplx:1.6584225 log_pplx/logits:1.6584225 loss:1.6584225 loss/logits:1.6584225 num_samples_in_batch:128 var_norm/all:701.13568
I0420 08:08:21.597275 140375142418176 trainer.py:371] Steps/second: 0.129891, Examples/second: 17.610906
I0420 08:08:26.930875 140375134025472 trainer.py:511] time: 7.420712
I0420 08:08:26.931936 140375134025472 trainer.py:522] step:   288 fraction_of_correct_next_step_preds:0.4916712 fraction_of_correct_next_step_preds/logits:0.4916712 grad_norm/all:0.66766959 grad_scale_all:1 log_pplx:1.6747226 log_pplx/logits:1.6747226 loss:1.6747226 loss/logits:1.6747226 num_samples_in_batch:128 var_norm/all:701.11646
I0420 08:08:31.607119 140375142418176 trainer.py:371] Steps/second: 0.129756, Examples/second: 17.589153
I0420 08:08:34.528407 140375134025472 trainer.py:511] time: 7.596274
I0420 08:08:34.529500 140375134025472 trainer.py:522] step:   289 fraction_of_correct_next_step_preds:0.49554795 fraction_of_correct_next_step_preds/logits:0.49554795 grad_norm/all:0.59034371 grad_scale_all:1 log_pplx:1.6630248 log_pplx/logits:1.6630248 loss:1.6630248 loss/logits:1.6630248 num_samples_in_batch:128 var_norm/all:701.09717
I0420 08:08:40.189884 140375134025472 trainer.py:511] time: 5.660071
I0420 08:08:40.190975 140375134025472 trainer.py:522] step:   290 fraction_of_correct_next_step_preds:0.49784851 fraction_of_correct_next_step_preds/logits:0.49784851 grad_norm/all:0.85040981 grad_scale_all:1 log_pplx:1.6592635 log_pplx/logits:1.6592635 loss:1.6592635 loss/logits:1.6592635 num_samples_in_batch:128 var_norm/all:701.07788
I0420 08:08:41.615701 140375142418176 trainer.py:371] Steps/second: 0.130071, Examples/second: 17.625015
I0420 08:08:47.114645 140375134025472 trainer.py:511] time: 6.899251
I0420 08:08:47.115684 140375134025472 trainer.py:522] step:   291 fraction_of_correct_next_step_preds:0.49960199 fraction_of_correct_next_step_preds/logits:0.49960199 grad_norm/all:0.69848967 grad_scale_all:1 log_pplx:1.6517125 log_pplx/logits:1.6517125 loss:1.6517125 loss/logits:1.6517125 num_samples_in_batch:128 var_norm/all:701.05847
I0420 08:08:51.625663 140375142418176 trainer.py:371] Steps/second: 0.129936, Examples/second: 17.603392
I0420 08:08:55.315984 140375134025472 trainer.py:511] time: 8.199814
I0420 08:08:55.317042 140375134025472 trainer.py:522] step:   292 fraction_of_correct_next_step_preds:0.49012542 fraction_of_correct_next_step_preds/logits:0.49012542 grad_norm/all:0.76374143 grad_scale_all:1 log_pplx:1.6770569 log_pplx/logits:1.6770569 loss:1.6770569 loss/logits:1.6770569 num_samples_in_batch:128 var_norm/all:701.03912
I0420 08:08:59.675689 140375134025472 trainer.py:511] time: 4.358358
I0420 08:08:59.676516 140375134025472 trainer.py:522] step:   293 fraction_of_correct_next_step_preds:0.48613396 fraction_of_correct_next_step_preds/logits:0.48613396 grad_norm/all:0.58962899 grad_scale_all:1 log_pplx:1.7016851 log_pplx/logits:1.7016851 loss:1.7016851 loss/logits:1.7016851 num_samples_in_batch:256 var_norm/all:701.01959
I0420 08:09:01.636991 140375142418176 trainer.py:371] Steps/second: 0.130247, Examples/second: 17.695752
I0420 08:09:09.086409 140375134025472 trainer.py:511] time: 9.409725
I0420 08:09:09.087534 140375134025472 trainer.py:522] step:   294 fraction_of_correct_next_step_preds:0.49767861 fraction_of_correct_next_step_preds/logits:0.49767861 grad_norm/all:0.91316557 grad_scale_all:1 log_pplx:1.6625874 log_pplx/logits:1.6625874 loss:1.6625874 loss/logits:1.6625874 num_samples_in_batch:128 var_norm/all:701.00018
I0420 08:09:11.645663 140375142418176 trainer.py:371] Steps/second: 0.130112, Examples/second: 17.674016
I0420 08:09:17.783576 140375134025472 trainer.py:511] time: 8.695789
I0420 08:09:17.784698 140375134025472 trainer.py:522] step:   295 fraction_of_correct_next_step_preds:0.48983049 fraction_of_correct_next_step_preds/logits:0.48983049 grad_norm/all:0.95493954 grad_scale_all:1 log_pplx:1.6716349 log_pplx/logits:1.6716349 loss:1.6716349 loss/logits:1.6716349 num_samples_in_batch:128 var_norm/all:700.98065
I0420 08:09:21.646167 140375142418176 trainer.py:371] Steps/second: 0.129979, Examples/second: 17.652537
I0420 08:09:25.372036 140375134025472 trainer.py:511] time: 7.586990
I0420 08:09:25.372812 140375134025472 trainer.py:522] step:   296 fraction_of_correct_next_step_preds:0.5000816 fraction_of_correct_next_step_preds/logits:0.5000816 grad_norm/all:0.64577919 grad_scale_all:1 log_pplx:1.6472374 log_pplx/logits:1.6472374 loss:1.6472374 loss/logits:1.6472374 num_samples_in_batch:128 var_norm/all:700.96106
I0420 08:09:31.650913 140375142418176 trainer.py:371] Steps/second: 0.129848, Examples/second: 17.631214
I0420 08:09:32.993964 140375134025472 trainer.py:511] time: 7.620850
I0420 08:09:32.995151 140375134025472 trainer.py:522] step:   297 fraction_of_correct_next_step_preds:0.4945648 fraction_of_correct_next_step_preds/logits:0.4945648 grad_norm/all:0.72654635 grad_scale_all:1 log_pplx:1.6568946 log_pplx/logits:1.6568946 loss:1.6568946 loss/logits:1.6568946 num_samples_in_batch:128 var_norm/all:700.94147
I0420 08:09:39.829082 140375134025472 trainer.py:511] time: 6.833739
I0420 08:09:39.830120 140375134025472 trainer.py:522] step:   298 fraction_of_correct_next_step_preds:0.49532753 fraction_of_correct_next_step_preds/logits:0.49532753 grad_norm/all:0.84149033 grad_scale_all:1 log_pplx:1.6645547 log_pplx/logits:1.6645547 loss:1.6645547 loss/logits:1.6645547 num_samples_in_batch:128 var_norm/all:700.92188
I0420 08:09:41.661695 140375142418176 trainer.py:371] Steps/second: 0.130153, Examples/second: 17.665935
I0420 08:09:45.428024 140375134025472 trainer.py:511] time: 5.597765
I0420 08:09:45.428751 140375134025472 trainer.py:522] step:   299 fraction_of_correct_next_step_preds:0.49844065 fraction_of_correct_next_step_preds/logits:0.49844065 grad_norm/all:0.72662562 grad_scale_all:1 log_pplx:1.6585479 log_pplx/logits:1.6585479 loss:1.6585479 loss/logits:1.6585479 num_samples_in_batch:128 var_norm/all:700.90222
I0420 08:09:51.669598 140375142418176 trainer.py:371] Steps/second: 0.130022, Examples/second: 17.644714
I0420 08:09:54.806416 140375134025472 trainer.py:511] time: 9.276496
I0420 08:09:54.807459 140375134025472 trainer.py:522] step:   300 fraction_of_correct_next_step_preds:0.49890438 fraction_of_correct_next_step_preds/logits:0.49890438 grad_norm/all:0.47588634 grad_scale_all:1 log_pplx:1.6455002 log_pplx/logits:1.6455002 loss:1.6455002 loss/logits:1.6455002 num_samples_in_batch:128 var_norm/all:700.88251
I0420 08:10:01.680495 140375142418176 trainer.py:371] Steps/second: 0.129891, Examples/second: 17.623655
I0420 08:10:02.882178 140375134025472 trainer.py:511] time: 8.074491
I0420 08:10:02.883347 140375134025472 base_runner.py:115] step:   301 fraction_of_correct_next_step_preds:0.49539271 fraction_of_correct_next_step_preds/logits:0.49539271 grad_norm/all:0.55232203 grad_scale_all:1 log_pplx:1.6580131 log_pplx/logits:1.6580131 loss:1.6580131 loss/logits:1.6580131 num_samples_in_batch:128 var_norm/all:700.86273
I0420 08:10:11.410650 140375134025472 trainer.py:511] time: 8.527046
I0420 08:10:11.412190 140375134025472 trainer.py:522] step:   302 fraction_of_correct_next_step_preds:0.50001818 fraction_of_correct_next_step_preds/logits:0.50001818 grad_norm/all:0.59572774 grad_scale_all:1 log_pplx:1.6452334 log_pplx/logits:1.6452334 loss:1.6452334 loss/logits:1.6452334 num_samples_in_batch:128 var_norm/all:700.84296
I0420 08:10:11.702862 140375142418176 trainer.py:371] Steps/second: 0.130192, Examples/second: 17.657870
I0420 08:10:11.703310 140375142418176 trainer.py:275] Write summary @302
2019-04-20 08:10:11.704807: I lingvo/core/ops/record_batcher.cc:344] 2302 total seconds passed. Total records yielded: 319. Total records skipped: 0
I0420 08:10:21.761969 140375134025472 trainer.py:511] time: 10.349495
I0420 08:10:21.766017 140375134025472 trainer.py:522] step:   303 fraction_of_correct_next_step_preds:0.49674481 fraction_of_correct_next_step_preds/logits:0.49674481 grad_norm/all:0.54272324 grad_scale_all:1 log_pplx:1.6499885 log_pplx/logits:1.6499885 loss:1.6499885 loss/logits:1.6499885 num_samples_in_batch:128 var_norm/all:700.82318
I0420 08:10:33.999098 140375134025472 trainer.py:511] time: 12.231365
I0420 08:10:34.002353 140375134025472 trainer.py:522] step:   304 fraction_of_correct_next_step_preds:0.50056696 fraction_of_correct_next_step_preds/logits:0.50056696 grad_norm/all:0.56036305 grad_scale_all:1 log_pplx:1.6472863 log_pplx/logits:1.6472863 loss:1.6472863 loss/logits:1.6472863 num_samples_in_batch:128 var_norm/all:700.80322
I0420 08:10:44.876357 140375134025472 trainer.py:511] time: 10.873799
I0420 08:10:44.878129 140375134025472 trainer.py:522] step:   305 fraction_of_correct_next_step_preds:0.49619058 fraction_of_correct_next_step_preds/logits:0.49619058 grad_norm/all:0.45682853 grad_scale_all:1 log_pplx:1.6571825 log_pplx/logits:1.6571825 loss:1.6571825 loss/logits:1.6571825 num_samples_in_batch:128 var_norm/all:700.78339
I0420 08:10:54.934130 140375134025472 trainer.py:511] time: 10.055444
I0420 08:10:54.951957 140375134025472 trainer.py:522] step:   306 fraction_of_correct_next_step_preds:0.49130249 fraction_of_correct_next_step_preds/logits:0.49130249 grad_norm/all:0.55479544 grad_scale_all:1 log_pplx:1.6631523 log_pplx/logits:1.6631523 loss:1.6631523 loss/logits:1.6631523 num_samples_in_batch:128 var_norm/all:700.76337
I0420 08:11:08.898406 140375134025472 trainer.py:511] time: 13.945316
I0420 08:11:08.900434 140375134025472 trainer.py:522] step:   307 fraction_of_correct_next_step_preds:0.50046659 fraction_of_correct_next_step_preds/logits:0.50046659 grad_norm/all:0.5233255 grad_scale_all:1 log_pplx:1.6387581 log_pplx/logits:1.6387581 loss:1.6387581 loss/logits:1.6387581 num_samples_in_batch:128 var_norm/all:700.74335
I0420 08:11:23.458416 140375134025472 trainer.py:511] time: 14.557503
I0420 08:11:23.460469 140375134025472 trainer.py:522] step:   308 fraction_of_correct_next_step_preds:0.50056994 fraction_of_correct_next_step_preds/logits:0.50056994 grad_norm/all:0.53521502 grad_scale_all:1 log_pplx:1.6413771 log_pplx/logits:1.6413771 loss:1.6413771 loss/logits:1.6413771 num_samples_in_batch:128 var_norm/all:700.72327
I0420 08:11:25.631345 140375142418176 trainer.py:284] Write summary done: step 302
I0420 08:11:25.643549 140375142418176 base_runner.py:115] step:   302, steps/sec: 0.13, examples/sec: 17.66
I0420 08:11:25.647203 140375142418176 trainer.py:371] Steps/second: 0.128677, Examples/second: 17.486705
I0420 08:11:25.647681 140375142418176 trainer.py:268] Save checkpoint
W0420 08:11:27.696300 140375142418176 meta_graph.py:447] Issue encountered when serializing __batch_norm_update_dict.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'dict' object has no attribute 'name'
W0420 08:11:27.696728 140375142418176 meta_graph.py:447] Issue encountered when serializing __model_split_id_stack.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'list' object has no attribute 'name'
I0420 08:11:27.889962 140375142418176 trainer.py:270] Save checkpoint done: /data/dingzhenyou/speech_data/librispeech/log/train/ckpt-00000308
I0420 08:11:28.140069 140375134025472 trainer.py:511] time: 4.679287
I0420 08:11:28.141210 140375134025472 trainer.py:522] step:   309 fraction_of_correct_next_step_preds:0.4929764 fraction_of_correct_next_step_preds/logits:0.4929764 grad_norm/all:0.60663974 grad_scale_all:1 log_pplx:1.6712581 log_pplx/logits:1.6712581 loss:1.6712581 loss/logits:1.6712581 num_samples_in_batch:256 var_norm/all:700.70325
I0420 08:11:35.650377 140375142418176 trainer.py:371] Steps/second: 0.128558, Examples/second: 17.467183
I0420 08:11:36.566423 140375134025472 trainer.py:511] time: 8.425011
I0420 08:11:36.567435 140375134025472 trainer.py:522] step:   310 fraction_of_correct_next_step_preds:0.49667397 fraction_of_correct_next_step_preds/logits:0.49667397 grad_norm/all:0.6936717 grad_scale_all:1 log_pplx:1.6473578 log_pplx/logits:1.6473578 loss:1.6473578 loss/logits:1.6473578 num_samples_in_batch:128 var_norm/all:700.68311
I0420 08:11:44.382427 140375134025472 trainer.py:511] time: 7.814773
I0420 08:11:44.383707 140375134025472 trainer.py:522] step:   311 fraction_of_correct_next_step_preds:0.51070797 fraction_of_correct_next_step_preds/logits:0.51070797 grad_norm/all:0.60806739 grad_scale_all:1 log_pplx:1.6229703 log_pplx/logits:1.6229703 loss:1.6229703 loss/logits:1.6229703 num_samples_in_batch:128 var_norm/all:700.66296
I0420 08:11:45.648066 140375142418176 trainer.py:371] Steps/second: 0.128854, Examples/second: 17.500896
I0420 08:11:52.074405 140375134025472 trainer.py:511] time: 7.690345
I0420 08:11:52.075900 140375134025472 trainer.py:522] step:   312 fraction_of_correct_next_step_preds:0.50249594 fraction_of_correct_next_step_preds/logits:0.50249594 grad_norm/all:0.64068145 grad_scale_all:1 log_pplx:1.6338384 log_pplx/logits:1.6338384 loss:1.6338384 loss/logits:1.6338384 num_samples_in_batch:128 var_norm/all:700.64282
I0420 08:11:55.658617 140375142418176 trainer.py:371] Steps/second: 0.128734, Examples/second: 17.481423
I0420 08:11:59.064162 140375134025472 trainer.py:511] time: 6.987994
I0420 08:11:59.065263 140375134025472 trainer.py:522] step:   313 fraction_of_correct_next_step_preds:0.50249946 fraction_of_correct_next_step_preds/logits:0.50249946 grad_norm/all:0.59031916 grad_scale_all:1 log_pplx:1.6381044 log_pplx/logits:1.6381044 loss:1.6381044 loss/logits:1.6381044 num_samples_in_batch:128 var_norm/all:700.62274
I0420 08:12:04.761533 140375134025472 trainer.py:511] time: 5.696028
I0420 08:12:04.762573 140375134025472 trainer.py:522] step:   314 fraction_of_correct_next_step_preds:0.49937415 fraction_of_correct_next_step_preds/logits:0.49937415 grad_norm/all:0.87156236 grad_scale_all:1 log_pplx:1.6565 log_pplx/logits:1.6565 loss:1.6565 loss/logits:1.6565 num_samples_in_batch:128 var_norm/all:700.60254
I0420 08:12:05.668169 140375142418176 trainer.py:371] Steps/second: 0.129026, Examples/second: 17.514715
I0420 08:12:13.708889 140375134025472 trainer.py:511] time: 8.945945
I0420 08:12:13.709841 140375134025472 trainer.py:522] step:   315 fraction_of_correct_next_step_preds:0.50379395 fraction_of_correct_next_step_preds/logits:0.50379395 grad_norm/all:0.72499329 grad_scale_all:1 log_pplx:1.631755 log_pplx/logits:1.631755 loss:1.631755 loss/logits:1.631755 num_samples_in_batch:128 var_norm/all:700.5824