<a href="https://colab.research.google.com/github/wenxuan0923/My-notes/blob/master/Tensor2Tensor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Transformer model for language translation 
### --- With Tensor2Tensor

**Tensor2Tensor** package, or T2T for short, is a library of deep learning models developed by Google Brain team.


In this note I will use T2T to implement the **Transformer** model proposed in the paper <a href='https://arxiv.org/abs/1706.03762' target='_blank'>Attention Is All You Need </a> for English-Chinese translation problem. The greatest thing about implementing Transformer with T2T is its functionality to visualize the multi-head attention layers.

- The model is built under the environment of Google Colab with GPU enabled.

- Python code is used for data generation and model training. One can choose to use command line in terminal instead.

- Please refer to this <a href='https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/Transformer_translate.ipynb' target='_blank'> tutorial </a> and for more details.

- An implementation of Transformer (Encoder) using purely Keras can be found <a href='#'>here<a>.



### Initialization

In [0]:
import os
import sys
import numpy as np
import collections
import matplotlib.pyplot as plt
# Colab-only TensorFlow version selector
if 'google.colab' in sys.modules: 
  %tensorflow_version 1.x
import tensorflow as tf
from tensor2tensor import models
from tensor2tensor import problems
from tensor2tensor.layers import common_layers
from tensor2tensor.utils import trainer_lib
from tensor2tensor.utils import t2t_model
from tensor2tensor.utils import registry
from tensor2tensor.utils import metrics
from tensor2tensor.utils.trainer_lib import (create_hparams, 
                                             create_run_config, 
                                             create_experiment)
# Enable TF Eager execution
tfe = tf.contrib.eager
tfe.enable_eager_execution()
# Other setup
Modes = tf.estimator.ModeKeys

### Setup some directories


In [0]:
# Not all of these folders are necessary
# Choose the ones that suits your needs

DATA_DIR = os.path.expanduser("/t2t/data") # This folder contain the training data
TMP_DIR = os.path.expanduser("/t2t/tmp")
TRAIN_DIR = os.path.expanduser("/t2t/train") # This folder contain all the checkpoints
EXPORT_DIR = os.path.expanduser("/t2t/export") # This folder contain the exported model for production
TRANSLATIONS_DIR = os.path.expanduser("/t2t/translation") # This folder contain all translated sequence
EVENT_DIR = os.path.expanduser("/t2t/event") # Test the BLEU score
USR_DIR = os.path.expanduser("/t2t/user") # This folder contains our data that we want to add
 
tf.gfile.MakeDirs(DATA_DIR)
tf.gfile.MakeDirs(TMP_DIR)
tf.gfile.MakeDirs(TRAIN_DIR)
tf.gfile.MakeDirs(EXPORT_DIR)
tf.gfile.MakeDirs(TRANSLATIONS_DIR)
tf.gfile.MakeDirs(EVENT_DIR)
tf.gfile.MakeDirs(USR_DIR)

These gonna generate the folders below:
<center><img src='https://drive.google.com/uc?id=1vldUGgC5SaNVjlhXCLcOaAErVckdElGq'></img></center>

### Initialize parameters

A **Problem** is a dataset together with some **fixed pre-processing**.
It could be a translation dataset with a specific tokenization,
or an image dataset with a specific resolution.

In [0]:
# problems.available()   # Show all problems
# this is a English-Chinese dataset with 8192 vocabulary
PROBLEM = 'translate_enzh_wmt8k' 

# registry.list_models() # Show all registered models
MODEL = 'transformer' 

# Hyperparameters for the model by default 
# start with "transformer_base" or 'transformer_base_single_gpu'
# if training on a single GPU
HPARAMS = 'transformer_base_single_gpu'    

### Data Generation

Now we are ready to fetch the dataset of PROBLEM and process it into a standard format ready for training and evaluation. The downloaded data will be stored in the data folder we just created. This process takes:

CPU times: user 29min 43s, sys: 1min 4s, total: 30min 48s
Wall time: 30min 54s

In [0]:
%%time
t2t_problem = problems.problem(PROBLEM)
t2t_problem.generate_data(DATA_DIR, TMP_DIR)

### Train the model

In [0]:
# Setting up parameters
train_steps = 20000              # Total number of train steps for all Epochs
eval_steps = 100                  # Number of steps to perform for each evaluation
batch_size = 1000       
save_checkpoints_steps = 1000    # Save checkpoints every 1000 steps
ALPHA = 0.1                      # Learning rate
schedule = "continuous_train_and_eval"

# Init Hparams object 
hparams = create_hparams(HPARAMS)
# Make Changes to Hparams
hparams.batch_size = batch_size
hparams.learning_rate = ALPHA

# See all Hparams with code below
#print(json.loads(hparams.to_json())

In [6]:
# train the model
RUN_CONFIG = create_run_config(
      model_dir=TRAIN_DIR,
      model_name=MODEL,
      save_checkpoints_steps= save_checkpoints_steps
)

tensorflow_exp_fn = create_experiment(
        run_config=RUN_CONFIG,
        hparams=hparams,
        model_name=MODEL,
        problem_name=PROBLEM,
        data_dir=DATA_DIR, 
        train_steps=train_steps, 
        eval_steps=eval_steps, 
        use_xla=True # For acceleration
    ) 

tensorflow_exp_fn.train_and_evaluate()



















Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.


Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.


INFO:tensorflow:Configuring DataParallelism to replicate the model.


INFO:tensorflow:Configuring DataParallelism to replicate the model.


INFO:tensorflow:schedule=continuous_train_and_eval


INFO:tensorflow:schedule=continuous_train_and_eval


INFO:tensorflow:worker_gpu=1


INFO:tensorflow:worker_gpu=1


INFO:tensorflow:sync=False


INFO:tensorflow:sync=False












INFO:tensorflow:datashard_devices: ['gpu:0']


INFO:tensorflow:datashard_devices: ['gpu:0']


INFO:tensorflow:caching_devices: None


INFO:tensorflow:caching_devices: None


INFO:tensorflow:ps_devices: ['gpu:0']


INFO:tensorflow:ps_devices: ['gpu:0']


INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc30b08c358>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_eval_distribute': None, '_experimental_max_worker_delay_secs': None, '_device_fn': None, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_log_step_count_steps': 100, '_protocol': None, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95
}
allow_soft_placement: true
graph_options {
  optimizer_options {
    global_jit_level: OFF
  }
}
isolate_session_state: true
, '_save_checkpoints_steps': 1000, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/t2t/train', '_session_creation_timeout_secs': 7200, 'use_tpu'

INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc30b08c358>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_eval_distribute': None, '_experimental_max_worker_delay_secs': None, '_device_fn': None, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_log_step_count_steps': 100, '_protocol': None, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95
}
allow_soft_placement: true
graph_options {
  optimizer_options {
    global_jit_level: OFF
  }
}
isolate_session_state: true
, '_save_checkpoints_steps': 1000, '_keep_checkpoint_max': 20, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/t2t/train', '_session_creation_timeout_secs': 7200, 'use_tpu'





INFO:tensorflow:Using ValidationMonitor


INFO:tensorflow:Using ValidationMonitor


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.


Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.












Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-train*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-train*


INFO:tensorflow:partition: 0 num_data_files: 10


INFO:tensorflow:partition: 0 num_data_files: 10


Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.


Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.


























Instructions for updating:
Use `tf.cast` instead.


Instructions for updating:
Use `tf.cast` instead.






Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
















Instructions for updating:
Use `tf.cast` instead.


Instructions for updating:
Use `tf.cast` instead.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


























INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'train'


INFO:tensorflow:Setting T2TModel mode to 'train'


















INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body








Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
































INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top














INFO:tensorflow:Base learning rate: 0.100000


INFO:tensorflow:Base learning rate: 0.100000








INFO:tensorflow:Trainable Variables Total size: 56774656


INFO:tensorflow:Trainable Variables Total size: 56774656


INFO:tensorflow:Non-trainable variables Total size: 5


INFO:tensorflow:Non-trainable variables Total size: 5


INFO:tensorflow:Using optimizer adam


INFO:tensorflow:Using optimizer adam


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 0 into /t2t/train/model.ckpt.


INFO:tensorflow:loss = 8.206303, step = 0


INFO:tensorflow:loss = 8.206303, step = 0


INFO:tensorflow:global_step/sec: 4.3892


INFO:tensorflow:global_step/sec: 4.3892


INFO:tensorflow:loss = 7.5571785, step = 100 (22.786 sec)


INFO:tensorflow:loss = 7.5571785, step = 100 (22.786 sec)


INFO:tensorflow:global_step/sec: 7.5983


INFO:tensorflow:global_step/sec: 7.5983


INFO:tensorflow:loss = 7.07722, step = 200 (13.161 sec)


INFO:tensorflow:loss = 7.07722, step = 200 (13.161 sec)


INFO:tensorflow:global_step/sec: 7.86938


INFO:tensorflow:global_step/sec: 7.86938


INFO:tensorflow:loss = 6.700705, step = 300 (12.707 sec)


INFO:tensorflow:loss = 6.700705, step = 300 (12.707 sec)


INFO:tensorflow:global_step/sec: 7.91829


INFO:tensorflow:global_step/sec: 7.91829


INFO:tensorflow:loss = 6.7903023, step = 400 (12.629 sec)


INFO:tensorflow:loss = 6.7903023, step = 400 (12.629 sec)


INFO:tensorflow:global_step/sec: 8.0249


INFO:tensorflow:global_step/sec: 8.0249


INFO:tensorflow:loss = 6.6664696, step = 500 (12.461 sec)


INFO:tensorflow:loss = 6.6664696, step = 500 (12.461 sec)


INFO:tensorflow:global_step/sec: 7.91077


INFO:tensorflow:global_step/sec: 7.91077


INFO:tensorflow:loss = 6.165196, step = 600 (12.641 sec)


INFO:tensorflow:loss = 6.165196, step = 600 (12.641 sec)


INFO:tensorflow:global_step/sec: 7.88527


INFO:tensorflow:global_step/sec: 7.88527


INFO:tensorflow:loss = 6.374553, step = 700 (12.682 sec)


INFO:tensorflow:loss = 6.374553, step = 700 (12.682 sec)


INFO:tensorflow:global_step/sec: 7.93233


INFO:tensorflow:global_step/sec: 7.93233


INFO:tensorflow:loss = 6.281819, step = 800 (12.606 sec)


INFO:tensorflow:loss = 6.281819, step = 800 (12.606 sec)


INFO:tensorflow:global_step/sec: 7.96413


INFO:tensorflow:global_step/sec: 7.96413


INFO:tensorflow:loss = 6.169307, step = 900 (12.558 sec)


INFO:tensorflow:loss = 6.169307, step = 900 (12.558 sec)


INFO:tensorflow:Saving checkpoints for 1000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 1000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.50445


INFO:tensorflow:global_step/sec: 6.50445


INFO:tensorflow:loss = 6.1104093, step = 1000 (15.372 sec)


INFO:tensorflow:loss = 6.1104093, step = 1000 (15.372 sec)


INFO:tensorflow:global_step/sec: 7.93365


INFO:tensorflow:global_step/sec: 7.93365


INFO:tensorflow:loss = 6.202379, step = 1100 (12.605 sec)


INFO:tensorflow:loss = 6.202379, step = 1100 (12.605 sec)


INFO:tensorflow:global_step/sec: 7.94012


INFO:tensorflow:global_step/sec: 7.94012


INFO:tensorflow:loss = 5.659753, step = 1200 (12.598 sec)


INFO:tensorflow:loss = 5.659753, step = 1200 (12.598 sec)


INFO:tensorflow:global_step/sec: 7.88336


INFO:tensorflow:global_step/sec: 7.88336


INFO:tensorflow:loss = 5.8959856, step = 1300 (12.684 sec)


INFO:tensorflow:loss = 5.8959856, step = 1300 (12.684 sec)


INFO:tensorflow:global_step/sec: 7.96847


INFO:tensorflow:global_step/sec: 7.96847


INFO:tensorflow:loss = 6.234469, step = 1400 (12.547 sec)


INFO:tensorflow:loss = 6.234469, step = 1400 (12.547 sec)


INFO:tensorflow:global_step/sec: 7.95553


INFO:tensorflow:global_step/sec: 7.95553


INFO:tensorflow:loss = 5.8511863, step = 1500 (12.571 sec)


INFO:tensorflow:loss = 5.8511863, step = 1500 (12.571 sec)


INFO:tensorflow:global_step/sec: 7.93897


INFO:tensorflow:global_step/sec: 7.93897


INFO:tensorflow:loss = 6.192944, step = 1600 (12.595 sec)


INFO:tensorflow:loss = 6.192944, step = 1600 (12.595 sec)


INFO:tensorflow:global_step/sec: 7.79488


INFO:tensorflow:global_step/sec: 7.79488


INFO:tensorflow:loss = 5.86918, step = 1700 (12.829 sec)


INFO:tensorflow:loss = 5.86918, step = 1700 (12.829 sec)


INFO:tensorflow:global_step/sec: 7.92909


INFO:tensorflow:global_step/sec: 7.92909


INFO:tensorflow:loss = 5.92421, step = 1800 (12.612 sec)


INFO:tensorflow:loss = 5.92421, step = 1800 (12.612 sec)


INFO:tensorflow:global_step/sec: 7.89793


INFO:tensorflow:global_step/sec: 7.89793


INFO:tensorflow:loss = 5.599897, step = 1900 (12.662 sec)


INFO:tensorflow:loss = 5.599897, step = 1900 (12.662 sec)


INFO:tensorflow:Saving checkpoints for 2000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 2000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.42329


INFO:tensorflow:global_step/sec: 6.42329


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top








Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    


Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    














INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T20:51:56Z


INFO:tensorflow:Starting evaluation at 2020-06-08T20:51:56Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-2000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-2000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-20:52:16


INFO:tensorflow:Finished evaluation at 2020-06-08-20:52:16


INFO:tensorflow:Saving dict for global step 2000: global_step = 2000, loss = 6.8611174, metrics-translate_enzh_wmt8k/targets/accuracy = 0.12910478, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.19365717, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.016224528, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.940546, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.057352167, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.14293887


INFO:tensorflow:Saving dict for global step 2000: global_step = 2000, loss = 6.8611174, metrics-translate_enzh_wmt8k/targets/accuracy = 0.12910478, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.19365717, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.016224528, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.940546, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.057352167, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.14293887


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 2000: /t2t/train/model.ckpt-2000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 2000: /t2t/train/model.ckpt-2000


INFO:tensorflow:Validation (step 2000): loss = 6.8611174, metrics-translate_enzh_wmt8k/targets/accuracy = 0.12910478, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.19365717, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.016224528, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.940546, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.057352167, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.14293887, global_step = 2000


INFO:tensorflow:Validation (step 2000): loss = 6.8611174, metrics-translate_enzh_wmt8k/targets/accuracy = 0.12910478, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.19365717, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.016224528, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.940546, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.057352167, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.14293887, global_step = 2000


INFO:tensorflow:loss = 5.8742394, step = 2000 (43.713 sec)


INFO:tensorflow:loss = 5.8742394, step = 2000 (43.713 sec)


INFO:tensorflow:global_step/sec: 2.44415


INFO:tensorflow:global_step/sec: 2.44415


INFO:tensorflow:loss = 5.9558845, step = 2100 (12.769 sec)


INFO:tensorflow:loss = 5.9558845, step = 2100 (12.769 sec)


INFO:tensorflow:global_step/sec: 7.85166


INFO:tensorflow:global_step/sec: 7.85166


INFO:tensorflow:loss = 5.536512, step = 2200 (12.737 sec)


INFO:tensorflow:loss = 5.536512, step = 2200 (12.737 sec)


INFO:tensorflow:global_step/sec: 7.89637


INFO:tensorflow:global_step/sec: 7.89637


INFO:tensorflow:loss = 5.578603, step = 2300 (12.664 sec)


INFO:tensorflow:loss = 5.578603, step = 2300 (12.664 sec)


INFO:tensorflow:global_step/sec: 7.51758


INFO:tensorflow:global_step/sec: 7.51758


INFO:tensorflow:loss = 4.943216, step = 2400 (13.305 sec)


INFO:tensorflow:loss = 4.943216, step = 2400 (13.305 sec)


INFO:tensorflow:global_step/sec: 7.94624


INFO:tensorflow:global_step/sec: 7.94624


INFO:tensorflow:loss = 5.595986, step = 2500 (12.580 sec)


INFO:tensorflow:loss = 5.595986, step = 2500 (12.580 sec)


INFO:tensorflow:global_step/sec: 7.8601


INFO:tensorflow:global_step/sec: 7.8601


INFO:tensorflow:loss = 5.3691363, step = 2600 (12.724 sec)


INFO:tensorflow:loss = 5.3691363, step = 2600 (12.724 sec)


INFO:tensorflow:global_step/sec: 7.88726


INFO:tensorflow:global_step/sec: 7.88726


INFO:tensorflow:loss = 5.5195036, step = 2700 (12.678 sec)


INFO:tensorflow:loss = 5.5195036, step = 2700 (12.678 sec)


INFO:tensorflow:global_step/sec: 7.88833


INFO:tensorflow:global_step/sec: 7.88833


INFO:tensorflow:loss = 4.666626, step = 2800 (12.681 sec)


INFO:tensorflow:loss = 4.666626, step = 2800 (12.681 sec)


INFO:tensorflow:global_step/sec: 7.97324


INFO:tensorflow:global_step/sec: 7.97324


INFO:tensorflow:loss = 5.0440574, step = 2900 (12.538 sec)


INFO:tensorflow:loss = 5.0440574, step = 2900 (12.538 sec)


INFO:tensorflow:Saving checkpoints for 3000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 3000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.42961


INFO:tensorflow:global_step/sec: 6.42961


INFO:tensorflow:loss = 5.2860193, step = 3000 (15.553 sec)


INFO:tensorflow:loss = 5.2860193, step = 3000 (15.553 sec)


INFO:tensorflow:global_step/sec: 7.86829


INFO:tensorflow:global_step/sec: 7.86829


INFO:tensorflow:loss = 5.0830245, step = 3100 (12.710 sec)


INFO:tensorflow:loss = 5.0830245, step = 3100 (12.710 sec)


INFO:tensorflow:global_step/sec: 7.83137


INFO:tensorflow:global_step/sec: 7.83137


INFO:tensorflow:loss = 4.526812, step = 3200 (12.770 sec)


INFO:tensorflow:loss = 4.526812, step = 3200 (12.770 sec)


INFO:tensorflow:global_step/sec: 7.86331


INFO:tensorflow:global_step/sec: 7.86331


INFO:tensorflow:loss = 4.875972, step = 3300 (12.717 sec)


INFO:tensorflow:loss = 4.875972, step = 3300 (12.717 sec)


INFO:tensorflow:global_step/sec: 7.88393


INFO:tensorflow:global_step/sec: 7.88393


INFO:tensorflow:loss = 5.0665154, step = 3400 (12.684 sec)


INFO:tensorflow:loss = 5.0665154, step = 3400 (12.684 sec)


INFO:tensorflow:global_step/sec: 7.83054


INFO:tensorflow:global_step/sec: 7.83054


INFO:tensorflow:loss = 4.984199, step = 3500 (12.770 sec)


INFO:tensorflow:loss = 4.984199, step = 3500 (12.770 sec)


INFO:tensorflow:global_step/sec: 7.90185


INFO:tensorflow:global_step/sec: 7.90185


INFO:tensorflow:loss = 4.906101, step = 3600 (12.655 sec)


INFO:tensorflow:loss = 4.906101, step = 3600 (12.655 sec)


INFO:tensorflow:global_step/sec: 7.81357


INFO:tensorflow:global_step/sec: 7.81357


INFO:tensorflow:loss = 4.742843, step = 3700 (12.799 sec)


INFO:tensorflow:loss = 4.742843, step = 3700 (12.799 sec)


INFO:tensorflow:global_step/sec: 7.69966


INFO:tensorflow:global_step/sec: 7.69966


INFO:tensorflow:loss = 4.8483458, step = 3800 (12.988 sec)


INFO:tensorflow:loss = 4.8483458, step = 3800 (12.988 sec)


INFO:tensorflow:global_step/sec: 7.83676


INFO:tensorflow:global_step/sec: 7.83676


INFO:tensorflow:loss = 5.015508, step = 3900 (12.760 sec)


INFO:tensorflow:loss = 5.015508, step = 3900 (12.760 sec)


INFO:tensorflow:Saving checkpoints for 4000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 4000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.40118


INFO:tensorflow:global_step/sec: 6.40118


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T20:56:45Z


INFO:tensorflow:Starting evaluation at 2020-06-08T20:56:45Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-4000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-4000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-20:57:04


INFO:tensorflow:Finished evaluation at 2020-06-08-20:57:04


INFO:tensorflow:Saving dict for global step 4000: global_step = 4000, loss = 5.9013567, metrics-translate_enzh_wmt8k/targets/accuracy = 0.18352543, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.2958439, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.0242064, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.001211, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.068529755, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.19286706


INFO:tensorflow:Saving dict for global step 4000: global_step = 4000, loss = 5.9013567, metrics-translate_enzh_wmt8k/targets/accuracy = 0.18352543, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.2958439, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.0242064, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.001211, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.068529755, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.19286706


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: /t2t/train/model.ckpt-4000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: /t2t/train/model.ckpt-4000


INFO:tensorflow:Validation (step 4000): loss = 5.9013567, metrics-translate_enzh_wmt8k/targets/accuracy = 0.18352543, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.2958439, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.0242064, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.001211, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.068529755, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.19286706, global_step = 4000


INFO:tensorflow:Validation (step 4000): loss = 5.9013567, metrics-translate_enzh_wmt8k/targets/accuracy = 0.18352543, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.2958439, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.0242064, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -6.001211, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.068529755, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.19286706, global_step = 4000


INFO:tensorflow:loss = 4.931782, step = 4000 (43.139 sec)


INFO:tensorflow:loss = 4.931782, step = 4000 (43.139 sec)


INFO:tensorflow:global_step/sec: 2.48199


INFO:tensorflow:global_step/sec: 2.48199


INFO:tensorflow:loss = 4.745625, step = 4100 (12.774 sec)


INFO:tensorflow:loss = 4.745625, step = 4100 (12.774 sec)


INFO:tensorflow:global_step/sec: 7.80836


INFO:tensorflow:global_step/sec: 7.80836


INFO:tensorflow:loss = 4.571465, step = 4200 (12.807 sec)


INFO:tensorflow:loss = 4.571465, step = 4200 (12.807 sec)


INFO:tensorflow:global_step/sec: 7.87564


INFO:tensorflow:global_step/sec: 7.87564


INFO:tensorflow:loss = 4.1410484, step = 4300 (12.698 sec)


INFO:tensorflow:loss = 4.1410484, step = 4300 (12.698 sec)


INFO:tensorflow:global_step/sec: 7.89674


INFO:tensorflow:global_step/sec: 7.89674


INFO:tensorflow:loss = 3.799463, step = 4400 (12.663 sec)


INFO:tensorflow:loss = 3.799463, step = 4400 (12.663 sec)


INFO:tensorflow:global_step/sec: 7.78765


INFO:tensorflow:global_step/sec: 7.78765


INFO:tensorflow:loss = 4.6475196, step = 4500 (12.841 sec)


INFO:tensorflow:loss = 4.6475196, step = 4500 (12.841 sec)


INFO:tensorflow:global_step/sec: 7.52567


INFO:tensorflow:global_step/sec: 7.52567


INFO:tensorflow:loss = 4.7759104, step = 4600 (13.290 sec)


INFO:tensorflow:loss = 4.7759104, step = 4600 (13.290 sec)


INFO:tensorflow:global_step/sec: 7.86095


INFO:tensorflow:global_step/sec: 7.86095


INFO:tensorflow:loss = 4.523374, step = 4700 (12.719 sec)


INFO:tensorflow:loss = 4.523374, step = 4700 (12.719 sec)


INFO:tensorflow:global_step/sec: 7.83284


INFO:tensorflow:global_step/sec: 7.83284


INFO:tensorflow:loss = 4.2273765, step = 4800 (12.767 sec)


INFO:tensorflow:loss = 4.2273765, step = 4800 (12.767 sec)


INFO:tensorflow:global_step/sec: 7.80726


INFO:tensorflow:global_step/sec: 7.80726


INFO:tensorflow:loss = 4.6141863, step = 4900 (12.808 sec)


INFO:tensorflow:loss = 4.6141863, step = 4900 (12.808 sec)


INFO:tensorflow:Saving checkpoints for 5000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 5000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.42409


INFO:tensorflow:global_step/sec: 6.42409


INFO:tensorflow:loss = 4.1609697, step = 5000 (15.570 sec)


INFO:tensorflow:loss = 4.1609697, step = 5000 (15.570 sec)


INFO:tensorflow:global_step/sec: 7.92297


INFO:tensorflow:global_step/sec: 7.92297


INFO:tensorflow:loss = 4.116477, step = 5100 (12.621 sec)


INFO:tensorflow:loss = 4.116477, step = 5100 (12.621 sec)


INFO:tensorflow:global_step/sec: 7.85644


INFO:tensorflow:global_step/sec: 7.85644


INFO:tensorflow:loss = 3.8781662, step = 5200 (12.726 sec)


INFO:tensorflow:loss = 3.8781662, step = 5200 (12.726 sec)


INFO:tensorflow:global_step/sec: 7.83803


INFO:tensorflow:global_step/sec: 7.83803


INFO:tensorflow:loss = 3.8632264, step = 5300 (12.758 sec)


INFO:tensorflow:loss = 3.8632264, step = 5300 (12.758 sec)


INFO:tensorflow:global_step/sec: 7.91871


INFO:tensorflow:global_step/sec: 7.91871


INFO:tensorflow:loss = 4.195269, step = 5400 (12.632 sec)


INFO:tensorflow:loss = 4.195269, step = 5400 (12.632 sec)


INFO:tensorflow:global_step/sec: 7.84133


INFO:tensorflow:global_step/sec: 7.84133


INFO:tensorflow:loss = 4.5809717, step = 5500 (12.749 sec)


INFO:tensorflow:loss = 4.5809717, step = 5500 (12.749 sec)


INFO:tensorflow:global_step/sec: 7.8042


INFO:tensorflow:global_step/sec: 7.8042


INFO:tensorflow:loss = 4.1958055, step = 5600 (12.814 sec)


INFO:tensorflow:loss = 4.1958055, step = 5600 (12.814 sec)


INFO:tensorflow:global_step/sec: 7.80813


INFO:tensorflow:global_step/sec: 7.80813


INFO:tensorflow:loss = 4.1584396, step = 5700 (12.808 sec)


INFO:tensorflow:loss = 4.1584396, step = 5700 (12.808 sec)


INFO:tensorflow:global_step/sec: 7.8048


INFO:tensorflow:global_step/sec: 7.8048


INFO:tensorflow:loss = 3.9134655, step = 5800 (12.812 sec)


INFO:tensorflow:loss = 3.9134655, step = 5800 (12.812 sec)


INFO:tensorflow:global_step/sec: 7.70815


INFO:tensorflow:global_step/sec: 7.70815


INFO:tensorflow:loss = 4.4913654, step = 5900 (12.973 sec)


INFO:tensorflow:loss = 4.4913654, step = 5900 (12.973 sec)


INFO:tensorflow:Saving checkpoints for 6000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 6000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.40582


INFO:tensorflow:global_step/sec: 6.40582


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:01:34Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:01:34Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-6000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-6000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:01:53


INFO:tensorflow:Finished evaluation at 2020-06-08-21:01:53


INFO:tensorflow:Saving dict for global step 6000: global_step = 6000, loss = 5.4428215, metrics-translate_enzh_wmt8k/targets/accuracy = 0.21099804, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.34464413, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.031128103, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.547634, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08039985, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.21557666


INFO:tensorflow:Saving dict for global step 6000: global_step = 6000, loss = 5.4428215, metrics-translate_enzh_wmt8k/targets/accuracy = 0.21099804, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.34464413, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.031128103, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.547634, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08039985, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.21557666


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 6000: /t2t/train/model.ckpt-6000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 6000: /t2t/train/model.ckpt-6000


INFO:tensorflow:Validation (step 6000): loss = 5.4428215, metrics-translate_enzh_wmt8k/targets/accuracy = 0.21099804, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.34464413, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.031128103, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.547634, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08039985, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.21557666, global_step = 6000


INFO:tensorflow:Validation (step 6000): loss = 5.4428215, metrics-translate_enzh_wmt8k/targets/accuracy = 0.21099804, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.34464413, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.031128103, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.547634, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08039985, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.21557666, global_step = 6000


INFO:tensorflow:loss = 4.4349403, step = 6000 (43.111 sec)


INFO:tensorflow:loss = 4.4349403, step = 6000 (43.111 sec)


INFO:tensorflow:global_step/sec: 2.49258


INFO:tensorflow:global_step/sec: 2.49258


INFO:tensorflow:loss = 3.8265464, step = 6100 (12.619 sec)


INFO:tensorflow:loss = 3.8265464, step = 6100 (12.619 sec)


INFO:tensorflow:global_step/sec: 7.89843


INFO:tensorflow:global_step/sec: 7.89843


INFO:tensorflow:loss = 3.7733388, step = 6200 (12.663 sec)


INFO:tensorflow:loss = 3.7733388, step = 6200 (12.663 sec)


INFO:tensorflow:global_step/sec: 7.8487


INFO:tensorflow:global_step/sec: 7.8487


INFO:tensorflow:loss = 3.9183242, step = 6300 (12.738 sec)


INFO:tensorflow:loss = 3.9183242, step = 6300 (12.738 sec)


INFO:tensorflow:global_step/sec: 7.83041


INFO:tensorflow:global_step/sec: 7.83041


INFO:tensorflow:loss = 4.308501, step = 6400 (12.771 sec)


INFO:tensorflow:loss = 4.308501, step = 6400 (12.771 sec)


INFO:tensorflow:global_step/sec: 7.86937


INFO:tensorflow:global_step/sec: 7.86937


INFO:tensorflow:loss = 3.960894, step = 6500 (12.708 sec)


INFO:tensorflow:loss = 3.960894, step = 6500 (12.708 sec)


INFO:tensorflow:global_step/sec: 7.87572


INFO:tensorflow:global_step/sec: 7.87572


INFO:tensorflow:loss = 4.272045, step = 6600 (12.697 sec)


INFO:tensorflow:loss = 4.272045, step = 6600 (12.697 sec)


INFO:tensorflow:global_step/sec: 7.63881


INFO:tensorflow:global_step/sec: 7.63881


INFO:tensorflow:loss = 4.099188, step = 6700 (13.091 sec)


INFO:tensorflow:loss = 4.099188, step = 6700 (13.091 sec)


INFO:tensorflow:global_step/sec: 7.61835


INFO:tensorflow:global_step/sec: 7.61835


INFO:tensorflow:loss = 3.9703505, step = 6800 (13.126 sec)


INFO:tensorflow:loss = 3.9703505, step = 6800 (13.126 sec)


INFO:tensorflow:global_step/sec: 7.9061


INFO:tensorflow:global_step/sec: 7.9061


INFO:tensorflow:loss = 4.1397934, step = 6900 (12.648 sec)


INFO:tensorflow:loss = 4.1397934, step = 6900 (12.648 sec)


INFO:tensorflow:Saving checkpoints for 7000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 7000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 5.88125


INFO:tensorflow:global_step/sec: 5.88125


INFO:tensorflow:loss = 4.271044, step = 7000 (17.003 sec)


INFO:tensorflow:loss = 4.271044, step = 7000 (17.003 sec)


INFO:tensorflow:global_step/sec: 7.77701


INFO:tensorflow:global_step/sec: 7.77701


INFO:tensorflow:loss = 3.7244186, step = 7100 (12.859 sec)


INFO:tensorflow:loss = 3.7244186, step = 7100 (12.859 sec)


INFO:tensorflow:global_step/sec: 7.80699


INFO:tensorflow:global_step/sec: 7.80699


INFO:tensorflow:loss = 3.189792, step = 7200 (12.810 sec)


INFO:tensorflow:loss = 3.189792, step = 7200 (12.810 sec)


INFO:tensorflow:global_step/sec: 7.73773


INFO:tensorflow:global_step/sec: 7.73773


INFO:tensorflow:loss = 3.8653927, step = 7300 (12.922 sec)


INFO:tensorflow:loss = 3.8653927, step = 7300 (12.922 sec)


INFO:tensorflow:global_step/sec: 7.79588


INFO:tensorflow:global_step/sec: 7.79588


INFO:tensorflow:loss = 4.342372, step = 7400 (12.828 sec)


INFO:tensorflow:loss = 4.342372, step = 7400 (12.828 sec)


INFO:tensorflow:global_step/sec: 7.83059


INFO:tensorflow:global_step/sec: 7.83059


INFO:tensorflow:loss = 3.274453, step = 7500 (12.770 sec)


INFO:tensorflow:loss = 3.274453, step = 7500 (12.770 sec)


INFO:tensorflow:global_step/sec: 7.77913


INFO:tensorflow:global_step/sec: 7.77913


INFO:tensorflow:loss = 3.466512, step = 7600 (12.855 sec)


INFO:tensorflow:loss = 3.466512, step = 7600 (12.855 sec)


INFO:tensorflow:global_step/sec: 7.80131


INFO:tensorflow:global_step/sec: 7.80131


INFO:tensorflow:loss = 3.9678297, step = 7700 (12.819 sec)


INFO:tensorflow:loss = 3.9678297, step = 7700 (12.819 sec)


INFO:tensorflow:global_step/sec: 7.83772


INFO:tensorflow:global_step/sec: 7.83772


INFO:tensorflow:loss = 3.5934412, step = 7800 (12.759 sec)


INFO:tensorflow:loss = 3.5934412, step = 7800 (12.759 sec)


INFO:tensorflow:global_step/sec: 7.81984


INFO:tensorflow:global_step/sec: 7.81984


INFO:tensorflow:loss = 3.7750864, step = 7900 (12.787 sec)


INFO:tensorflow:loss = 3.7750864, step = 7900 (12.787 sec)


INFO:tensorflow:Saving checkpoints for 8000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 8000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.44879


INFO:tensorflow:global_step/sec: 6.44879


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:06:25Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:06:25Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-8000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-8000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:06:44


INFO:tensorflow:Finished evaluation at 2020-06-08-21:06:44


INFO:tensorflow:Saving dict for global step 8000: global_step = 8000, loss = 5.1076803, metrics-translate_enzh_wmt8k/targets/accuracy = 0.2314114, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.38569567, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.039085366, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.219601, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08715356, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2355536


INFO:tensorflow:Saving dict for global step 8000: global_step = 8000, loss = 5.1076803, metrics-translate_enzh_wmt8k/targets/accuracy = 0.2314114, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.38569567, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.039085366, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.219601, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08715356, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2355536


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 8000: /t2t/train/model.ckpt-8000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 8000: /t2t/train/model.ckpt-8000


INFO:tensorflow:Validation (step 8000): loss = 5.1076803, metrics-translate_enzh_wmt8k/targets/accuracy = 0.2314114, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.38569567, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.039085366, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.219601, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08715356, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2355536, global_step = 8000


INFO:tensorflow:Validation (step 8000): loss = 5.1076803, metrics-translate_enzh_wmt8k/targets/accuracy = 0.2314114, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.38569567, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.039085366, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -5.219601, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.08715356, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2355536, global_step = 8000


INFO:tensorflow:loss = 3.9268389, step = 8000 (42.974 sec)


INFO:tensorflow:loss = 3.9268389, step = 8000 (42.974 sec)


INFO:tensorflow:global_step/sec: 2.48751


INFO:tensorflow:global_step/sec: 2.48751


INFO:tensorflow:loss = 3.2559953, step = 8100 (12.734 sec)


INFO:tensorflow:loss = 3.2559953, step = 8100 (12.734 sec)


INFO:tensorflow:global_step/sec: 7.78766


INFO:tensorflow:global_step/sec: 7.78766


INFO:tensorflow:loss = 4.0477905, step = 8200 (12.840 sec)


INFO:tensorflow:loss = 4.0477905, step = 8200 (12.840 sec)


INFO:tensorflow:global_step/sec: 7.76412


INFO:tensorflow:global_step/sec: 7.76412


INFO:tensorflow:loss = 3.8048196, step = 8300 (12.880 sec)


INFO:tensorflow:loss = 3.8048196, step = 8300 (12.880 sec)


INFO:tensorflow:global_step/sec: 7.74372


INFO:tensorflow:global_step/sec: 7.74372


INFO:tensorflow:loss = 3.6556268, step = 8400 (12.914 sec)


INFO:tensorflow:loss = 3.6556268, step = 8400 (12.914 sec)


INFO:tensorflow:global_step/sec: 7.76052


INFO:tensorflow:global_step/sec: 7.76052


INFO:tensorflow:loss = 3.2806036, step = 8500 (12.887 sec)


INFO:tensorflow:loss = 3.2806036, step = 8500 (12.887 sec)


INFO:tensorflow:global_step/sec: 7.72373


INFO:tensorflow:global_step/sec: 7.72373


INFO:tensorflow:loss = 3.1070733, step = 8600 (12.946 sec)


INFO:tensorflow:loss = 3.1070733, step = 8600 (12.946 sec)


INFO:tensorflow:global_step/sec: 7.65521


INFO:tensorflow:global_step/sec: 7.65521


INFO:tensorflow:loss = 3.487856, step = 8700 (13.066 sec)


INFO:tensorflow:loss = 3.487856, step = 8700 (13.066 sec)


INFO:tensorflow:global_step/sec: 7.68865


INFO:tensorflow:global_step/sec: 7.68865


INFO:tensorflow:loss = 3.4546196, step = 8800 (13.004 sec)


INFO:tensorflow:loss = 3.4546196, step = 8800 (13.004 sec)


INFO:tensorflow:global_step/sec: 7.41573


INFO:tensorflow:global_step/sec: 7.41573


INFO:tensorflow:loss = 4.0095267, step = 8900 (13.485 sec)


INFO:tensorflow:loss = 4.0095267, step = 8900 (13.485 sec)


INFO:tensorflow:Saving checkpoints for 9000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 9000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.44246


INFO:tensorflow:global_step/sec: 6.44246


INFO:tensorflow:loss = 3.5430999, step = 9000 (15.522 sec)


INFO:tensorflow:loss = 3.5430999, step = 9000 (15.522 sec)


INFO:tensorflow:global_step/sec: 7.79625


INFO:tensorflow:global_step/sec: 7.79625


INFO:tensorflow:loss = 3.3563914, step = 9100 (12.831 sec)


INFO:tensorflow:loss = 3.3563914, step = 9100 (12.831 sec)


INFO:tensorflow:global_step/sec: 7.82176


INFO:tensorflow:global_step/sec: 7.82176


INFO:tensorflow:loss = 3.2364528, step = 9200 (12.781 sec)


INFO:tensorflow:loss = 3.2364528, step = 9200 (12.781 sec)


INFO:tensorflow:global_step/sec: 7.70843


INFO:tensorflow:global_step/sec: 7.70843


INFO:tensorflow:loss = 3.81494, step = 9300 (12.974 sec)


INFO:tensorflow:loss = 3.81494, step = 9300 (12.974 sec)


INFO:tensorflow:global_step/sec: 7.60069


INFO:tensorflow:global_step/sec: 7.60069


INFO:tensorflow:loss = 3.9500952, step = 9400 (13.155 sec)


INFO:tensorflow:loss = 3.9500952, step = 9400 (13.155 sec)


INFO:tensorflow:global_step/sec: 7.55725


INFO:tensorflow:global_step/sec: 7.55725


INFO:tensorflow:loss = 3.1188602, step = 9500 (13.235 sec)


INFO:tensorflow:loss = 3.1188602, step = 9500 (13.235 sec)


INFO:tensorflow:global_step/sec: 7.65204


INFO:tensorflow:global_step/sec: 7.65204


INFO:tensorflow:loss = 3.5616543, step = 9600 (13.066 sec)


INFO:tensorflow:loss = 3.5616543, step = 9600 (13.066 sec)


INFO:tensorflow:global_step/sec: 7.63591


INFO:tensorflow:global_step/sec: 7.63591


INFO:tensorflow:loss = 3.5397677, step = 9700 (13.098 sec)


INFO:tensorflow:loss = 3.5397677, step = 9700 (13.098 sec)


INFO:tensorflow:global_step/sec: 7.67388


INFO:tensorflow:global_step/sec: 7.67388


INFO:tensorflow:loss = 3.5219703, step = 9800 (13.030 sec)


INFO:tensorflow:loss = 3.5219703, step = 9800 (13.030 sec)


INFO:tensorflow:global_step/sec: 7.62452


INFO:tensorflow:global_step/sec: 7.62452


INFO:tensorflow:loss = 3.6437066, step = 9900 (13.115 sec)


INFO:tensorflow:loss = 3.6437066, step = 9900 (13.115 sec)


INFO:tensorflow:Saving checkpoints for 10000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 10000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.47447


INFO:tensorflow:global_step/sec: 6.47447


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:11:17Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:11:17Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-10000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-10000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:11:36


INFO:tensorflow:Finished evaluation at 2020-06-08-21:11:36


INFO:tensorflow:Saving dict for global step 10000: global_step = 10000, loss = 4.862646, metrics-translate_enzh_wmt8k/targets/accuracy = 0.25082058, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.41562626, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.049977608, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.9802423, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.10371607, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.25396776


INFO:tensorflow:Saving dict for global step 10000: global_step = 10000, loss = 4.862646, metrics-translate_enzh_wmt8k/targets/accuracy = 0.25082058, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.41562626, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.049977608, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.9802423, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.10371607, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.25396776


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 10000: /t2t/train/model.ckpt-10000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 10000: /t2t/train/model.ckpt-10000


INFO:tensorflow:Validation (step 10000): loss = 4.862646, metrics-translate_enzh_wmt8k/targets/accuracy = 0.25082058, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.41562626, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.049977608, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.9802423, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.10371607, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.25396776, global_step = 10000


INFO:tensorflow:Validation (step 10000): loss = 4.862646, metrics-translate_enzh_wmt8k/targets/accuracy = 0.25082058, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.41562626, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.049977608, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.9802423, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.10371607, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.25396776, global_step = 10000


INFO:tensorflow:loss = 2.8132887, step = 10000 (43.006 sec)


INFO:tensorflow:loss = 2.8132887, step = 10000 (43.006 sec)


INFO:tensorflow:global_step/sec: 2.45643


INFO:tensorflow:global_step/sec: 2.45643


INFO:tensorflow:loss = 3.3376534, step = 10100 (13.149 sec)


INFO:tensorflow:loss = 3.3376534, step = 10100 (13.149 sec)


INFO:tensorflow:global_step/sec: 7.69575


INFO:tensorflow:global_step/sec: 7.69575


INFO:tensorflow:loss = 3.5048878, step = 10200 (12.997 sec)


INFO:tensorflow:loss = 3.5048878, step = 10200 (12.997 sec)


INFO:tensorflow:global_step/sec: 7.64926


INFO:tensorflow:global_step/sec: 7.64926


INFO:tensorflow:loss = 3.420542, step = 10300 (13.070 sec)


INFO:tensorflow:loss = 3.420542, step = 10300 (13.070 sec)


INFO:tensorflow:global_step/sec: 7.71005


INFO:tensorflow:global_step/sec: 7.71005


INFO:tensorflow:loss = 3.243301, step = 10400 (12.970 sec)


INFO:tensorflow:loss = 3.243301, step = 10400 (12.970 sec)


INFO:tensorflow:global_step/sec: 7.61012


INFO:tensorflow:global_step/sec: 7.61012


INFO:tensorflow:loss = 2.9840086, step = 10500 (13.140 sec)


INFO:tensorflow:loss = 2.9840086, step = 10500 (13.140 sec)


INFO:tensorflow:global_step/sec: 7.5639


INFO:tensorflow:global_step/sec: 7.5639


INFO:tensorflow:loss = 3.5589929, step = 10600 (13.224 sec)


INFO:tensorflow:loss = 3.5589929, step = 10600 (13.224 sec)


INFO:tensorflow:global_step/sec: 7.6846


INFO:tensorflow:global_step/sec: 7.6846


INFO:tensorflow:loss = 3.3088934, step = 10700 (13.010 sec)


INFO:tensorflow:loss = 3.3088934, step = 10700 (13.010 sec)


INFO:tensorflow:global_step/sec: 7.66668


INFO:tensorflow:global_step/sec: 7.66668


INFO:tensorflow:loss = 3.480203, step = 10800 (13.044 sec)


INFO:tensorflow:loss = 3.480203, step = 10800 (13.044 sec)


INFO:tensorflow:global_step/sec: 7.66992


INFO:tensorflow:global_step/sec: 7.66992


INFO:tensorflow:loss = 3.5424097, step = 10900 (13.038 sec)


INFO:tensorflow:loss = 3.5424097, step = 10900 (13.038 sec)


INFO:tensorflow:Saving checkpoints for 11000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 11000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.13897


INFO:tensorflow:global_step/sec: 6.13897


INFO:tensorflow:loss = 3.029024, step = 11000 (16.289 sec)


INFO:tensorflow:loss = 3.029024, step = 11000 (16.289 sec)


INFO:tensorflow:global_step/sec: 7.64124


INFO:tensorflow:global_step/sec: 7.64124


INFO:tensorflow:loss = 3.194232, step = 11100 (13.088 sec)


INFO:tensorflow:loss = 3.194232, step = 11100 (13.088 sec)


INFO:tensorflow:global_step/sec: 7.74304


INFO:tensorflow:global_step/sec: 7.74304


INFO:tensorflow:loss = 3.3059125, step = 11200 (12.914 sec)


INFO:tensorflow:loss = 3.3059125, step = 11200 (12.914 sec)


INFO:tensorflow:global_step/sec: 7.68464


INFO:tensorflow:global_step/sec: 7.68464


INFO:tensorflow:loss = 3.4165974, step = 11300 (13.013 sec)


INFO:tensorflow:loss = 3.4165974, step = 11300 (13.013 sec)


INFO:tensorflow:global_step/sec: 7.63459


INFO:tensorflow:global_step/sec: 7.63459


INFO:tensorflow:loss = 3.795646, step = 11400 (13.102 sec)


INFO:tensorflow:loss = 3.795646, step = 11400 (13.102 sec)


INFO:tensorflow:global_step/sec: 7.63216


INFO:tensorflow:global_step/sec: 7.63216


INFO:tensorflow:loss = 3.5605738, step = 11500 (13.099 sec)


INFO:tensorflow:loss = 3.5605738, step = 11500 (13.099 sec)


INFO:tensorflow:global_step/sec: 7.6724


INFO:tensorflow:global_step/sec: 7.6724


INFO:tensorflow:loss = 3.098699, step = 11600 (13.033 sec)


INFO:tensorflow:loss = 3.098699, step = 11600 (13.033 sec)


INFO:tensorflow:global_step/sec: 7.6647


INFO:tensorflow:global_step/sec: 7.6647


INFO:tensorflow:loss = 3.1131926, step = 11700 (13.050 sec)


INFO:tensorflow:loss = 3.1131926, step = 11700 (13.050 sec)


INFO:tensorflow:global_step/sec: 7.60219


INFO:tensorflow:global_step/sec: 7.60219


INFO:tensorflow:loss = 3.112894, step = 11800 (13.151 sec)


INFO:tensorflow:loss = 3.112894, step = 11800 (13.151 sec)


INFO:tensorflow:global_step/sec: 7.69442


INFO:tensorflow:global_step/sec: 7.69442


INFO:tensorflow:loss = 3.3593385, step = 11900 (12.997 sec)


INFO:tensorflow:loss = 3.3593385, step = 11900 (12.997 sec)


INFO:tensorflow:Saving checkpoints for 12000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 12000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.44333


INFO:tensorflow:global_step/sec: 6.44333


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:16:12Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:16:12Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-12000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-12000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:16:32


INFO:tensorflow:Finished evaluation at 2020-06-08-21:16:32


INFO:tensorflow:Saving dict for global step 12000: global_step = 12000, loss = 4.6434674, metrics-translate_enzh_wmt8k/targets/accuracy = 0.27270275, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.44263425, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.061126143, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.759766, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11572767, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2719984


INFO:tensorflow:Saving dict for global step 12000: global_step = 12000, loss = 4.6434674, metrics-translate_enzh_wmt8k/targets/accuracy = 0.27270275, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.44263425, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.061126143, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.759766, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11572767, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2719984


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 12000: /t2t/train/model.ckpt-12000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 12000: /t2t/train/model.ckpt-12000


INFO:tensorflow:Validation (step 12000): loss = 4.6434674, metrics-translate_enzh_wmt8k/targets/accuracy = 0.27270275, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.44263425, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.061126143, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.759766, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11572767, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2719984, global_step = 12000


INFO:tensorflow:Validation (step 12000): loss = 4.6434674, metrics-translate_enzh_wmt8k/targets/accuracy = 0.27270275, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.44263425, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.061126143, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.759766, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11572767, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2719984, global_step = 12000


INFO:tensorflow:loss = 2.7099621, step = 12000 (43.984 sec)


INFO:tensorflow:loss = 2.7099621, step = 12000 (43.984 sec)


INFO:tensorflow:global_step/sec: 2.41289


INFO:tensorflow:global_step/sec: 2.41289


INFO:tensorflow:loss = 3.1451695, step = 12100 (12.980 sec)


INFO:tensorflow:loss = 3.1451695, step = 12100 (12.980 sec)


INFO:tensorflow:global_step/sec: 7.60814


INFO:tensorflow:global_step/sec: 7.60814


INFO:tensorflow:loss = 3.8083255, step = 12200 (13.144 sec)


INFO:tensorflow:loss = 3.8083255, step = 12200 (13.144 sec)


INFO:tensorflow:global_step/sec: 7.68463


INFO:tensorflow:global_step/sec: 7.68463


INFO:tensorflow:loss = 3.2100718, step = 12300 (13.012 sec)


INFO:tensorflow:loss = 3.2100718, step = 12300 (13.012 sec)


INFO:tensorflow:global_step/sec: 7.68151


INFO:tensorflow:global_step/sec: 7.68151


INFO:tensorflow:loss = 3.2407572, step = 12400 (13.019 sec)


INFO:tensorflow:loss = 3.2407572, step = 12400 (13.019 sec)


INFO:tensorflow:global_step/sec: 7.66143


INFO:tensorflow:global_step/sec: 7.66143


INFO:tensorflow:loss = 3.0543363, step = 12500 (13.052 sec)


INFO:tensorflow:loss = 3.0543363, step = 12500 (13.052 sec)


INFO:tensorflow:global_step/sec: 7.69756


INFO:tensorflow:global_step/sec: 7.69756


INFO:tensorflow:loss = 3.0307634, step = 12600 (12.994 sec)


INFO:tensorflow:loss = 3.0307634, step = 12600 (12.994 sec)


INFO:tensorflow:global_step/sec: 7.62283


INFO:tensorflow:global_step/sec: 7.62283


INFO:tensorflow:loss = 2.9677067, step = 12700 (13.115 sec)


INFO:tensorflow:loss = 2.9677067, step = 12700 (13.115 sec)


INFO:tensorflow:global_step/sec: 7.72298


INFO:tensorflow:global_step/sec: 7.72298


INFO:tensorflow:loss = 3.1478102, step = 12800 (12.949 sec)


INFO:tensorflow:loss = 3.1478102, step = 12800 (12.949 sec)


INFO:tensorflow:global_step/sec: 7.69831


INFO:tensorflow:global_step/sec: 7.69831


INFO:tensorflow:loss = 2.9882126, step = 12900 (12.990 sec)


INFO:tensorflow:loss = 2.9882126, step = 12900 (12.990 sec)


INFO:tensorflow:Saving checkpoints for 13000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 13000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.31107


INFO:tensorflow:global_step/sec: 6.31107


INFO:tensorflow:loss = 3.5916572, step = 13000 (15.844 sec)


INFO:tensorflow:loss = 3.5916572, step = 13000 (15.844 sec)


INFO:tensorflow:global_step/sec: 7.33996


INFO:tensorflow:global_step/sec: 7.33996


INFO:tensorflow:loss = 3.0502279, step = 13100 (13.625 sec)


INFO:tensorflow:loss = 3.0502279, step = 13100 (13.625 sec)


INFO:tensorflow:global_step/sec: 7.6031


INFO:tensorflow:global_step/sec: 7.6031


INFO:tensorflow:loss = 2.7239337, step = 13200 (13.152 sec)


INFO:tensorflow:loss = 2.7239337, step = 13200 (13.152 sec)


INFO:tensorflow:global_step/sec: 7.65394


INFO:tensorflow:global_step/sec: 7.65394


INFO:tensorflow:loss = 3.0620496, step = 13300 (13.065 sec)


INFO:tensorflow:loss = 3.0620496, step = 13300 (13.065 sec)


INFO:tensorflow:global_step/sec: 7.72784


INFO:tensorflow:global_step/sec: 7.72784


INFO:tensorflow:loss = 2.5952446, step = 13400 (12.940 sec)


INFO:tensorflow:loss = 2.5952446, step = 13400 (12.940 sec)


INFO:tensorflow:global_step/sec: 7.63351


INFO:tensorflow:global_step/sec: 7.63351


INFO:tensorflow:loss = 3.6884592, step = 13500 (13.100 sec)


INFO:tensorflow:loss = 3.6884592, step = 13500 (13.100 sec)


INFO:tensorflow:global_step/sec: 7.67366


INFO:tensorflow:global_step/sec: 7.67366


INFO:tensorflow:loss = 3.1284928, step = 13600 (13.035 sec)


INFO:tensorflow:loss = 3.1284928, step = 13600 (13.035 sec)


INFO:tensorflow:global_step/sec: 7.68466


INFO:tensorflow:global_step/sec: 7.68466


INFO:tensorflow:loss = 2.9659512, step = 13700 (13.010 sec)


INFO:tensorflow:loss = 2.9659512, step = 13700 (13.010 sec)


INFO:tensorflow:global_step/sec: 7.69469


INFO:tensorflow:global_step/sec: 7.69469


INFO:tensorflow:loss = 3.202089, step = 13800 (12.996 sec)


INFO:tensorflow:loss = 3.202089, step = 13800 (12.996 sec)


INFO:tensorflow:global_step/sec: 7.6342


INFO:tensorflow:global_step/sec: 7.6342


INFO:tensorflow:loss = 2.7806702, step = 13900 (13.100 sec)


INFO:tensorflow:loss = 2.7806702, step = 13900 (13.100 sec)


INFO:tensorflow:Saving checkpoints for 14000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 14000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.39621


INFO:tensorflow:global_step/sec: 6.39621


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:21:07Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:21:07Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-14000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-14000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:21:27


INFO:tensorflow:Finished evaluation at 2020-06-08-21:21:27


INFO:tensorflow:Saving dict for global step 14000: global_step = 14000, loss = 4.475232, metrics-translate_enzh_wmt8k/targets/accuracy = 0.28737578, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.46610513, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.06475938, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.5849752, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11844131, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.28226086


INFO:tensorflow:Saving dict for global step 14000: global_step = 14000, loss = 4.475232, metrics-translate_enzh_wmt8k/targets/accuracy = 0.28737578, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.46610513, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.06475938, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.5849752, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11844131, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.28226086


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 14000: /t2t/train/model.ckpt-14000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 14000: /t2t/train/model.ckpt-14000


INFO:tensorflow:Validation (step 14000): loss = 4.475232, metrics-translate_enzh_wmt8k/targets/accuracy = 0.28737578, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.46610513, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.06475938, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.5849752, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11844131, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.28226086, global_step = 14000


INFO:tensorflow:Validation (step 14000): loss = 4.475232, metrics-translate_enzh_wmt8k/targets/accuracy = 0.28737578, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.46610513, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.06475938, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.5849752, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.11844131, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.28226086, global_step = 14000


INFO:tensorflow:loss = 3.0895848, step = 14000 (43.920 sec)


INFO:tensorflow:loss = 3.0895848, step = 14000 (43.920 sec)


INFO:tensorflow:global_step/sec: 2.3971


INFO:tensorflow:global_step/sec: 2.3971


INFO:tensorflow:loss = 2.8645997, step = 14100 (13.430 sec)


INFO:tensorflow:loss = 2.8645997, step = 14100 (13.430 sec)


INFO:tensorflow:global_step/sec: 7.55984


INFO:tensorflow:global_step/sec: 7.55984


INFO:tensorflow:loss = 3.4607165, step = 14200 (13.227 sec)


INFO:tensorflow:loss = 3.4607165, step = 14200 (13.227 sec)


INFO:tensorflow:global_step/sec: 7.5826


INFO:tensorflow:global_step/sec: 7.5826


INFO:tensorflow:loss = 2.7654612, step = 14300 (13.189 sec)


INFO:tensorflow:loss = 2.7654612, step = 14300 (13.189 sec)


INFO:tensorflow:global_step/sec: 7.78852


INFO:tensorflow:global_step/sec: 7.78852


INFO:tensorflow:loss = 2.8008876, step = 14400 (12.841 sec)


INFO:tensorflow:loss = 2.8008876, step = 14400 (12.841 sec)


INFO:tensorflow:global_step/sec: 7.64164


INFO:tensorflow:global_step/sec: 7.64164


INFO:tensorflow:loss = 3.4806154, step = 14500 (13.085 sec)


INFO:tensorflow:loss = 3.4806154, step = 14500 (13.085 sec)


INFO:tensorflow:global_step/sec: 7.71831


INFO:tensorflow:global_step/sec: 7.71831


INFO:tensorflow:loss = 3.5537312, step = 14600 (12.956 sec)


INFO:tensorflow:loss = 3.5537312, step = 14600 (12.956 sec)


INFO:tensorflow:global_step/sec: 7.64594


INFO:tensorflow:global_step/sec: 7.64594


INFO:tensorflow:loss = 3.0466561, step = 14700 (13.079 sec)


INFO:tensorflow:loss = 3.0466561, step = 14700 (13.079 sec)


INFO:tensorflow:global_step/sec: 7.69692


INFO:tensorflow:global_step/sec: 7.69692


INFO:tensorflow:loss = 3.4498734, step = 14800 (12.993 sec)


INFO:tensorflow:loss = 3.4498734, step = 14800 (12.993 sec)


INFO:tensorflow:global_step/sec: 7.63279


INFO:tensorflow:global_step/sec: 7.63279


INFO:tensorflow:loss = 2.7698355, step = 14900 (13.101 sec)


INFO:tensorflow:loss = 2.7698355, step = 14900 (13.101 sec)


INFO:tensorflow:Saving checkpoints for 15000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 15000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.35616


INFO:tensorflow:global_step/sec: 6.35616


INFO:tensorflow:loss = 3.3396697, step = 15000 (15.732 sec)


INFO:tensorflow:loss = 3.3396697, step = 15000 (15.732 sec)


INFO:tensorflow:global_step/sec: 7.65014


INFO:tensorflow:global_step/sec: 7.65014


INFO:tensorflow:loss = 3.2845938, step = 15100 (13.074 sec)


INFO:tensorflow:loss = 3.2845938, step = 15100 (13.074 sec)


INFO:tensorflow:global_step/sec: 7.51421


INFO:tensorflow:global_step/sec: 7.51421


INFO:tensorflow:loss = 2.6357763, step = 15200 (13.307 sec)


INFO:tensorflow:loss = 2.6357763, step = 15200 (13.307 sec)


INFO:tensorflow:global_step/sec: 7.54374


INFO:tensorflow:global_step/sec: 7.54374


INFO:tensorflow:loss = 2.397761, step = 15300 (13.256 sec)


INFO:tensorflow:loss = 2.397761, step = 15300 (13.256 sec)


INFO:tensorflow:global_step/sec: 7.66447


INFO:tensorflow:global_step/sec: 7.66447


INFO:tensorflow:loss = 2.6766994, step = 15400 (13.047 sec)


INFO:tensorflow:loss = 2.6766994, step = 15400 (13.047 sec)


INFO:tensorflow:global_step/sec: 7.66887


INFO:tensorflow:global_step/sec: 7.66887


INFO:tensorflow:loss = 3.1618454, step = 15500 (13.040 sec)


INFO:tensorflow:loss = 3.1618454, step = 15500 (13.040 sec)


INFO:tensorflow:global_step/sec: 7.65278


INFO:tensorflow:global_step/sec: 7.65278


INFO:tensorflow:loss = 2.7829573, step = 15600 (13.068 sec)


INFO:tensorflow:loss = 2.7829573, step = 15600 (13.068 sec)


INFO:tensorflow:global_step/sec: 7.69855


INFO:tensorflow:global_step/sec: 7.69855


INFO:tensorflow:loss = 2.9106302, step = 15700 (12.988 sec)


INFO:tensorflow:loss = 2.9106302, step = 15700 (12.988 sec)


INFO:tensorflow:global_step/sec: 7.56068


INFO:tensorflow:global_step/sec: 7.56068


INFO:tensorflow:loss = 2.6952105, step = 15800 (13.228 sec)


INFO:tensorflow:loss = 2.6952105, step = 15800 (13.228 sec)


INFO:tensorflow:global_step/sec: 7.63384


INFO:tensorflow:global_step/sec: 7.63384


INFO:tensorflow:loss = 2.7055595, step = 15900 (13.103 sec)


INFO:tensorflow:loss = 2.7055595, step = 15900 (13.103 sec)


INFO:tensorflow:Saving checkpoints for 16000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 16000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.2568


INFO:tensorflow:global_step/sec: 6.2568


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:26:03Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:26:03Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-16000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-16000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:26:22


INFO:tensorflow:Finished evaluation at 2020-06-08-21:26:22


INFO:tensorflow:Saving dict for global step 16000: global_step = 16000, loss = 4.310485, metrics-translate_enzh_wmt8k/targets/accuracy = 0.30641028, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.48794234, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.076690294, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.424801, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1347365, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2988083


INFO:tensorflow:Saving dict for global step 16000: global_step = 16000, loss = 4.310485, metrics-translate_enzh_wmt8k/targets/accuracy = 0.30641028, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.48794234, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.076690294, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.424801, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1347365, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2988083


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 16000: /t2t/train/model.ckpt-16000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 16000: /t2t/train/model.ckpt-16000


INFO:tensorflow:Validation (step 16000): loss = 4.310485, metrics-translate_enzh_wmt8k/targets/accuracy = 0.30641028, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.48794234, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.076690294, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.424801, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1347365, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2988083, global_step = 16000


INFO:tensorflow:Validation (step 16000): loss = 4.310485, metrics-translate_enzh_wmt8k/targets/accuracy = 0.30641028, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.48794234, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.076690294, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.424801, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1347365, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.2988083, global_step = 16000


INFO:tensorflow:loss = 3.1321092, step = 16000 (43.978 sec)


INFO:tensorflow:loss = 3.1321092, step = 16000 (43.978 sec)


INFO:tensorflow:global_step/sec: 2.43063


INFO:tensorflow:global_step/sec: 2.43063


INFO:tensorflow:loss = 2.7276545, step = 16100 (13.142 sec)


INFO:tensorflow:loss = 2.7276545, step = 16100 (13.142 sec)


INFO:tensorflow:global_step/sec: 7.52127


INFO:tensorflow:global_step/sec: 7.52127


INFO:tensorflow:loss = 2.90432, step = 16200 (13.295 sec)


INFO:tensorflow:loss = 2.90432, step = 16200 (13.295 sec)


INFO:tensorflow:global_step/sec: 7.61387


INFO:tensorflow:global_step/sec: 7.61387


INFO:tensorflow:loss = 2.806334, step = 16300 (13.134 sec)


INFO:tensorflow:loss = 2.806334, step = 16300 (13.134 sec)


INFO:tensorflow:global_step/sec: 7.65945


INFO:tensorflow:global_step/sec: 7.65945


INFO:tensorflow:loss = 2.915818, step = 16400 (13.056 sec)


INFO:tensorflow:loss = 2.915818, step = 16400 (13.056 sec)


INFO:tensorflow:global_step/sec: 7.77447


INFO:tensorflow:global_step/sec: 7.77447


INFO:tensorflow:loss = 2.3825846, step = 16500 (12.865 sec)


INFO:tensorflow:loss = 2.3825846, step = 16500 (12.865 sec)


INFO:tensorflow:global_step/sec: 7.75577


INFO:tensorflow:global_step/sec: 7.75577


INFO:tensorflow:loss = 2.9097972, step = 16600 (12.891 sec)


INFO:tensorflow:loss = 2.9097972, step = 16600 (12.891 sec)


INFO:tensorflow:global_step/sec: 7.77373


INFO:tensorflow:global_step/sec: 7.77373


INFO:tensorflow:loss = 2.3725939, step = 16700 (12.864 sec)


INFO:tensorflow:loss = 2.3725939, step = 16700 (12.864 sec)


INFO:tensorflow:global_step/sec: 7.716


INFO:tensorflow:global_step/sec: 7.716


INFO:tensorflow:loss = 3.2936022, step = 16800 (12.960 sec)


INFO:tensorflow:loss = 3.2936022, step = 16800 (12.960 sec)


INFO:tensorflow:global_step/sec: 7.74907


INFO:tensorflow:global_step/sec: 7.74907


INFO:tensorflow:loss = 2.848516, step = 16900 (12.905 sec)


INFO:tensorflow:loss = 2.848516, step = 16900 (12.905 sec)


INFO:tensorflow:Saving checkpoints for 17000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 17000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.29778


INFO:tensorflow:global_step/sec: 6.29778


INFO:tensorflow:loss = 2.7673094, step = 17000 (15.878 sec)


INFO:tensorflow:loss = 2.7673094, step = 17000 (15.878 sec)


INFO:tensorflow:global_step/sec: 7.74427


INFO:tensorflow:global_step/sec: 7.74427


INFO:tensorflow:loss = 2.483264, step = 17100 (12.916 sec)


INFO:tensorflow:loss = 2.483264, step = 17100 (12.916 sec)


INFO:tensorflow:global_step/sec: 7.77875


INFO:tensorflow:global_step/sec: 7.77875


INFO:tensorflow:loss = 2.822777, step = 17200 (12.853 sec)


INFO:tensorflow:loss = 2.822777, step = 17200 (12.853 sec)


INFO:tensorflow:global_step/sec: 7.55662


INFO:tensorflow:global_step/sec: 7.55662


INFO:tensorflow:loss = 2.6830814, step = 17300 (13.233 sec)


INFO:tensorflow:loss = 2.6830814, step = 17300 (13.233 sec)


INFO:tensorflow:global_step/sec: 7.47427


INFO:tensorflow:global_step/sec: 7.47427


INFO:tensorflow:loss = 2.6408353, step = 17400 (13.379 sec)


INFO:tensorflow:loss = 2.6408353, step = 17400 (13.379 sec)


INFO:tensorflow:global_step/sec: 7.7266


INFO:tensorflow:global_step/sec: 7.7266


INFO:tensorflow:loss = 2.5349681, step = 17500 (12.945 sec)


INFO:tensorflow:loss = 2.5349681, step = 17500 (12.945 sec)


INFO:tensorflow:global_step/sec: 7.71255


INFO:tensorflow:global_step/sec: 7.71255


INFO:tensorflow:loss = 2.8018641, step = 17600 (12.963 sec)


INFO:tensorflow:loss = 2.8018641, step = 17600 (12.963 sec)


INFO:tensorflow:global_step/sec: 7.66796


INFO:tensorflow:global_step/sec: 7.66796


INFO:tensorflow:loss = 2.9487994, step = 17700 (13.042 sec)


INFO:tensorflow:loss = 2.9487994, step = 17700 (13.042 sec)


INFO:tensorflow:global_step/sec: 7.65906


INFO:tensorflow:global_step/sec: 7.65906


INFO:tensorflow:loss = 2.8175545, step = 17800 (13.056 sec)


INFO:tensorflow:loss = 2.8175545, step = 17800 (13.056 sec)


INFO:tensorflow:global_step/sec: 7.65804


INFO:tensorflow:global_step/sec: 7.65804


INFO:tensorflow:loss = 3.0564973, step = 17900 (13.058 sec)


INFO:tensorflow:loss = 3.0564973, step = 17900 (13.058 sec)


INFO:tensorflow:Saving checkpoints for 18000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 18000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.296


INFO:tensorflow:global_step/sec: 6.296


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:Reading data files from /t2t/data/translate_enzh_wmt8k-dev*


INFO:tensorflow:partition: 0 num_data_files: 1


INFO:tensorflow:partition: 0 num_data_files: 1


























INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Calling model_fn.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights.


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting T2TModel mode to 'eval'


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.dropout to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.label_smoothing to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.symbol_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.attention_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0


INFO:tensorflow:Setting hparams.relu_dropout to 0.0






INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Using variable initializer: uniform_unit_scaling


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_8182_512.bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Transforming feature 'targets' with symbol_modality_8267_512.targets_bottom


INFO:tensorflow:Building model body


INFO:tensorflow:Building model body


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Transforming body output with symbol_modality_8267_512.top


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2020-06-08T21:30:57Z


INFO:tensorflow:Starting evaluation at 2020-06-08T21:30:57Z


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Graph was finalized.


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-18000


INFO:tensorflow:Restoring parameters from /t2t/train/model.ckpt-18000


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Done running local_init_op.


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [10/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [20/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [30/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [40/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [50/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [60/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [70/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [80/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [90/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Evaluation [100/100]


INFO:tensorflow:Finished evaluation at 2020-06-08-21:31:16


INFO:tensorflow:Finished evaluation at 2020-06-08-21:31:16


INFO:tensorflow:Saving dict for global step 18000: global_step = 18000, loss = 4.151175, metrics-translate_enzh_wmt8k/targets/accuracy = 0.32423076, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0005402485, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.5103491, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.08242771, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.2558484, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1407957, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.31169805


INFO:tensorflow:Saving dict for global step 18000: global_step = 18000, loss = 4.151175, metrics-translate_enzh_wmt8k/targets/accuracy = 0.32423076, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0005402485, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.5103491, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.08242771, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.2558484, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1407957, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.31169805


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 18000: /t2t/train/model.ckpt-18000


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 18000: /t2t/train/model.ckpt-18000


INFO:tensorflow:Validation (step 18000): loss = 4.151175, metrics-translate_enzh_wmt8k/targets/accuracy = 0.32423076, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0005402485, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.5103491, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.08242771, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.2558484, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1407957, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.31169805, global_step = 18000


INFO:tensorflow:Validation (step 18000): loss = 4.151175, metrics-translate_enzh_wmt8k/targets/accuracy = 0.32423076, metrics-translate_enzh_wmt8k/targets/accuracy_per_sequence = 0.0005402485, metrics-translate_enzh_wmt8k/targets/accuracy_top5 = 0.5103491, metrics-translate_enzh_wmt8k/targets/approx_bleu_score = 0.08242771, metrics-translate_enzh_wmt8k/targets/neg_log_perplexity = -4.2558484, metrics-translate_enzh_wmt8k/targets/rouge_2_fscore = 0.1407957, metrics-translate_enzh_wmt8k/targets/rouge_L_fscore = 0.31169805, global_step = 18000


INFO:tensorflow:loss = 3.4971333, step = 18000 (43.479 sec)


INFO:tensorflow:loss = 3.4971333, step = 18000 (43.479 sec)


INFO:tensorflow:global_step/sec: 2.46356


INFO:tensorflow:global_step/sec: 2.46356


INFO:tensorflow:loss = 3.4451873, step = 18100 (12.995 sec)


INFO:tensorflow:loss = 3.4451873, step = 18100 (12.995 sec)


INFO:tensorflow:global_step/sec: 7.54917


INFO:tensorflow:global_step/sec: 7.54917


INFO:tensorflow:loss = 2.4596133, step = 18200 (13.249 sec)


INFO:tensorflow:loss = 2.4596133, step = 18200 (13.249 sec)


INFO:tensorflow:global_step/sec: 7.64478


INFO:tensorflow:global_step/sec: 7.64478


INFO:tensorflow:loss = 2.3631704, step = 18300 (13.079 sec)


INFO:tensorflow:loss = 2.3631704, step = 18300 (13.079 sec)


INFO:tensorflow:global_step/sec: 7.701


INFO:tensorflow:global_step/sec: 7.701


INFO:tensorflow:loss = 2.1979406, step = 18400 (12.985 sec)


INFO:tensorflow:loss = 2.1979406, step = 18400 (12.985 sec)


INFO:tensorflow:global_step/sec: 7.772


INFO:tensorflow:global_step/sec: 7.772


INFO:tensorflow:loss = 3.046075, step = 18500 (12.869 sec)


INFO:tensorflow:loss = 3.046075, step = 18500 (12.869 sec)


INFO:tensorflow:global_step/sec: 7.73744


INFO:tensorflow:global_step/sec: 7.73744


INFO:tensorflow:loss = 2.5964413, step = 18600 (12.921 sec)


INFO:tensorflow:loss = 2.5964413, step = 18600 (12.921 sec)


INFO:tensorflow:global_step/sec: 7.62619


INFO:tensorflow:global_step/sec: 7.62619


INFO:tensorflow:loss = 2.4547062, step = 18700 (13.113 sec)


INFO:tensorflow:loss = 2.4547062, step = 18700 (13.113 sec)


INFO:tensorflow:global_step/sec: 7.70324


INFO:tensorflow:global_step/sec: 7.70324


INFO:tensorflow:loss = 2.3899727, step = 18800 (12.982 sec)


INFO:tensorflow:loss = 2.3899727, step = 18800 (12.982 sec)


INFO:tensorflow:global_step/sec: 7.66439


INFO:tensorflow:global_step/sec: 7.66439


INFO:tensorflow:loss = 2.689398, step = 18900 (13.046 sec)


INFO:tensorflow:loss = 2.689398, step = 18900 (13.046 sec)


INFO:tensorflow:Saving checkpoints for 19000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 19000 into /t2t/train/model.ckpt.


INFO:tensorflow:global_step/sec: 6.29102


INFO:tensorflow:global_step/sec: 6.29102


INFO:tensorflow:loss = 2.578266, step = 19000 (15.896 sec)


INFO:tensorflow:loss = 2.578266, step = 19000 (15.896 sec)


INFO:tensorflow:global_step/sec: 7.73644


INFO:tensorflow:global_step/sec: 7.73644


INFO:tensorflow:loss = 2.9835713, step = 19100 (12.929 sec)


INFO:tensorflow:loss = 2.9835713, step = 19100 (12.929 sec)


INFO:tensorflow:global_step/sec: 7.75704


INFO:tensorflow:global_step/sec: 7.75704


INFO:tensorflow:loss = 3.0302334, step = 19200 (12.889 sec)


INFO:tensorflow:loss = 3.0302334, step = 19200 (12.889 sec)


INFO:tensorflow:global_step/sec: 7.67552


INFO:tensorflow:global_step/sec: 7.67552


INFO:tensorflow:loss = 2.58807, step = 19300 (13.028 sec)


INFO:tensorflow:loss = 2.58807, step = 19300 (13.028 sec)


INFO:tensorflow:global_step/sec: 7.74699


INFO:tensorflow:global_step/sec: 7.74699


INFO:tensorflow:loss = 2.572075, step = 19400 (12.909 sec)


INFO:tensorflow:loss = 2.572075, step = 19400 (12.909 sec)


INFO:tensorflow:global_step/sec: 7.3695


INFO:tensorflow:global_step/sec: 7.3695


INFO:tensorflow:loss = 3.7583628, step = 19500 (13.571 sec)


INFO:tensorflow:loss = 3.7583628, step = 19500 (13.571 sec)


INFO:tensorflow:global_step/sec: 7.63061


INFO:tensorflow:global_step/sec: 7.63061


INFO:tensorflow:loss = 2.4969592, step = 19600 (13.103 sec)


INFO:tensorflow:loss = 2.4969592, step = 19600 (13.103 sec)


INFO:tensorflow:global_step/sec: 7.7278


INFO:tensorflow:global_step/sec: 7.7278


INFO:tensorflow:loss = 2.596013, step = 19700 (12.940 sec)


INFO:tensorflow:loss = 2.596013, step = 19700 (12.940 sec)


INFO:tensorflow:global_step/sec: 7.71637


INFO:tensorflow:global_step/sec: 7.71637


INFO:tensorflow:loss = 2.5882008, step = 19800 (12.962 sec)


INFO:tensorflow:loss = 2.5882008, step = 19800 (12.962 sec)


INFO:tensorflow:global_step/sec: 7.67381


INFO:tensorflow:global_step/sec: 7.67381


INFO:tensorflow:loss = 2.7925658, step = 19900 (13.029 sec)


INFO:tensorflow:loss = 2.7925658, step = 19900 (13.029 sec)


INFO:tensorflow:Saving checkpoints for 20000 into /t2t/train/model.ckpt.


INFO:tensorflow:Saving checkpoints for 20000 into /t2t/train/model.ckpt.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


Instructions for updating:
Use standard file APIs to delete files with this prefix.


INFO:tensorflow:Loss for final step: 2.9437342.


INFO:tensorflow:Loss for final step: 2.9437342.


### Make prediction

After training the model, re-run the environment first, then make prediction (translation).

In [7]:
# Re-run the environment
tfe = tf.contrib.eager
tfe.enable_eager_execution()
Modes = tf.estimator.ModeKeys
hparams = create_hparams(HPARAMS, data_dir=DATA_DIR, problem_name=PROBLEM)
translate_model = registry.model(MODEL)(hparams, Modes.PREDICT)

# Get the encoders (fixed pre-processing) from the problem
encoders = t2t_problem.feature_encoders(DATA_DIR)

def encode(input_str, output_str=None):
  """Input str to features dict, ready for inference"""
  inputs = encoders["inputs"].encode(input_str) + [1]  # add EOS 
  batch_inputs = tf.reshape(inputs, [1, -1, 1])  # Make it 3D
  return {"inputs": batch_inputs}

def decode(integers):
  """List of ints to str"""
  integers = list(np.squeeze(integers))
  if 1 in integers:
    integers = integers[:integers.index(1)]
  return encoders["targets"].decode(np.squeeze(integers))

# Get the latest checkpoint
ckpt_path = tf.train.latest_checkpoint(TRAIN_DIR)
print('Latest Checkpoint: ', ckpt_path)

def translate(inputs):
  encoded_inputs = encode(inputs)
  with tfe.restore_variables_on_create(ckpt_path):
    model_output = translate_model.infer(encoded_inputs)["outputs"]
  return decode(model_output)

INFO:tensorflow:Setting T2TModel mode to 'infer'


INFO:tensorflow:Setting T2TModel mode to 'infer'


Latest Checkpoint:  /t2t/train/model.ckpt-20000


Now we are ready to do some translation!

In [12]:
# Predict
inputs = ["I think they will never come back to the US.", 
          "Human rights is the first priority.",
          "Everyone should have health insurance.",
          "President Trump's overall approval rating dropped 7% over the past month"]

for sentence in inputs:
  output = translate(sentence)
  print("\33[34m Inputs:\33[30m %s" % sentence)
  print("\033[35m Outputs:\33[30m %s" % output)
  print()

[34m Inputs:[30m I think they will never come back to the US.
[35m Outputs:[30m 我认为他们永远不会重返美国。

[34m Inputs:[30m Human rights is the first priority.
[35m Outputs:[30m 人权是第一个优先考虑。

[34m Inputs:[30m Everyone should have health insurance.
[35m Outputs:[30m 每个人都应该拥有健康保险。

[34m Inputs:[30m President Trump's overall approval rating dropped 7% over the past month
[35m Outputs:[30m 特朗普的总支持率在过去个月中下降了7%。



The translation is very reasonable!

## Attention visualization

Note the encoding (decoding) component is a stack of identical encoder layers (decoders), there are **6** of them on top of each other in the paper Attention Is All You Need. Besides, Thansformer adopts multi-head attention layers: **8** parallel heads in the paper, and there are used in three different ways:

- Self-Attention layer in Encoder
- Self-Attention layer in Decoder
- Encoder-Decoder Attention layers

In the final visualization, we should be able to see three groups of attentions for each of these 6 layers with 8 heads.

<center><img src='https://drive.google.com/uc?id=1Ln15le_1cRdRlMm9eac-q8_q-h2hEsHi'width=800></img></center>

In [0]:
from tensor2tensor.visualization import attention
from tensor2tensor.data_generators import text_encoder

SIZE = 35

def encode_eval(input_str, output_str):
  inputs = tf.reshape(encoders["inputs"].encode(input_str) + [1], [1, -1, 1, 1])  # Make it 3D.
  outputs = tf.reshape(encoders["inputs"].encode(output_str) + [1], [1, -1, 1, 1])  # Make it 3D.
  return {"inputs": inputs, "targets": outputs}

def resize(np_mat):
  # Sum across heads
  np_mat = np_mat[:, :SIZE, :SIZE]
  row_sums = np.sum(np_mat, axis=0)
  # Normalize
  layer_mat = np_mat / row_sums[np.newaxis, :]
  lsh = layer_mat.shape
  # Add extra dim for viz code to work.
  layer_mat = np.reshape(layer_mat, (1, lsh[0], lsh[1], lsh[2]))
  return layer_mat

def get_att_mats():
  '''Get attention weights matrices'''  
  enc_atts = []  # Encoder attentions
  dec_atts = []  # Decoder attentions
  encdec_atts = []  # Encoder-Decoder attentions

  for i in range(hparams.num_hidden_layers):
    enc_att = translate_model.attention_weights[
      "transformer/body/encoder/layer_%i/self_attention/multihead_attention/dot_product_attention" % i][0]
    dec_att = translate_model.attention_weights[
      "transformer/body/decoder/layer_%i/self_attention/multihead_attention/dot_product_attention" % i][0]
    encdec_att = translate_model.attention_weights[
      "transformer/body/decoder/layer_%i/encdec_attention/multihead_attention/dot_product_attention" % i][0]
    enc_atts.append(resize(enc_att))
    dec_atts.append(resize(dec_att))
    encdec_atts.append(resize(encdec_att))
  return enc_atts, dec_atts, encdec_atts

def to_tokens(ids, is_input=True):
  ids = np.squeeze(ids)
  if is_input:
      subtokenizer = hparams.problem_hparams.vocabulary['inputs']
  else:
      subtokenizer = hparams.problem_hparams.vocabulary['targets']
  tokens = []
  for _id in ids:
    if _id == 0:
      tokens.append('<PAD>')
    elif _id == 1:
      tokens.append('<EOS>')
    elif _id == -1:
      tokens.append('<NULL>')
    else:
        tokens.append(subtokenizer._subtoken_id_to_subtoken_string(_id))
  return tokens

In [0]:
def call_html():
  import IPython
  display(IPython.core.display.HTML('''
        <script src="/static/components/requirejs/require.js"></script>
        <script>
          requirejs.config({
            paths: {
              base: '/static/base',
              "d3": "https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.8/d3.min",
              jquery: '//ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min',
            },
          });
        </script>
        '''))

In [15]:
# break down inputs and outputs to subwords
sentence = "President Trump's overall approval rating dropped 7% over the past month"
inp_text = to_tokens(encoders["inputs"].encode(sentence))
out_text = to_tokens(encoders["targets"].encode(output), is_input=False)

# Run eval to collect attention weights
example = encode_eval(sentence, output)
with tfe.restore_variables_on_create(tf.train.latest_checkpoint(ckpt_path)):
  translate_model.set_mode(Modes.EVAL)
  translate_model(example)
# Get normalized attention weights for each layer
enc_atts, dec_atts, encdec_atts = get_att_mats()

call_html()
attention.show(inp_text, out_text, enc_atts, dec_atts, encdec_atts)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Now you are able to see how much attention was paid on each sub-word when encoding (decoding) input (output) sub-words.