[View in Colaboratory](https://colab.research.google.com/github/josd/eye/blob/master/i2i/transduction_roots/observation_prediction_roots.ipynb)

# Transduction from observation to prediction for roots of polynomial

## Introduction

What is [Transduction (machine learning)](https://en.wikipedia.org/wiki/Transduction_(machine_learning%29):

> In logic, statistical inference, and supervised learning, transduction or
transductive inference is reasoning from observed, specific (training) cases
to specific (test) cases. In contrast, induction is reasoning from observed
training cases to general rules, which are then applied to the test cases.
The distinction is most interesting in cases where the predictions of the
transductive model are not achievable by any inductive model. Note that this
is caused by transductive inference on different test sets producing mutually
inconsistent predictions.

What is the Tensor2Tensor [Transformer model](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py):

> The Transformer model consists of an encoder and a decoder. Both are stacks
of self-attention layers followed by feed-forward layers. This model yields
good results on a number of problems, especially in NLP and machine translation.
See "Attention Is All You Need" (https://arxiv.org/abs/1706.03762) for the full
description of the model and the results obtained with its early version.

![Transformer model](https://pbs.twimg.com/media/DCKhefrUMAE9stK.jpg)

> The encoder is composed of a stack of N identical layers. Each layer has
two sub-layers. The first is a multi-head self-attention mechanism, and the
second is a simple, positionwise fully connected feed-forward network.
There is a residual connection around each of the two sub-layers, followed by
layer normalization.

> The decoder is also composed of a stack of N identical layers. In addition
to the two sub-layers in each encoder layer, the decoder inserts a third
sub-layer, which performs multi-head attention over the output of the encoder
stack. The self-attention sub-layer in the decoder stack is modified to prevent
positions from attending to subsequent positions.  This masking, combined with
the fact that the output embeddings are offset by one position, ensures that the
predictions for position i can depend only on the known outputs at positions
less than i.

In [1]:
# Preparation

# install tensor2tensor
! pip install -q -U tensor2tensor

# show versions
! pip show tensorflow
! pip show tensor2tensor

# get the needed resources
! curl -O http://josd.github.io/eye/i2i/transduction_roots/observation_prediction_roots.sh
! curl -O http://josd.github.io/eye/i2i/transduction_roots/observation_prediction_roots.py
! curl -O http://josd.github.io/eye/i2i/transduction_roots/__init__.py
! curl -O http://josd.github.io/eye/i2i/transduction_roots/test_roots.observation
! chmod +x observation_prediction_roots.sh

# clear data and model
% rm -fr /tmp/t2t_data/observation_prediction_roots/
% rm -fr /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/

# start tensorboard
! curl -O https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
! unzip -o ngrok-stable-linux-amd64.zip
get_ipython().system_raw('tensorboard --logdir /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small --host 0.0.0.0 --port 6006 &')
get_ipython().system_raw('./ngrok http 6006 &')
! curl -s http://localhost:4040/api/tunnels | python3 -c "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

Name: tensorflow
Version: 1.9.0rc2
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: gast, tensorboard, setuptools, grpcio, numpy, termcolor, absl-py, wheel, protobuf, six, astor
Required-by: 
Name: tensor2tensor
Version: 1.6.6
Summary: Tensor2Tensor
Home-page: http://github.com/tensorflow/tensor2tensor
Author: Google Inc.
Author-email: no-reply@google.com
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: gunicorn, h5py, numpy, future, gevent, bz2file, oauth2client, six, sympy, flask, tqdm, requests, gym, scipy, google-api-python-client
Required-by: 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1267  100  1267    0     0   1267      

In [2]:
# See the observation_prediction_roots problem

! pygmentize -g observation_prediction_roots.py

[34mimport[39;49;00m [04m[36mrandom[39;49;00m
[34mimport[39;49;00m [04m[36mcmath[39;49;00m
[34mfrom[39;49;00m [04m[36mtensor2tensor.data_generators[39;49;00m [34mimport[39;49;00m problem
[34mfrom[39;49;00m [04m[36mtensor2tensor.data_generators[39;49;00m [34mimport[39;49;00m text_problems
[34mfrom[39;49;00m [04m[36mtensor2tensor.utils[39;49;00m [34mimport[39;49;00m registry

[30;01m@registry.register_problem[39;49;00m
[34mclass[39;49;00m [04m[32mObservationPredictionRoots[39;49;00m(text_problems.Text2TextProblem):
  [33m"""Transduction from observation to prediction for roots of polynomial ax**2 + bx + c = 0"""[39;49;00m

  [30;01m@property[39;49;00m
  [34mdef[39;49;00m [32mapprox_vocab_size[39;49;00m([36mself[39;49;00m):
    [34mreturn[39;49;00m [34m2[39;49;00m**[34m14[39;49;00m  [37m# ~16k[39;49;00m

  [30;01m@property[39;49;00m
  [34mdef[39;49;00m [32mis_generate_per_split[39;49;00m([36mself[39;49;00m):
 

In [3]:
# See the observation_prediction_roots script

! pygmentize -g observation_prediction_roots.sh

[37m#!/bin/bash[39;49;00m
[31mPROBLEM[39;49;00m=observation_prediction_roots
[31mMODEL[39;49;00m=transformer
[31mHPARAMS[39;49;00m=transformer_small

[31mUSER_DIR[39;49;00m=[31m$PWD[39;49;00m
[31mDATA_DIR[39;49;00m=/tmp/t2t_data/[31m$PROBLEM[39;49;00m
[31mTMP_DIR[39;49;00m=/tmp/t2t_datagen/[31m$PROBLEM[39;49;00m
[31mTRAIN_DIR[39;49;00m=/tmp/t2t_train/[31m$PROBLEM[39;49;00m/[31m$MODEL[39;49;00m-[31m$HPARAMS[39;49;00m

mkdir -p [31m$DATA_DIR[39;49;00m [31m$TMP_DIR[39;49;00m [31m$TRAIN_DIR[39;49;00m

[37m# generate data[39;49;00m
t2t-datagen [33m\[39;49;00m
  --data_dir=[31m$DATA_DIR[39;49;00m [33m\[39;49;00m
  --problem=[31m$PROBLEM[39;49;00m [33m\[39;49;00m
  --t2t_usr_dir=[31m$USER_DIR[39;49;00m [33m\[39;49;00m
  --tmp_dir=[31m$TMP_DIR[39;49;00m

[37m# train with Adam for 3600 steps[39;49;00m
t2t-trainer [33m\[39;49;00m
  --data_dir=[31m$DATA_DIR[39;49;00m [33m\[39;49;00m
  --eval_steps=[34m10[39;49;00m

In [4]:
# Run the observation_prediction_roots script

! ./observation_prediction_roots.sh

INFO:tensorflow:Importing user module content from path /
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fdee16d1fd0>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_device_fn': None, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': None, '_log_step_count_steps': 100, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95
}
allow

INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:41:29
INFO:tensorflow:Graph was finalized.
2018-07-14 11:41:30.189603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:41:30.189706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:41:30.189766: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:41:30.189798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:41:30.190006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring paramet

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:42:29
INFO:tensorflow:Graph was finalized.
2018-07-14 11:42:29.708705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:42:29.708837: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:42:29.708878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:42:29.708909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:42:29.709150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:43:28
INFO:tensorflow:Graph was finalized.
2018-07-14 11:43:29.039836: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:43:29.039960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:43:29.039992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:43:29.040021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:43:29.040206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-380
INFO:tensor

INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:44:28
INFO:tensorflow:Graph was finalized.
2018-07-14 11:44:28.836947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:44:28.837059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:44:28.837091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:44:28.837133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:44:28.837377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring paramet

INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:45:28
INFO:tensorflow:Graph was finalized.
2018-07-14 11:45:28.528774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:45:28.528870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:45:28.528914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:45:28.528948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:45:28.529139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring paramet

INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:46:27
INFO:tensorflow:Graph was finalized.
2018-07-14 11:46:28.055076: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:46:28.055191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:46:28.055231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:46:28.055263: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:46:28.055494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, 

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:47:27
INFO:tensorflow:Graph was finalized.
2018-07-14 11:47:27.819070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:47:27.819191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:47:27.819229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:47:27.819261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:47:27.819518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-815
INFO:tensorf

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:48:28
INFO:tensorflow:Graph was finalized.
2018-07-14 11:48:29.340283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:48:29.340417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:48:29.340474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:48:29.340512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:48:29.340700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:49:30
INFO:tensorflow:Graph was finalized.
2018-07-14 11:49:30.957275: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:49:30.957402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:49:30.957473: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:49:30.957512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:49:30.957722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:50:34
INFO:tensorflow:Graph was finalized.
2018-07-14 11:50:34.873483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:50:34.873679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:50:34.873721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:50:34.873757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:50:34.874107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1079
INFO:tenso

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:51:39
INFO:tensorflow:Graph was finalized.
2018-07-14 11:51:39.587213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:51:39.587365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:51:39.587408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:51:39.587460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:51:39.587663: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:52:42
INFO:tensorflow:Graph was finalized.
2018-07-14 11:52:43.342187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:52:43.342342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:52:43.342383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:52:43.342420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:52:43.342644: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:53:43
INFO:tensorflow:Graph was finalized.
2018-07-14 11:53:43.940361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:53:43.940507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:53:43.940550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:53:43.940585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:53:43.940773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:54:44
INFO:tensorflow:Graph was finalized.
2018-07-14 11:54:44.859211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:54:44.859342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:54:44.859383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:54:44.859419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:54:44.859628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1422
INFO:tenso

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:55:45
INFO:tensorflow:Graph was finalized.
2018-07-14 11:55:46.362308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:55:46.362505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:55:46.362563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:55:46.362599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:55:46.362819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1519
INFO:tensor

INFO:tensorflow:Graph was finalized.
2018-07-14 11:56:47.990982: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:56:47.991109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:56:47.991160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:56:47.991196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:56:47.991414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1611
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Eval

INFO:tensorflow:Starting evaluation at 2018-07-14-11:57:48
INFO:tensorflow:Graph was finalized.
2018-07-14 11:57:49.302660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:57:49.302785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:57:49.302827: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:57:49.302860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:57:49.303055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1712
INFO:tensorflow:Running local_init_op.
INFO:tensor

INFO:tensorflow:Graph was finalized.
2018-07-14 11:58:52.828923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:58:52.829034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:58:52.829086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:58:52.829118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:58:52.829327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-1797
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Eval

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-11:59:54
INFO:tensorflow:Graph was finalized.
2018-07-14 11:59:54.833214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 11:59:54.833337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 11:59:54.833381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 11:59:54.833430: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 11:59:54.833653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:00:55
INFO:tensorflow:Graph was finalized.
2018-07-14 12:00:55.989154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:00:55.989266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:00:55.989314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:00:55.989351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:00:55.989566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:01:55
INFO:tensorflow:Graph was finalized.
2018-07-14 12:01:55.745757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:01:55.745884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:01:55.745931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:01:55.745964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:01:55.746165: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring paramet

INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:02:54
INFO:tensorflow:Graph was finalized.
2018-07-14 12:02:55.166144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:02:55.166268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:02:55.166310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:02:55.166343: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:02:55.166559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, p

INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:03:55
INFO:tensorflow:Graph was finalized.
2018-07-14 12:03:55.516254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:03:55.516389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:03:55.516462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:03:55.516501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:03:55.516675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, 

INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:04:55
INFO:tensorflow:Graph was finalized.
2018-07-14 12:04:55.540388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:04:55.540545: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:04:55.540588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:04:55.540619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:04:55.540822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, 

INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:05:55
INFO:tensorflow:Graph was finalized.
2018-07-14 12:05:56.100733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:05:56.100867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:05:56.100908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:05:56.100944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:05:56.101121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, 

INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:06:56
INFO:tensorflow:Graph was finalized.
2018-07-14 12:06:56.640635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:06:56.640760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:06:56.640815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:06:56.640847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:06:56.641052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:07:57
INFO:tensorflow:Graph was finalized.
2018-07-14 12:07:57.870214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:07:57.870341: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:07:57.870386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:07:57.870421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:07:57.870639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:09:01
INFO:tensorflow:Graph was finalized.
2018-07-14 12:09:01.439588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:09:01.439736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:09:01.439777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:09:01.439829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:09:01.440080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-2807
INFO:tensor

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:10:04
INFO:tensorflow:Graph was finalized.
2018-07-14 12:10:05.109058: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:10:05.109188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:10:05.109242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:10:05.109278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:10:05.109510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-2891
INFO:tensor

INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:11:05
INFO:tensorflow:Graph was finalized.
2018-07-14 12:11:06.019512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:11:06.019672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:11:06.019708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:11:06.019732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:11:06.020000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_

INFO:tensorflow:Graph was finalized.
2018-07-14 12:12:06.174204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:12:06.174330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:12:06.174381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:12:06.174414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:12:06.174648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3101
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation

INFO:tensorflow:Graph was finalized.
2018-07-14 12:13:06.886645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:13:06.886765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:13:06.886817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:13:06.886858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:13:06.887062: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3194
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Eval

INFO:tensorflow:Graph was finalized.
2018-07-14 12:14:07.607041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:14:07.607179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:14:07.607226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:14:07.607260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:14:07.607493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3285
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Eval

INFO:tensorflow:Starting evaluation at 2018-07-14-12:15:08
INFO:tensorflow:Graph was finalized.
2018-07-14 12:15:08.764802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:15:08.764921: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:15:08.764969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:15:08.765003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:15:08.765202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3387
INFO:tensorflow:Running local_init_op.
INFO:tensor

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:16:09
INFO:tensorflow:Graph was finalized.
2018-07-14 12:16:09.826650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:16:09.826857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:16:09.826927: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:16:09.826959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:16:09.827223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3498
INFO:tensor

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07-14-12:17:09
INFO:tensorflow:Graph was finalized.
2018-07-14 12:17:10.234899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-14 12:17:10.235061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-14 12:17:10.235122: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-07-14 12:17:10.235166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-07-14 12:17:10.235425: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10867 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3600
INFO:tensor

INFO:tensorflow:Finished evaluation at 2018-07-14-12:17:29
INFO:tensorflow:Saving dict for global step 3600: global_step = 3600, loss = 0.23866168, metrics-observation_prediction_roots/targets/accuracy = 0.9452165, metrics-observation_prediction_roots/targets/accuracy_per_sequence = 0.3292857, metrics-observation_prediction_roots/targets/accuracy_top5 = 0.9811635, metrics-observation_prediction_roots/targets/approx_bleu_score = 0.84327775, metrics-observation_prediction_roots/targets/neg_log_perplexity = -0.33131462, metrics-observation_prediction_roots/targets/rouge_2_fscore = 0.843067, metrics-observation_prediction_roots/targets/rouge_L_fscore = 0.9231578
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 3600: /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-3600
INFO:tensorflow:Importing user module content from path /
INFO:tensorflow:Overriding hparams in transformer_small with optimizer=SGD
Instructions for updating:
When switch

INFO:tensorflow:Loss for final step: 0.11300836.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-07

INFO:tensorflow:Saving checkpoints for 3948 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.099700436.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transform

INFO:tensorflow:Saving checkpoints for 4104 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.097055025.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transform

INFO:tensorflow:global_step/sec: 7.17732
INFO:tensorflow:loss = 0.09523366, step = 4204 (13.933 sec)
INFO:tensorflow:Saving checkpoints for 4241 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.095918566.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' wit

INFO:tensorflow:Saving checkpoints for 4241 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:loss = 0.08530588, step = 4241
INFO:tensorflow:global_step/sec: 7.12306
INFO:tensorflow:loss = 0.090404496, step = 4341 (14.040 sec)
INFO:tensorflow:Saving checkpoints for 4384 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09674714.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:ten

INFO:tensorflow:Saving checkpoints for 4384 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:loss = 0.09682124, step = 4384
INFO:tensorflow:global_step/sec: 7.23114
INFO:tensorflow:loss = 0.084707886, step = 4484 (13.830 sec)
INFO:tensorflow:Saving checkpoints for 4529 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09872757.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:ten

INFO:tensorflow:Saving checkpoints for 4529 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:loss = 0.08894437, step = 4529
INFO:tensorflow:global_step/sec: 7.31535
INFO:tensorflow:loss = 0.08766955, step = 4629 (13.669 sec)
INFO:tensorflow:Saving checkpoints for 4676 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.099725366.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:ten

INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 4676 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:loss = 0.08561533, step = 4676
INFO:tensorflow:global_step/sec: 7.3484
INFO:tensorflow:loss = 0.08184544, step = 4776 (13.609 sec)
INFO:tensorflow:Saving checkpoints for 4825 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.10749249.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Se

INFO:tensorflow:loss = 0.10060811, step = 4825
INFO:tensorflow:global_step/sec: 7.211
INFO:tensorflow:loss = 0.066319086, step = 4925 (13.868 sec)
INFO:tensorflow:Saving checkpoints for 4974 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09451805.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.botto

INFO:tensorflow:loss = 0.09162907, step = 4974
INFO:tensorflow:global_step/sec: 7.25537
INFO:tensorflow:loss = 0.07974915, step = 5074 (13.784 sec)
INFO:tensorflow:Saving checkpoints for 5121 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.10428287.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:loss = 0.09846982, step = 5121
INFO:tensorflow:global_step/sec: 7.32696
INFO:tensorflow:loss = 0.0729525, step = 5221 (13.649 sec)
INFO:tensorflow:Saving checkpoints for 5270 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.11635221.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.botto

INFO:tensorflow:loss = 0.08745169, step = 5270
INFO:tensorflow:global_step/sec: 6.99705
INFO:tensorflow:loss = 0.09806531, step = 5370 (14.293 sec)
INFO:tensorflow:Saving checkpoints for 5410 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.11673061.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:loss = 0.089396305, step = 5410
INFO:tensorflow:global_step/sec: 7.35115
INFO:tensorflow:loss = 0.09141321, step = 5510 (13.604 sec)
INFO:tensorflow:Saving checkpoints for 5561 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.10418109.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bot

INFO:tensorflow:global_step/sec: 6.87504
INFO:tensorflow:loss = 0.076768525, step = 5661 (14.546 sec)
INFO:tensorflow:Saving checkpoints for 5698 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09268806.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' wit

INFO:tensorflow:loss = 0.10528631, step = 5698
INFO:tensorflow:global_step/sec: 7.16492
INFO:tensorflow:loss = 0.068808645, step = 5798 (13.957 sec)
INFO:tensorflow:Saving checkpoints for 5843 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.096280895.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bo

INFO:tensorflow:global_step/sec: 7.10792
INFO:tensorflow:loss = 0.11177015, step = 5943 (14.070 sec)
INFO:tensorflow:Saving checkpoints for 5986 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09190047.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with

INFO:tensorflow:global_step/sec: 7.02673
INFO:tensorflow:loss = 0.08773277, step = 6086 (14.232 sec)
INFO:tensorflow:Saving checkpoints for 6126 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.108235404.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' wit

INFO:tensorflow:loss = 0.087180324, step = 6126
INFO:tensorflow:global_step/sec: 6.27254
INFO:tensorflow:loss = 0.090464786, step = 6226 (15.943 sec)
INFO:tensorflow:Saving checkpoints for 6234 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.0958982.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bot

INFO:tensorflow:loss = 0.080917366, step = 6234
INFO:tensorflow:global_step/sec: 6.57386
INFO:tensorflow:loss = 0.114124, step = 6334 (15.212 sec)
INFO:tensorflow:Saving checkpoints for 6338 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.098501734.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:loss = 0.08843404, step = 6338
INFO:tensorflow:global_step/sec: 7.1752
INFO:tensorflow:loss = 0.075338416, step = 6438 (13.937 sec)
INFO:tensorflow:Saving checkpoints for 6486 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09714719.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:Saving checkpoints for 6486 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:loss = 0.097704485, step = 6486
INFO:tensorflow:global_step/sec: 7.12456
INFO:tensorflow:loss = 0.08079961, step = 6586 (14.037 sec)
INFO:tensorflow:Saving checkpoints for 6634 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.0965113.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tens

INFO:tensorflow:loss = 0.08393594, step = 6634
INFO:tensorflow:global_step/sec: 6.98658
INFO:tensorflow:loss = 0.10352223, step = 6734 (14.314 sec)
INFO:tensorflow:Saving checkpoints for 6770 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.10391315.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:loss = 0.09853861, step = 6770
INFO:tensorflow:global_step/sec: 6.94403
INFO:tensorflow:loss = 0.0691255, step = 6870 (14.401 sec)
INFO:tensorflow:Saving checkpoints for 6908 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.094906874.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bott

INFO:tensorflow:global_step/sec: 7.00053
INFO:tensorflow:loss = 0.08643257, step = 7008 (14.285 sec)
INFO:tensorflow:Saving checkpoints for 7045 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09902369.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with

INFO:tensorflow:global_step/sec: 6.92608
INFO:tensorflow:loss = 0.07900216, step = 7145 (14.439 sec)
INFO:tensorflow:Saving checkpoints for 7188 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.09670772.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with

INFO:tensorflow:Saving checkpoints for 7200 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.
INFO:tensorflow:Loss for final step: 0.0801804.
INFO:tensorflow:Reading data files from /tmp/t2t_data/observation_prediction_roots/observation_prediction_roots-dev*
INFO:tensorflow:partition: 0 num_data_files: 3
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transformin

INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
INFO:tensorflow:Finished evaluation at 2018-07-14-12:42:53
INFO:tensorflow:Saving dict for global step 7200: global_step = 7200, loss = 0.07628368, metrics-observation_prediction_roots/targets/accuracy = 0.98030055, metrics-observation_prediction_roots/targets/accuracy_per_sequence = 0.54214287, metrics-observation_prediction_roots/targets/accuracy_top5 = 0.9946436, metrics-observation_prediction_roots/targets/approx_bleu_score = 0.92196065, metrics-observation_prediction_roots/targets/neg_log_perplexity = -0.14433086, metrics-observation_prediction_roots/targets/rouge_2_fscore = 0.91952455, metrics-observation_prediction_roots/targets/rouge_L_fscore = 0.95685834
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 7200: /tmp/t2t_train/obs

In [5]:
# See the transductions

! pygmentize -g test_roots.observation
print("->-")
! pygmentize -g test_roots.prediction

A_POLYNOMIAL with coefficients 1 3 22
->-
A_ROOT real -1.50 imag -4.56 A_ROOT real -1.50 imag 4.56


In [0]:
import os

import tensorflow as tf

from tensor2tensor import problems
from tensor2tensor.bin import t2t_decoder  # To register the hparams set
from tensor2tensor.utils import registry
from tensor2tensor.utils import trainer_lib
from tensor2tensor.visualization import attention
from tensor2tensor.visualization import visualization

In [0]:
def call_html():
  import IPython
  display(IPython.core.display.HTML('''
        <script src="/static/components/requirejs/require.js"></script>
        <script>
          requirejs.config({
            paths: {
              base: '/static/base',
              "d3": "https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.8/d3.min",
              jquery: '//ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min',
            },
          });
        </script>
        '''))

In [0]:
# MODEL
CHECKPOINT = os.path.expanduser('/tmp/t2t_train/observation_prediction_roots/transformer-transformer_small')

In [0]:
# HPARAMS
problem_name = 'observation_prediction_roots'
data_dir = os.path.expanduser('/tmp/t2t_data/observation_prediction_roots')
model_name = "transformer"
hparams_set = "transformer_small"

In [10]:
import observation_prediction_roots

visualizer = visualization.AttentionVisualizer(hparams_set, model_name, data_dir, problem_name, beam_size=1)

INFO:tensorflow:Setting T2TModel mode to 'eval'
INFO:tensorflow:Setting hparams.dropout to 0.0
INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0
INFO:tensorflow:Setting hparams.symbol_dropout to 0.0
INFO:tensorflow:Setting hparams.attention_dropout to 0.0
INFO:tensorflow:Setting hparams.relu_dropout to 0.0
INFO:tensorflow:Using variable initializer: uniform_unit_scaling
INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_168_256.bottom
INFO:tensorflow:Transforming 'targets' with symbol_modality_168_256.targets_bottom
INFO:tensorflow:Building model body
INFO:tensorflow:Transforming body output with symbol_modality_168_256.top
INFO:tensorflow:Greedy Decoding


In [11]:
tf.Variable(0, dtype=tf.int64, trainable=False, name='global_step')

sess = tf.train.MonitoredTrainingSession(
    checkpoint_dir=CHECKPOINT,
    save_summaries_secs=0,
)

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt-7200
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 7200 into /tmp/t2t_train/observation_prediction_roots/transformer-transformer_small/model.ckpt.


In [12]:
input_sentence = "A_POLYNOMIAL with coefficients 1 3 22"
output_string, inp_text, out_text, att_mats = visualizer.get_vis_data_from_string(sess, input_sentence)
print(output_string)

A_ROOT real -1.50 imag -4.56 A_ROOT real -1.50 imag 4.56<EOS>


## Interpreting the Visualizations
- The layers drop down allow you to view the different Transformer layers, 0-indexed of course.
  - Tip: The first layer, last layer and 2nd to last layer are usually the most interpretable.
- The attention dropdown allows you to select different pairs of encoder-decoder attentions:
  - All: Shows all types of attentions together. NOTE: There is no relation between heads of the same color - between the decoder self attention and decoder-encoder attention since they do not share parameters.
  - Input - Input: Shows only the encoder self-attention.
  - Input - Output: Shows the decoder’s attention on the encoder. NOTE: Every decoder layer attends to the final layer of encoder so the visualization will show the attention on the final encoder layer regardless of what layer is selected in the drop down.
  - Output - Output: Shows only the decoder self-attention. NOTE: The visualization might be slightly misleading in the first layer since the text shown is the target of the decoder, the input to the decoder at layer 0 is this text with a GO symbol prepreded.
- The colored squares represent the different attention heads.
  - You can hide or show a given head by clicking on it’s color.
  - Double clicking a color will hide all other colors, double clicking on a color when it’s the only head showing will show all the heads again.
- You can hover over a word to see the individual attention weights for just that position.
  - Hovering over the words on the left will show what that position attended to.
  - Hovering over the words on the right will show what positions attended to it.

In [13]:
call_html()
attention.show(inp_text, out_text, *att_mats)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>