# Tensor2Tensor translation
First try using the deep learning framework for translation tasks using the transformer architecture.

<img style="width: 400px;" src="../images/transformer.png" />

In [3]:
import linecache
import os
import sys

import numpy as np
np.random.seed(0)

import matplotlib.pyplot as plt

In [4]:
def show_sample(fp, src_ext=".src", tgt_ext=".tgt", lines=[3,21,80,99]):
    linecache.clearcache()
    for l in lines:
        print("LINE: {} \nSOURCE:    {} \nTARGET:     {}\n".format(l, 
                                                                   linecache.getline(fp+src_ext, l), 
                                                                   linecache.getline(fp+tgt_ext, l)))

In [5]:
django_fp = "../datasets/django/all"
show_sample(django_fp, src_ext=".desc", tgt_ext=".code", lines=[13,14])

LINE: 13 
SOURCE:      define the function get_cache with backend and dictionary pair of elements kwargs as arguments.
 
TARGET:         def get_cache ( backend , ** kwargs ) :


LINE: 14 
 




### Dataset and train / test split
Copy the full dataset to the temp folder. We then split the data into a training and testing set at around 90% / 10%

In [8]:
dirName = "temp"
 
try:
    # Create target Directory
    os.mkdir(dirName)
    print("Directory " , dirName ,  " Created ") 
except FileExistsError:
    print("Directory " , dirName ,  " already exists")

Directory  temp  Created 


In [15]:
inputs_train_fp = "temp/inputs.train.txt"
targets_train_fp = "temp/targets.train.txt"
inputs_test_fp = "temp/inputs.eval.txt"
targets_test_fp = "temp/targets.eval.txt"

In [16]:
train_ratio = 0.9 # this means 90% of the data will be used for training, thus 10% for testing
num_samples = sum(1 for line in open(django_fp + ".desc"))
train_cutoff = int(num_samples * train_ratio)

line_nums = np.arange(num_samples)
np.random.shuffle(line_nums)

train_lines = line_nums[:train_cutoff]
test_lines = line_nums[train_cutoff:]

##### Train split for .desc and .code

In [17]:
with open(inputs_train_fp, "w") as out:
    for l in train_lines:
        src = linecache.getline(django_fp + ".desc", l)
        out.write(src)

In [18]:
with open(targets_train_fp, "w") as out:
    for l in train_lines:
        src = linecache.getline(django_fp + ".code", l)
        out.write(src)

##### Test split for .desc and .code

In [19]:
with open(inputs_test_fp, "w") as out:
    for l in test_lines:
        src = linecache.getline(django_fp + ".desc", l)
        out.write(src)

In [20]:
with open(targets_test_fp, "w") as out:
    for l in test_lines:
        src = linecache.getline(django_fp + ".code", l)
        out.write(src)

## Generating the data
Tensor2Tensor uses Problems to define the datainput to the model. We arer currently using a facility provided by the library to generate a problem from input and target files.

In [22]:
!pwd
!ls

/my_shared/experiments
Tensor2Tensor_first_try.ipynb  datasets.ipynb  retrieval.ipynb	temp


In [25]:
!t2t-datagen --tmp_dir=./temp --problem=text2text_tmpdir

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
W0819 13:08:31.380589 139897936963328 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality n

I0819 13:08:38.116696 139897936963328 text_encoder.py:866] vocab_size = 2392
I0819 13:08:38.117123 139897936963328 text_encoder.py:802] Iteration 1
I0819 13:08:38.330088 139897936963328 text_encoder.py:866] vocab_size = 1068
I0819 13:08:38.330311 139897936963328 text_encoder.py:802] Iteration 2
I0819 13:08:38.555809 139897936963328 text_encoder.py:866] vocab_size = 1124
I0819 13:08:38.556070 139897936963328 text_encoder.py:802] Iteration 3
I0819 13:08:38.770539 139897936963328 text_encoder.py:866] vocab_size = 1115
I0819 13:08:38.773955 139897936963328 text_encoder.py:722] Trying min_count 62
I0819 13:08:38.776998 139897936963328 text_encoder.py:802] Iteration 0
I0819 13:08:39.176549 139897936963328 text_encoder.py:866] vocab_size = 3728
I0819 13:08:39.176832 139897936963328 text_encoder.py:802] Iteration 1
I0819 13:08:39.402698 139897936963328 text_encoder.py:866] vocab_size = 1565
I0819 13:08:39.402932 139897936963328 text_encoder.py:802] Iteration 2
I0819 13:08:39.637024 13989793696

In [29]:
!t2t-trainer --problem=text2text_tmpdir \
             --hparams_set=transformer_base \
             --hparams='batch_size=1024' \
             --model=transformer \
             --data_dir=/tmp \
             --output_dir=./temp/model

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
W0819 13:21:28.144207 140450289841920 deprecation_wrapper.py:119] From /usr/local/lib/python3.5/dist-packages/tensor2tensor/utils/expert_utils.py:68: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0819 13:21:28.625369 140450289841920 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more info

I0819 13:21:31.644086 140450289841920 estimator.py:209] Using config: {'_num_ps_replicas': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fbcdee5a470>, '_keep_checkpoint_every_n_hours': 10000, '_device_fn': None, '_environment': 'local', '_is_chief': True, '_session_config': gpu_options {
  per_process_gpu_memory_fraction: 0.95
}
allow_soft_placement: true
graph_options {
  optimizer_options {
    global_jit_level: OFF
  }
}
isolate_session_state: true
, '_model_dir': './temp/model', '_save_checkpoints_steps': 1000, '_experimental_max_worker_delay_secs': None, 'data_parallelism': <tensor2tensor.utils.expert_utils.Parallelism object at 0x7fbcdee5a4a8>, '_save_checkpoints_secs': None, '_log_step_count_steps': 100, '_task_type': None, 't2t_device_info': {'num_async_replicas': 1}, '_save_summary_steps': 100, '_master': '', 'use_tpu': False, '_eval_distribute': None, '_evaluation_master': '', '_train_distribute': None, '_tf_config': gpu_options {
  per_p

2019-08-19 13:21:54.132793: W tensorflow/core/common_runtime/colocation_graph.cc:1016] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
Fill: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Add: CPU XLA_CPU 
Mul: CPU XLA_CPU 
RandomStandardNormal: CPU XLA_CPU 
VarHa

  training/transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/Adam_1/Initializer/zeros (Fill) /device:GPU:0
  training/transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/Adam_1 (VarHandleOp) /device:GPU:0
  training/transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /device:GPU:0
  training/transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/Adam_1/Assign (AssignVariableOp) /device:GPU:0
  training/transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/Adam_1/Read/ReadVariableOp (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_2/self_attention/multihead_attention/output_transform/kernel/ResourceApplyAdam (ResourceApplyAdam) /device:GPU:0
  report_uninitialized_variables/VarIsInitializedO

  training/train/update_transformer/body/encoder/layer_0/ffn/conv2/bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_0/ffn/conv2/bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_1/self_attention/layer_prepostprocess/layer_norm/layer_norm_scale/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_1/self_attention/layer_prepostprocess/layer_norm/layer_norm_scale/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_1/self_attention/layer_prepostprocess/layer_norm/layer_norm_bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /device:GPU:0
  training/train/update_transformer/body/encoder/layer_1/self_attention/layer_prepostprocess/layer_norm/layer_norm_bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVaria

VarHandleOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
Mul: CPU XLA_CPU 
Fill: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/shape (Const) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/min (Const) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/max (Const) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/RandomUniform (RandomUniform) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/sub (Sub) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform/mul (Mul) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel/Initializer/random_uniform (Add) 
  transformer/body/decoder/layer_2/ffn/conv2/kernel (VarHandleOp) /device:GPU:0
  transformer/body/decoder/layer_2/ffn/conv2/kernel/IsInitialized/VarIsIn

I0819 13:21:57.542656 140450289841920 session_manager.py:500] Running local_init_op.
I0819 13:21:57.777920 140450289841920 session_manager.py:502] Done running local_init_op.
I0819 13:22:07.965782 140450289841920 basic_session_run_hooks.py:606] Saving checkpoints for 0 into ./temp/model/model.ckpt.
2019-08-19 13:22:35.223431: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 117 of 512
2019-08-19 13:22:45.015565: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 276 of 512
2019-08-19 13:22:55.033494: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:111] Filling up shuffle buffer (this may take a while): 432 of 512
2019-08-19 13:23:02.409222: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:162] Shuffle buffer filled.
I0819 13:23:08.089487 140450289841920 basic_session_run_hooks.py:262] loss = 8.330775, step = 1
Killed
