# Sequence to sequence

> **訳注**：このnotebookは，Udacityの[Deep learning nanodegreeの公開教材](https://github.com/udacity/deep-learning)を和訳するプロジェクトの一環として，[Character Sequence to Sequence ](https://github.com/haltaro/deep-learning-in-japanese/blob/master/seq2seq/sequence_to_sequence_implementation.ipynb)を翻訳したものです（**Exercise**は翻訳対象外です）．本プロジェクトのnotebook一覧は[こちら](https://haltaro.github.io/deep-learning-in-japanese/)．


このノートブックでは，入力された文字列をソートして出力するモデルを，Sequence to Sequenceで実装します．このノートブックはTensorflow 1.1に対応しており，[Dave Currieの記事](https://medium.com/towards-data-science/text-summarization-with-amazon-reviews-41801c2210b)を参考にしています．

<img src="images/sequence-to-sequence.jpg"/>


## Dataset 

データセットは`/data/`ディレクトリにあります．以下のファイルが格納されています．

 * **letters_source.txt**: 入力文字列のリスト．一行に一つの文字列が存在します．
 * **letters_target.txt**: 目標文字列のリスト．各行は，`letters_source.txt`と対応しています．

In [1]:
import numpy as np
import time

import helper

source_path = 'data/letters_source.txt'
target_path = 'data/letters_target.txt'

source_sentences = helper.load_data(source_path)
target_sentences = helper.load_data(target_path)

In [2]:
import warnings
warnings.filterwarnings('ignore')

まず，データセットの中身を確認してみましょう．`source_sentences`は，`\n`で区切られた全ての入力文字列を保持しています．

In [3]:
source_sentences[:50].split('\n')

['bsaqq',
 'npy',
 'lbwuj',
 'bqv',
 'kial',
 'tddam',
 'edxpjpg',
 'nspv',
 'huloz',
 '']

`target_senteces`は，`\n`で区切られた，全ての目標文字列を保持しています．文字列の並び順は，`source_sentences`と対応しています．

In [4]:
target_sentences[:50].split('\n')

['abqqs',
 'npy',
 'bjluw',
 'bqv',
 'aikl',
 'addmt',
 'degjppx',
 'npsv',
 'hlouz',
 '']

## Preprocess

文字列を，文字リストに変換する必要があります．

<img src="images/source_and_target_arrays.png"/>

それから，`vocab_to_int`に従い文字を整数値に変換します．

In [5]:
def extract_character_vocab(data):
    special_words = ['<PAD>', '<UNK>', '<GO>',  '<EOS>']

    set_words = set([character for line in data.split('\n') for character in line])
    int_to_vocab = {word_i: word for word_i, word in enumerate(special_words + list(set_words))}
    vocab_to_int = {word: word_i for word_i, word in int_to_vocab.items()}

    return int_to_vocab, vocab_to_int

# Build int2letter and letter2int dicts
source_int_to_letter, source_letter_to_int = extract_character_vocab(source_sentences)
target_int_to_letter, target_letter_to_int = extract_character_vocab(target_sentences)

# Convert characters to ids
source_letter_ids = [[source_letter_to_int.get(letter, source_letter_to_int['<UNK>']) for letter in line] for line in source_sentences.split('\n')]
target_letter_ids = [[target_letter_to_int.get(letter, target_letter_to_int['<UNK>']) for letter in line] + [target_letter_to_int['<EOS>']] for line in target_sentences.split('\n')] 

print("Example source sequence")
print(source_letter_ids[:3])
print("\n")
print("Example target sequence")
print(target_letter_ids[:3])

Example source sequence
[[5, 6, 9, 23, 23], [18, 13, 11], [24, 5, 22, 8, 27]]


Example target sequence
[[9, 5, 23, 23, 6, 3], [18, 13, 11, 3], [5, 27, 24, 8, 22, 3]]


これで準備は完了です．

## Model

### Tensorflowのバージョン確認

Tensorflowのバージョンを確認します．

In [6]:
from distutils.version import LooseVersion
import tensorflow as tf
from tensorflow.python.layers.core import Dense


# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.1'), 'Please use TensorFlow version 1.1 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

TensorFlow Version: 1.6.0


### Hyperparameters

In [7]:
# Number of Epochs
epochs = 60
# Batch Size
batch_size = 128
# RNN Size
rnn_size = 50
# Number of Layers
num_layers = 2
# Embedding Size
encoding_embedding_size = 15
decoding_embedding_size = 15
# Learning Rate
learning_rate = 0.001

### Input

In [8]:
def get_model_inputs():
    input_data = tf.placeholder(tf.int32, [None, None], name='input')
    targets = tf.placeholder(tf.int32, [None, None], name='targets')
    lr = tf.placeholder(tf.float32, name='learning_rate')

    target_sequence_length = tf.placeholder(
        tf.int32, (None,), name='target_sequence_length')
    max_target_sequence_length = tf.reduce_max(
        target_sequence_length, name='max_target_len')
    source_sequence_length = tf.placeholder(
        tf.int32, (None,), name='source_sequence_length')
    
    return input_data, targets, lr, target_sequence_length, \
        max_target_sequence_length, source_sequence_length


### Sequence to Sequence Model

seq2seqモデルを構築するための関数定義を始めます．以下のようにボトムアップで各要素を構築します．

    2.1 Encoder
        - Embedding
        - Encoder cell
    2.2 Decoder
        1- Process decoder inputs
        2- Set up the decoder
            - Embedding
            - Decoder cell
            - Dense output layer
            - Training decoder
            - Inference decoder
    2.3 Seq2seq model connecting the encoder and decoder
    2.4 Build the training graph hooking up the model with the 
        optimizer

### 2.1 Encoder

モデルの最初の要素はEncoderです．この関数では，入力を埋め込み，Encoderを構築し，埋め込まれたデータをEncoderに渡します．

-  [`tf.contrib.layers.embed_sequence`](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence)を使って入力データを埋め込みます．

<img src="images/embed_sequence.png" />

- RNNスタックに埋め込まれたデータを渡します．RNNの状態を保存し，出力を蟲師ます．

<img src="images/encoder.png" />

In [9]:
def encoding_layer(input_data, rnn_size, num_layers,
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):


    # Encoder embedding
    enc_embed_input = tf.contrib.layers.embed_sequence(
        input_data, source_vocab_size, encoding_embedding_size)

    # RNN cell
    def make_cell(rnn_size):
        enc_cell = tf.contrib.rnn.LSTMCell(
            rnn_size, initializer=tf.random_uniform_initializer(
                -0.1, 0.1, seed=2))
        return enc_cell

    enc_cell = tf.contrib.rnn.MultiRNNCell(
        [make_cell(rnn_size) for _ in range(num_layers)])
    
    enc_output, enc_state = tf.nn.dynamic_rnn(
        enc_cell, enc_embed_input, 
        sequence_length=source_sequence_length, dtype=tf.float32)
    
    return enc_output, enc_state

## 2.2 Decoder

Decoderは，このモデルでおそらく最もややこしい部分です．構築には以下の手順が必要です．

    1- Decoderへの入力を処理．
    2- Decoderの各要素を構築．
        - Embedding
        - Decoder cell
        - Dense output layer
        - Training decoder
        - Inference decoder

### Decoderへの入力を処理

学習プロセスにおいて，目標文字列は二つの使われ方をします．

 1. 誤差計算．
 2. モデルをロバストにするためのDecoderの入力．

二点目について詳細に説明します．ここでは，以下のような目標データを考えます（実際はこの時点で整数値のはずですが，簡単のため文字として表現しています）．

<img src="images/targets_1.png"/>

Decoderに入力する前に，このテンソルを加工する必要があります．

1- 各タイムステップにおいて，この文字列中の要素をDecoderに渡します．最後のタイムステップを考えてください．Decoderは最後にどの要素を出力するのでしょう？そのステップの入力は，目標文字列の最後から二番目の要素です．最後の要素は，Decoderへの入力として使われることはありません．よって，最後の要素を削除する必要があります．

[`tf.strided_slice()`](https://www.tensorflow.org/api_docs/python/tf/strided_slice)メソッドを使って，これを行います．このメソッドに，テンソル，開始インデックス，および終了インデックスを引数として渡します．

> **訳注**：以下の４つめの引数（`[1, 1]`）は`stride`で，この場合は特に指定する必要はありません．

<img src="images/strided_slice_1.png"/>

2- 各文字列の最初の要素は`<GO>`シンボルのはずです．`<GO>`シンボルを先頭に追加します．

<img src="images/targets_add_go.png"/>

Decoderに渡す準備ができました．テンソルは以下のような見た目のはずです（もし整数値を文字列に変換したなら）．

<img src="images/targets_after_processing_1.png"/>

In [13]:
# Process the input we'll feed to the decoder
def process_decoder_input(target_data, vocab_to_int, batch_size):
    '''Remove the last word id from each batch and concat the <GO> 
    to the begining of each batch'''
    ending = tf.strided_slice(target_data, [0, 0], [batch_size, -1], [1, 1])
    dec_input = tf.concat([tf.fill([batch_size, 1], vocab_to_int['<GO>']), ending], 1)

    return dec_input

### Decoderの各要素を構築

        - Embedding
        - Decoder cell
        - Dense output layer
        - Training decoder
        - Inference decoder

#### 1- Embedding

Decoderの学習用の入力の準備は整いました．次は，入力値にEmbedding処理を行います．

下図のようなEmbedding行列を`tf.nn.embedding_lookup`で構築します．

<img src="images/embeddings.png" />

#### 2- Decoder Cell

次に，Decoder cellを構築します．Encoderと同様，`tf.contrib.rnn.LSTMCell`を使います．

学習用のDecoderか，推論（Inference，prediction）用のDecoderか宣言する必要があります．これらのDecoderは内部パラメータを共有します．つまり（当たり前ですが）学習時に獲得したWeightとBiasは，推論時にも利用されます．

まず，Decoder RNNで利用するタイプを定義します．ここではLSTMを選択しました．

#### 3- Dense output layer

Decoderを宣言する前に，出力層を構築する必要があります．出力層は`tensorflow.python.layers.Dense`で，各タイムステップで，Decoderの出力値をLogitsに変換します．

#### 4- Training decoder

基本方針として，内部パラメータを共有する二つのDecoderを構築します．一つは学習用で，一つは推論用です．両方とも`tf.contrib.seq2seq.BasicDecoder`と`tf.contrib.seq2seq.dynamic_decode`を使います．しかし，学習用Decoderには目標文字列を入力する点が，推論用Decoderと違います．これはロバストなモデルにするための処理です．

学習用のDecoderは，例えば下図のようなものと考えられます（ただし，下図では文字ではなく単語を単位として扱っています）．

<img src="images/sequence-to-sequence-training-decoder.png"/>

学習用Decoderは各タイムステップの出力を次の入力として利用**しません**．その代わり，目標データセット（オレンジ色）を入力として用います．

#### 5- Inference decoder

推論用Deocoderは，実際に推論に使うDecoderです．

<img src="images/sequence-to-sequence-inference-decoder.png"/>

学習用Decoderと推論用Decoderのどちらに対しても，EncoderのHidden stateを渡します．Tensorflowはここまでのロジックのほとんどの機能を提供してくれます．我々は，`tf.contrib.seq2seq`から適切なメソッドを選択し，それらを適切な入力に適用するだけです．

In [18]:
def decoding_layer(target_letter_to_int, decoding_embedding_size, num_layers, 
                   rnn_size, target_sequence_length, max_target_sequence_length, 
                   enc_state, dec_input):
    # 1. Decoder Embedding
    target_vocab_size = len(target_letter_to_int)
    dec_embeddings = tf.Variable(tf.random_uniform(
        [target_vocab_size, decoding_embedding_size]))
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

    # 2. Construct the decoder cell
    def make_cell(rnn_size):
        dec_cell = tf.contrib.rnn.LSTMCell(
            rnn_size, initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
        return dec_cell

    dec_cell = tf.contrib.rnn.MultiRNNCell(
        [make_cell(rnn_size) for _ in range(num_layers)])
     
    # 3. Dense layer to translate the decoder's output at each time 
    # step into a choice from the target vocabulary
    output_layer = Dense(
        target_vocab_size, kernel_initializer = tf.truncated_normal_initializer(
            mean = 0.0, stddev=0.1))


    # 4. Set up a training decoder and an inference decoder
    # Training Decoder
    with tf.variable_scope("decode"):

        # Helper for the training process. Used by BasicDecoder to read inputs.
        training_helper = tf.contrib.seq2seq.TrainingHelper(
            inputs=dec_embed_input, sequence_length=target_sequence_length,
            time_major=False)
        
        
        # Basic decoder
        training_decoder = tf.contrib.seq2seq.BasicDecoder(
            dec_cell, training_helper, enc_state, output_layer) 
        
        # Perform dynamic decoding using the decoder
        training_decoder_output, _, __ = tf.contrib.seq2seq.dynamic_decode(
            training_decoder, impute_finished=True,
            maximum_iterations=max_target_sequence_length)
    
    # 5. Inference Decoder
    # Reuses the same parameters trained by the training process
    with tf.variable_scope("decode", reuse=True):
        start_tokens = tf.tile(
            tf.constant([target_letter_to_int['<GO>']], dtype=tf.int32), 
            [batch_size], name='start_tokens')

        # Helper for the inference process.
        inference_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(
            dec_embeddings, start_tokens, target_letter_to_int['<EOS>'])

        # Basic decoder
        inference_decoder = tf.contrib.seq2seq.BasicDecoder(
            dec_cell, inference_helper, enc_state, output_layer)
        
        # Perform dynamic decoding using the decoder
        inference_decoder_output, _, __ = tf.contrib.seq2seq.dynamic_decode(
            inference_decoder, impute_finished=True,
            maximum_iterations=max_target_sequence_length)
         
    
    return training_decoder_output, inference_decoder_output

> **訳注**：Tensorflow 1.1では二つだった`tf.contrib.seq2seq.dynamic_decode`の返り値が，現在（1.6）では三つになっていたため，`training_decoder_output, _, __`および`inference_decoder_output, _, __`と修正しました．

## 2.3 Seq2seq model 

EncoderとDecoderを繋ぎあわせてみましょう．

In [14]:

def seq2seq_model(input_data, targets, lr, target_sequence_length, 
                  max_target_sequence_length, source_sequence_length,
                  source_vocab_size, target_vocab_size,
                  enc_embedding_size, dec_embedding_size, 
                  rnn_size, num_layers):
    
    # Pass the input data through the encoder. We'll ignore the encoder output, 
    # but use the state
    _, enc_state = encoding_layer(input_data, 
                                  rnn_size, 
                                  num_layers, 
                                  source_sequence_length,
                                  source_vocab_size, 
                                  encoding_embedding_size)
    
    
    # Prepare the target sequences we'll feed to the decoder in training mode
    dec_input = process_decoder_input(targets, target_letter_to_int, batch_size)
    
    # Pass encoder state and decoder inputs to the decoders
    training_decoder_output, inference_decoder_output = decoding_layer(
        target_letter_to_int, decoding_embedding_size, num_layers, rnn_size,
        target_sequence_length, max_target_sequence_length, enc_state, dec_input) 
    
    return training_decoder_output, inference_decoder_output

モデルは下図のような`rnn_output`Logitsを含む`training_decoder_output`と`inference_decoder_output`を出力します．

<img src="images/logits.png"/>

学習用Decoderから得られたLogitsを`tf.contrib.seq2seq.sequence_loss()`に渡して，損失関数と勾配を計算します．

In [19]:
# Build the graph
train_graph = tf.Graph()
# Set the graph to default to ensure that it is ready for training
with train_graph.as_default():
    
    # Load the model inputs    
    input_data, targets, lr, target_sequence_length, \
    max_target_sequence_length, source_sequence_length = get_model_inputs()
    
    # Create the training and inference logits
    training_decoder_output, inference_decoder_output = seq2seq_model(
        input_data, targets, lr, target_sequence_length, max_target_sequence_length, 
        source_sequence_length, len(source_letter_to_int), len(target_letter_to_int),
        encoding_embedding_size, decoding_embedding_size, rnn_size, num_layers)    
    
    # Create tensors for the training logits and inference logits
    training_logits = tf.identity(training_decoder_output.rnn_output, 
                                  'logits')
    inference_logits = tf.identity(inference_decoder_output.sample_id, 
                                   name='predictions')
    
    # Create the weights for sequence_loss
    masks = tf.sequence_mask(target_sequence_length, max_target_sequence_length, 
                             dtype=tf.float32, name='masks')

    with tf.name_scope("optimization"):
        
        # Loss function
        cost = tf.contrib.seq2seq.sequence_loss(
            training_logits, targets, masks)

        # Optimizer
        optimizer = tf.train.AdamOptimizer(lr)

        # Gradient Clipping
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -5., 5.), var) 
                            for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)

## バッチ生成

バッチを生成する際，必要な処理が少しだけあります．以下は，`batch_size = 2`のときの簡単な例です．入力文字列は以下のように処理します（実際は整数値です．簡単のため文字列を表示しています）：

<img src="images/source_batch.png" />

目標文字列は以下のように処理します：

<img src="images/target_batch.png" />

In [27]:
def pad_sentence_batch(sentence_batch, pad_int):
    """Pad sentences with <PAD> so that each sentence of a batch has the same length"""
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [pad_int] * (max_sentence - len(sentence)) 
            for sentence in sentence_batch]

In [26]:
def get_batches(targets, sources, batch_size, source_pad_int, target_pad_int):
    """Batch targets, sources, and the lengths of their sentences together"""
    for batch_i in range(0, len(sources)//batch_size):
        start_i = batch_i * batch_size
        sources_batch = sources[start_i:start_i + batch_size]
        targets_batch = targets[start_i:start_i + batch_size]
        pad_sources_batch = np.array(pad_sentence_batch(sources_batch, source_pad_int))
        pad_targets_batch = np.array(pad_sentence_batch(targets_batch, target_pad_int))
        
        # Need the lengths for the _lengths parameters
        pad_targets_lengths = []
        for target in pad_targets_batch:
            pad_targets_lengths.append(len(target))
        
        pad_source_lengths = []
        for source in pad_sources_batch:
            pad_source_lengths.append(len(source))
        
        yield pad_targets_batch, pad_sources_batch, pad_targets_lengths, \
              pad_source_lengths

## 学習

モデルを学習させる準備が整いました．もしメモリ不足の問題が発生した場合は，`batch_size`を小さくしてみてください．

> **訳注**：`checkpoint = "best_model.ckpt" `だとエラーが出るので，[こちら](https://stackoverflow.com/questions/42134360/tensorflow-valueerror-parent-directory-of-trained-variables-ckpt-doesnt-exis)を参考に`checkpoint = "./best_model.ckpt"`と修正しました．

In [23]:
# Split data to training and validation sets
train_source = source_letter_ids[batch_size:]
train_target = target_letter_ids[batch_size:]
valid_source = source_letter_ids[:batch_size]
valid_target = target_letter_ids[:batch_size]
(valid_targets_batch, valid_sources_batch, 
 valid_targets_lengths, valid_sources_lengths) = next(
    get_batches(valid_target, valid_source, batch_size,source_letter_to_int['<PAD>'],
                target_letter_to_int['<PAD>']))

display_step = 20 # Check training loss after every 20 batches

checkpoint = "./best_model.ckpt" 
with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())
        
    for epoch_i in range(1, epochs+1):
        for batch_i, (targets_batch, sources_batch, targets_lengths, sources_lengths)\
            in enumerate(get_batches(
                train_target, train_source, batch_size,
                source_letter_to_int['<PAD>'],target_letter_to_int['<PAD>'])):
            
            # Training step
            _, loss = sess.run(
                [train_op, cost],
                {input_data: sources_batch,
                 targets: targets_batch,
                 lr: learning_rate,
                 target_sequence_length: targets_lengths,
                 source_sequence_length: sources_lengths})

            # Debug message updating us on the status of the training
            if batch_i % display_step == 0 and batch_i > 0:
                
                # Calculate validation cost
                validation_loss = sess.run(
                [cost],
                {input_data: valid_sources_batch,
                 targets: valid_targets_batch,
                 lr: learning_rate,
                 target_sequence_length: valid_targets_lengths,
                 source_sequence_length: valid_sources_lengths})
                
                print('Epoch {:>3}/{} Batch {:>4}/{} - Loss: {:>6.3f}  - Validation loss: {:>6.3f}'
                      .format(epoch_i,
                              epochs, 
                              batch_i, 
                              len(train_source) // batch_size, 
                              loss, 
                              validation_loss[0]))

    
    
    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, checkpoint)
    print('Model Trained and Saved')

Epoch   1/60 Batch   20/77 - Loss:  2.326  - Validation loss:  2.366
Epoch   1/60 Batch   40/77 - Loss:  2.273  - Validation loss:  2.229
Epoch   1/60 Batch   60/77 - Loss:  1.952  - Validation loss:  2.008
Epoch   2/60 Batch   20/77 - Loss:  1.662  - Validation loss:  1.743
Epoch   2/60 Batch   40/77 - Loss:  1.671  - Validation loss:  1.641
Epoch   2/60 Batch   60/77 - Loss:  1.504  - Validation loss:  1.544
Epoch   3/60 Batch   20/77 - Loss:  1.367  - Validation loss:  1.435
Epoch   3/60 Batch   40/77 - Loss:  1.447  - Validation loss:  1.413
Epoch   3/60 Batch   60/77 - Loss:  1.356  - Validation loss:  1.383
Epoch   4/60 Batch   20/77 - Loss:  1.239  - Validation loss:  1.304
Epoch   4/60 Batch   40/77 - Loss:  1.290  - Validation loss:  1.262
Epoch   4/60 Batch   60/77 - Loss:  1.177  - Validation loss:  1.212
Epoch   5/60 Batch   20/77 - Loss:  1.109  - Validation loss:  1.157
Epoch   5/60 Batch   40/77 - Loss:  1.167  - Validation loss:  1.133
Epoch   5/60 Batch   60/77 - Loss:

Epoch  40/60 Batch   60/77 - Loss:  0.027  - Validation loss:  0.046
Epoch  41/60 Batch   20/77 - Loss:  0.027  - Validation loss:  0.044
Epoch  41/60 Batch   40/77 - Loss:  0.028  - Validation loss:  0.041
Epoch  41/60 Batch   60/77 - Loss:  0.026  - Validation loss:  0.043
Epoch  42/60 Batch   20/77 - Loss:  0.025  - Validation loss:  0.041
Epoch  42/60 Batch   40/77 - Loss:  0.026  - Validation loss:  0.039
Epoch  42/60 Batch   60/77 - Loss:  0.024  - Validation loss:  0.041
Epoch  43/60 Batch   20/77 - Loss:  0.023  - Validation loss:  0.039
Epoch  43/60 Batch   40/77 - Loss:  0.024  - Validation loss:  0.037
Epoch  43/60 Batch   60/77 - Loss:  0.022  - Validation loss:  0.039
Epoch  44/60 Batch   20/77 - Loss:  0.021  - Validation loss:  0.037
Epoch  44/60 Batch   40/77 - Loss:  0.022  - Validation loss:  0.035
Epoch  44/60 Batch   60/77 - Loss:  0.021  - Validation loss:  0.036
Epoch  45/60 Batch   20/77 - Loss:  0.019  - Validation loss:  0.035
Epoch  45/60 Batch   40/77 - Loss:

## Prediction

In [24]:
def source_to_seq(text):
    '''Prepare the text for the model'''
    sequence_length = 7
    return [source_letter_to_int.get(word, source_letter_to_int['<UNK>']) 
            for word in text]+ [source_letter_to_int['<PAD>']]*(
            sequence_length-len(text))


In [25]:
input_sentence = 'hello'
text = source_to_seq(input_sentence)

checkpoint = "./best_model.ckpt"

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(checkpoint + '.meta')
    loader.restore(sess, checkpoint)

    input_data = loaded_graph.get_tensor_by_name('input:0')
    logits = loaded_graph.get_tensor_by_name('predictions:0')
    source_sequence_length = loaded_graph.get_tensor_by_name('source_sequence_length:0')
    target_sequence_length = loaded_graph.get_tensor_by_name('target_sequence_length:0')
    
    #Multiply by batch_size to match the model's input parameters
    answer_logits = sess.run(logits, {input_data: [text]*batch_size, 
                                      target_sequence_length: [len(text)]*batch_size, 
                                      source_sequence_length: [len(text)]*batch_size})[0] 


pad = source_letter_to_int["<PAD>"] 

print('Original Text:', input_sentence)

print('\nSource')
print('  Word Ids:    {}'.format([i for i in text]))
print('  Input Words: {}'.format(" ".join([source_int_to_letter[i] for i in text])))

print('\nTarget')
print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
print('  Response Words: {}'.format(" ".join([target_int_to_letter[i] 
                                              for i in answer_logits if i != pad])))

INFO:tensorflow:Restoring parameters from ./best_model.ckpt
Original Text: hello

Source
  Word Ids:    [29, 16, 24, 24, 21, 0, 0]
  Input Words: h e l l o <PAD> <PAD>

Target
  Word Ids:       [16, 29, 24, 24, 21, 3]
  Response Words: e h l l o <EOS>
