# Language Translation
Training a sequence to sequence model on a dataset of English and French sentences that can translate new sentences from English to French.
## Get the Data
Since translating the whole language of English to French will take lots of time to train, i have trained it with a small portion of the English corpus.

In [1]:

import helper
import problem_unittests as tests

source_path = 'data/small_vocab_en'
target_path = 'data/small_vocab_fr'
source_text = helper.load_data(source_path)
target_text = helper.load_data(target_path)

## Explore the Data
Playing around with view_sentence_range to view different parts of the data.

In [2]:
view_sentence_range = (0, 10)

import numpy as np

print('Dataset Stats')
print('Roughly the number of unique words: {}'.format(len({word: None for word in source_text.split()})))

sentences = source_text.split('\n')
word_counts = [len(sentence.split()) for sentence in sentences]
print('Number of sentences: {}'.format(len(sentences)))
print('Average number of words in a sentence: {}'.format(np.average(word_counts)))

print()
print('English sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(source_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))
print()
print('French sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(target_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Dataset Stats
Roughly the number of unique words: 227
Number of sentences: 137861
Average number of words in a sentence: 13.225277634719028

English sentences 0 to 10:
new jersey is sometimes quiet during autumn , and it is snowy in april .
the united states is usually chilly during july , and it is usually freezing in november .
california is usually quiet during march , and it is usually hot in june .
the united states is sometimes mild during june , and it is cold in september .
your least liked fruit is the grape , but my least liked is the apple .
his favorite fruit is the orange , but my favorite is the grape .
paris is relaxing during december , but it is usually chilly in july .
new jersey is busy during spring , and it is never hot in march .
our least liked fruit is the lemon , but my least liked is the grape .
the united states is sometimes busy during january , and it is sometimes warm in november .

French sentences 0 to 10:
new jersey est parfois calme pendant l' automne 

## Implement Preprocessing Function
### Text to Word Ids
In the function `text_to_ids()`, we'll turn `source_text` and `target_text` from words to ids. Need to add the `<EOS>` word id at the end of `target_text`.  This will help the neural network predict when the sentence should end.

>We can get the `<EOS>` word id by doing:
```python
target_vocab_to_int['<EOS>']
```
WE can get other word ids using `source_vocab_to_int` and `target_vocab_to_int`.

In [3]:
def text_to_ids(source_text, target_text, source_vocab_to_int, target_vocab_to_int):
    """
    Convert source and target text to proper word ids
    :param source_text: String that contains all the source text.
    :param target_text: String that contains all the target text.
    :param source_vocab_to_int: Dictionary to go from the source words to an id
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :return: A tuple of lists (source_id_text, target_id_text)
    """
    source_id_text =[[source_vocab_to_int[i] for i in x.split()] for x in source_text.split('\n')];
    target_id_text =[[target_vocab_to_int[i] for i in x.split()] for x in target_text.split('\n')];
    
    target_id_text=[r + [target_vocab_to_int['<EOS>']] for r in target_id_text];
    return source_id_text, target_id_text ;


tests.test_text_to_ids(text_to_ids)

Tests Passed


### Preprocess all the data and save it
Running the code cell below will preprocess all the data and save it to file.

In [4]:
helper.preprocess_and_save_data(source_path, target_path, text_to_ids)

# Check Point
The preprocessed data has been saved to disk.

In [5]:

import numpy as np
import helper

(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()

### Check the Version of TensorFlow and Access to GPU


In [6]:

from distutils.version import LooseVersion
import warnings
import tensorflow as tf
from tensorflow.python.layers.core import Dense

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.1'), 'Please use TensorFlow version 1.1 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

# Check for a GPU
if not tf.test.gpu_device_name():
    warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

TensorFlow Version: 1.3.0
Default GPU Device: /gpu:0


## Build the Neural Network
We'll build the components necessary to build a Sequence-to-Sequence model by implementing the following functions below:
- `model_inputs`
- `process_decoder_input`
- `encoding_layer`
- `decoding_layer_train`
- `decoding_layer_infer`
- `decoding_layer`
- `seq2seq_model`

### Input
Implemented the `model_inputs()` function to create TF Placeholders for the Neural Network. It should create the following placeholders:

- Input text placeholder named "input" using the TF Placeholder name parameter with rank 2.
- Targets placeholder with rank 2.
- Learning rate placeholder with rank 0.
- Keep probability placeholder named "keep_prob" using the TF Placeholder name parameter with rank 0.
- Target sequence length placeholder named "target_sequence_length" with rank 1
- Max target sequence length tensor named "max_target_len" getting its value from applying tf.reduce_max on the target_sequence_length placeholder. Rank 0.
- Source sequence length placeholder named "source_sequence_length" with rank 1

Returns the placeholders in the following the tuple (input, targets, learning rate, keep probability, target sequence length, max target sequence length, source sequence length)

In [7]:
def model_inputs():
    """
    Create TF Placeholders for input, targets, learning rate, and lengths of source and target sequences.
    :return: Tuple (input, targets, learning rate, keep probability, target sequence length,
    max target sequence length, source sequence length)
    """
    inp= tf.placeholder(tf.int32,[None,None],name='input');
    targets=tf.placeholder(tf.int32,[None,None],name='targets');
    learning_rate=tf.placeholder(tf.float32,name='lr');
    keep_prob = tf.placeholder(tf.float32,name = 'keep_prob');
    target_sequence_length = tf.placeholder(tf.int32, [None,], name='target_sequence_length')
    max_target_sequence_length = tf.reduce_max(target_sequence_length, name='max_target_len')

    #  Computes the maximum of elements across dimensions of a tensor.
    # Reduces input_tensor along the dimensions given in axis. Unless keep_dims is true, the rank of the 
    # tensor is reduced by 1 for each entry in axis. If keep_dims is true, the reduced dimensions are 
    # retained with length 1.
    # If axis has no entries, all dimensions are reduced, and a tensor with a single element is returned.


    source_sequence_length = tf.placeholder(tf.int32, [None], name='source_sequence_length')

    return inp,targets,learning_rate,keep_prob,target_sequence_length,\
           max_target_sequence_length,source_sequence_length;


tests.test_model_inputs(model_inputs)

Object was never used (type <class 'tensorflow.python.framework.ops.Operation'>):
<tf.Operation 'assert_rank_2/Assert/Assert' type=Assert>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
Object was never used (type <class 'tensorflow.python.framework.ops.Operation'>):
<tf.Operation 'assert_rank_3/Assert/Assert' type=Assert>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:


Tests Passed


### Process Decoder Input
Implemented `process_decoder_input` by removing the last word id from each batch in `target_data` and concat the GO ID to the begining of each batch.

In [8]:
def process_decoder_input(target_data, target_vocab_to_int, batch_size):
    """
    Preprocess target data for encoding
    :param target_data: Target Placehoder
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :param batch_size: Batch Size
    :return: Preprocessed target data
    """
    endon = tf.strided_slice(target_data,[0,0],[batch_size,-1],[1,1]);
    decoded = tf.concat([tf.fill([batch_size,1],target_vocab_to_int['<GO>']),endon],1);
    return decoded;

tests.test_process_encoding_input(process_decoder_input)

Tests Passed


### Encoding
Implemented `encoding_layer()` to create a Encoder RNN layer:
 * Embed the encoder input using [`tf.contrib.layers.embed_sequence`](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence)
 * Construct a [stacked](https://github.com/tensorflow/tensorflow/blob/6947f65a374ebf29e74bb71e36fd82760056d82c/tensorflow/docs_src/tutorials/recurrent.md#stacking-multiple-lstms) [`tf.contrib.rnn.LSTMCell`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/LSTMCell) wrapped in a [`tf.contrib.rnn.DropoutWrapper`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/DropoutWrapper)
 * Pass cell and embedded input to [`tf.nn.dynamic_rnn()`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn)

In [9]:
from imp import reload
reload(tests)
def make_cell(rnn_size):
    enc_cell = tf.contrib.rnn.LSTMCell(rnn_size,
                                       initializer=tf.random_uniform_initializer(-0.1, 0.1, seed=2))
    return enc_cell

def encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):
    """
    Create encoding layer
    :param rnn_inputs: Inputs for the RNN
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param keep_prob: Dropout keep probability
    :param source_sequence_length: a list of the lengths of each sequence in the batch
    :param source_vocab_size: vocabulary size of source data
    :param encoding_embedding_size: embedding size of source data
    :return: tuple (RNN output, RNN state)
    """
    # TODO: Implement Function
    encoded_embed_input = tf.contrib.layers.embed_sequence(rnn_inputs, source_vocab_size, encoding_embedding_size)
    
        
    encoded_cell = tf.contrib.rnn.MultiRNNCell([make_cell(rnn_size) for _ in range(num_layers)])

    encoded_out, encoded = tf.nn.dynamic_rnn(encoded_cell, encoded_embed_input, sequence_length=source_sequence_length, dtype=tf.float32)

    return encoded_out, encoded
    


tests.test_encoding_layer(encoding_layer)

Tests Passed


### Decoding - Training
Creates a training decoding layer:
* Creates a [`tf.contrib.seq2seq.TrainingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/TrainingHelper) 
* Creates a [`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obtains the decoder outputs from [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [16]:
def decoding_layer_train(encoder_state, dec_cell, dec_embed_input, 
                         target_sequence_length, max_summary_length, 
                         output_layer, keep_prob):
    """
    Create a decoding layer for training
    :param encoder_state: Encoder State
    :param dec_cell: Decoder RNN Cell
    :param dec_embed_input: Decoder embedded input
    :param target_sequence_length: The lengths of each sequence in the target batch
    :param max_summary_length: The length of the longest sequence in the batch
    :param output_layer: Function to apply the output layer
    :param keep_prob: Dropout keep probability
    :return: BasicDecoderOutput containing training logits and sample_id
    """
#     TrainingHelper reads a sequence of integers from the encoding layer.

    training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=dec_embed_input,sequence_length=target_sequence_length,
                                                        time_major=False)
#     BasicDecoder processes the sequence with the decoding
#     cell, and an output layer, which is a fully connected layer. initial_state comes from 
#     our DynamicAttentionWrapperState -- > figure this out later , lots of stuff on attention based networks 
    training_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,training_helper,encoder_state,output_layer) 
        
#     Perform dynamic decoding using the decoder
#     dynamic_decode creates our outputs that will be used for training.
    training_decoder_output, _,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder,
                                                                   impute_finished=True,maximum_iterations=max_summary_length)    
    
    return training_decoder_output



"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_decoding_layer_train(decoding_layer_train)

Tests Passed


### Decoding - Inference
Create inference decoder:
* Creates a [`tf.contrib.seq2seq.GreedyEmbeddingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/GreedyEmbeddingHelper)
* Creates a [`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obtains the decoder outputs from [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [18]:
def decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id,
                         end_of_sequence_id, max_target_sequence_length,
                         vocab_size, output_layer, batch_size, keep_prob):
    """
    Creates a decoding layer for inference
    :param encoder_state: Encoder state
    :param dec_cell: Decoder RNN Cell
    :param dec_embeddings: Decoder embeddings
    :param start_of_sequence_id: GO ID
    :param end_of_sequence_id: EOS Id
    :param max_target_sequence_length: Maximum length of target sequences
    :param vocab_size: Size of decoder/target vocabulary
    :param decoding_scope: TenorFlow Variable Scope for decoding
    :param output_layer: Function to apply the output layer
    :param batch_size: Batch size
    :param keep_prob: Dropout keep probability
    :return: BasicDecoderOutput containing inference logits and sample_id
    """
    # Constructs a tensor by tiling a given tensor.
    # This operation creates a new tensor by replicating input multiples times. 
    # The output tensor's i'th dimension has input.dims(i) * multiples[i] elements, and the values of input are 
    # replicated multiples[i] times along the 'i'th dimension. 
    # For example, tiling [a b c d] by [2] produces [a b c d a b c d].
     
    start_tokens = tf.tile(tf.constant([start_of_sequence_id], dtype=tf.int32), [batch_size], name='start_tokens')

    inference_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(dec_embeddings,
                                                                start_tokens,
                                                                end_of_sequence_id)

    inference_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                        inference_helper,
                                                        encoder_state,
                                                        output_layer)
        
    inference_decoder_output,_,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder,
                                                                    impute_finished=True,
                                                                    maximum_iterations=max_target_sequence_length)
         
    return inference_decoder_output




"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_decoding_layer_infer(decoding_layer_infer)

Tests Passed


### Build the Decoding Layer
Implement `decoding_layer()` to create a Decoder RNN layer.

* Embeds the target sequences
* Constructs the decoder LSTM cell (just like the constructed encoder cell above)
* Creates an output layer to map the outputs of the decoder to the elements of our vocabulary
* Uses `decoding_layer_train(encoder_state, dec_cell, dec_embed_input, target_sequence_length, max_target_sequence_length, output_layer, keep_prob)` function to get the training logits.
* Uses `decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id, end_of_sequence_id, max_target_sequence_length, vocab_size, output_layer, batch_size, keep_prob)` function to get the inference logits.

Note: We'll need to use [tf.variable_scope](https://www.tensorflow.org/api_docs/python/tf/variable_scope) to share variables between training and inference.

In [19]:
def decoding_layer(dec_input, encoder_state,
                   target_sequence_length, max_target_sequence_length,
                   rnn_size,
                   num_layers, target_vocab_to_int, target_vocab_size,
                   batch_size, keep_prob, decoding_embedding_size):
    """
    Create decoding layer
    :param dec_input: Decoder input
    :param encoder_state: Encoder state
    :param target_sequence_length: The lengths of each sequence in the target batch
    :param max_target_sequence_length: Maximum length of target sequences
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :param target_vocab_size: Size of target vocabulary
    :param batch_size: The size of the batch
    :param keep_prob: Dropout keep probability
    :param decoding_embedding_size: Decoding embedding size
    :return: Tuple of (Training BasicDecoderOutput, Inference BasicDecoderOutput)
    """
    
    # probably the toughest part of the project , need more resources on variable scopes 
    dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)
    dec_cell = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.LSTMCell(rnn_size) for _ in range(num_layers)])
    
    output_layer = Dense(target_vocab_size, kernel_initializer = tf.truncated_normal_initializer(mean=0.0, stddev=0.1))

    with tf.variable_scope('decode'):
        training_decoder_output = decoding_layer_train(encoder_state, dec_cell, dec_embed_input, target_sequence_length,
                                                       max_target_sequence_length, output_layer, keep_prob)
    
    with tf.variable_scope('decode', reuse=True):
        inference_decoder_output = decoding_layer_infer(encoder_state, dec_cell, dec_embeddings,
                                                        target_vocab_to_int['<GO>'], target_vocab_to_int['<EOS>'],
                                                        max_target_sequence_length, target_vocab_size, output_layer,
                                                        batch_size, keep_prob)

    return training_decoder_output, inference_decoder_output



tests.test_decoding_layer(decoding_layer)

Tests Passed


### Build the Neural Network
Applying the functions implemented above to:

- Encodes the input using  `encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob,  source_sequence_length, source_vocab_size, encoding_embedding_size)`.
- Processes target data using  `process_decoder_input(target_data, target_vocab_to_int, batch_size)` function.
- Decodes the encoded input using  `decoding_layer(dec_input, enc_state, target_sequence_length, max_target_sentence_length, rnn_size, num_layers, target_vocab_to_int, target_vocab_size, batch_size, keep_prob, dec_embedding_size)` function.

In [20]:
def seq2seq_model(input_data, target_data, keep_prob, batch_size,
                  source_sequence_length, target_sequence_length,
                  max_target_sentence_length,
                  source_vocab_size, target_vocab_size,
                  enc_embedding_size, dec_embedding_size,
                  rnn_size, num_layers, target_vocab_to_int):
    """
    Build the Sequence-to-Sequence part of the neural network
    :param input_data: Input placeholder
    :param target_data: Target placeholder
    :param keep_prob: Dropout keep probability placeholder
    :param batch_size: Batch Size
    :param source_sequence_length: Sequence Lengths of source sequences in the batch
    :param target_sequence_length: Sequence Lengths of target sequences in the batch
    :param source_vocab_size: Source vocabulary size
    :param target_vocab_size: Target vocabulary size
    :param enc_embedding_size: Decoder embedding size
    :param dec_embedding_size: Encoder embedding size
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :return: Tuple of (Training BasicDecoderOutput, Inference BasicDecoderOutput)
    """
    _, encoder_state = encoding_layer(input_data, rnn_size, num_layers, keep_prob, 
                                      source_sequence_length, source_vocab_size, enc_embedding_size)
    
    decode_input = process_decoder_input(target_data, target_vocab_to_int, batch_size)
    
    training_decoder_output, inference_decoder_output = decoding_layer(decode_input, encoder_state, target_sequence_length, 
                                                                       max_target_sentence_length, rnn_size,
                                                                       num_layers, target_vocab_to_int, target_vocab_size,
                                                                       batch_size, keep_prob, dec_embedding_size)
    
    return training_decoder_output, inference_decoder_output


tests.test_seq2seq_model(seq2seq_model)

Tests Passed


## Neural Network Training
### Hyperparameters
Tune the following parameters:

- Set `epochs` to the number of epochs.
- Set `batch_size` to the batch size.
- Set `rnn_size` to the size of the RNNs.
- Set `num_layers` to the number of layers.
- Set `encoding_embedding_size` to the size of the embedding for the encoder.
- Set `decoding_embedding_size` to the size of the embedding for the decoder.
- Set `learning_rate` to the learning rate.
- Set `keep_probability` to the Dropout keep probability
- Set `display_step` to state how many steps between each debug output statement

In [21]:
# Number of Epochs
epochs = 50
# Batch Size
batch_size = 256
# RNN Size
rnn_size = 150
# Number of Layers
num_layers = 3
# Embedding Size
encoding_embedding_size = 300
decoding_embedding_size = 300
# Learning Rate
learning_rate = 0.0004
# Dropout Keep Probability
keep_probability = 0.8
display_step=10

### Build the Graph
Build the graph using the neural network  implemented.

In [22]:

save_path = 'checkpoints/dev'
(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()
max_target_sentence_length = max([len(sentence) for sentence in source_int_text])

train_graph = tf.Graph()
with train_graph.as_default():
    input_data, targets, lr, keep_prob, target_sequence_length, max_target_sequence_length, source_sequence_length = model_inputs()

    #sequence_length = tf.placeholder_with_default(max_target_sentence_length, None, name='sequence_length')
    input_shape = tf.shape(input_data)

    train_logits, inference_logits = seq2seq_model(tf.reverse(input_data, [-1]),
                                                   targets,
                                                   keep_prob,
                                                   batch_size,
                                                   source_sequence_length,
                                                   target_sequence_length,
                                                   max_target_sequence_length,
                                                   len(source_vocab_to_int),
                                                   len(target_vocab_to_int),
                                                   encoding_embedding_size,
                                                   decoding_embedding_size,
                                                   rnn_size,
                                                   num_layers,
                                                   target_vocab_to_int)


    training_logits = tf.identity(train_logits.rnn_output, name='logits')
    inference_logits = tf.identity(inference_logits.sample_id, name='predictions')

    masks = tf.sequence_mask(target_sequence_length, max_target_sequence_length, dtype=tf.float32, name='masks')

    with tf.name_scope("optimization"):
        # Loss function
        cost = tf.contrib.seq2seq.sequence_loss(
            training_logits,
            targets,
            masks)

        # Optimizer
        optimizer = tf.train.AdamOptimizer(lr)

        # Gradient Clipping
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)


Batch and pad the source and target sequences

In [23]:

def pad_sentence_batch(sentence_batch, pad_int):
    """Pad sentences with <PAD> so that each sentence of a batch has the same length"""
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [pad_int] * (max_sentence - len(sentence)) for sentence in sentence_batch]


def get_batches(sources, targets, batch_size, source_pad_int, target_pad_int):
    """Batch targets, sources, and the lengths of their sentences together"""
    for batch_i in range(0, len(sources)//batch_size):
        start_i = batch_i * batch_size

        # Slice the right amount for the batch
        sources_batch = sources[start_i:start_i + batch_size]
        targets_batch = targets[start_i:start_i + batch_size]

        # Pad
        pad_sources_batch = np.array(pad_sentence_batch(sources_batch, source_pad_int))
        pad_targets_batch = np.array(pad_sentence_batch(targets_batch, target_pad_int))

        # Need the lengths for the _lengths parameters
        pad_targets_lengths = []
        for target in pad_targets_batch:
            pad_targets_lengths.append(len(target))

        pad_source_lengths = []
        for source in pad_sources_batch:
            pad_source_lengths.append(len(source))

        yield pad_sources_batch, pad_targets_batch, pad_source_lengths, pad_targets_lengths


### Train
Trains the neural network on the preprocessed data. 

In [24]:

def get_accuracy(target, logits):
    """
    Calculate accuracy
    """
    max_seq = max(target.shape[1], logits.shape[1])
    if max_seq - target.shape[1]:
        target = np.pad(
            target,
            [(0,0),(0,max_seq - target.shape[1])],
            'constant')
    if max_seq - logits.shape[1]:
        logits = np.pad(
            logits,
            [(0,0),(0,max_seq - logits.shape[1])],
            'constant')

    return np.mean(np.equal(target, logits))

# Split data to training and validation sets
train_source = source_int_text[batch_size:]
train_target = target_int_text[batch_size:]
valid_source = source_int_text[:batch_size]
valid_target = target_int_text[:batch_size]
(valid_sources_batch, valid_targets_batch, valid_sources_lengths, valid_targets_lengths ) = next(get_batches(valid_source,
                                                                                                             valid_target,
                                                                                                             batch_size,
                                                                                                             source_vocab_to_int['<PAD>'],
                                                                                                             target_vocab_to_int['<PAD>']))                                                                                                  
with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())

    for epoch_i in range(epochs):
        for batch_i, (source_batch, target_batch, sources_lengths, targets_lengths) in enumerate(
                get_batches(train_source, train_target, batch_size,
                            source_vocab_to_int['<PAD>'],
                            target_vocab_to_int['<PAD>'])):

            _, loss = sess.run(
                [train_op, cost],
                {input_data: source_batch,
                 targets: target_batch,
                 lr: learning_rate,
                 target_sequence_length: targets_lengths,
                 source_sequence_length: sources_lengths,
                 keep_prob: keep_probability})


            if batch_i % display_step == 0 and batch_i > 0:


                batch_train_logits = sess.run(
                    inference_logits,
                    {input_data: source_batch,
                     source_sequence_length: sources_lengths,
                     target_sequence_length: targets_lengths,
                     keep_prob: 1.0})


                batch_valid_logits = sess.run(
                    inference_logits,
                    {input_data: valid_sources_batch,
                     source_sequence_length: valid_sources_lengths,
                     target_sequence_length: valid_targets_lengths,
                     keep_prob: 1.0})

                train_acc = get_accuracy(target_batch, batch_train_logits)

                valid_acc = get_accuracy(valid_targets_batch, batch_valid_logits)

                print('Epoch {:>3} Batch {:>4}/{} - Train Accuracy: {:>6.4f}, Validation Accuracy: {:>6.4f}, Loss: {:>6.4f}'
                      .format(epoch_i, batch_i, len(source_int_text) // batch_size, train_acc, valid_acc, loss))

    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, save_path)
    print('Model Trained and Saved')

Epoch   0 Batch   10/538 - Train Accuracy: 0.2328, Validation Accuracy: 0.3274, Loss: 4.7898
Epoch   0 Batch   20/538 - Train Accuracy: 0.2991, Validation Accuracy: 0.3469, Loss: 3.9121
Epoch   0 Batch   30/538 - Train Accuracy: 0.3096, Validation Accuracy: 0.3777, Loss: 3.5635
Epoch   0 Batch   40/538 - Train Accuracy: 0.3974, Validation Accuracy: 0.4050, Loss: 2.9520
Epoch   0 Batch   50/538 - Train Accuracy: 0.3676, Validation Accuracy: 0.4242, Loss: 3.0064
Epoch   0 Batch   60/538 - Train Accuracy: 0.3748, Validation Accuracy: 0.4377, Loss: 2.9259
Epoch   0 Batch   70/538 - Train Accuracy: 0.4126, Validation Accuracy: 0.4498, Loss: 2.7187
Epoch   0 Batch   80/538 - Train Accuracy: 0.3791, Validation Accuracy: 0.4460, Loss: 2.7790
Epoch   0 Batch   90/538 - Train Accuracy: 0.4211, Validation Accuracy: 0.4542, Loss: 2.5765
Epoch   0 Batch  100/538 - Train Accuracy: 0.4229, Validation Accuracy: 0.4750, Loss: 2.5810
Epoch   0 Batch  110/538 - Train Accuracy: 0.4275, Validation Accuracy

Epoch   1 Batch  370/538 - Train Accuracy: 0.5504, Validation Accuracy: 0.5779, Loss: 1.0951
Epoch   1 Batch  380/538 - Train Accuracy: 0.5449, Validation Accuracy: 0.5792, Loss: 1.0377
Epoch   1 Batch  390/538 - Train Accuracy: 0.5964, Validation Accuracy: 0.5763, Loss: 0.9854
Epoch   1 Batch  400/538 - Train Accuracy: 0.5727, Validation Accuracy: 0.5765, Loss: 1.0178
Epoch   1 Batch  410/538 - Train Accuracy: 0.5564, Validation Accuracy: 0.5755, Loss: 1.0193
Epoch   1 Batch  420/538 - Train Accuracy: 0.5687, Validation Accuracy: 0.5847, Loss: 1.0229
Epoch   1 Batch  430/538 - Train Accuracy: 0.5684, Validation Accuracy: 0.5847, Loss: 1.0148
Epoch   1 Batch  440/538 - Train Accuracy: 0.5547, Validation Accuracy: 0.5772, Loss: 1.0353
Epoch   1 Batch  450/538 - Train Accuracy: 0.5800, Validation Accuracy: 0.5794, Loss: 0.9998
Epoch   1 Batch  460/538 - Train Accuracy: 0.5536, Validation Accuracy: 0.5758, Loss: 0.9767
Epoch   1 Batch  470/538 - Train Accuracy: 0.5971, Validation Accuracy

Epoch   3 Batch  200/538 - Train Accuracy: 0.6658, Validation Accuracy: 0.6660, Loss: 0.5742
Epoch   3 Batch  210/538 - Train Accuracy: 0.6497, Validation Accuracy: 0.6669, Loss: 0.5658
Epoch   3 Batch  220/538 - Train Accuracy: 0.6442, Validation Accuracy: 0.6419, Loss: 0.5557
Epoch   3 Batch  230/538 - Train Accuracy: 0.6410, Validation Accuracy: 0.6607, Loss: 0.5833
Epoch   3 Batch  240/538 - Train Accuracy: 0.6586, Validation Accuracy: 0.6685, Loss: 0.5823
Epoch   3 Batch  250/538 - Train Accuracy: 0.6623, Validation Accuracy: 0.6529, Loss: 0.5622
Epoch   3 Batch  260/538 - Train Accuracy: 0.6616, Validation Accuracy: 0.6674, Loss: 0.5608
Epoch   3 Batch  270/538 - Train Accuracy: 0.6438, Validation Accuracy: 0.6593, Loss: 0.5679
Epoch   3 Batch  280/538 - Train Accuracy: 0.6886, Validation Accuracy: 0.6577, Loss: 0.5342
Epoch   3 Batch  290/538 - Train Accuracy: 0.6729, Validation Accuracy: 0.6673, Loss: 0.5447
Epoch   3 Batch  300/538 - Train Accuracy: 0.6825, Validation Accuracy

Epoch   5 Batch   30/538 - Train Accuracy: 0.7717, Validation Accuracy: 0.7685, Loss: 0.3487
Epoch   5 Batch   40/538 - Train Accuracy: 0.8089, Validation Accuracy: 0.7793, Loss: 0.2964
Epoch   5 Batch   50/538 - Train Accuracy: 0.8045, Validation Accuracy: 0.7866, Loss: 0.3237
Epoch   5 Batch   60/538 - Train Accuracy: 0.8078, Validation Accuracy: 0.7836, Loss: 0.3178
Epoch   5 Batch   70/538 - Train Accuracy: 0.7839, Validation Accuracy: 0.7782, Loss: 0.3102
Epoch   5 Batch   80/538 - Train Accuracy: 0.8012, Validation Accuracy: 0.7750, Loss: 0.3222
Epoch   5 Batch   90/538 - Train Accuracy: 0.7753, Validation Accuracy: 0.7903, Loss: 0.3237
Epoch   5 Batch  100/538 - Train Accuracy: 0.8158, Validation Accuracy: 0.7923, Loss: 0.2982
Epoch   5 Batch  110/538 - Train Accuracy: 0.8031, Validation Accuracy: 0.7901, Loss: 0.3167
Epoch   5 Batch  120/538 - Train Accuracy: 0.8217, Validation Accuracy: 0.8036, Loss: 0.2925
Epoch   5 Batch  130/538 - Train Accuracy: 0.8278, Validation Accuracy

Epoch   6 Batch  390/538 - Train Accuracy: 0.9133, Validation Accuracy: 0.8801, Loss: 0.1416
Epoch   6 Batch  400/538 - Train Accuracy: 0.9182, Validation Accuracy: 0.8727, Loss: 0.1516
Epoch   6 Batch  410/538 - Train Accuracy: 0.8803, Validation Accuracy: 0.8890, Loss: 0.1632
Epoch   6 Batch  420/538 - Train Accuracy: 0.9197, Validation Accuracy: 0.8679, Loss: 0.1446
Epoch   6 Batch  430/538 - Train Accuracy: 0.8928, Validation Accuracy: 0.8944, Loss: 0.1445
Epoch   6 Batch  440/538 - Train Accuracy: 0.8881, Validation Accuracy: 0.8908, Loss: 0.1671
Epoch   6 Batch  450/538 - Train Accuracy: 0.8752, Validation Accuracy: 0.9002, Loss: 0.1659
Epoch   6 Batch  460/538 - Train Accuracy: 0.8731, Validation Accuracy: 0.8944, Loss: 0.1566
Epoch   6 Batch  470/538 - Train Accuracy: 0.8945, Validation Accuracy: 0.8928, Loss: 0.1414
Epoch   6 Batch  480/538 - Train Accuracy: 0.8953, Validation Accuracy: 0.9016, Loss: 0.1439
Epoch   6 Batch  490/538 - Train Accuracy: 0.8867, Validation Accuracy

Epoch   8 Batch  220/538 - Train Accuracy: 0.9018, Validation Accuracy: 0.9164, Loss: 0.0768
Epoch   8 Batch  230/538 - Train Accuracy: 0.9186, Validation Accuracy: 0.9071, Loss: 0.0813
Epoch   8 Batch  240/538 - Train Accuracy: 0.9174, Validation Accuracy: 0.9142, Loss: 0.0857
Epoch   8 Batch  250/538 - Train Accuracy: 0.9355, Validation Accuracy: 0.9148, Loss: 0.0721
Epoch   8 Batch  260/538 - Train Accuracy: 0.8999, Validation Accuracy: 0.9123, Loss: 0.0847
Epoch   8 Batch  270/538 - Train Accuracy: 0.9252, Validation Accuracy: 0.9128, Loss: 0.0761
Epoch   8 Batch  280/538 - Train Accuracy: 0.9386, Validation Accuracy: 0.9086, Loss: 0.0715
Epoch   8 Batch  290/538 - Train Accuracy: 0.9277, Validation Accuracy: 0.9205, Loss: 0.0720
Epoch   8 Batch  300/538 - Train Accuracy: 0.9020, Validation Accuracy: 0.9144, Loss: 0.0776
Epoch   8 Batch  310/538 - Train Accuracy: 0.9598, Validation Accuracy: 0.9192, Loss: 0.0749
Epoch   8 Batch  320/538 - Train Accuracy: 0.9366, Validation Accuracy

Epoch  10 Batch   50/538 - Train Accuracy: 0.9262, Validation Accuracy: 0.9238, Loss: 0.0485
Epoch  10 Batch   60/538 - Train Accuracy: 0.9523, Validation Accuracy: 0.9363, Loss: 0.0494
Epoch  10 Batch   70/538 - Train Accuracy: 0.9364, Validation Accuracy: 0.9249, Loss: 0.0475
Epoch  10 Batch   80/538 - Train Accuracy: 0.9402, Validation Accuracy: 0.9322, Loss: 0.0538
Epoch  10 Batch   90/538 - Train Accuracy: 0.9345, Validation Accuracy: 0.9350, Loss: 0.0538
Epoch  10 Batch  100/538 - Train Accuracy: 0.9439, Validation Accuracy: 0.9295, Loss: 0.0428
Epoch  10 Batch  110/538 - Train Accuracy: 0.9393, Validation Accuracy: 0.9469, Loss: 0.0487
Epoch  10 Batch  120/538 - Train Accuracy: 0.9506, Validation Accuracy: 0.9368, Loss: 0.0381
Epoch  10 Batch  130/538 - Train Accuracy: 0.9626, Validation Accuracy: 0.9521, Loss: 0.0459
Epoch  10 Batch  140/538 - Train Accuracy: 0.9215, Validation Accuracy: 0.9455, Loss: 0.0629
Epoch  10 Batch  150/538 - Train Accuracy: 0.9412, Validation Accuracy

Epoch  11 Batch  410/538 - Train Accuracy: 0.9580, Validation Accuracy: 0.9480, Loss: 0.0447
Epoch  11 Batch  420/538 - Train Accuracy: 0.9535, Validation Accuracy: 0.9252, Loss: 0.0406
Epoch  11 Batch  430/538 - Train Accuracy: 0.9500, Validation Accuracy: 0.9460, Loss: 0.0329
Epoch  11 Batch  440/538 - Train Accuracy: 0.9477, Validation Accuracy: 0.9544, Loss: 0.0383
Epoch  11 Batch  450/538 - Train Accuracy: 0.9328, Validation Accuracy: 0.9426, Loss: 0.0481
Epoch  11 Batch  460/538 - Train Accuracy: 0.9401, Validation Accuracy: 0.9425, Loss: 0.0409
Epoch  11 Batch  470/538 - Train Accuracy: 0.9591, Validation Accuracy: 0.9409, Loss: 0.0349
Epoch  11 Batch  480/538 - Train Accuracy: 0.9669, Validation Accuracy: 0.9577, Loss: 0.0353
Epoch  11 Batch  490/538 - Train Accuracy: 0.9526, Validation Accuracy: 0.9535, Loss: 0.0361
Epoch  11 Batch  500/538 - Train Accuracy: 0.9654, Validation Accuracy: 0.9322, Loss: 0.0283
Epoch  11 Batch  510/538 - Train Accuracy: 0.9645, Validation Accuracy

Epoch  13 Batch  240/538 - Train Accuracy: 0.9564, Validation Accuracy: 0.9522, Loss: 0.0308
Epoch  13 Batch  250/538 - Train Accuracy: 0.9752, Validation Accuracy: 0.9480, Loss: 0.0263
Epoch  13 Batch  260/538 - Train Accuracy: 0.9645, Validation Accuracy: 0.9529, Loss: 0.0286
Epoch  13 Batch  270/538 - Train Accuracy: 0.9537, Validation Accuracy: 0.9496, Loss: 0.0251
Epoch  13 Batch  280/538 - Train Accuracy: 0.9102, Validation Accuracy: 0.9315, Loss: 0.0964
Epoch  13 Batch  290/538 - Train Accuracy: 0.9184, Validation Accuracy: 0.8924, Loss: 0.0743
Epoch  13 Batch  300/538 - Train Accuracy: 0.9438, Validation Accuracy: 0.9265, Loss: 0.0481
Epoch  13 Batch  310/538 - Train Accuracy: 0.9693, Validation Accuracy: 0.9460, Loss: 0.0430
Epoch  13 Batch  320/538 - Train Accuracy: 0.9622, Validation Accuracy: 0.9604, Loss: 0.0309
Epoch  13 Batch  330/538 - Train Accuracy: 0.9645, Validation Accuracy: 0.9608, Loss: 0.0266
Epoch  13 Batch  340/538 - Train Accuracy: 0.9516, Validation Accuracy

Epoch  15 Batch   70/538 - Train Accuracy: 0.9764, Validation Accuracy: 0.9666, Loss: 0.0212
Epoch  15 Batch   80/538 - Train Accuracy: 0.9617, Validation Accuracy: 0.9467, Loss: 0.0212
Epoch  15 Batch   90/538 - Train Accuracy: 0.9749, Validation Accuracy: 0.9647, Loss: 0.0249
Epoch  15 Batch  100/538 - Train Accuracy: 0.9783, Validation Accuracy: 0.9526, Loss: 0.0179
Epoch  15 Batch  110/538 - Train Accuracy: 0.9711, Validation Accuracy: 0.9643, Loss: 0.0244
Epoch  15 Batch  120/538 - Train Accuracy: 0.9818, Validation Accuracy: 0.9608, Loss: 0.0168
Epoch  15 Batch  130/538 - Train Accuracy: 0.9738, Validation Accuracy: 0.9508, Loss: 0.0217
Epoch  15 Batch  140/538 - Train Accuracy: 0.9557, Validation Accuracy: 0.9595, Loss: 0.0308
Epoch  15 Batch  150/538 - Train Accuracy: 0.9779, Validation Accuracy: 0.9444, Loss: 0.0208
Epoch  15 Batch  160/538 - Train Accuracy: 0.9542, Validation Accuracy: 0.9652, Loss: 0.0206
Epoch  15 Batch  170/538 - Train Accuracy: 0.9714, Validation Accuracy

Epoch  16 Batch  430/538 - Train Accuracy: 0.9686, Validation Accuracy: 0.9648, Loss: 0.0183
Epoch  16 Batch  440/538 - Train Accuracy: 0.9811, Validation Accuracy: 0.9535, Loss: 0.0198
Epoch  16 Batch  450/538 - Train Accuracy: 0.9555, Validation Accuracy: 0.9625, Loss: 0.0250
Epoch  16 Batch  460/538 - Train Accuracy: 0.9756, Validation Accuracy: 0.9567, Loss: 0.0184
Epoch  16 Batch  470/538 - Train Accuracy: 0.9715, Validation Accuracy: 0.9590, Loss: 0.0175
Epoch  16 Batch  480/538 - Train Accuracy: 0.9678, Validation Accuracy: 0.9602, Loss: 0.0178
Epoch  16 Batch  490/538 - Train Accuracy: 0.9635, Validation Accuracy: 0.9583, Loss: 0.0191
Epoch  16 Batch  500/538 - Train Accuracy: 0.9838, Validation Accuracy: 0.9540, Loss: 0.0127
Epoch  16 Batch  510/538 - Train Accuracy: 0.9855, Validation Accuracy: 0.9668, Loss: 0.0168
Epoch  16 Batch  520/538 - Train Accuracy: 0.9809, Validation Accuracy: 0.9657, Loss: 0.0203
Epoch  16 Batch  530/538 - Train Accuracy: 0.9662, Validation Accuracy

Epoch  18 Batch  260/538 - Train Accuracy: 0.9563, Validation Accuracy: 0.9565, Loss: 0.0172
Epoch  18 Batch  270/538 - Train Accuracy: 0.9824, Validation Accuracy: 0.9705, Loss: 0.0140
Epoch  18 Batch  280/538 - Train Accuracy: 0.9860, Validation Accuracy: 0.9602, Loss: 0.0113
Epoch  18 Batch  290/538 - Train Accuracy: 0.9781, Validation Accuracy: 0.9661, Loss: 0.0132
Epoch  18 Batch  300/538 - Train Accuracy: 0.9812, Validation Accuracy: 0.9629, Loss: 0.0164
Epoch  18 Batch  310/538 - Train Accuracy: 0.9840, Validation Accuracy: 0.9597, Loss: 0.0190
Epoch  18 Batch  320/538 - Train Accuracy: 0.9749, Validation Accuracy: 0.9581, Loss: 0.0151
Epoch  18 Batch  330/538 - Train Accuracy: 0.9795, Validation Accuracy: 0.9629, Loss: 0.0154
Epoch  18 Batch  340/538 - Train Accuracy: 0.9723, Validation Accuracy: 0.9595, Loss: 0.0158
Epoch  18 Batch  350/538 - Train Accuracy: 0.9727, Validation Accuracy: 0.9652, Loss: 0.0183
Epoch  18 Batch  360/538 - Train Accuracy: 0.9783, Validation Accuracy

Epoch  20 Batch   90/538 - Train Accuracy: 0.9851, Validation Accuracy: 0.9545, Loss: 0.0134
Epoch  20 Batch  100/538 - Train Accuracy: 0.9854, Validation Accuracy: 0.9707, Loss: 0.0097
Epoch  20 Batch  110/538 - Train Accuracy: 0.9779, Validation Accuracy: 0.9560, Loss: 0.0127
Epoch  20 Batch  120/538 - Train Accuracy: 0.9779, Validation Accuracy: 0.9693, Loss: 0.0080
Epoch  20 Batch  130/538 - Train Accuracy: 0.9807, Validation Accuracy: 0.9680, Loss: 0.0128
Epoch  20 Batch  140/538 - Train Accuracy: 0.9773, Validation Accuracy: 0.9615, Loss: 0.0185
Epoch  20 Batch  150/538 - Train Accuracy: 0.9781, Validation Accuracy: 0.9625, Loss: 0.0145
Epoch  20 Batch  160/538 - Train Accuracy: 0.9814, Validation Accuracy: 0.9595, Loss: 0.0115
Epoch  20 Batch  170/538 - Train Accuracy: 0.9740, Validation Accuracy: 0.9622, Loss: 0.0137
Epoch  20 Batch  180/538 - Train Accuracy: 0.9762, Validation Accuracy: 0.9622, Loss: 0.0118
Epoch  20 Batch  190/538 - Train Accuracy: 0.9760, Validation Accuracy

Epoch  21 Batch  450/538 - Train Accuracy: 0.9715, Validation Accuracy: 0.9732, Loss: 0.0156
Epoch  21 Batch  460/538 - Train Accuracy: 0.9864, Validation Accuracy: 0.9609, Loss: 0.0111
Epoch  21 Batch  470/538 - Train Accuracy: 0.9820, Validation Accuracy: 0.9718, Loss: 0.0104
Epoch  21 Batch  480/538 - Train Accuracy: 0.9829, Validation Accuracy: 0.9668, Loss: 0.0109
Epoch  21 Batch  490/538 - Train Accuracy: 0.9721, Validation Accuracy: 0.9698, Loss: 0.0124
Epoch  21 Batch  500/538 - Train Accuracy: 0.9865, Validation Accuracy: 0.9707, Loss: 0.0079
Epoch  21 Batch  510/538 - Train Accuracy: 0.9920, Validation Accuracy: 0.9650, Loss: 0.0087
Epoch  21 Batch  520/538 - Train Accuracy: 0.9844, Validation Accuracy: 0.9657, Loss: 0.0135
Epoch  21 Batch  530/538 - Train Accuracy: 0.9764, Validation Accuracy: 0.9647, Loss: 0.0122
Epoch  22 Batch   10/538 - Train Accuracy: 0.9852, Validation Accuracy: 0.9599, Loss: 0.0108
Epoch  22 Batch   20/538 - Train Accuracy: 0.9834, Validation Accuracy

Epoch  23 Batch  280/538 - Train Accuracy: 0.9825, Validation Accuracy: 0.9585, Loss: 0.0089
Epoch  23 Batch  290/538 - Train Accuracy: 0.9803, Validation Accuracy: 0.9712, Loss: 0.0077
Epoch  23 Batch  300/538 - Train Accuracy: 0.9853, Validation Accuracy: 0.9725, Loss: 0.0118
Epoch  23 Batch  310/538 - Train Accuracy: 0.9840, Validation Accuracy: 0.9682, Loss: 0.0141
Epoch  23 Batch  320/538 - Train Accuracy: 0.9851, Validation Accuracy: 0.9613, Loss: 0.0087
Epoch  23 Batch  330/538 - Train Accuracy: 0.9840, Validation Accuracy: 0.9652, Loss: 0.0100
Epoch  23 Batch  340/538 - Train Accuracy: 0.9871, Validation Accuracy: 0.9666, Loss: 0.0109
Epoch  23 Batch  350/538 - Train Accuracy: 0.9890, Validation Accuracy: 0.9790, Loss: 0.0126
Epoch  23 Batch  360/538 - Train Accuracy: 0.9820, Validation Accuracy: 0.9730, Loss: 0.0085
Epoch  23 Batch  370/538 - Train Accuracy: 0.9746, Validation Accuracy: 0.9727, Loss: 0.0114
Epoch  23 Batch  380/538 - Train Accuracy: 0.9859, Validation Accuracy

Epoch  25 Batch  110/538 - Train Accuracy: 0.9928, Validation Accuracy: 0.9656, Loss: 0.0075
Epoch  25 Batch  120/538 - Train Accuracy: 0.9887, Validation Accuracy: 0.9721, Loss: 0.0060
Epoch  25 Batch  130/538 - Train Accuracy: 0.9849, Validation Accuracy: 0.9750, Loss: 0.0090
Epoch  25 Batch  140/538 - Train Accuracy: 0.9848, Validation Accuracy: 0.9636, Loss: 0.0108
Epoch  25 Batch  150/538 - Train Accuracy: 0.9885, Validation Accuracy: 0.9737, Loss: 0.0091
Epoch  25 Batch  160/538 - Train Accuracy: 0.9903, Validation Accuracy: 0.9666, Loss: 0.0072
Epoch  25 Batch  170/538 - Train Accuracy: 0.9829, Validation Accuracy: 0.9659, Loss: 0.0096
Epoch  25 Batch  180/538 - Train Accuracy: 0.9877, Validation Accuracy: 0.9670, Loss: 0.0093
Epoch  25 Batch  190/538 - Train Accuracy: 0.9853, Validation Accuracy: 0.9732, Loss: 0.0109
Epoch  25 Batch  200/538 - Train Accuracy: 0.9881, Validation Accuracy: 0.9638, Loss: 0.0062
Epoch  25 Batch  210/538 - Train Accuracy: 0.9825, Validation Accuracy

Epoch  26 Batch  470/538 - Train Accuracy: 0.9836, Validation Accuracy: 0.9693, Loss: 0.0086
Epoch  26 Batch  480/538 - Train Accuracy: 0.9926, Validation Accuracy: 0.9599, Loss: 0.0082
Epoch  26 Batch  490/538 - Train Accuracy: 0.9818, Validation Accuracy: 0.9719, Loss: 0.0073
Epoch  26 Batch  500/538 - Train Accuracy: 0.9918, Validation Accuracy: 0.9718, Loss: 0.0054
Epoch  26 Batch  510/538 - Train Accuracy: 0.9991, Validation Accuracy: 0.9734, Loss: 0.0058
Epoch  26 Batch  520/538 - Train Accuracy: 0.9904, Validation Accuracy: 0.9670, Loss: 0.0116
Epoch  26 Batch  530/538 - Train Accuracy: 0.9885, Validation Accuracy: 0.9579, Loss: 0.0077
Epoch  27 Batch   10/538 - Train Accuracy: 0.9902, Validation Accuracy: 0.9670, Loss: 0.0079
Epoch  27 Batch   20/538 - Train Accuracy: 0.9916, Validation Accuracy: 0.9759, Loss: 0.0105
Epoch  27 Batch   30/538 - Train Accuracy: 0.9846, Validation Accuracy: 0.9673, Loss: 0.0089
Epoch  27 Batch   40/538 - Train Accuracy: 0.9870, Validation Accuracy

Epoch  28 Batch  300/538 - Train Accuracy: 0.9916, Validation Accuracy: 0.9679, Loss: 0.0084
Epoch  28 Batch  310/538 - Train Accuracy: 0.9893, Validation Accuracy: 0.9707, Loss: 0.0100
Epoch  28 Batch  320/538 - Train Accuracy: 0.9901, Validation Accuracy: 0.9769, Loss: 0.0056
Epoch  28 Batch  330/538 - Train Accuracy: 0.9825, Validation Accuracy: 0.9703, Loss: 0.0079
Epoch  28 Batch  340/538 - Train Accuracy: 0.9889, Validation Accuracy: 0.9732, Loss: 0.0069
Epoch  28 Batch  350/538 - Train Accuracy: 0.9918, Validation Accuracy: 0.9652, Loss: 0.0099
Epoch  28 Batch  360/538 - Train Accuracy: 0.9928, Validation Accuracy: 0.9659, Loss: 0.0055
Epoch  28 Batch  370/538 - Train Accuracy: 0.9787, Validation Accuracy: 0.9712, Loss: 0.0093
Epoch  28 Batch  380/538 - Train Accuracy: 0.9943, Validation Accuracy: 0.9675, Loss: 0.0050
Epoch  28 Batch  390/538 - Train Accuracy: 0.9931, Validation Accuracy: 0.9647, Loss: 0.0060
Epoch  28 Batch  400/538 - Train Accuracy: 0.9879, Validation Accuracy

Epoch  30 Batch  130/538 - Train Accuracy: 0.9887, Validation Accuracy: 0.9682, Loss: 0.0061
Epoch  30 Batch  140/538 - Train Accuracy: 0.9846, Validation Accuracy: 0.9668, Loss: 0.0096
Epoch  30 Batch  150/538 - Train Accuracy: 0.9881, Validation Accuracy: 0.9590, Loss: 0.0071
Epoch  30 Batch  160/538 - Train Accuracy: 0.9929, Validation Accuracy: 0.9753, Loss: 0.0053
Epoch  30 Batch  170/538 - Train Accuracy: 0.9859, Validation Accuracy: 0.9751, Loss: 0.0073
Epoch  30 Batch  180/538 - Train Accuracy: 0.9864, Validation Accuracy: 0.9721, Loss: 0.0083
Epoch  30 Batch  190/538 - Train Accuracy: 0.9896, Validation Accuracy: 0.9762, Loss: 0.0112
Epoch  30 Batch  200/538 - Train Accuracy: 0.9939, Validation Accuracy: 0.9725, Loss: 0.0064
Epoch  30 Batch  210/538 - Train Accuracy: 0.9924, Validation Accuracy: 0.9666, Loss: 0.0064
Epoch  30 Batch  220/538 - Train Accuracy: 0.9911, Validation Accuracy: 0.9698, Loss: 0.0065
Epoch  30 Batch  230/538 - Train Accuracy: 0.9824, Validation Accuracy

Epoch  31 Batch  490/538 - Train Accuracy: 0.9875, Validation Accuracy: 0.9767, Loss: 0.0050
Epoch  31 Batch  500/538 - Train Accuracy: 0.9954, Validation Accuracy: 0.9664, Loss: 0.0039
Epoch  31 Batch  510/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9689, Loss: 0.0048
Epoch  31 Batch  520/538 - Train Accuracy: 0.9871, Validation Accuracy: 0.9654, Loss: 0.0128
Epoch  31 Batch  530/538 - Train Accuracy: 0.9936, Validation Accuracy: 0.9746, Loss: 0.0065
Epoch  32 Batch   10/538 - Train Accuracy: 0.9936, Validation Accuracy: 0.9833, Loss: 0.0044
Epoch  32 Batch   20/538 - Train Accuracy: 0.9924, Validation Accuracy: 0.9753, Loss: 0.0060
Epoch  32 Batch   30/538 - Train Accuracy: 0.9877, Validation Accuracy: 0.9734, Loss: 0.0067
Epoch  32 Batch   40/538 - Train Accuracy: 0.9883, Validation Accuracy: 0.9531, Loss: 0.0062
Epoch  32 Batch   50/538 - Train Accuracy: 0.9938, Validation Accuracy: 0.9668, Loss: 0.0046
Epoch  32 Batch   60/538 - Train Accuracy: 0.9865, Validation Accuracy

Epoch  33 Batch  320/538 - Train Accuracy: 0.9959, Validation Accuracy: 0.9771, Loss: 0.0037
Epoch  33 Batch  330/538 - Train Accuracy: 0.9907, Validation Accuracy: 0.9739, Loss: 0.0060
Epoch  33 Batch  340/538 - Train Accuracy: 0.9941, Validation Accuracy: 0.9723, Loss: 0.0050
Epoch  33 Batch  350/538 - Train Accuracy: 0.9927, Validation Accuracy: 0.9680, Loss: 0.0050
Epoch  33 Batch  360/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9792, Loss: 0.0039
Epoch  33 Batch  370/538 - Train Accuracy: 0.9918, Validation Accuracy: 0.9787, Loss: 0.0043
Epoch  33 Batch  380/538 - Train Accuracy: 0.9951, Validation Accuracy: 0.9721, Loss: 0.0034
Epoch  33 Batch  390/538 - Train Accuracy: 0.9924, Validation Accuracy: 0.9723, Loss: 0.0038
Epoch  33 Batch  400/538 - Train Accuracy: 0.9933, Validation Accuracy: 0.9782, Loss: 0.0041
Epoch  33 Batch  410/538 - Train Accuracy: 0.9971, Validation Accuracy: 0.9718, Loss: 0.0034
Epoch  33 Batch  420/538 - Train Accuracy: 0.9896, Validation Accuracy

Epoch  35 Batch  150/538 - Train Accuracy: 0.9996, Validation Accuracy: 0.9728, Loss: 0.0042
Epoch  35 Batch  160/538 - Train Accuracy: 0.9937, Validation Accuracy: 0.9805, Loss: 0.0041
Epoch  35 Batch  170/538 - Train Accuracy: 0.9907, Validation Accuracy: 0.9735, Loss: 0.0059
Epoch  35 Batch  180/538 - Train Accuracy: 0.9933, Validation Accuracy: 0.9735, Loss: 0.0048
Epoch  35 Batch  190/538 - Train Accuracy: 0.9933, Validation Accuracy: 0.9805, Loss: 0.0057
Epoch  35 Batch  200/538 - Train Accuracy: 0.9969, Validation Accuracy: 0.9732, Loss: 0.0032
Epoch  35 Batch  210/538 - Train Accuracy: 0.9957, Validation Accuracy: 0.9819, Loss: 0.0045
Epoch  35 Batch  220/538 - Train Accuracy: 0.9931, Validation Accuracy: 0.9702, Loss: 0.0042
Epoch  35 Batch  230/538 - Train Accuracy: 0.9930, Validation Accuracy: 0.9712, Loss: 0.0055
Epoch  35 Batch  240/538 - Train Accuracy: 0.9928, Validation Accuracy: 0.9712, Loss: 0.0047
Epoch  35 Batch  250/538 - Train Accuracy: 0.9973, Validation Accuracy

Epoch  36 Batch  510/538 - Train Accuracy: 0.9974, Validation Accuracy: 0.9775, Loss: 0.0028
Epoch  36 Batch  520/538 - Train Accuracy: 0.9934, Validation Accuracy: 0.9718, Loss: 0.0072
Epoch  36 Batch  530/538 - Train Accuracy: 0.9938, Validation Accuracy: 0.9838, Loss: 0.0037
Epoch  37 Batch   10/538 - Train Accuracy: 0.9943, Validation Accuracy: 0.9806, Loss: 0.0037
Epoch  37 Batch   20/538 - Train Accuracy: 0.9926, Validation Accuracy: 0.9787, Loss: 0.0041
Epoch  37 Batch   30/538 - Train Accuracy: 0.9953, Validation Accuracy: 0.9680, Loss: 0.0052
Epoch  37 Batch   40/538 - Train Accuracy: 0.9918, Validation Accuracy: 0.9670, Loss: 0.0036
Epoch  37 Batch   50/538 - Train Accuracy: 0.9947, Validation Accuracy: 0.9725, Loss: 0.0053
Epoch  37 Batch   60/538 - Train Accuracy: 0.9865, Validation Accuracy: 0.9849, Loss: 0.0068
Epoch  37 Batch   70/538 - Train Accuracy: 0.9924, Validation Accuracy: 0.9703, Loss: 0.0042
Epoch  37 Batch   80/538 - Train Accuracy: 1.0000, Validation Accuracy

Epoch  38 Batch  340/538 - Train Accuracy: 0.9961, Validation Accuracy: 0.9771, Loss: 0.0026
Epoch  38 Batch  350/538 - Train Accuracy: 0.9978, Validation Accuracy: 0.9787, Loss: 0.0037
Epoch  38 Batch  360/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9812, Loss: 0.0030
Epoch  38 Batch  370/538 - Train Accuracy: 0.9967, Validation Accuracy: 0.9789, Loss: 0.0033
Epoch  38 Batch  380/538 - Train Accuracy: 0.9982, Validation Accuracy: 0.9686, Loss: 0.0027
Epoch  38 Batch  390/538 - Train Accuracy: 0.9967, Validation Accuracy: 0.9782, Loss: 0.0029
Epoch  38 Batch  400/538 - Train Accuracy: 0.9974, Validation Accuracy: 0.9725, Loss: 0.0042
Epoch  38 Batch  410/538 - Train Accuracy: 0.9967, Validation Accuracy: 0.9691, Loss: 0.0030
Epoch  38 Batch  420/538 - Train Accuracy: 0.9885, Validation Accuracy: 0.9840, Loss: 0.0073
Epoch  38 Batch  430/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9789, Loss: 0.0036
Epoch  38 Batch  440/538 - Train Accuracy: 0.9953, Validation Accuracy

Epoch  40 Batch  170/538 - Train Accuracy: 0.9929, Validation Accuracy: 0.9798, Loss: 0.0050
Epoch  40 Batch  180/538 - Train Accuracy: 0.9918, Validation Accuracy: 0.9737, Loss: 0.0049
Epoch  40 Batch  190/538 - Train Accuracy: 0.9980, Validation Accuracy: 0.9775, Loss: 0.0035
Epoch  40 Batch  200/538 - Train Accuracy: 0.9977, Validation Accuracy: 0.9782, Loss: 0.0023
Epoch  40 Batch  210/538 - Train Accuracy: 0.9965, Validation Accuracy: 0.9808, Loss: 0.0041
Epoch  40 Batch  220/538 - Train Accuracy: 0.9993, Validation Accuracy: 0.9734, Loss: 0.0028
Epoch  40 Batch  230/538 - Train Accuracy: 0.9926, Validation Accuracy: 0.9753, Loss: 0.0033
Epoch  40 Batch  240/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9707, Loss: 0.0032
Epoch  40 Batch  250/538 - Train Accuracy: 0.9934, Validation Accuracy: 0.9705, Loss: 0.0037
Epoch  40 Batch  260/538 - Train Accuracy: 0.9892, Validation Accuracy: 0.9748, Loss: 0.0041
Epoch  40 Batch  270/538 - Train Accuracy: 1.0000, Validation Accuracy

Epoch  41 Batch  530/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9695, Loss: 0.0026
Epoch  42 Batch   10/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9688, Loss: 0.0023
Epoch  42 Batch   20/538 - Train Accuracy: 0.9961, Validation Accuracy: 0.9805, Loss: 0.0041
Epoch  42 Batch   30/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9764, Loss: 0.0022
Epoch  42 Batch   40/538 - Train Accuracy: 0.9968, Validation Accuracy: 0.9753, Loss: 0.0025
Epoch  42 Batch   50/538 - Train Accuracy: 0.9971, Validation Accuracy: 0.9767, Loss: 0.0027
Epoch  42 Batch   60/538 - Train Accuracy: 0.9975, Validation Accuracy: 0.9743, Loss: 0.0039
Epoch  42 Batch   70/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9755, Loss: 0.0016
Epoch  42 Batch   80/538 - Train Accuracy: 0.9930, Validation Accuracy: 0.9796, Loss: 0.0023
Epoch  42 Batch   90/538 - Train Accuracy: 0.9937, Validation Accuracy: 0.9748, Loss: 0.0031
Epoch  42 Batch  100/538 - Train Accuracy: 1.0000, Validation Accuracy

Epoch  43 Batch  360/538 - Train Accuracy: 0.9988, Validation Accuracy: 0.9801, Loss: 0.0017
Epoch  43 Batch  370/538 - Train Accuracy: 0.9967, Validation Accuracy: 0.9771, Loss: 0.0029
Epoch  43 Batch  380/538 - Train Accuracy: 0.9967, Validation Accuracy: 0.9764, Loss: 0.0021
Epoch  43 Batch  390/538 - Train Accuracy: 0.9993, Validation Accuracy: 0.9821, Loss: 0.0021
Epoch  43 Batch  400/538 - Train Accuracy: 0.9885, Validation Accuracy: 0.9808, Loss: 0.0027
Epoch  43 Batch  410/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9831, Loss: 0.0032
Epoch  43 Batch  420/538 - Train Accuracy: 0.9971, Validation Accuracy: 0.9835, Loss: 0.0051
Epoch  43 Batch  430/538 - Train Accuracy: 0.9896, Validation Accuracy: 0.9796, Loss: 0.0051
Epoch  43 Batch  440/538 - Train Accuracy: 0.9963, Validation Accuracy: 0.9698, Loss: 0.0075
Epoch  43 Batch  450/538 - Train Accuracy: 0.9980, Validation Accuracy: 0.9806, Loss: 0.0047
Epoch  43 Batch  460/538 - Train Accuracy: 0.9994, Validation Accuracy

Epoch  45 Batch  190/538 - Train Accuracy: 0.9980, Validation Accuracy: 0.9727, Loss: 0.0033
Epoch  45 Batch  200/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9805, Loss: 0.0017
Epoch  45 Batch  210/538 - Train Accuracy: 0.9957, Validation Accuracy: 0.9798, Loss: 0.0029
Epoch  45 Batch  220/538 - Train Accuracy: 0.9993, Validation Accuracy: 0.9785, Loss: 0.0019
Epoch  45 Batch  230/538 - Train Accuracy: 0.9945, Validation Accuracy: 0.9806, Loss: 0.0022
Epoch  45 Batch  240/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9798, Loss: 0.0023
Epoch  45 Batch  250/538 - Train Accuracy: 0.9965, Validation Accuracy: 0.9812, Loss: 0.0021
Epoch  45 Batch  260/538 - Train Accuracy: 0.9942, Validation Accuracy: 0.9847, Loss: 0.0037
Epoch  45 Batch  270/538 - Train Accuracy: 0.9998, Validation Accuracy: 0.9767, Loss: 0.0017
Epoch  45 Batch  280/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9782, Loss: 0.0026
Epoch  45 Batch  290/538 - Train Accuracy: 0.9957, Validation Accuracy

Epoch  47 Batch   20/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9792, Loss: 0.0026
Epoch  47 Batch   30/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9698, Loss: 0.0023
Epoch  47 Batch   40/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9769, Loss: 0.0018
Epoch  47 Batch   50/538 - Train Accuracy: 0.9982, Validation Accuracy: 0.9741, Loss: 0.0022
Epoch  47 Batch   60/538 - Train Accuracy: 0.9947, Validation Accuracy: 0.9689, Loss: 0.0040
Epoch  47 Batch   70/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9792, Loss: 0.0018
Epoch  47 Batch   80/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9753, Loss: 0.0016
Epoch  47 Batch   90/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9732, Loss: 0.0018
Epoch  47 Batch  100/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9771, Loss: 0.0008
Epoch  47 Batch  110/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9714, Loss: 0.0012
Epoch  47 Batch  120/538 - Train Accuracy: 0.9971, Validation Accuracy

Epoch  48 Batch  380/538 - Train Accuracy: 0.9994, Validation Accuracy: 0.9757, Loss: 0.0025
Epoch  48 Batch  390/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9805, Loss: 0.0011
Epoch  48 Batch  400/538 - Train Accuracy: 0.9948, Validation Accuracy: 0.9775, Loss: 0.0020
Epoch  48 Batch  410/538 - Train Accuracy: 1.0000, Validation Accuracy: 0.9796, Loss: 0.0014
Epoch  48 Batch  420/538 - Train Accuracy: 0.9982, Validation Accuracy: 0.9846, Loss: 0.0028
Epoch  48 Batch  430/538 - Train Accuracy: 0.9973, Validation Accuracy: 0.9821, Loss: 0.0021
Epoch  48 Batch  440/538 - Train Accuracy: 0.9953, Validation Accuracy: 0.9638, Loss: 0.0022
Epoch  48 Batch  450/538 - Train Accuracy: 0.9968, Validation Accuracy: 0.9828, Loss: 0.0026
Epoch  48 Batch  460/538 - Train Accuracy: 0.9998, Validation Accuracy: 0.9819, Loss: 0.0024
Epoch  48 Batch  470/538 - Train Accuracy: 0.9985, Validation Accuracy: 0.9803, Loss: 0.0022
Epoch  48 Batch  480/538 - Train Accuracy: 0.9952, Validation Accuracy

### Save Parameters
Save the `batch_size` and `save_path` parameters for inference.

In [25]:

# Save parameters for checkpoint
helper.save_params(save_path)

# Checkpoint

In [26]:

import tensorflow as tf
import numpy as np
import helper
import problem_unittests as tests

_, (source_vocab_to_int, target_vocab_to_int), (source_int_to_vocab, target_int_to_vocab) = helper.load_preprocess()
load_path = helper.load_params()

## Sentence to Sequence
To feed a sentence into the model for translation, first need to preprocess it.  This implements the function `sentence_to_seq()` to preprocess new sentences.

- Converts the sentence to lowercase
- Converts words into ids using `vocab_to_int`
- Converts words not in the vocabulary, to the `<UNK>` word id.

In [27]:
def sentence_to_seq(sentence, vocab_to_int):
    sentence = sentence.lower()
    
    ids = [vocab_to_int.get(word,vocab_to_int['<UNK>']) for word in sentence.split(" ")]

    return ids


tests.test_sentence_to_seq(sentence_to_seq)

Tests Passed


## Translate
This will translate `translate_sentence` from English to French.

In [36]:
translate_sentence = 'He likes chocolate.'


translate_sentence = sentence_to_seq(translate_sentence, source_vocab_to_int)

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(load_path + '.meta')
    loader.restore(sess, load_path)

    input_data = loaded_graph.get_tensor_by_name('input:0')
    logits = loaded_graph.get_tensor_by_name('predictions:0')
    target_sequence_length = loaded_graph.get_tensor_by_name('target_sequence_length:0')
    source_sequence_length = loaded_graph.get_tensor_by_name('source_sequence_length:0')
    keep_prob = loaded_graph.get_tensor_by_name('keep_prob:0')

    translate_logits = sess.run(logits, {input_data: [translate_sentence]*batch_size,
                                         target_sequence_length: [len(translate_sentence)*2]*batch_size,
                                         source_sequence_length: [len(translate_sentence)]*batch_size,
                                         keep_prob: 1.0})[0]

print('Input')
print('  Word Ids:      {}'.format([i for i in translate_sentence]))
print('  English Words: {}'.format([source_int_to_vocab[i] for i in translate_sentence]))

print('\nPrediction')
print('  Word Ids:      {}'.format([i for i in translate_logits]))
print('  French Words: {}'.format(" ".join([target_int_to_vocab[i] for i in translate_logits])))


INFO:tensorflow:Restoring parameters from checkpoints/dev
Input
  Word Ids:      [119, 158, 2]
  English Words: ['he', 'likes', '<UNK>']

Prediction
  Word Ids:      [296, 9, 21, 308, 301, 305]
  French Words: leur pendant , et il .


## Imperfect Translation
Noticed that some sentences translate better than others.  Since the dataset we're using only has a vocabulary of 227 English words of the thousands that we use, we're only going to see good results using these words.

## TODO
Train on the [WMT10 French-English corpus](http://www.statmt.org/wmt10/training-giga-fren.tar).  This dataset has more vocabulary and richer in topics discussed. Might take a lot of time to train .
