# Language Translation

In this notebook, we will train a sequence-to-sequence model on a dataset of English and French sentences that can translate new sentences from English to French.

We shall train with a small portion of the English and French corpus since training the entire corpus will require a long time.

* [Load and Explore the Data](#part1)
* [Preprocess the Data and Save](#part2)
* [Create the Neural Network](#part3)
* [Train the Neural Network](#part4)
* [Test the Sentence to Sequence Model](#part5)

In [None]:
# Import required libraries

import numpy as np
import helper

### Check the Version of TensorFlow and Access to GPU

In [11]:
from distutils.version import LooseVersion
import warnings
import tensorflow as tf

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.1'), 'Please use TensorFlow version 1.1 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

# Check for a GPU
if not tf.test.gpu_device_name():
    warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

TensorFlow Version: 1.3.0


  # This is added back by InteractiveShellApp.init_path()


---
<a id='part1'></a>

## Load and Explore the Data

In [1]:
source_path = 'data/small_vocab_en'
target_path = 'data/small_vocab_fr'
source_text = helper.load_data(source_path)
target_text = helper.load_data(target_path)

In [2]:
view_sentence_range = (0, 10)

print('Approximate number of unique words: {}'.format(len({word: None for word in source_text.split()})))

sentences = source_text.split('\n')
word_counts = [len(sentence.split()) for sentence in sentences]
print('Number of sentences: {}'.format(len(sentences)))
print('Average number of words in a sentence: {}'.format(np.average(word_counts)))

print()
print('English sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(source_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))
print()
print('French sentences {} to {}:'.format(*view_sentence_range))
print('\n'.join(target_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Dataset Stats
Roughly the number of unique words: 227
Number of sentences: 137861
Average number of words in a sentence: 13.225277634719028

English sentences 0 to 10:
new jersey is sometimes quiet during autumn , and it is snowy in april .
the united states is usually chilly during july , and it is usually freezing in november .
california is usually quiet during march , and it is usually hot in june .
the united states is sometimes mild during june , and it is cold in september .
your least liked fruit is the grape , but my least liked is the apple .
his favorite fruit is the orange , but my favorite is the grape .
paris is relaxing during december , but it is usually chilly in july .
new jersey is busy during spring , and it is never hot in march .
our least liked fruit is the lemon , but my least liked is the grape .
the united states is sometimes busy during january , and it is sometimes warm in november .

French sentences 0 to 10:
new jersey est parfois calme pendant l' automne 

---
<a id='part2'></a>

## Preprocess the Data and Save

In [None]:
def text_to_ids(source_text, target_text, source_vocab_to_int, target_vocab_to_int):
    """
    Convert source and target text to proper word ids. The <EOS> word id added at the end of target_text to help
    the neural network to predict when the sentence should end.
    :param source_text: String that contains all the source text.
    :param target_text: String that contains all the target text.
    :param source_vocab_to_int: Dictionary to go from the source words to an id
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :return: A tuple of lists (source_id_text, target_id_text)
    """
    # TODO: Implement Function
    source_sentences = source_text.lower().split('\n')
    source_text_split = [sentence.split() for sentence in source_sentences]
    source_ids = [[source_vocab_to_int[word] for word in sentence] for sentence in source_text_split]
    target_sentences = target_text.lower().split('\n')
    target_text_split = [sentence.split() + ['<EOS>'] for sentence in target_sentences]
    target_ids = [[target_vocab_to_int[word] for word in sentence] for sentence in target_text_split]
    return source_ids, target_ids

helper.preprocess_and_save_data(source_path, target_path, text_to_ids)

# Check Point

In [5]:
import numpy as np
import helper
import tensorflow as tf
from tensorflow.python.layers.core import Dense

(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()

---
<a id='part3'></a>

## Create the Neural Network
You'll build the components necessary to build a Sequence-to-Sequence model by implementing the following functions below:
- `model_inputs`
- `process_decoder_input`
- `encoding_layer`
- `decoding_layer_train`
- `decoding_layer_infer`
- `decoding_layer`
- `seq2seq_model`

### Input

In [1]:
def model_inputs():
    """
    Create TF Placeholders for input, targets, learning rate, and lengths of source and target sequences.
    :return: Tuple (input, targets, learning rate, keep probability, target sequence length,
    max target sequence length, source sequence length)
    """
    # TODO: Implement Function
    inputs = tf.placeholder(tf.int32, [None, None], name='input')
    targets = tf.placeholder(tf.int32, [None, None], name='targets')
    learning_rate = tf.placeholder(tf.float32, [], name='learning_rate')
    keep_prob = tf.placeholder(tf.float32, [], name='keep_prob')
    target_sequence_length = tf.placeholder(tf.int32, [None], name='target_sequence_length')
    max_target_len = tf.reduce_max(target_sequence_length)
    source_sequence_length = tf.placeholder(tf.int32, [None], name='source_sequence_length')
    return inputs, targets, learning_rate, keep_prob, target_sequence_length, max_target_len, source_sequence_length


def process_decoder_input(target_data, target_vocab_to_int, batch_size):
    """
    Preprocess target data for encoding by removing the last word id from each batch in target_data 
    and concat the GO ID to the beginning of each batch.
    :param target_data: Target Placeholder
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :param batch_size: Batch Size
    :return: Preprocessed target data
    """
    # TODO: Implement Function
    go_id = tf.constant(target_vocab_to_int['<GO>'], shape=(batch_size,1), dtype=tf.int32)
    processed_data = tf.concat([go_id, target_data[:,:-1]], axis=1)
    return processed_data

### Encoding
Implement `encoding_layer()` to create a Encoder RNN layer:
 * Embed the encoder input using [`tf.contrib.layers.embed_sequence`](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence)
 * Construct a [stacked](https://github.com/tensorflow/tensorflow/blob/6947f65a374ebf29e74bb71e36fd82760056d82c/tensorflow/docs_src/tutorials/recurrent.md#stacking-multiple-lstms) [`tf.contrib.rnn.LSTMCell`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/LSTMCell) wrapped in a [`tf.contrib.rnn.DropoutWrapper`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/DropoutWrapper)
 * Pass cell and embedded input to [`tf.nn.dynamic_rnn()`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn)

In [4]:
def encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):
    """
    Create encoding layer
    :param rnn_inputs: Inputs for the RNN
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param keep_prob: Dropout keep probability
    :param source_sequence_length: a list of the lengths of each sequence in the batch
    :param source_vocab_size: vocabulary size of source data
    :param encoding_embedding_size: embedding size of source data
    :return: tuple (RNN output, RNN state)
    """
    # Embed the encoder input
    embed = tf.contrib.layers.embed_sequence(rnn_inputs, vocab_size=source_vocab_size,
                                             embed_dim=encoding_embedding_size)

    # RNN cell
    def make_cell(rnn_size, keep_prob):
        # LSTM cell
        lstm = tf.contrib.rnn.LSTMCell(rnn_size)
        # Add dropout to the cell
        drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return drop
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([make_cell(rnn_size, keep_prob) for _ in range(num_layers)])

    # Pass cell and embedded input to tf.nn.dynamic.rnn
    output, final_state = tf.nn.dynamic_rnn(cell, embed, sequence_length=source_sequence_length, dtype=tf.float32)
    return output, final_state

### Decoding - Training
Create a training decoding layer:
* Create a [`tf.contrib.seq2seq.TrainingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/TrainingHelper) 
* Create a [`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obtain the decoder outputs from [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [5]:
def decoding_layer_train(encoder_state, dec_cell, dec_embed_input, 
                         target_sequence_length, max_summary_length, 
                         output_layer, keep_prob):
    """
    Create a decoding layer for training
    :param encoder_state: Encoder State
    :param dec_cell: Decoder RNN Cell
    :param dec_embed_input: Decoder embedded input
    :param target_sequence_length: The lengths of each sequence in the target batch
    :param max_summary_length: The length of the longest sequence in the batch
   :param output_layer: Function to apply the output layer
    :param keep_prob: Dropout keep probability
    :return: BasicDecoderOutput containing training logits and sample_id
    """
    training_helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_sequence_length)
    basic_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, training_helper, encoder_state, output_layer)
    basic_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(basic_decoder, 
                                                                   maximum_iterations=max_summary_length)
    return basic_decoder_output

### Decoding - Inference
Create inference decoder:
* Create a [`tf.contrib.seq2seq.GreedyEmbeddingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/GreedyEmbeddingHelper)
* Create a [`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obtain the decoder outputs from [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [6]:
def decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id,
                         end_of_sequence_id, max_target_sequence_length,
                         vocab_size, output_layer, batch_size, keep_prob):
    """
    Create a decoding layer for inference
    :param encoder_state: Encoder state
    :param dec_cell: Decoder RNN Cell
    :param dec_embeddings: Decoder embeddings
    :param start_of_sequence_id: GO ID
    :param end_of_sequence_id: EOS ID
    :param max_target_sequence_length: Maximum length of target sequences
    :param vocab_size: Size of decoder/target vocabulary
    :param decoding_scope: TensorFlow Variable Scope for decoding
    :param output_layer: Function to apply the output layer
    :param batch_size: Batch size
    :param keep_prob: Dropout keep probability
    :return: BasicDecoderOutput containing inference logits and sample_id
    """
    start_tokens = tf.tile(tf.constant([start_of_sequence_id], dtype=tf.int32), 
                           [batch_size], name='start_tokens')
    training_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(dec_embeddings, start_tokens, end_of_sequence_id)
    inference_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, training_helper, encoder_state, output_layer)
    inference_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, 
                                                                       maximum_iterations=max_target_sequence_length)
    return inference_decoder_output

### Build the Decoding Layer
Implement `decoding_layer()` to create a Decoder RNN layer.

* Embed the target sequences
* Construct the decoder LSTM cell (just like you constructed the encoder cell above)
* Create an output layer to map the outputs of the decoder to the elements of our vocabulary
* Use the your `decoding_layer_train(encoder_state, dec_cell, dec_embed_input, target_sequence_length, max_target_sequence_length, output_layer, keep_prob)` function to get the training logits.
* Use your `decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id, end_of_sequence_id, max_target_sequence_length, vocab_size, output_layer, batch_size, keep_prob)` function to get the inference logits.

Note: You'll need to use [tf.variable_scope](https://www.tensorflow.org/api_docs/python/tf/variable_scope) to share variables between training and inference.

In [7]:
def decoding_layer(dec_input, encoder_state,
                   target_sequence_length, max_target_sequence_length,
                   rnn_size,
                   num_layers, target_vocab_to_int, target_vocab_size,
                   batch_size, keep_prob, decoding_embedding_size):
    """
    Create decoding layer
    :param dec_input: Decoder input
    :param encoder_state: Encoder state
    :param target_sequence_length: The lengths of each sequence in the target batch
    :param max_target_sequence_length: Maximum length of target sequences
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :param target_vocab_size: Size of target vocabulary
    :param batch_size: The size of the batch
    :param keep_prob: Dropout keep probability
    :param decoding_embedding_size: Decoding embedding size
    :return: Tuple of (Training BasicDecoderOutput, Inference BasicDecoderOutput)
    """
    # 1. Decoder Embedding
    dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

    # 2. Construct the decoder cell
    def make_cell(rnn_size, keep_prob):
        # LSTM cell
        lstm = tf.contrib.rnn.LSTMCell(rnn_size)
        # Add dropout to the cell
        drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return drop
    
    # Stack up multiple LSTM layers, for deep learning
    dec_cell = tf.contrib.rnn.MultiRNNCell([make_cell(rnn_size, keep_prob) for _ in range(num_layers)])

    # 3. Dense layer to translate the decoder's output at each time 
    # step into a choice from the target vocabulary
    output_layer = Dense(target_vocab_size,
                         kernel_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.1))
    
    with tf.variable_scope('decode'):
        tr_decoder_output = decoding_layer_train(encoder_state, dec_cell, dec_embed_input, 
                                                 target_sequence_length, max_target_sequence_length, 
                                                 output_layer, keep_prob)
        start_of_sequence_id = target_vocab_to_int['<GO>']
        end_of_sequence_id = target_vocab_to_int['<EOS>']
        inf_decoder_output = decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id,
                                                  end_of_sequence_id, max_target_sequence_length,
                                                  target_vocab_size, output_layer, batch_size, keep_prob)
    return tr_decoder_output, inf_decoder_output

### Build the Neural Network
Apply the functions implemented above to:

- Encode the input using your `encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob, source_sequence_length, source_vocab_size, encoding_embedding_size)`.
- Process target data using your `process_decoder_input(target_data, target_vocab_to_int, batch_size)` function.
- Decode the encoded input using your `decoding_layer(dec_input, enc_state, target_sequence_length, max_target_sentence_length, rnn_size, num_layers, target_vocab_to_int, target_vocab_size, batch_size, keep_prob, dec_embedding_size)` function.

In [8]:
def seq2seq_model(input_data, target_data, keep_prob, batch_size,
                  source_sequence_length, target_sequence_length,
                  max_target_sentence_length,
                  source_vocab_size, target_vocab_size,
                  enc_embedding_size, dec_embedding_size,
                  rnn_size, num_layers, target_vocab_to_int):
    """
    Build the Sequence-to-Sequence part of the neural network
    :param input_data: Input placeholder
    :param target_data: Target placeholder
    :param keep_prob: Dropout keep probability placeholder
    :param batch_size: Batch Size
    :param source_sequence_length: Sequence Lengths of source sequences in the batch
    :param target_sequence_length: Sequence Lengths of target sequences in the batch
    :param source_vocab_size: Source vocabulary size
    :param target_vocab_size: Target vocabulary size
    :param enc_embedding_size: Decoder embedding size
    :param dec_embedding_size: Encoder embedding size
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :return: Tuple of (Training BasicDecoderOutput, Inference BasicDecoderOutput)
    """
    # Pass the input data through the encoder. We'll ignore the encoder output, but use the state
    _, enc_state = encoding_layer(input_data, rnn_size, num_layers, keep_prob,  
                                  source_sequence_length, source_vocab_size, 
                                  enc_embedding_size)
    
    # Prepare the target sequences we'll feed to the decoder in training mode
    dec_input = process_decoder_input(target_data, target_vocab_to_int, batch_size)
    
    # Pass encoder state and decoder inputs to the decoders
    tr_decoder_output, inf_decoder_output = decoding_layer(dec_input, enc_state, target_sequence_length, 
                                                           max_target_sentence_length, rnn_size, num_layers, 
                                                           target_vocab_to_int, target_vocab_size, 
                                                           batch_size, keep_prob, 
                                                           dec_embedding_size)
    
    return tr_decoder_output, inf_decoder_output

---
<a id='part4'></a>

## Train the Neural Network

### Hyperparameters
Tune the following parameters:

- Set `epochs` to the number of epochs.
- Set `batch_size` to the batch size.
- Set `rnn_size` to the size of the RNNs.
- Set `num_layers` to the number of layers.
- Set `encoding_embedding_size` to the size of the embedding for the encoder.
- Set `decoding_embedding_size` to the size of the embedding for the decoder.
- Set `learning_rate` to the learning rate.
- Set `keep_probability` to the Dropout keep probability
- Set `display_step` to state how many steps between each debug output statement

In [14]:
# Number of Epochs
epochs = 10
# Batch Size
batch_size = 512
# RNN Size
rnn_size = 128
# Number of Layers
num_layers = 2
# Embedding Size
encoding_embedding_size = 128
decoding_embedding_size = 128
# Learning Rate
learning_rate = 0.001
# Dropout Keep Probability
keep_probability = 0.55
display_step = True

### Build the Graph

In [15]:
save_path = 'checkpoints/dev'
(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()
max_target_sentence_length = max([len(sentence) for sentence in source_int_text])

train_graph = tf.Graph()
with train_graph.as_default():
    input_data, targets, lr, keep_prob, target_sequence_length, max_target_sequence_length, source_sequence_length = model_inputs()

    #sequence_length = tf.placeholder_with_default(max_target_sentence_length, None, name='sequence_length')
    input_shape = tf.shape(input_data)

    train_logits, inference_logits = seq2seq_model(tf.reverse(input_data, [-1]),
                                                   targets,
                                                   keep_prob,
                                                   batch_size,
                                                   source_sequence_length,
                                                   target_sequence_length,
                                                   max_target_sequence_length,
                                                   len(source_vocab_to_int),
                                                   len(target_vocab_to_int),
                                                   encoding_embedding_size,
                                                   decoding_embedding_size,
                                                   rnn_size,
                                                   num_layers,
                                                   target_vocab_to_int)


    training_logits = tf.identity(train_logits.rnn_output, name='logits')
    inference_logits = tf.identity(inference_logits.sample_id, name='predictions')

    masks = tf.sequence_mask(target_sequence_length, max_target_sequence_length, dtype=tf.float32, name='masks')

    with tf.name_scope("optimization"):
        # Loss function
        cost = tf.contrib.seq2seq.sequence_loss(
            training_logits,
            targets,
            masks)

        # Optimizer
        optimizer = tf.train.AdamOptimizer(lr)

        # Gradient Clipping
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)


Batch and pad the source and target sequences

In [9]:
def pad_sentence_batch(sentence_batch, pad_int):
    """Pad sentences with <PAD> so that each sentence of a batch has the same length"""
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [pad_int] * (max_sentence - len(sentence)) for sentence in sentence_batch]


def get_batches(sources, targets, batch_size, source_pad_int, target_pad_int):
    """Batch targets, sources, and the lengths of their sentences together"""
    for batch_i in range(0, len(sources)//batch_size):
        start_i = batch_i * batch_size

        # Slice the right amount for the batch
        sources_batch = sources[start_i:start_i + batch_size]
        targets_batch = targets[start_i:start_i + batch_size]

        # Pad
        pad_sources_batch = np.array(pad_sentence_batch(sources_batch, source_pad_int))
        pad_targets_batch = np.array(pad_sentence_batch(targets_batch, target_pad_int))

        # Need the lengths for the _lengths parameters
        pad_targets_lengths = []
        for target in pad_targets_batch:
            pad_targets_lengths.append(len(target))

        pad_source_lengths = []
        for source in pad_sources_batch:
            pad_source_lengths.append(len(source))

        yield pad_sources_batch, pad_targets_batch, pad_source_lengths, pad_targets_lengths

### Split data to training and validation sets

In [None]:
train_source = source_int_text[batch_size:]
train_target = target_int_text[batch_size:]
valid_source = source_int_text[:batch_size]
valid_target = target_int_text[:batch_size]
(valid_sources_batch, valid_targets_batch, valid_sources_lengths, valid_targets_lengths ) = \
    next(get_batches(valid_source,
                     valid_target,
                     batch_size,
                     source_vocab_to_int['<PAD>'],
                     target_vocab_to_int['<PAD>']))

### Train

In [17]:
def get_accuracy(target, logits):
    """
    Calculate accuracy
    """
    max_seq = max(target.shape[1], logits.shape[1])
    if max_seq - target.shape[1]:
        target = np.pad(
            target,
            [(0,0),(0,max_seq - target.shape[1])],
            'constant')
    if max_seq - logits.shape[1]:
        logits = np.pad(
            logits,
            [(0,0),(0,max_seq - logits.shape[1])],
            'constant')

    return np.mean(np.equal(target, logits))


with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())

    for epoch_i in range(epochs):
        for batch_i, (source_batch, target_batch, sources_lengths, targets_lengths) in enumerate(
                get_batches(train_source, train_target, batch_size,
                            source_vocab_to_int['<PAD>'],
                            target_vocab_to_int['<PAD>'])):

            _, loss = sess.run(
                [train_op, cost],
                {input_data: source_batch,
                 targets: target_batch,
                 lr: learning_rate,
                 target_sequence_length: targets_lengths,
                 source_sequence_length: sources_lengths,
                 keep_prob: keep_probability})


            if batch_i % display_step == 0 and batch_i > 0:


                batch_train_logits = sess.run(
                    inference_logits,
                    {input_data: source_batch,
                     source_sequence_length: sources_lengths,
                     target_sequence_length: targets_lengths,
                     keep_prob: 1.0})


                batch_valid_logits = sess.run(
                    inference_logits,
                    {input_data: valid_sources_batch,
                     source_sequence_length: valid_sources_lengths,
                     target_sequence_length: valid_targets_lengths,
                     keep_prob: 1.0})

                train_acc = get_accuracy(target_batch, batch_train_logits)

                valid_acc = get_accuracy(valid_targets_batch, batch_valid_logits)

                print('Epoch {:>3} Batch {:>4}/{} - Train Accuracy: {:>6.4f}, Validation Accuracy: {:>6.4f}, Loss: {:>6.4f}'
                      .format(epoch_i, batch_i, len(source_int_text) // batch_size, train_acc, valid_acc, loss))

    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, save_path)
    print('Model Trained and Saved')

Epoch   0 Batch    1/269 - Train Accuracy: 0.2336, Validation Accuracy: 0.3100, Loss: 5.7067
Epoch   0 Batch    2/269 - Train Accuracy: 0.2655, Validation Accuracy: 0.3096, Loss: 5.5046
Epoch   0 Batch    3/269 - Train Accuracy: 0.2444, Validation Accuracy: 0.3096, Loss: 5.3443
Epoch   0 Batch    4/269 - Train Accuracy: 0.2317, Validation Accuracy: 0.3096, Loss: 5.1667
Epoch   0 Batch    5/269 - Train Accuracy: 0.2325, Validation Accuracy: 0.3096, Loss: 4.9802
Epoch   0 Batch    6/269 - Train Accuracy: 0.2786, Validation Accuracy: 0.3098, Loss: 4.6574
Epoch   0 Batch    7/269 - Train Accuracy: 0.2865, Validation Accuracy: 0.3210, Loss: 4.5111
Epoch   0 Batch    8/269 - Train Accuracy: 0.2535, Validation Accuracy: 0.3224, Loss: 4.5242
Epoch   0 Batch    9/269 - Train Accuracy: 0.2923, Validation Accuracy: 0.3328, Loss: 4.2981
Epoch   0 Batch   10/269 - Train Accuracy: 0.2673, Validation Accuracy: 0.3407, Loss: 4.3466
Epoch   0 Batch   11/269 - Train Accuracy: 0.3061, Validation Accuracy

Epoch   0 Batch   90/269 - Train Accuracy: 0.4313, Validation Accuracy: 0.4840, Loss: 2.6237
Epoch   0 Batch   91/269 - Train Accuracy: 0.4654, Validation Accuracy: 0.4876, Loss: 2.4696
Epoch   0 Batch   92/269 - Train Accuracy: 0.4648, Validation Accuracy: 0.4900, Loss: 2.4512
Epoch   0 Batch   93/269 - Train Accuracy: 0.4806, Validation Accuracy: 0.4850, Loss: 2.3671
Epoch   0 Batch   94/269 - Train Accuracy: 0.4695, Validation Accuracy: 0.4847, Loss: 2.4391
Epoch   0 Batch   95/269 - Train Accuracy: 0.4676, Validation Accuracy: 0.4904, Loss: 2.4305
Epoch   0 Batch   96/269 - Train Accuracy: 0.4682, Validation Accuracy: 0.4932, Loss: 2.4399
Epoch   0 Batch   97/269 - Train Accuracy: 0.4639, Validation Accuracy: 0.4831, Loss: 2.4142
Epoch   0 Batch   98/269 - Train Accuracy: 0.4752, Validation Accuracy: 0.4882, Loss: 2.3936
Epoch   0 Batch   99/269 - Train Accuracy: 0.4394, Validation Accuracy: 0.4923, Loss: 2.5347
Epoch   0 Batch  100/269 - Train Accuracy: 0.4892, Validation Accuracy

Epoch   0 Batch  179/269 - Train Accuracy: 0.4940, Validation Accuracy: 0.5149, Loss: 1.8394
Epoch   0 Batch  180/269 - Train Accuracy: 0.4895, Validation Accuracy: 0.5150, Loss: 1.8077
Epoch   0 Batch  181/269 - Train Accuracy: 0.4605, Validation Accuracy: 0.4867, Loss: 1.8006
Epoch   0 Batch  182/269 - Train Accuracy: 0.4940, Validation Accuracy: 0.5104, Loss: 1.8258
Epoch   0 Batch  183/269 - Train Accuracy: 0.5521, Validation Accuracy: 0.5088, Loss: 1.5729
Epoch   0 Batch  184/269 - Train Accuracy: 0.4412, Validation Accuracy: 0.4950, Loss: 1.8741
Epoch   0 Batch  185/269 - Train Accuracy: 0.4962, Validation Accuracy: 0.5067, Loss: 1.7805
Epoch   0 Batch  186/269 - Train Accuracy: 0.4607, Validation Accuracy: 0.5118, Loss: 1.8602
Epoch   0 Batch  187/269 - Train Accuracy: 0.4807, Validation Accuracy: 0.5038, Loss: 1.7594
Epoch   0 Batch  188/269 - Train Accuracy: 0.4769, Validation Accuracy: 0.4907, Loss: 1.7193
Epoch   0 Batch  189/269 - Train Accuracy: 0.4965, Validation Accuracy

Epoch   1 Batch    1/269 - Train Accuracy: 0.4578, Validation Accuracy: 0.5057, Loss: 1.5066
Epoch   1 Batch    2/269 - Train Accuracy: 0.4964, Validation Accuracy: 0.5347, Loss: 1.4761
Epoch   1 Batch    3/269 - Train Accuracy: 0.4622, Validation Accuracy: 0.5077, Loss: 1.5123
Epoch   1 Batch    4/269 - Train Accuracy: 0.4548, Validation Accuracy: 0.5157, Loss: 1.4999
Epoch   1 Batch    5/269 - Train Accuracy: 0.4566, Validation Accuracy: 0.5173, Loss: 1.5131
Epoch   1 Batch    6/269 - Train Accuracy: 0.4974, Validation Accuracy: 0.5133, Loss: 1.3768
Epoch   1 Batch    7/269 - Train Accuracy: 0.4862, Validation Accuracy: 0.5036, Loss: 1.4139
Epoch   1 Batch    8/269 - Train Accuracy: 0.4558, Validation Accuracy: 0.5100, Loss: 1.4799
Epoch   1 Batch    9/269 - Train Accuracy: 0.4725, Validation Accuracy: 0.5118, Loss: 1.4352
Epoch   1 Batch   10/269 - Train Accuracy: 0.4467, Validation Accuracy: 0.4972, Loss: 1.4603
Epoch   1 Batch   11/269 - Train Accuracy: 0.4813, Validation Accuracy

Epoch   1 Batch   90/269 - Train Accuracy: 0.4832, Validation Accuracy: 0.5376, Loss: 1.2655
Epoch   1 Batch   91/269 - Train Accuracy: 0.5240, Validation Accuracy: 0.5502, Loss: 1.1860
Epoch   1 Batch   92/269 - Train Accuracy: 0.5405, Validation Accuracy: 0.5566, Loss: 1.1836
Epoch   1 Batch   93/269 - Train Accuracy: 0.5502, Validation Accuracy: 0.5568, Loss: 1.1440
Epoch   1 Batch   94/269 - Train Accuracy: 0.5437, Validation Accuracy: 0.5613, Loss: 1.2100
Epoch   1 Batch   95/269 - Train Accuracy: 0.5390, Validation Accuracy: 0.5636, Loss: 1.1849
Epoch   1 Batch   96/269 - Train Accuracy: 0.5293, Validation Accuracy: 0.5455, Loss: 1.1664
Epoch   1 Batch   97/269 - Train Accuracy: 0.5294, Validation Accuracy: 0.5557, Loss: 1.1894
Epoch   1 Batch   98/269 - Train Accuracy: 0.5667, Validation Accuracy: 0.5688, Loss: 1.1611
Epoch   1 Batch   99/269 - Train Accuracy: 0.5215, Validation Accuracy: 0.5630, Loss: 1.2470
Epoch   1 Batch  100/269 - Train Accuracy: 0.5478, Validation Accuracy

Epoch   1 Batch  179/269 - Train Accuracy: 0.5510, Validation Accuracy: 0.5602, Loss: 0.9969
Epoch   1 Batch  180/269 - Train Accuracy: 0.5514, Validation Accuracy: 0.5639, Loss: 0.9760
Epoch   1 Batch  181/269 - Train Accuracy: 0.5452, Validation Accuracy: 0.5692, Loss: 0.9921
Epoch   1 Batch  182/269 - Train Accuracy: 0.5451, Validation Accuracy: 0.5572, Loss: 1.0004
Epoch   1 Batch  183/269 - Train Accuracy: 0.6073, Validation Accuracy: 0.5554, Loss: 0.8530
Epoch   1 Batch  184/269 - Train Accuracy: 0.5479, Validation Accuracy: 0.5642, Loss: 1.0223
Epoch   1 Batch  185/269 - Train Accuracy: 0.5659, Validation Accuracy: 0.5683, Loss: 0.9685
Epoch   1 Batch  186/269 - Train Accuracy: 0.5108, Validation Accuracy: 0.5597, Loss: 1.0106
Epoch   1 Batch  187/269 - Train Accuracy: 0.5492, Validation Accuracy: 0.5585, Loss: 0.9636
Epoch   1 Batch  188/269 - Train Accuracy: 0.5624, Validation Accuracy: 0.5671, Loss: 0.9520
Epoch   1 Batch  189/269 - Train Accuracy: 0.5627, Validation Accuracy

Epoch   2 Batch    1/269 - Train Accuracy: 0.5603, Validation Accuracy: 0.5886, Loss: 0.8785
Epoch   2 Batch    2/269 - Train Accuracy: 0.5599, Validation Accuracy: 0.5916, Loss: 0.8747
Epoch   2 Batch    3/269 - Train Accuracy: 0.5676, Validation Accuracy: 0.5924, Loss: 0.8870
Epoch   2 Batch    4/269 - Train Accuracy: 0.5494, Validation Accuracy: 0.5901, Loss: 0.8931
Epoch   2 Batch    5/269 - Train Accuracy: 0.5469, Validation Accuracy: 0.5865, Loss: 0.8867
Epoch   2 Batch    6/269 - Train Accuracy: 0.5754, Validation Accuracy: 0.5869, Loss: 0.8157
Epoch   2 Batch    7/269 - Train Accuracy: 0.5769, Validation Accuracy: 0.5883, Loss: 0.8444
Epoch   2 Batch    8/269 - Train Accuracy: 0.5629, Validation Accuracy: 0.5878, Loss: 0.8912
Epoch   2 Batch    9/269 - Train Accuracy: 0.5620, Validation Accuracy: 0.5880, Loss: 0.8693
Epoch   2 Batch   10/269 - Train Accuracy: 0.5615, Validation Accuracy: 0.5874, Loss: 0.8700
Epoch   2 Batch   11/269 - Train Accuracy: 0.5697, Validation Accuracy

Epoch   2 Batch   90/269 - Train Accuracy: 0.5653, Validation Accuracy: 0.5988, Loss: 0.8237
Epoch   2 Batch   91/269 - Train Accuracy: 0.6036, Validation Accuracy: 0.6011, Loss: 0.7538
Epoch   2 Batch   92/269 - Train Accuracy: 0.5977, Validation Accuracy: 0.6083, Loss: 0.7627
Epoch   2 Batch   93/269 - Train Accuracy: 0.6134, Validation Accuracy: 0.6073, Loss: 0.7356
Epoch   2 Batch   94/269 - Train Accuracy: 0.6028, Validation Accuracy: 0.6064, Loss: 0.7796
Epoch   2 Batch   95/269 - Train Accuracy: 0.5913, Validation Accuracy: 0.6020, Loss: 0.7763
Epoch   2 Batch   96/269 - Train Accuracy: 0.5944, Validation Accuracy: 0.6008, Loss: 0.7676
Epoch   2 Batch   97/269 - Train Accuracy: 0.5751, Validation Accuracy: 0.6063, Loss: 0.7710
Epoch   2 Batch   98/269 - Train Accuracy: 0.6000, Validation Accuracy: 0.6088, Loss: 0.7625
Epoch   2 Batch   99/269 - Train Accuracy: 0.5852, Validation Accuracy: 0.6101, Loss: 0.7965
Epoch   2 Batch  100/269 - Train Accuracy: 0.6136, Validation Accuracy

Epoch   2 Batch  179/269 - Train Accuracy: 0.6193, Validation Accuracy: 0.6185, Loss: 0.7188
Epoch   2 Batch  180/269 - Train Accuracy: 0.6060, Validation Accuracy: 0.6203, Loss: 0.7103
Epoch   2 Batch  181/269 - Train Accuracy: 0.5965, Validation Accuracy: 0.6175, Loss: 0.7192
Epoch   2 Batch  182/269 - Train Accuracy: 0.6177, Validation Accuracy: 0.6158, Loss: 0.7154
Epoch   2 Batch  183/269 - Train Accuracy: 0.6626, Validation Accuracy: 0.6137, Loss: 0.6194
Epoch   2 Batch  184/269 - Train Accuracy: 0.5908, Validation Accuracy: 0.6159, Loss: 0.7404
Epoch   2 Batch  185/269 - Train Accuracy: 0.6211, Validation Accuracy: 0.6169, Loss: 0.7065
Epoch   2 Batch  186/269 - Train Accuracy: 0.5877, Validation Accuracy: 0.6168, Loss: 0.7323
Epoch   2 Batch  187/269 - Train Accuracy: 0.6168, Validation Accuracy: 0.6167, Loss: 0.7027
Epoch   2 Batch  188/269 - Train Accuracy: 0.6190, Validation Accuracy: 0.6210, Loss: 0.6922
Epoch   2 Batch  189/269 - Train Accuracy: 0.6124, Validation Accuracy

Epoch   3 Batch    1/269 - Train Accuracy: 0.6056, Validation Accuracy: 0.6325, Loss: 0.6878
Epoch   3 Batch    2/269 - Train Accuracy: 0.6030, Validation Accuracy: 0.6273, Loss: 0.6811
Epoch   3 Batch    3/269 - Train Accuracy: 0.6145, Validation Accuracy: 0.6319, Loss: 0.6769
Epoch   3 Batch    4/269 - Train Accuracy: 0.5892, Validation Accuracy: 0.6325, Loss: 0.7013
Epoch   3 Batch    5/269 - Train Accuracy: 0.5874, Validation Accuracy: 0.6262, Loss: 0.7104
Epoch   3 Batch    6/269 - Train Accuracy: 0.6149, Validation Accuracy: 0.6192, Loss: 0.6424
Epoch   3 Batch    7/269 - Train Accuracy: 0.6264, Validation Accuracy: 0.6260, Loss: 0.6596
Epoch   3 Batch    8/269 - Train Accuracy: 0.5949, Validation Accuracy: 0.6230, Loss: 0.7062
Epoch   3 Batch    9/269 - Train Accuracy: 0.5966, Validation Accuracy: 0.6179, Loss: 0.6812
Epoch   3 Batch   10/269 - Train Accuracy: 0.6114, Validation Accuracy: 0.6289, Loss: 0.6849
Epoch   3 Batch   11/269 - Train Accuracy: 0.6132, Validation Accuracy

Epoch   3 Batch   90/269 - Train Accuracy: 0.6014, Validation Accuracy: 0.6278, Loss: 0.6631
Epoch   3 Batch   91/269 - Train Accuracy: 0.6290, Validation Accuracy: 0.6246, Loss: 0.6060
Epoch   3 Batch   92/269 - Train Accuracy: 0.6231, Validation Accuracy: 0.6261, Loss: 0.6145
Epoch   3 Batch   93/269 - Train Accuracy: 0.6357, Validation Accuracy: 0.6232, Loss: 0.5960
Epoch   3 Batch   94/269 - Train Accuracy: 0.6265, Validation Accuracy: 0.6289, Loss: 0.6364
Epoch   3 Batch   95/269 - Train Accuracy: 0.6210, Validation Accuracy: 0.6337, Loss: 0.6206
Epoch   3 Batch   96/269 - Train Accuracy: 0.6253, Validation Accuracy: 0.6246, Loss: 0.6207
Epoch   3 Batch   97/269 - Train Accuracy: 0.6229, Validation Accuracy: 0.6213, Loss: 0.6170
Epoch   3 Batch   98/269 - Train Accuracy: 0.6326, Validation Accuracy: 0.6275, Loss: 0.6202
Epoch   3 Batch   99/269 - Train Accuracy: 0.6154, Validation Accuracy: 0.6253, Loss: 0.6471
Epoch   3 Batch  100/269 - Train Accuracy: 0.6531, Validation Accuracy

Epoch   3 Batch  179/269 - Train Accuracy: 0.6302, Validation Accuracy: 0.6267, Loss: 0.5895
Epoch   3 Batch  180/269 - Train Accuracy: 0.6254, Validation Accuracy: 0.6269, Loss: 0.5810
Epoch   3 Batch  181/269 - Train Accuracy: 0.6310, Validation Accuracy: 0.6352, Loss: 0.5918
Epoch   3 Batch  182/269 - Train Accuracy: 0.6451, Validation Accuracy: 0.6370, Loss: 0.5844
Epoch   3 Batch  183/269 - Train Accuracy: 0.6758, Validation Accuracy: 0.6326, Loss: 0.5108
Epoch   3 Batch  184/269 - Train Accuracy: 0.6077, Validation Accuracy: 0.6329, Loss: 0.6065
Epoch   3 Batch  185/269 - Train Accuracy: 0.6323, Validation Accuracy: 0.6351, Loss: 0.5803
Epoch   3 Batch  186/269 - Train Accuracy: 0.6085, Validation Accuracy: 0.6356, Loss: 0.5991
Epoch   3 Batch  187/269 - Train Accuracy: 0.6382, Validation Accuracy: 0.6308, Loss: 0.5737
Epoch   3 Batch  188/269 - Train Accuracy: 0.6316, Validation Accuracy: 0.6332, Loss: 0.5627
Epoch   3 Batch  189/269 - Train Accuracy: 0.6317, Validation Accuracy

Epoch   4 Batch    1/269 - Train Accuracy: 0.6201, Validation Accuracy: 0.6333, Loss: 0.5613
Epoch   4 Batch    2/269 - Train Accuracy: 0.6232, Validation Accuracy: 0.6347, Loss: 0.5496
Epoch   4 Batch    3/269 - Train Accuracy: 0.6466, Validation Accuracy: 0.6375, Loss: 0.5579
Epoch   4 Batch    4/269 - Train Accuracy: 0.6070, Validation Accuracy: 0.6330, Loss: 0.5719
Epoch   4 Batch    5/269 - Train Accuracy: 0.6224, Validation Accuracy: 0.6337, Loss: 0.5687
Epoch   4 Batch    6/269 - Train Accuracy: 0.6512, Validation Accuracy: 0.6374, Loss: 0.5247
Epoch   4 Batch    7/269 - Train Accuracy: 0.6432, Validation Accuracy: 0.6342, Loss: 0.5308
Epoch   4 Batch    8/269 - Train Accuracy: 0.6169, Validation Accuracy: 0.6333, Loss: 0.5657
Epoch   4 Batch    9/269 - Train Accuracy: 0.6260, Validation Accuracy: 0.6377, Loss: 0.5504
Epoch   4 Batch   10/269 - Train Accuracy: 0.6188, Validation Accuracy: 0.6254, Loss: 0.5494
Epoch   4 Batch   11/269 - Train Accuracy: 0.6206, Validation Accuracy

Epoch   4 Batch   90/269 - Train Accuracy: 0.6383, Validation Accuracy: 0.6620, Loss: 0.5390
Epoch   4 Batch   91/269 - Train Accuracy: 0.6922, Validation Accuracy: 0.6602, Loss: 0.4824
Epoch   4 Batch   92/269 - Train Accuracy: 0.6697, Validation Accuracy: 0.6566, Loss: 0.4999
Epoch   4 Batch   93/269 - Train Accuracy: 0.6783, Validation Accuracy: 0.6585, Loss: 0.4891
Epoch   4 Batch   94/269 - Train Accuracy: 0.6774, Validation Accuracy: 0.6679, Loss: 0.5151
Epoch   4 Batch   95/269 - Train Accuracy: 0.6481, Validation Accuracy: 0.6619, Loss: 0.5081
Epoch   4 Batch   96/269 - Train Accuracy: 0.6630, Validation Accuracy: 0.6599, Loss: 0.4994
Epoch   4 Batch   97/269 - Train Accuracy: 0.6661, Validation Accuracy: 0.6584, Loss: 0.4963
Epoch   4 Batch   98/269 - Train Accuracy: 0.6717, Validation Accuracy: 0.6689, Loss: 0.5102
Epoch   4 Batch   99/269 - Train Accuracy: 0.6607, Validation Accuracy: 0.6694, Loss: 0.5224
Epoch   4 Batch  100/269 - Train Accuracy: 0.6811, Validation Accuracy

Epoch   4 Batch  179/269 - Train Accuracy: 0.6781, Validation Accuracy: 0.6901, Loss: 0.4806
Epoch   4 Batch  180/269 - Train Accuracy: 0.6696, Validation Accuracy: 0.6779, Loss: 0.4730
Epoch   4 Batch  181/269 - Train Accuracy: 0.6889, Validation Accuracy: 0.6985, Loss: 0.4794
Epoch   4 Batch  182/269 - Train Accuracy: 0.7036, Validation Accuracy: 0.7009, Loss: 0.4789
Epoch   4 Batch  183/269 - Train Accuracy: 0.7279, Validation Accuracy: 0.6911, Loss: 0.4100
Epoch   4 Batch  184/269 - Train Accuracy: 0.6504, Validation Accuracy: 0.6851, Loss: 0.4906
Epoch   4 Batch  185/269 - Train Accuracy: 0.6955, Validation Accuracy: 0.6900, Loss: 0.4766
Epoch   4 Batch  186/269 - Train Accuracy: 0.7011, Validation Accuracy: 0.6975, Loss: 0.4936
Epoch   4 Batch  187/269 - Train Accuracy: 0.6963, Validation Accuracy: 0.6889, Loss: 0.4640
Epoch   4 Batch  188/269 - Train Accuracy: 0.6731, Validation Accuracy: 0.6688, Loss: 0.4583
Epoch   4 Batch  189/269 - Train Accuracy: 0.6948, Validation Accuracy

Epoch   5 Batch    1/269 - Train Accuracy: 0.7051, Validation Accuracy: 0.7118, Loss: 0.4567
Epoch   5 Batch    2/269 - Train Accuracy: 0.7002, Validation Accuracy: 0.7004, Loss: 0.4441
Epoch   5 Batch    3/269 - Train Accuracy: 0.7367, Validation Accuracy: 0.7136, Loss: 0.4512
Epoch   5 Batch    4/269 - Train Accuracy: 0.6798, Validation Accuracy: 0.7191, Loss: 0.4543
Epoch   5 Batch    5/269 - Train Accuracy: 0.6852, Validation Accuracy: 0.7031, Loss: 0.4529
Epoch   5 Batch    6/269 - Train Accuracy: 0.7537, Validation Accuracy: 0.7252, Loss: 0.4252
Epoch   5 Batch    7/269 - Train Accuracy: 0.7150, Validation Accuracy: 0.7203, Loss: 0.4372
Epoch   5 Batch    8/269 - Train Accuracy: 0.6814, Validation Accuracy: 0.7052, Loss: 0.4519
Epoch   5 Batch    9/269 - Train Accuracy: 0.6975, Validation Accuracy: 0.7178, Loss: 0.4442
Epoch   5 Batch   10/269 - Train Accuracy: 0.7044, Validation Accuracy: 0.7171, Loss: 0.4407
Epoch   5 Batch   11/269 - Train Accuracy: 0.7147, Validation Accuracy

Epoch   5 Batch   90/269 - Train Accuracy: 0.7064, Validation Accuracy: 0.7359, Loss: 0.4287
Epoch   5 Batch   91/269 - Train Accuracy: 0.7568, Validation Accuracy: 0.7343, Loss: 0.3914
Epoch   5 Batch   92/269 - Train Accuracy: 0.7631, Validation Accuracy: 0.7426, Loss: 0.3879
Epoch   5 Batch   93/269 - Train Accuracy: 0.7465, Validation Accuracy: 0.7415, Loss: 0.3857
Epoch   5 Batch   94/269 - Train Accuracy: 0.7391, Validation Accuracy: 0.7292, Loss: 0.4107
Epoch   5 Batch   95/269 - Train Accuracy: 0.7298, Validation Accuracy: 0.7349, Loss: 0.3993
Epoch   5 Batch   96/269 - Train Accuracy: 0.7300, Validation Accuracy: 0.7386, Loss: 0.4035
Epoch   5 Batch   97/269 - Train Accuracy: 0.7632, Validation Accuracy: 0.7485, Loss: 0.4020
Epoch   5 Batch   98/269 - Train Accuracy: 0.7468, Validation Accuracy: 0.7518, Loss: 0.3999
Epoch   5 Batch   99/269 - Train Accuracy: 0.7083, Validation Accuracy: 0.7129, Loss: 0.4133
Epoch   5 Batch  100/269 - Train Accuracy: 0.7514, Validation Accuracy

Epoch   5 Batch  179/269 - Train Accuracy: 0.7495, Validation Accuracy: 0.7572, Loss: 0.3815
Epoch   5 Batch  180/269 - Train Accuracy: 0.7677, Validation Accuracy: 0.7667, Loss: 0.3726
Epoch   5 Batch  181/269 - Train Accuracy: 0.7512, Validation Accuracy: 0.7458, Loss: 0.3783
Epoch   5 Batch  182/269 - Train Accuracy: 0.7733, Validation Accuracy: 0.7558, Loss: 0.3811
Epoch   5 Batch  183/269 - Train Accuracy: 0.8057, Validation Accuracy: 0.7684, Loss: 0.3301
Epoch   5 Batch  184/269 - Train Accuracy: 0.7256, Validation Accuracy: 0.7418, Loss: 0.3882
Epoch   5 Batch  185/269 - Train Accuracy: 0.7622, Validation Accuracy: 0.7702, Loss: 0.3765
Epoch   5 Batch  186/269 - Train Accuracy: 0.7640, Validation Accuracy: 0.7520, Loss: 0.3783
Epoch   5 Batch  187/269 - Train Accuracy: 0.7653, Validation Accuracy: 0.7535, Loss: 0.3779
Epoch   5 Batch  188/269 - Train Accuracy: 0.7514, Validation Accuracy: 0.7357, Loss: 0.3664
Epoch   5 Batch  189/269 - Train Accuracy: 0.7681, Validation Accuracy

Epoch   6 Batch    1/269 - Train Accuracy: 0.7552, Validation Accuracy: 0.7594, Loss: 0.3614
Epoch   6 Batch    2/269 - Train Accuracy: 0.7784, Validation Accuracy: 0.7851, Loss: 0.3551
Epoch   6 Batch    3/269 - Train Accuracy: 0.7951, Validation Accuracy: 0.7811, Loss: 0.3567
Epoch   6 Batch    4/269 - Train Accuracy: 0.7246, Validation Accuracy: 0.7523, Loss: 0.3620
Epoch   6 Batch    5/269 - Train Accuracy: 0.7675, Validation Accuracy: 0.7849, Loss: 0.3702
Epoch   6 Batch    6/269 - Train Accuracy: 0.8083, Validation Accuracy: 0.7846, Loss: 0.3335
Epoch   6 Batch    7/269 - Train Accuracy: 0.7800, Validation Accuracy: 0.7669, Loss: 0.3360
Epoch   6 Batch    8/269 - Train Accuracy: 0.7446, Validation Accuracy: 0.7733, Loss: 0.3632
Epoch   6 Batch    9/269 - Train Accuracy: 0.7776, Validation Accuracy: 0.7845, Loss: 0.3497
Epoch   6 Batch   10/269 - Train Accuracy: 0.7702, Validation Accuracy: 0.7765, Loss: 0.3492
Epoch   6 Batch   11/269 - Train Accuracy: 0.7773, Validation Accuracy

Epoch   6 Batch   90/269 - Train Accuracy: 0.7063, Validation Accuracy: 0.7547, Loss: 0.3468
Epoch   6 Batch   91/269 - Train Accuracy: 0.8081, Validation Accuracy: 0.7960, Loss: 0.3266
Epoch   6 Batch   92/269 - Train Accuracy: 0.8069, Validation Accuracy: 0.7915, Loss: 0.3134
Epoch   6 Batch   93/269 - Train Accuracy: 0.7781, Validation Accuracy: 0.7733, Loss: 0.3189
Epoch   6 Batch   94/269 - Train Accuracy: 0.7632, Validation Accuracy: 0.7652, Loss: 0.3427
Epoch   6 Batch   95/269 - Train Accuracy: 0.7832, Validation Accuracy: 0.7773, Loss: 0.3252
Epoch   6 Batch   96/269 - Train Accuracy: 0.7822, Validation Accuracy: 0.7900, Loss: 0.3356
Epoch   6 Batch   97/269 - Train Accuracy: 0.7943, Validation Accuracy: 0.7786, Loss: 0.3257
Epoch   6 Batch   98/269 - Train Accuracy: 0.7954, Validation Accuracy: 0.7909, Loss: 0.3235
Epoch   6 Batch   99/269 - Train Accuracy: 0.7899, Validation Accuracy: 0.8018, Loss: 0.3383
Epoch   6 Batch  100/269 - Train Accuracy: 0.7972, Validation Accuracy

Epoch   6 Batch  179/269 - Train Accuracy: 0.8070, Validation Accuracy: 0.8060, Loss: 0.3044
Epoch   6 Batch  180/269 - Train Accuracy: 0.8102, Validation Accuracy: 0.8001, Loss: 0.2976
Epoch   6 Batch  181/269 - Train Accuracy: 0.8102, Validation Accuracy: 0.8062, Loss: 0.3137
Epoch   6 Batch  182/269 - Train Accuracy: 0.8258, Validation Accuracy: 0.8060, Loss: 0.3052
Epoch   6 Batch  183/269 - Train Accuracy: 0.8258, Validation Accuracy: 0.8002, Loss: 0.2580
Epoch   6 Batch  184/269 - Train Accuracy: 0.7851, Validation Accuracy: 0.8050, Loss: 0.3138
Epoch   6 Batch  185/269 - Train Accuracy: 0.8108, Validation Accuracy: 0.7989, Loss: 0.3015
Epoch   6 Batch  186/269 - Train Accuracy: 0.8187, Validation Accuracy: 0.8120, Loss: 0.3089
Epoch   6 Batch  187/269 - Train Accuracy: 0.8203, Validation Accuracy: 0.8165, Loss: 0.2941
Epoch   6 Batch  188/269 - Train Accuracy: 0.8147, Validation Accuracy: 0.8106, Loss: 0.2865
Epoch   6 Batch  189/269 - Train Accuracy: 0.8165, Validation Accuracy

Epoch   7 Batch    1/269 - Train Accuracy: 0.8297, Validation Accuracy: 0.8132, Loss: 0.2896
Epoch   7 Batch    2/269 - Train Accuracy: 0.8116, Validation Accuracy: 0.8088, Loss: 0.2942
Epoch   7 Batch    3/269 - Train Accuracy: 0.8200, Validation Accuracy: 0.8041, Loss: 0.2856
Epoch   7 Batch    4/269 - Train Accuracy: 0.7938, Validation Accuracy: 0.8107, Loss: 0.2943
Epoch   7 Batch    5/269 - Train Accuracy: 0.8099, Validation Accuracy: 0.8166, Loss: 0.2912
Epoch   7 Batch    6/269 - Train Accuracy: 0.8326, Validation Accuracy: 0.8175, Loss: 0.2736
Epoch   7 Batch    7/269 - Train Accuracy: 0.8289, Validation Accuracy: 0.8242, Loss: 0.2798
Epoch   7 Batch    8/269 - Train Accuracy: 0.8171, Validation Accuracy: 0.8225, Loss: 0.2964
Epoch   7 Batch    9/269 - Train Accuracy: 0.8250, Validation Accuracy: 0.8268, Loss: 0.2862
Epoch   7 Batch   10/269 - Train Accuracy: 0.8373, Validation Accuracy: 0.8275, Loss: 0.2830
Epoch   7 Batch   11/269 - Train Accuracy: 0.8282, Validation Accuracy

Epoch   7 Batch   90/269 - Train Accuracy: 0.8104, Validation Accuracy: 0.8264, Loss: 0.2906
Epoch   7 Batch   91/269 - Train Accuracy: 0.8453, Validation Accuracy: 0.8257, Loss: 0.2550
Epoch   7 Batch   92/269 - Train Accuracy: 0.8649, Validation Accuracy: 0.8217, Loss: 0.2564
Epoch   7 Batch   93/269 - Train Accuracy: 0.8356, Validation Accuracy: 0.8273, Loss: 0.2543
Epoch   7 Batch   94/269 - Train Accuracy: 0.8262, Validation Accuracy: 0.8260, Loss: 0.2736
Epoch   7 Batch   95/269 - Train Accuracy: 0.8278, Validation Accuracy: 0.8334, Loss: 0.2622
Epoch   7 Batch   96/269 - Train Accuracy: 0.8101, Validation Accuracy: 0.8250, Loss: 0.2570
Epoch   7 Batch   97/269 - Train Accuracy: 0.8428, Validation Accuracy: 0.8353, Loss: 0.2613
Epoch   7 Batch   98/269 - Train Accuracy: 0.8471, Validation Accuracy: 0.8350, Loss: 0.2600
Epoch   7 Batch   99/269 - Train Accuracy: 0.8191, Validation Accuracy: 0.8310, Loss: 0.2707
Epoch   7 Batch  100/269 - Train Accuracy: 0.8548, Validation Accuracy

Epoch   7 Batch  179/269 - Train Accuracy: 0.8285, Validation Accuracy: 0.8353, Loss: 0.2463
Epoch   7 Batch  180/269 - Train Accuracy: 0.8492, Validation Accuracy: 0.8352, Loss: 0.2396
Epoch   7 Batch  181/269 - Train Accuracy: 0.8414, Validation Accuracy: 0.8488, Loss: 0.2530
Epoch   7 Batch  182/269 - Train Accuracy: 0.8628, Validation Accuracy: 0.8352, Loss: 0.2462
Epoch   7 Batch  183/269 - Train Accuracy: 0.8561, Validation Accuracy: 0.8324, Loss: 0.2108
Epoch   7 Batch  184/269 - Train Accuracy: 0.8285, Validation Accuracy: 0.8383, Loss: 0.2545
Epoch   7 Batch  185/269 - Train Accuracy: 0.8680, Validation Accuracy: 0.8417, Loss: 0.2468
Epoch   7 Batch  186/269 - Train Accuracy: 0.8415, Validation Accuracy: 0.8297, Loss: 0.2514
Epoch   7 Batch  187/269 - Train Accuracy: 0.8544, Validation Accuracy: 0.8352, Loss: 0.2430
Epoch   7 Batch  188/269 - Train Accuracy: 0.8612, Validation Accuracy: 0.8449, Loss: 0.2414
Epoch   7 Batch  189/269 - Train Accuracy: 0.8452, Validation Accuracy

Epoch   8 Batch    1/269 - Train Accuracy: 0.8721, Validation Accuracy: 0.8538, Loss: 0.2380
Epoch   8 Batch    2/269 - Train Accuracy: 0.8550, Validation Accuracy: 0.8537, Loss: 0.2399
Epoch   8 Batch    3/269 - Train Accuracy: 0.8589, Validation Accuracy: 0.8468, Loss: 0.2393
Epoch   8 Batch    4/269 - Train Accuracy: 0.8257, Validation Accuracy: 0.8477, Loss: 0.2443
Epoch   8 Batch    5/269 - Train Accuracy: 0.8511, Validation Accuracy: 0.8592, Loss: 0.2468
Epoch   8 Batch    6/269 - Train Accuracy: 0.8775, Validation Accuracy: 0.8611, Loss: 0.2264
Epoch   8 Batch    7/269 - Train Accuracy: 0.8650, Validation Accuracy: 0.8540, Loss: 0.2234
Epoch   8 Batch    8/269 - Train Accuracy: 0.8664, Validation Accuracy: 0.8589, Loss: 0.2390
Epoch   8 Batch    9/269 - Train Accuracy: 0.8584, Validation Accuracy: 0.8495, Loss: 0.2424
Epoch   8 Batch   10/269 - Train Accuracy: 0.8654, Validation Accuracy: 0.8560, Loss: 0.2236
Epoch   8 Batch   11/269 - Train Accuracy: 0.8724, Validation Accuracy

Epoch   8 Batch   90/269 - Train Accuracy: 0.8468, Validation Accuracy: 0.8598, Loss: 0.2331
Epoch   8 Batch   91/269 - Train Accuracy: 0.8839, Validation Accuracy: 0.8592, Loss: 0.2083
Epoch   8 Batch   92/269 - Train Accuracy: 0.8799, Validation Accuracy: 0.8481, Loss: 0.2046
Epoch   8 Batch   93/269 - Train Accuracy: 0.8632, Validation Accuracy: 0.8528, Loss: 0.2067
Epoch   8 Batch   94/269 - Train Accuracy: 0.8491, Validation Accuracy: 0.8567, Loss: 0.2341
Epoch   8 Batch   95/269 - Train Accuracy: 0.8658, Validation Accuracy: 0.8593, Loss: 0.2183
Epoch   8 Batch   96/269 - Train Accuracy: 0.8519, Validation Accuracy: 0.8559, Loss: 0.2124
Epoch   8 Batch   97/269 - Train Accuracy: 0.8630, Validation Accuracy: 0.8468, Loss: 0.2135
Epoch   8 Batch   98/269 - Train Accuracy: 0.8791, Validation Accuracy: 0.8597, Loss: 0.2180
Epoch   8 Batch   99/269 - Train Accuracy: 0.8609, Validation Accuracy: 0.8583, Loss: 0.2182
Epoch   8 Batch  100/269 - Train Accuracy: 0.8938, Validation Accuracy

Epoch   8 Batch  179/269 - Train Accuracy: 0.8542, Validation Accuracy: 0.8671, Loss: 0.2066
Epoch   8 Batch  180/269 - Train Accuracy: 0.8746, Validation Accuracy: 0.8673, Loss: 0.2052
Epoch   8 Batch  181/269 - Train Accuracy: 0.8597, Validation Accuracy: 0.8661, Loss: 0.2175
Epoch   8 Batch  182/269 - Train Accuracy: 0.8831, Validation Accuracy: 0.8584, Loss: 0.2046
Epoch   8 Batch  183/269 - Train Accuracy: 0.8920, Validation Accuracy: 0.8675, Loss: 0.1776
Epoch   8 Batch  184/269 - Train Accuracy: 0.8725, Validation Accuracy: 0.8746, Loss: 0.2105
Epoch   8 Batch  185/269 - Train Accuracy: 0.8875, Validation Accuracy: 0.8697, Loss: 0.2022
Epoch   8 Batch  186/269 - Train Accuracy: 0.8696, Validation Accuracy: 0.8722, Loss: 0.2034
Epoch   8 Batch  187/269 - Train Accuracy: 0.8721, Validation Accuracy: 0.8716, Loss: 0.2011
Epoch   8 Batch  188/269 - Train Accuracy: 0.8711, Validation Accuracy: 0.8673, Loss: 0.1989
Epoch   8 Batch  189/269 - Train Accuracy: 0.8756, Validation Accuracy

Epoch   9 Batch    1/269 - Train Accuracy: 0.8855, Validation Accuracy: 0.8709, Loss: 0.1933
Epoch   9 Batch    2/269 - Train Accuracy: 0.8742, Validation Accuracy: 0.8751, Loss: 0.1970
Epoch   9 Batch    3/269 - Train Accuracy: 0.8872, Validation Accuracy: 0.8753, Loss: 0.2040
Epoch   9 Batch    4/269 - Train Accuracy: 0.8487, Validation Accuracy: 0.8612, Loss: 0.2034
Epoch   9 Batch    5/269 - Train Accuracy: 0.8722, Validation Accuracy: 0.8815, Loss: 0.2076
Epoch   9 Batch    6/269 - Train Accuracy: 0.8948, Validation Accuracy: 0.8750, Loss: 0.1890
Epoch   9 Batch    7/269 - Train Accuracy: 0.8842, Validation Accuracy: 0.8737, Loss: 0.1867
Epoch   9 Batch    8/269 - Train Accuracy: 0.8889, Validation Accuracy: 0.8833, Loss: 0.1983
Epoch   9 Batch    9/269 - Train Accuracy: 0.8826, Validation Accuracy: 0.8818, Loss: 0.2028
Epoch   9 Batch   10/269 - Train Accuracy: 0.8878, Validation Accuracy: 0.8778, Loss: 0.1874
Epoch   9 Batch   11/269 - Train Accuracy: 0.8987, Validation Accuracy

Epoch   9 Batch   90/269 - Train Accuracy: 0.8724, Validation Accuracy: 0.8679, Loss: 0.1891
Epoch   9 Batch   91/269 - Train Accuracy: 0.8966, Validation Accuracy: 0.8730, Loss: 0.1723
Epoch   9 Batch   92/269 - Train Accuracy: 0.8931, Validation Accuracy: 0.8748, Loss: 0.1731
Epoch   9 Batch   93/269 - Train Accuracy: 0.8814, Validation Accuracy: 0.8711, Loss: 0.1749
Epoch   9 Batch   94/269 - Train Accuracy: 0.8760, Validation Accuracy: 0.8714, Loss: 0.1974
Epoch   9 Batch   95/269 - Train Accuracy: 0.8814, Validation Accuracy: 0.8764, Loss: 0.1785
Epoch   9 Batch   96/269 - Train Accuracy: 0.8687, Validation Accuracy: 0.8713, Loss: 0.1784
Epoch   9 Batch   97/269 - Train Accuracy: 0.8894, Validation Accuracy: 0.8761, Loss: 0.1724
Epoch   9 Batch   98/269 - Train Accuracy: 0.8968, Validation Accuracy: 0.8728, Loss: 0.1768
Epoch   9 Batch   99/269 - Train Accuracy: 0.8787, Validation Accuracy: 0.8722, Loss: 0.1768
Epoch   9 Batch  100/269 - Train Accuracy: 0.8988, Validation Accuracy

Epoch   9 Batch  179/269 - Train Accuracy: 0.8810, Validation Accuracy: 0.8824, Loss: 0.1756
Epoch   9 Batch  180/269 - Train Accuracy: 0.8879, Validation Accuracy: 0.8794, Loss: 0.1646
Epoch   9 Batch  181/269 - Train Accuracy: 0.8749, Validation Accuracy: 0.8802, Loss: 0.1798
Epoch   9 Batch  182/269 - Train Accuracy: 0.8948, Validation Accuracy: 0.8835, Loss: 0.1736
Epoch   9 Batch  183/269 - Train Accuracy: 0.8991, Validation Accuracy: 0.8846, Loss: 0.1465
Epoch   9 Batch  184/269 - Train Accuracy: 0.8846, Validation Accuracy: 0.8852, Loss: 0.1715
Epoch   9 Batch  185/269 - Train Accuracy: 0.9008, Validation Accuracy: 0.8809, Loss: 0.1716
Epoch   9 Batch  186/269 - Train Accuracy: 0.8862, Validation Accuracy: 0.8830, Loss: 0.1714
Epoch   9 Batch  187/269 - Train Accuracy: 0.8880, Validation Accuracy: 0.8781, Loss: 0.1684
Epoch   9 Batch  188/269 - Train Accuracy: 0.9009, Validation Accuracy: 0.8875, Loss: 0.1709
Epoch   9 Batch  189/269 - Train Accuracy: 0.9038, Validation Accuracy

Model Trained and Saved


### Save Parameters

In [18]:
helper.save_params(save_path)

# Checkpoint

In [17]:
import tensorflow as tf
import numpy as np
import helper

batch_size = 512

_, (source_vocab_to_int, target_vocab_to_int), (source_int_to_vocab, target_int_to_vocab) = helper.load_preprocess()
load_path = helper.load_params()

---
<a id='part5'></a>

## Test the Sentence to Sequence Model

In [15]:
def sentence_to_seq(sentence, vocab_to_int):
    """
    Convert a sentence to a sequence of ids
    :param sentence: String
    :param vocab_to_int: Dictionary to go from the words to an id
    :return: List of word ids
    """
    # TODO: Implement Function
    UNK_id = vocab_to_int['<UNK>']
    seq = []
    for word in sentence.lower().split():
        if word in vocab_to_int:
            seq.append(vocab_to_int[word])
        else:
            seq.append(UNK_id)
    return seq

## Translate from English to French

In [18]:
translate_sentence = 'he saw a old yellow truck .'

translate_sentence = sentence_to_seq(translate_sentence, source_vocab_to_int)

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(load_path + '.meta')
    loader.restore(sess, load_path)

    input_data = loaded_graph.get_tensor_by_name('input:0')
    logits = loaded_graph.get_tensor_by_name('predictions:0')
    target_sequence_length = loaded_graph.get_tensor_by_name('target_sequence_length:0')
    source_sequence_length = loaded_graph.get_tensor_by_name('source_sequence_length:0')
    keep_prob = loaded_graph.get_tensor_by_name('keep_prob:0')

    translate_logits = sess.run(logits, {input_data: [translate_sentence]*batch_size,
                                         target_sequence_length: [len(translate_sentence)*2]*batch_size,
                                         source_sequence_length: [len(translate_sentence)]*batch_size,
                                         keep_prob: 1.0})[0]

print('Input')
print('  Word Ids:      {}'.format([i for i in translate_sentence]))
print('  English Words: {}'.format([source_int_to_vocab[i] for i in translate_sentence]))

print('\nPrediction')
print('  Word Ids:      {}'.format([i for i in translate_logits]))
print('  French Words: {}'.format(" ".join([target_int_to_vocab[i] for i in translate_logits])))

INFO:tensorflow:Restoring parameters from checkpoints/dev
Input
  Word Ids:      [130, 190, 107, 91, 99, 127, 44]
  English Words: ['he', 'saw', 'a', 'old', 'yellow', 'truck', '.']

Prediction
  Word Ids:      [327, 260, 347, 222, 229, 130, 16, 1]
  French Words: il conduit ce vieux camion jaune . <EOS>


## Conclusion

Since the dataset only has a vocabulary of 227 English words, the translation is only good for limited number of sentences. To create a better translation model, we can train on the [WMT10 French-English corpus](http://www.statmt.org/wmt10/training-giga-fren.tar). This dataset has more vocabulary and richer in topics discussed but requires a much longer time to train.