# Creating a Spell Checker

[Original github repo](https://github.com/Currie32/Spell-Checker)

The objective of this project is to build a model that can take a sentence with spelling mistakes as input, and output the same sentence, but with the mistakes corrected. The data that we will use for this project will be twenty popular books from [Project Gutenberg](http://www.gutenberg.org/ebooks/search/?sort_order=downloads). Our model is designed using grid search to find the optimal architecture, and hyperparameter values. The best results, as measured by sequence loss with 15% of our data, were created using a two-layered network with a bi-direction RNN in the encoding layer and Bahdanau Attention in the decoding layer. [FloydHub's](https://www.floydhub.com/) GPU service was used to train the model.

The sections of the project are:
- Loading the Data
- Preparing the Data
- Building the Model
- Training the Model
- Fixing Custom Sentences
- Summary

In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
import random
import os
from os import listdir
from os.path import isfile, join
from collections import namedtuple
from tensorflow.python.layers.core import Dense
from tensorflow.python.ops.rnn_cell_impl import _zero_state_tensors
import time
from tqdm import tqdm
import re
from sklearn.model_selection import train_test_split

## MM

In [2]:
with open('data/rnn-mm-novel.txt', 'r') as handle:
    sentences = [l.replace(' ', '').replace('\n', '').replace('”', '').replace(
        '“', '').replace('\ufeff', '') for l in handle.readlines()]
print(len(sentences))
print(sentences[:3])

1603
['ကိုကို', 'ဘာလဲ', 'ကိုကို့ဟာကြီးခဏပိတ်လိုက်စမ်းပါ']


In [3]:
vocab_mm = list()
with open('data/chars.mm', 'r') as handle:
    for i, l in enumerate(handle.readlines()):
        if i % 2 == 0:
            vocab_mm.append(l.replace('\n', ''))
print(len(vocab_mm))
print(vocab_mm[-10:])

80
['၆', '၇', '၈', '၉', '၊', '။', '၌', '၍', '၎', '၏']


In [4]:
# char-int
vocab_to_int = {v: k for k, v in enumerate(vocab_mm)}
int_to_vocab = [v for v in vocab_mm]
codes = ['<PAD>', '<EOS>', '<GO~>', ' ', '…', '×', '-', '*', '.']
for i, c in zip(range(len(int_to_vocab), len(int_to_vocab) + len(codes), 1), codes):
    int_to_vocab.append(c)
    vocab_to_int[c] = i

In [5]:
# Convert sentences to integers

def sent_to_int(sentence, v2i):
    return [v2i[c] for c in sentence]

def int_to_sent(int_sent, i2v):
    return ''.join([i2v[i] for i in int_sent])

int_sentences = [sent_to_int(s, vocab_to_int) for s in sentences]

# Find the length of each sentence
lengths = pd.DataFrame([len(s) for s in int_sentences], columns=["counts"])

In [6]:
lengths.describe()

Unnamed: 0,counts
count,1603.0
mean,57.203993
std,62.575379
min,0.0
25%,10.0
50%,39.0
75%,83.0
max,493.0


In [7]:
# Limit the data we will use to train our model
max_length = 92
min_length = 10

good_sentences = []

for sentence in int_sentences:
    if len(sentence) <= max_length and len(sentence) >= min_length:
        good_sentences.append(sentence)

print("We will use {} to train and test our model.".format(len(good_sentences)))

We will use 878 to train and test our model.


In [8]:
# Split the data into training and testing sentences
training, testing = train_test_split(good_sentences, test_size = 0.15, random_state = 2)

print("Number of training sentences:", len(training))
print("Number of testing sentences:", len(testing))

Number of training sentences: 746
Number of testing sentences: 132


In [9]:
# Sort the sentences by length to reduce padding, which will allow the model to train faster
training_sorted = []
testing_sorted = []

for i in range(min_length, max_length+1):
    for sentence in training:
        if len(sentence) == i:
            training_sorted.append(sentence)
    for sentence in testing:
        if len(sentence) == i:
            testing_sorted.append(sentence)

# Check to ensure the sentences have been selected and sorted correctly
for i in range(5):
    print(training_sorted[i], len(training_sorted[i]))

[20, 4, 58, 10, 44, 16, 44, 28, 44, 56] 10
[16, 61, 49, 55, 1, 50, 55, 28, 44, 56] 10
[6, 45, 47, 84, 24, 44, 0, 45, 16, 58] 10
[33, 44, 56, 84, 20, 44, 56, 16, 26, 58] 10
[0, 45, 47, 56, 20, 44, 27, 46, 21, 43] 10


In [10]:
letters = list(vocab_mm)


def noise_maker(sentences=[sent_to_int('သရဖီဘုန်းဇော်၏မျက်လုံးဝိုင်းလေးများသည်ဝင်းလက်သွားပြီးပြတင်းပေါက်ဆီဘက်သို့ပြေးသွားသည်။', vocab_to_int)],
                error=True, omission=True, repetition=True,
                misspell_proba=0.25, omission_proba=0.15, repetition_proba=0.15,
                verbose=False, v2i=vocab_to_int):
    # there are currently 3 types of errors
    # misspelled errors: ပ်  က်  တ်, ခင်ဗျား ခများ
    # error by omission: း, ေကာင်
    # repetition: းးးးးးးး

    result = list()

    if False:
        error_file = './data/spelling-errors/errors.mm.txt'
        misspelled = {}
        with open(error_file, 'r') as handle:
            for line in handle:
                elements = re.split(r'[ ]+', line.replace('\n', ''))
                misspelled[v2i[elements[0]]] = [v2i[e] for e in elements[1:]]

    omission_file = './data/spelling-errors/omissions.mm.txt'
    with open(omission_file, 'r') as handle:
        omissions = [v2i[l.replace('\n', '')] for l in handle.readlines()]

    repetition_file = './data/spelling-errors/repetitions.mm.txt'
    with open(repetition_file, 'r') as handle:
        repetitions = [v2i[l.replace('\n', '')] for l in handle.readlines()]

    #import ipdb; ipdb.set_trace()
    for sentence in sentences:
        if False and error:
            # misspelled
            misspelled = {}
            misspell_match_count = 0
            misspell_noise_count = 0
            for err in list(misspelled.keys()):
                if err in sentence:
                    if verbose:
                        print('Match found:', err)
                    misspell_match_count += 1
                    if random.random() < misspell_proba:
                        misspell_noise_count += 1
                        error_choice = random.randint(
                            0, len(misspelled[err])-1)
                        if verbose:
                            print('Replacing {} with {}'.format(
                                misspelled[err][error_choice], err))
                        sentence = sentence.replace(
                            err, misspelled[err][error_choice])
            if verbose:
                print('*'*100)
                print('Misspell matches:', misspell_match_count)
                print('Misspell noise added:', misspell_noise_count)
                if misspell_noise_count > 0:
                    print(sentence)
                print('*'*100)

        if omission:
            # omissions
            omission_match_count = 0
            omission_noise_count = 0
            for om in omissions:
                if om in sentence:
                    # if the character is in the sentence,
                    # loop all appearances and remove with probability
                    match_indices = np.where(om == np.array(sentence))[0]
                    omission_match_count += match_indices.shape[0]
                    for indx in match_indices:
                        indx -= omission_noise_count
                        if random.random() < omission_proba:
                            if verbose:
                                print('Removing:', om)
                            # replace with a space to preserve length
                            sentence = sentence[:indx] + sentence[indx+1:]
                            omission_noise_count += 1
            if verbose:
                print('*'*100)
                print('Omission matches:', omission_match_count)
                print('Omission noise added:', omission_noise_count)
                if omission_noise_count > 0:
                    print(sentence)
                print('*'*100)

        if repetition:
            # repetitions
            repetition_match_count = 0
            repetition_noise_count = 0
            for ch in repetitions:
                if ch in sentence:
                    # if the character is in the sentence,
                    # loop all appearances and repeat with probability
                    match_indices = np.where(ch == np.array(sentence))[0]
                    repetition_match_count += match_indices.shape[0]
                    n_added = 0
                    for indx in match_indices:
                        indx += repetition_noise_count
                        if random.random() < repetition_proba:
                            if verbose:
                                print('Repeating:', ch)
                            # add 1 to indices for adding a new character
                            sentence = sentence[:indx] + [ch] + sentence[indx:]
                            repetition_noise_count += 1
            if verbose:
                print('*'*100)
                print('Repetition matches:', repetition_match_count)
                print('Repetition noise added:', repetition_noise_count)
                if repetition_noise_count > 0:
                    print(sentence)
                print('*'*100)
        if -1 in sentence:
            sentence.remove(-1)
        result.append(sentence)
    return result


if False:
    def noise_maker(sentence, threshold):
        '''Relocate, remove, or add characters to create spelling mistakes'''
        noisy_sentence = []
        i = 0
        while i < len(sentence):
            random = np.random.uniform(0, 1, 1)
            # Most characters will be correct since the threshold value is high
            if random < threshold:
                noisy_sentence.append(sentence[i])
            else:
                new_random = np.random.uniform(0, 1, 1)
                # ~33% chance characters will swap locations
                if new_random > 0.67:
                    if i == (len(sentence) - 1):
                        # If last character in sentence, it will not be typed
                        continue
                    else:
                        # if any other character, swap order with following character
                        noisy_sentence.append(sentence[i+1])
                        noisy_sentence.append(sentence[i])
                        i += 1
                # ~33% chance an extra lower case letter will be added to the sentence
                elif new_random < 0.33:
                    random_letter = np.random.choice(letters, 1)[0]
                    noisy_sentence.append(vocab_to_int[random_letter])
                    noisy_sentence.append(sentence[i])
                # ~33% chance a character will not be typed
                else:
                    pass
            i += 1
        return noisy_sentence

In [11]:
# Check to ensure noise_maker is making mistakes correctly.
noise_maker(sentences=training_sorted[:10])
print()




*Note: The noise_maker function is used to create spelling mistakes that are similar to those we would make. Sometimes we forget to type a letter, type a letter in the wrong location, or add an extra letter.*

# Building the Model

In [12]:
def model_inputs():
    '''Create palceholders for inputs to the model'''
    
    with tf.name_scope('inputs'):
        inputs = tf.placeholder(tf.int32, [None, None], name='inputs')
    with tf.name_scope('targets'):
        targets = tf.placeholder(tf.int32, [None, None], name='targets')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    inputs_length = tf.placeholder(tf.int32, (None,), name='inputs_length')
    targets_length = tf.placeholder(tf.int32, (None,), name='targets_length')
    max_target_length = tf.reduce_max(targets_length, name='max_target_len')

    return inputs, targets, keep_prob, inputs_length, targets_length, max_target_length

In [13]:
def process_encoding_input(targets, vocab_to_int, batch_size):
    '''Remove the last word id from each batch and concat the <GO~> to the begining of each batch'''
    
    with tf.name_scope("process_encoding"):
        ending = tf.strided_slice(targets, [0, 0], [batch_size, -1], [1, 1])
        dec_input = tf.concat([tf.fill([batch_size, 1], vocab_to_int['<GO~>']), ending], 1)

    return dec_input

In [14]:
def encoding_layer(rnn_size, sequence_length, num_layers, rnn_inputs, keep_prob, direction):
    '''Create the encoding layer'''
    
    if direction == 1:
        with tf.name_scope("RNN_Encoder_Cell_1D"):
            for layer in range(num_layers):
                with tf.variable_scope('encoder_{}'.format(layer)):
                    lstm = tf.contrib.rnn.LSTMCell(rnn_size)

                    drop = tf.contrib.rnn.DropoutWrapper(lstm, 
                                                         input_keep_prob = keep_prob)

                    enc_output, enc_state = tf.nn.dynamic_rnn(drop, 
                                                              rnn_inputs,
                                                              sequence_length,
                                                              dtype=tf.float32)

            return enc_output, enc_state
        
        
    if direction == 2: # bi-lstm
        with tf.name_scope("RNN_Encoder_Cell_2D"):
            for layer in range(num_layers):
                with tf.variable_scope('encoder_{}'.format(layer)):
                    cell_fw = tf.contrib.rnn.LSTMCell(rnn_size)
                    cell_fw = tf.contrib.rnn.DropoutWrapper(cell_fw, 
                                                            input_keep_prob = keep_prob)

                    cell_bw = tf.contrib.rnn.LSTMCell(rnn_size)
                    cell_bw = tf.contrib.rnn.DropoutWrapper(cell_bw, 
                                                            input_keep_prob = keep_prob)

                    enc_output, enc_state = tf.nn.bidirectional_dynamic_rnn(cell_fw, 
                                                                            cell_bw, 
                                                                            rnn_inputs,
                                                                            sequence_length,
                                                                            dtype=tf.float32)
            # Join outputs since we are using a bidirectional RNN
            enc_output = tf.concat(enc_output,2)
            # Use only the forward state because the model can't use both states at once
            return enc_output, enc_state[0]

In [15]:
def training_decoding_layer(dec_embed_input, targets_length, dec_cell, initial_state, output_layer, 
                            vocab_size, max_target_length):
    '''Create the training logits'''
    
    with tf.name_scope("Training_Decoder"):
        training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=dec_embed_input,
                                                            sequence_length=targets_length,
                                                            time_major=False)

        training_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                           training_helper,
                                                           initial_state,
                                                           output_layer) 

        training_logits, _ = tf.contrib.seq2seq.dynamic_decode(training_decoder,
                                                               output_time_major=False,
                                                               impute_finished=True,
                                                               maximum_iterations=max_target_length)
        return training_logits

In [16]:
def inference_decoding_layer(embeddings, start_token, end_token, dec_cell, initial_state, output_layer,
                             max_target_length, batch_size):
    '''Create the inference logits'''
    
    with tf.name_scope("Inference_Decoder"):
        start_tokens = tf.tile(tf.constant([start_token], dtype=tf.int32), [batch_size], name='start_tokens')

        inference_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(embeddings,
                                                                    start_tokens,
                                                                    end_token)

        inference_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                            inference_helper,
                                                            initial_state,
                                                            output_layer)

        inference_logits, _ = tf.contrib.seq2seq.dynamic_decode(inference_decoder,
                                                                output_time_major=False,
                                                                impute_finished=True,
                                                                maximum_iterations=max_target_length)

        return inference_logits

In [17]:
def decoding_layer(dec_embed_input, embeddings, enc_output, enc_state, vocab_size, inputs_length, targets_length, 
                   max_target_length, rnn_size, vocab_to_int, keep_prob, batch_size, num_layers, direction):
    '''Create the decoding cell and attention for the training and inference decoding layers'''
    
    with tf.name_scope("RNN_Decoder_Cell"):
        for layer in range(num_layers):
            with tf.variable_scope('decoder_{}'.format(layer)):
                lstm = tf.contrib.rnn.LSTMCell(rnn_size)
                dec_cell = tf.contrib.rnn.DropoutWrapper(lstm, 
                                                         input_keep_prob = keep_prob)
    
    output_layer = Dense(vocab_size,
                         kernel_initializer = tf.truncated_normal_initializer(mean = 0.0, stddev=0.1))
    
    attn_mech = tf.contrib.seq2seq.BahdanauAttention(rnn_size,
                                                  enc_output,
                                                  inputs_length,
                                                  normalize=False,
                                                  name='BahdanauAttention')
    
    with tf.name_scope("Attention_Wrapper"):
        dec_cell = tf.contrib.seq2seq.DynamicAttentionWrapper(dec_cell,
                                                              attn_mech,
                                                              rnn_size)
    
    initial_state = tf.contrib.seq2seq.DynamicAttentionWrapperState(enc_state,
                                                                    _zero_state_tensors(rnn_size, 
                                                                                        batch_size, 
                                                                                        tf.float32))

    with tf.variable_scope("decode"):
        training_logits = training_decoding_layer(dec_embed_input, 
                                                  targets_length, 
                                                  dec_cell, 
                                                  initial_state,
                                                  output_layer,
                                                  vocab_size, 
                                                  max_target_length)
    with tf.variable_scope("decode", reuse=True):
        inference_logits = inference_decoding_layer(embeddings,  
                                                    vocab_to_int['<GO~>'], 
                                                    vocab_to_int['<EOS>'],
                                                    dec_cell, 
                                                    initial_state, 
                                                    output_layer,
                                                    max_target_length,
                                                    batch_size)

    return training_logits, inference_logits

In [18]:
def seq2seq_model(inputs, targets, keep_prob, inputs_length, targets_length, max_target_length, 
                  vocab_size, rnn_size, num_layers, vocab_to_int, batch_size, embedding_size, direction):
    '''Use the previous functions to create the training and inference logits'''
    
    enc_embeddings = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1, 1))
    enc_embed_input = tf.nn.embedding_lookup(enc_embeddings, inputs)
    enc_output, enc_state = encoding_layer(rnn_size, inputs_length, num_layers, 
                                           enc_embed_input, keep_prob, direction)
    
    dec_embeddings = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1, 1))
    dec_input = process_encoding_input(targets, vocab_to_int, batch_size)
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)
    
    training_logits, inference_logits  = decoding_layer(dec_embed_input, 
                                                        dec_embeddings,
                                                        enc_output,
                                                        enc_state, 
                                                        vocab_size, 
                                                        inputs_length, 
                                                        targets_length, 
                                                        max_target_length,
                                                        rnn_size, 
                                                        vocab_to_int, 
                                                        keep_prob, 
                                                        batch_size,
                                                        num_layers,
                                                        direction)
    
    return training_logits, inference_logits

In [19]:
def pad_sentence_batch(sentence_batch):
    """Pad sentences with <PAD> so that each sentence of a batch has the same length"""
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [vocab_to_int['<PAD>']] * (max_sentence - len(sentence)) for sentence in sentence_batch]

In [20]:
def get_batches(sentences, batch_size, threshold):
    """Batch sentences, noisy sentences, and the lengths of their sentences together.
       With each epoch, sentences will receive new mistakes"""
    
    for batch_i in range(0, len(sentences)//batch_size):
        start_i = batch_i * batch_size
        sentences_batch = sentences[start_i:start_i + batch_size]
        
        sentences_batch_noisy = []
        for sentence in sentences_batch:
            sentences_batch_noisy.append(noise_maker(sentences=[sentence])[0])
            
        sentences_batch_eos = []
        for sentence in sentences_batch:
            sentence.append(vocab_to_int['<EOS>'])
            sentences_batch_eos.append(sentence)
            
        pad_sentences_batch = np.array(pad_sentence_batch(sentences_batch_eos))
        pad_sentences_noisy_batch = np.array(pad_sentence_batch(sentences_batch_noisy))
        
        # Need the lengths for the _lengths parameters
        pad_sentences_lengths = []
        for sentence in pad_sentences_batch:
            pad_sentences_lengths.append(len(sentence))
        
        pad_sentences_noisy_lengths = []
        for sentence in pad_sentences_noisy_batch:
            pad_sentences_noisy_lengths.append(len(sentence))
        
        yield pad_sentences_noisy_batch, pad_sentences_batch, pad_sentences_noisy_lengths, pad_sentences_lengths

*Note: This set of values achieved the best results.*

In [21]:
def build_graph(keep_prob, rnn_size, num_layers, batch_size, learning_rate, embedding_size, direction):

    tf.reset_default_graph()
    
    # Load the model inputs    
    inputs, targets, keep_prob, inputs_length, targets_length, max_target_length = model_inputs()

    # Create the training and inference logits
    training_logits, inference_logits = seq2seq_model(tf.reverse(inputs, [-1]),
                                                      targets, 
                                                      keep_prob,   
                                                      inputs_length,
                                                      targets_length,
                                                      max_target_length,
                                                      len(vocab_to_int)+1,
                                                      rnn_size, 
                                                      num_layers, 
                                                      vocab_to_int,
                                                      batch_size,
                                                      embedding_size,
                                                      direction)

    # Create tensors for the training logits and inference logits
    training_logits = tf.identity(training_logits.rnn_output, 'logits')

    with tf.name_scope('predictions'):
        predictions = tf.identity(inference_logits.sample_id, name='predictions')
        tf.summary.histogram('predictions', predictions)

    # Create the weights for sequence_loss
    masks = tf.sequence_mask(targets_length, max_target_length, dtype=tf.float32, name='masks')
    
    with tf.name_scope("cost"):
        # Loss function
        cost = tf.contrib.seq2seq.sequence_loss(training_logits, 
                                                targets, 
                                                masks)
        tf.summary.scalar('cost', cost)

    with tf.name_scope("optimze"):
        optimizer = tf.train.AdamOptimizer(learning_rate)

        # Gradient Clipping
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -5., 5.), var) for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)

    # Merge all of the summaries
    merged = tf.summary.merge_all()    

    # Export the nodes 
    export_nodes = ['inputs', 'targets', 'keep_prob', 'cost', 'inputs_length', 'targets_length',
                    'predictions', 'merged', 'train_op','optimizer']
    Graph = namedtuple('Graph', export_nodes)
    local_dict = locals()
    graph = Graph(*[local_dict[each] for each in export_nodes])

    return graph

## Training the Model

In [22]:
def train(model, epochs, log_string):
    '''Train the RNN'''

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        # Used to determine when to stop the training early
        testing_loss_summary = []

        # Keep track of which batch iteration is being trained
        iteration = 0

        display_step = 30  # The progress of the training will be displayed after every 30 batches
        stop_early = 0
        stop = 3e9  # If the batch_loss_testing does not decrease in 3 consecutive checks, stop training
        per_epoch = 3  # Test the model 3 times per epoch
        testing_check = (len(training_sorted)//batch_size//per_epoch)-1

        print()
        print("Training Model: {}".format(log_string))

        train_writer = tf.summary.FileWriter(
            './logs/1/train/{}'.format(log_string), sess.graph)
        test_writer = tf.summary.FileWriter(
            './logs/1/test/{}'.format(log_string))

        for epoch_i in range(1, epochs+1):
            print('-'*20, 'epoch', epoch_i, 'of', epochs, '-'*20)
            batch_loss = 0
            batch_time = 0

            for batch_i, (input_batch, target_batch, input_length, target_length) in tqdm(enumerate(
                    get_batches(training_sorted, batch_size, threshold))):
                start_time = time.time()
                
                summary, loss, _ = sess.run([model.merged,
                                             model.cost,
                                             model.train_op],
                                            {model.inputs: input_batch,
                                             model.targets: target_batch,
                                             model.inputs_length: input_length,
                                             model.targets_length: target_length,
                                             model.keep_prob: keep_probability})

                batch_loss += loss
                end_time = time.time()
                batch_time += end_time - start_time

                # Record the progress of training
                train_writer.add_summary(summary, iteration)

                iteration += 1

                if batch_i % display_step == 0 and batch_i > 0:
                    print('Epoch {:>3}/{} Batch {:>4}/{} - Loss: {:>6.3f}, Seconds: {:>4.2f}'
                          .format(epoch_i,
                                  epochs,
                                  batch_i,
                                  len(training_sorted) // batch_size,
                                  batch_loss / display_step,
                                  batch_time))
                    batch_loss = 0
                    batch_time = 0

                #### Testing ####
                if batch_i % testing_check == 0 and batch_i > 0:
                    batch_loss_testing = 0
                    batch_time_testing = 0
                    for batch_i, (input_batch, target_batch, input_length, target_length) in enumerate(
                            get_batches(testing_sorted, batch_size, threshold)):
                        start_time_testing = time.time()
                        summary, loss = sess.run([model.merged,
                                                  model.cost],
                                                 {model.inputs: input_batch,
                                                  model.targets: target_batch,
                                                  model.inputs_length: input_length,
                                                  model.targets_length: target_length,
                                                  model.keep_prob: 1})

                        batch_loss_testing += loss
                        end_time_testing = time.time()
                        batch_time_testing += end_time_testing - start_time_testing

                        # Record the progress of testing
                        test_writer.add_summary(summary, iteration)

                    n_batches_testing = batch_i + 1
                    print('Testing Loss: {:>6.3f}, Seconds: {:>4.2f}'
                          .format(batch_loss_testing / n_batches_testing,
                                  batch_time_testing))

                    batch_time_testing = 0

                    # If the batch_loss_testing is at a new minimum, save the model
                    testing_loss_summary.append(batch_loss_testing)
                    if batch_loss_testing <= min(testing_loss_summary):
                        print('New Record!')
                        stop_early = 0
                        checkpoint = "./{}.ckpt".format(log_string)
                        saver = tf.train.Saver()
                        saver.save(sess, checkpoint)
                    else:
                        print("No Improvement.")
                        stop_early += 1
                        if stop_early == stop:
                            break

            if stop_early == stop:
                print("Stopping Training.")
                break

In [23]:
# The default parameters
epochs = 100
batch_size = 32
num_layers = 2
rnn_size = 512
embedding_size = 64
learning_rate = 5e-3
direction = 2
threshold = 0.8
keep_probability = 0.75

In [None]:
# Train the model with the desired tuning parameters
for keep_probability in [0.75]:
    for num_layers in [2]:
        for threshold in [0.95]:
            log_string = 'kp={},nl={},th={}'.format(keep_probability,
                                                    num_layers,
                                                    threshold)
            model = build_graph(keep_probability, rnn_size, num_layers, batch_size,
                                learning_rate, embedding_size, direction)
            train(model, epochs, log_string)

  from ._conv import register_converters as _register_converters



Training Model: kp=0.75,nl=2,th=0.95


0it [00:00, ?it/s]

-------------------- epoch 1 of 100 --------------------


6it [00:11,  1.97s/it]

Testing Loss:  8.284, Seconds: 16.74
New Record!


12it [00:54,  4.58s/it]

Testing Loss:  3.747, Seconds: 16.98
New Record!


18it [02:02,  6.81s/it]

Testing Loss:  3.531, Seconds: 16.69
New Record!


23it [03:39,  9.54s/it]
0it [00:00, ?it/s]

-------------------- epoch 2 of 100 --------------------


6it [00:11,  1.89s/it]

Testing Loss:  2.906, Seconds: 14.50
New Record!


12it [00:52,  4.39s/it]

Testing Loss:  2.702, Seconds: 15.42
New Record!


18it [02:04,  6.91s/it]

Testing Loss:  2.692, Seconds: 18.11
New Record!


23it [03:36,  9.39s/it]
0it [00:00, ?it/s]

-------------------- epoch 3 of 100 --------------------


6it [00:11,  1.91s/it]

Testing Loss:  2.119, Seconds: 17.61
New Record!


13it [01:21,  6.25s/it]

Testing Loss:  2.167, Seconds: 17.61
No Improvement.


19it [02:37,  8.27s/it]

Testing Loss:  2.261, Seconds: 19.59
No Improvement.


23it [03:38,  9.50s/it]
0it [00:00, ?it/s]

-------------------- epoch 4 of 100 --------------------


7it [00:29,  4.18s/it]

Testing Loss:  2.154, Seconds: 13.43
No Improvement.


12it [00:53,  4.43s/it]

Testing Loss:  2.112, Seconds: 26.58
New Record!


18it [02:20,  7.80s/it]

Testing Loss:  1.971, Seconds: 18.13
New Record!


23it [03:52, 10.10s/it]
0it [00:00, ?it/s]

-------------------- epoch 5 of 100 --------------------


7it [00:33,  4.72s/it]

Testing Loss:  2.525, Seconds: 16.38
No Improvement.


13it [01:27,  6.71s/it]

Testing Loss:  2.504, Seconds: 23.03
No Improvement.


19it [02:50,  8.98s/it]

Testing Loss:  2.781, Seconds: 22.98
No Improvement.


23it [03:55, 10.25s/it]
0it [00:00, ?it/s]

-------------------- epoch 6 of 100 --------------------


7it [00:36,  5.21s/it]

Testing Loss:  2.380, Seconds: 18.76
No Improvement.


13it [01:33,  7.16s/it]

Testing Loss:  2.507, Seconds: 23.75
No Improvement.


19it [03:02,  9.60s/it]

Testing Loss:  2.721, Seconds: 25.05
No Improvement.


23it [04:07, 10.78s/it]
0it [00:00, ?it/s]

-------------------- epoch 7 of 100 --------------------


7it [00:38,  5.54s/it]

Testing Loss:  3.304, Seconds: 20.19
No Improvement.


13it [01:37,  7.52s/it]

Testing Loss:  3.158, Seconds: 25.02
No Improvement.


19it [03:05,  9.75s/it]

Testing Loss:  2.433, Seconds: 25.19
No Improvement.


23it [04:12, 10.99s/it]
0it [00:00, ?it/s]

-------------------- epoch 8 of 100 --------------------


6it [00:16,  2.71s/it]

Testing Loss:  1.582, Seconds: 23.79
New Record!


13it [01:45,  8.08s/it]

Testing Loss:  1.775, Seconds: 24.28
No Improvement.


19it [03:10, 10.02s/it]

Testing Loss:  2.278, Seconds: 23.46
No Improvement.


23it [04:16, 11.14s/it]
0it [00:00, ?it/s]

-------------------- epoch 9 of 100 --------------------


7it [00:46,  6.65s/it]

Testing Loss:  3.128, Seconds: 24.21
No Improvement.


13it [01:48,  8.37s/it]

Testing Loss:  2.067, Seconds: 24.42
No Improvement.


19it [03:18, 10.46s/it]

Testing Loss:  1.885, Seconds: 26.07
No Improvement.


23it [04:37, 12.09s/it]
0it [00:00, ?it/s]

-------------------- epoch 10 of 100 --------------------


7it [00:45,  6.56s/it]

Testing Loss:  2.544, Seconds: 24.69
No Improvement.


13it [01:47,  8.30s/it]

Testing Loss:  3.370, Seconds: 24.98
No Improvement.


19it [03:18, 10.46s/it]

Testing Loss:  4.302, Seconds: 25.61
No Improvement.


23it [04:29, 11.70s/it]
0it [00:00, ?it/s]

-------------------- epoch 11 of 100 --------------------


6it [00:17,  2.97s/it]

Testing Loss:  1.328, Seconds: 26.22
New Record!


13it [01:53,  8.73s/it]

Testing Loss:  1.409, Seconds: 25.89
No Improvement.


19it [03:26, 10.89s/it]

Testing Loss:  1.876, Seconds: 26.61
No Improvement.


23it [04:37, 12.05s/it]
0it [00:00, ?it/s]

-------------------- epoch 12 of 100 --------------------


7it [00:49,  7.02s/it]

Testing Loss:  1.974, Seconds: 26.32
No Improvement.


13it [01:55,  8.89s/it]

Testing Loss:  2.151, Seconds: 27.30
No Improvement.


19it [03:34, 11.26s/it]

Testing Loss:  2.032, Seconds: 28.93
No Improvement.


23it [04:55, 12.87s/it]
0it [00:00, ?it/s]

-------------------- epoch 13 of 100 --------------------


7it [00:54,  7.75s/it]

Testing Loss:  1.736, Seconds: 29.04
No Improvement.


12it [01:25,  7.14s/it]

Testing Loss:  1.292, Seconds: 28.27
New Record!


19it [03:42, 11.73s/it]

Testing Loss:  1.805, Seconds: 29.40
No Improvement.


23it [04:54, 12.82s/it]
0it [00:00, ?it/s]

-------------------- epoch 14 of 100 --------------------


6it [00:20,  3.38s/it]

Testing Loss:  1.155, Seconds: 30.34
New Record!


13it [02:07,  9.85s/it]

Testing Loss:  1.712, Seconds: 29.67
No Improvement.


19it [03:52, 12.22s/it]

Testing Loss:  1.962, Seconds: 32.96
No Improvement.


23it [05:06, 13.34s/it]
0it [00:00, ?it/s]

-------------------- epoch 15 of 100 --------------------


6it [00:20,  3.48s/it]

Testing Loss:  1.098, Seconds: 31.51
New Record!


13it [02:15, 10.44s/it]

Testing Loss:  2.079, Seconds: 33.52
No Improvement.


19it [04:11, 13.25s/it]

Testing Loss:  2.542, Seconds: 32.87
No Improvement.


23it [05:26, 14.19s/it]
0it [00:00, ?it/s]

-------------------- epoch 16 of 100 --------------------


7it [01:09,  9.94s/it]

Testing Loss:  1.200, Seconds: 42.71
No Improvement.


13it [02:26, 11.31s/it]

Testing Loss:  2.217, Seconds: 33.56
No Improvement.


19it [04:14, 13.37s/it]

Testing Loss:  2.286, Seconds: 33.58
No Improvement.


23it [05:31, 14.41s/it]
0it [00:00, ?it/s]

-------------------- epoch 17 of 100 --------------------


6it [00:22,  3.72s/it]

Testing Loss:  0.886, Seconds: 33.93
New Record!


13it [02:21, 10.92s/it]

Testing Loss:  1.631, Seconds: 34.69
No Improvement.


19it [04:13, 13.34s/it]

Testing Loss:  0.898, Seconds: 36.09
No Improvement.


23it [05:30, 14.36s/it]
0it [00:00, ?it/s]

-------------------- epoch 18 of 100 --------------------


6it [00:23,  3.88s/it]

Testing Loss:  0.813, Seconds: 34.87
New Record!


13it [02:26, 11.25s/it]

Testing Loss:  0.815, Seconds: 35.67
No Improvement.


19it [04:20, 13.69s/it]

Testing Loss:  0.854, Seconds: 36.61
No Improvement.


23it [05:38, 14.73s/it]
0it [00:00, ?it/s]

-------------------- epoch 19 of 100 --------------------


6it [00:24,  4.00s/it]

Testing Loss:  0.791, Seconds: 35.83
New Record!


13it [02:30, 11.59s/it]

Testing Loss:  0.804, Seconds: 37.26
No Improvement.


19it [04:25, 13.99s/it]

Testing Loss:  0.792, Seconds: 37.37
No Improvement.


23it [05:44, 14.99s/it]
0it [00:00, ?it/s]

-------------------- epoch 20 of 100 --------------------


6it [00:24,  4.15s/it]

Testing Loss:  0.789, Seconds: 38.14
New Record!


12it [01:47,  9.00s/it]

Testing Loss:  0.783, Seconds: 38.92
New Record!


19it [04:39, 14.73s/it]

Testing Loss:  1.004, Seconds: 39.29
No Improvement.


23it [06:01, 15.72s/it]
0it [00:00, ?it/s]

-------------------- epoch 21 of 100 --------------------


7it [01:13, 10.46s/it]

Testing Loss:  1.180, Seconds: 41.86
No Improvement.


13it [02:42, 12.49s/it]

Testing Loss:  1.204, Seconds: 39.68
No Improvement.


19it [04:43, 14.93s/it]

Testing Loss:  1.180, Seconds: 40.83
No Improvement.


23it [06:15, 16.33s/it]
0it [00:00, ?it/s]

-------------------- epoch 22 of 100 --------------------


7it [01:13, 10.46s/it]

Testing Loss:  0.804, Seconds: 40.52
No Improvement.


13it [02:48, 13.00s/it]

Testing Loss:  0.804, Seconds: 44.65
No Improvement.


18it [03:54, 13.02s/it]

Testing Loss:  0.778, Seconds: 42.73
New Record!


23it [06:19, 16.50s/it]
0it [00:00, ?it/s]

-------------------- epoch 23 of 100 --------------------


6it [00:27,  4.62s/it]

Testing Loss:  0.764, Seconds: 42.11
New Record!


13it [02:51, 13.18s/it]

Testing Loss:  0.943, Seconds: 42.36
No Improvement.


19it [04:58, 15.73s/it]

Testing Loss:  1.007, Seconds: 43.32
No Improvement.


23it [06:23, 16.67s/it]
0it [00:00, ?it/s]

-------------------- epoch 24 of 100 --------------------


7it [01:23, 11.86s/it]

Testing Loss:  0.852, Seconds: 44.23
No Improvement.


13it [03:00, 13.88s/it]

Testing Loss:  1.015, Seconds: 44.77
No Improvement.


19it [05:08, 16.21s/it]

Testing Loss:  1.029, Seconds: 44.46
No Improvement.


23it [06:32, 17.07s/it]
0it [00:00, ?it/s]

-------------------- epoch 25 of 100 --------------------


7it [01:21, 11.65s/it]

Testing Loss:  1.092, Seconds: 46.06
No Improvement.


13it [03:00, 13.88s/it]

Testing Loss:  0.796, Seconds: 45.40
No Improvement.


19it [05:12, 16.46s/it]

Testing Loss:  0.774, Seconds: 45.68
No Improvement.


23it [06:51, 17.89s/it]
0it [00:00, ?it/s]

-------------------- epoch 26 of 100 --------------------


6it [00:29,  4.98s/it]

Testing Loss:  0.720, Seconds: 47.14
New Record!


13it [03:06, 14.35s/it]

Testing Loss:  0.807, Seconds: 46.75
No Improvement.


19it [05:22, 16.95s/it]

Testing Loss:  0.883, Seconds: 47.20
No Improvement.


23it [06:50, 17.83s/it]
0it [00:00, ?it/s]

-------------------- epoch 27 of 100 --------------------


7it [01:27, 12.45s/it]

Testing Loss:  0.984, Seconds: 48.89
No Improvement.


13it [03:11, 14.74s/it]

Testing Loss:  0.800, Seconds: 48.04
No Improvement.


19it [05:41, 17.98s/it]

Testing Loss:  0.799, Seconds: 51.92
No Improvement.


23it [07:13, 18.86s/it]
0it [00:00, ?it/s]

-------------------- epoch 28 of 100 --------------------


7it [01:32, 13.15s/it]

Testing Loss:  1.075, Seconds: 51.82
No Improvement.


13it [03:19, 15.35s/it]

Testing Loss:  0.723, Seconds: 50.07
No Improvement.


18it [04:31, 15.10s/it]

Testing Loss:  0.664, Seconds: 50.81
New Record!


23it [07:12, 18.80s/it]
0it [00:00, ?it/s]

-------------------- epoch 29 of 100 --------------------


7it [01:35, 13.70s/it]

Testing Loss:  0.762, Seconds: 56.24
No Improvement.


13it [03:26, 15.89s/it]

Testing Loss:  0.720, Seconds: 51.90
No Improvement.


18it [04:40, 15.58s/it]

Testing Loss:  0.633, Seconds: 52.09
New Record!


23it [07:26, 19.40s/it]
0it [00:00, ?it/s]

-------------------- epoch 30 of 100 --------------------


6it [00:34,  5.75s/it]

Testing Loss:  0.629, Seconds: 52.69
New Record!


13it [03:28, 16.01s/it]

Testing Loss:  0.687, Seconds: 52.75
No Improvement.


19it [05:54, 18.65s/it]

Testing Loss:  0.709, Seconds: 53.60
No Improvement.


23it [07:25, 19.36s/it]
0it [00:00, ?it/s]

-------------------- epoch 31 of 100 --------------------


7it [01:36, 13.77s/it]

Testing Loss:  0.649, Seconds: 54.15
No Improvement.


13it [03:32, 16.36s/it]

Testing Loss:  0.742, Seconds: 55.61
No Improvement.


19it [06:03, 19.14s/it]

Testing Loss:  0.709, Seconds: 55.42
No Improvement.


23it [07:35, 19.79s/it]
0it [00:00, ?it/s]

-------------------- epoch 32 of 100 --------------------


7it [01:40, 14.37s/it]

Testing Loss:  0.845, Seconds: 57.44
No Improvement.


13it [03:42, 17.12s/it]

Testing Loss:  1.251, Seconds: 57.31
No Improvement.


19it [06:17, 19.85s/it]

Testing Loss:  0.728, Seconds: 57.01
No Improvement.


23it [07:50, 20.47s/it]
0it [00:00, ?it/s]

-------------------- epoch 33 of 100 --------------------


7it [01:43, 14.79s/it]

Testing Loss:  1.040, Seconds: 59.04
No Improvement.


13it [03:51, 17.83s/it]

Testing Loss:  0.884, Seconds: 63.77
No Improvement.


19it [06:27, 20.40s/it]

Testing Loss:  0.780, Seconds: 57.79
No Improvement.


23it [08:18, 21.69s/it]
0it [00:00, ?it/s]

-------------------- epoch 34 of 100 --------------------


7it [01:44, 14.96s/it]

Testing Loss:  0.812, Seconds: 59.67
No Improvement.


13it [04:01, 18.61s/it]

Testing Loss:  0.696, Seconds: 72.08
No Improvement.


19it [06:41, 21.15s/it]

Testing Loss:  0.865, Seconds: 60.73
No Improvement.


23it [08:17, 21.63s/it]
0it [00:00, ?it/s]

-------------------- epoch 35 of 100 --------------------


7it [01:47, 15.40s/it]

Testing Loss:  1.027, Seconds: 61.13
No Improvement.


13it [04:01, 18.61s/it]

Testing Loss:  0.682, Seconds: 65.97
No Improvement.


19it [06:50, 21.61s/it]

Testing Loss:  0.774, Seconds: 62.41
No Improvement.


23it [08:26, 22.04s/it]
0it [00:00, ?it/s]

-------------------- epoch 36 of 100 --------------------


7it [01:51, 16.00s/it]

Testing Loss:  0.893, Seconds: 64.50
No Improvement.


13it [04:03, 18.73s/it]

Testing Loss:  0.799, Seconds: 64.39
No Improvement.


19it [06:50, 21.60s/it]

Testing Loss:  0.953, Seconds: 64.22
No Improvement.


23it [08:28, 22.13s/it]
0it [00:00, ?it/s]

-------------------- epoch 37 of 100 --------------------


7it [01:53, 16.27s/it]

Testing Loss:  0.999, Seconds: 64.92
No Improvement.


13it [04:08, 19.11s/it]

Testing Loss:  0.983, Seconds: 65.55
No Improvement.


19it [06:59, 22.09s/it]

Testing Loss:  0.774, Seconds: 65.95
No Improvement.


23it [08:40, 22.61s/it]
0it [00:00, ?it/s]

-------------------- epoch 38 of 100 --------------------


7it [02:02, 17.55s/it]

Testing Loss:  1.140, Seconds: 67.11
No Improvement.


13it [04:23, 20.25s/it]

Testing Loss:  1.118, Seconds: 69.84
No Improvement.


19it [07:24, 23.40s/it]

Testing Loss:  1.248, Seconds: 69.89
No Improvement.


23it [09:07, 23.82s/it]
0it [00:00, ?it/s]

-------------------- epoch 39 of 100 --------------------


7it [02:04, 17.80s/it]

Testing Loss:  0.803, Seconds: 72.89
No Improvement.


13it [04:25, 20.41s/it]

Testing Loss:  1.030, Seconds: 69.70
No Improvement.


19it [07:31, 23.74s/it]

Testing Loss:  1.084, Seconds: 73.15
No Improvement.


23it [09:12, 24.04s/it]
0it [00:00, ?it/s]

-------------------- epoch 40 of 100 --------------------


6it [00:44,  7.38s/it]

Testing Loss:  0.572, Seconds: 69.57
New Record!


13it [04:27, 20.56s/it]

Testing Loss:  0.574, Seconds: 71.02
No Improvement.


19it [07:28, 23.60s/it]

Testing Loss:  0.667, Seconds: 70.62
No Improvement.


23it [09:16, 24.20s/it]
0it [00:00, ?it/s]

-------------------- epoch 41 of 100 --------------------


7it [02:23, 20.56s/it]

Testing Loss:  0.683, Seconds: 87.37
No Improvement.


13it [04:50, 22.35s/it]

Testing Loss:  0.635, Seconds: 72.50
No Improvement.


19it [07:52, 24.86s/it]

Testing Loss:  0.955, Seconds: 72.45
No Improvement.


23it [09:34, 24.96s/it]
0it [00:00, ?it/s]

-------------------- epoch 42 of 100 --------------------


7it [02:07, 18.20s/it]

Testing Loss:  0.621, Seconds: 73.44
No Improvement.


13it [04:36, 21.30s/it]

Testing Loss:  0.648, Seconds: 75.01
No Improvement.


19it [07:43, 24.40s/it]

Testing Loss:  0.713, Seconds: 74.70
No Improvement.


23it [09:29, 24.76s/it]
0it [00:00, ?it/s]

-------------------- epoch 43 of 100 --------------------


7it [02:13, 19.07s/it]

Testing Loss:  0.891, Seconds: 77.12
No Improvement.


13it [04:45, 21.99s/it]

Testing Loss:  0.615, Seconds: 76.11
No Improvement.


19it [07:56, 25.09s/it]

Testing Loss:  0.741, Seconds: 77.95
No Improvement.


23it [10:03, 26.26s/it]
0it [00:00, ?it/s]

-------------------- epoch 44 of 100 --------------------


7it [02:24, 20.71s/it]

Testing Loss:  0.672, Seconds: 87.85
No Improvement.


13it [04:59, 23.04s/it]

Testing Loss:  0.627, Seconds: 78.40
No Improvement.


19it [08:14, 26.00s/it]

Testing Loss:  0.778, Seconds: 79.11
No Improvement.


23it [09:59, 26.05s/it]
0it [00:00, ?it/s]

-------------------- epoch 45 of 100 --------------------


7it [02:17, 19.63s/it]

Testing Loss:  0.705, Seconds: 79.07
No Improvement.


13it [04:55, 22.73s/it]

Testing Loss:  0.673, Seconds: 79.86
No Improvement.


19it [08:14, 26.05s/it]

Testing Loss:  0.653, Seconds: 81.74
No Improvement.


23it [10:08, 26.44s/it]
0it [00:00, ?it/s]

-------------------- epoch 46 of 100 --------------------


7it [02:19, 19.98s/it]

Testing Loss:  0.651, Seconds: 80.26
No Improvement.


13it [05:01, 23.17s/it]

Testing Loss:  0.648, Seconds: 82.03
No Improvement.


19it [08:22, 26.46s/it]

Testing Loss:  0.786, Seconds: 84.21
No Improvement.


23it [10:27, 27.28s/it]
0it [00:00, ?it/s]

-------------------- epoch 47 of 100 --------------------


7it [02:24, 20.67s/it]

Testing Loss:  0.617, Seconds: 83.39
No Improvement.


13it [05:07, 23.67s/it]

Testing Loss:  0.609, Seconds: 83.53
No Improvement.


19it [08:30, 26.87s/it]

Testing Loss:  0.725, Seconds: 84.39
No Improvement.


23it [10:38, 27.76s/it]
0it [00:00, ?it/s]

-------------------- epoch 48 of 100 --------------------


7it [02:46, 23.84s/it]

Testing Loss:  1.071, Seconds: 104.82
No Improvement.


13it [05:34, 25.72s/it]

Testing Loss:  0.623, Seconds: 85.52
No Improvement.


19it [09:00, 28.47s/it]

Testing Loss:  0.635, Seconds: 86.97
No Improvement.


23it [10:53, 28.41s/it]
0it [00:00, ?it/s]

-------------------- epoch 49 of 100 --------------------


7it [02:31, 21.60s/it]

Testing Loss:  0.751, Seconds: 87.98
No Improvement.


13it [05:22, 24.84s/it]

Testing Loss:  0.575, Seconds: 87.78
No Improvement.


18it [07:01, 23.44s/it]

Testing Loss:  0.562, Seconds: 88.43
New Record!


23it [10:49, 28.26s/it]
0it [00:00, ?it/s]

-------------------- epoch 50 of 100 --------------------


7it [02:34, 22.06s/it]

Testing Loss:  0.706, Seconds: 89.90
No Improvement.


12it [03:41, 18.49s/it]

Testing Loss:  0.510, Seconds: 101.84
New Record!


19it [09:16, 29.28s/it]

Testing Loss:  0.552, Seconds: 89.66
No Improvement.


23it [11:09, 29.12s/it]
0it [00:00, ?it/s]

-------------------- epoch 51 of 100 --------------------


7it [02:37, 22.49s/it]

Testing Loss:  0.585, Seconds: 91.28
No Improvement.


13it [05:36, 25.91s/it]

Testing Loss:  0.686, Seconds: 93.03
No Improvement.


19it [09:30, 30.04s/it]

Testing Loss:  0.565, Seconds: 90.34
No Improvement.


23it [11:25, 29.80s/it]
0it [00:00, ?it/s]

-------------------- epoch 52 of 100 --------------------


7it [02:52, 24.58s/it]

Testing Loss:  0.545, Seconds: 104.75
No Improvement.


13it [06:10, 28.48s/it]

Testing Loss:  0.645, Seconds: 107.93
No Improvement.


19it [09:54, 31.28s/it]

Testing Loss:  0.725, Seconds: 94.77
No Improvement.


23it [11:49, 30.86s/it]
0it [00:00, ?it/s]

-------------------- epoch 53 of 100 --------------------


7it [02:44, 23.44s/it]

Testing Loss:  0.882, Seconds: 95.33
No Improvement.


13it [05:51, 27.03s/it]

Testing Loss:  0.593, Seconds: 96.93
No Improvement.


19it [09:35, 30.28s/it]

Testing Loss:  0.556, Seconds: 94.83
No Improvement.


23it [11:33, 30.16s/it]
0it [00:00, ?it/s]

-------------------- epoch 54 of 100 --------------------


7it [03:01, 25.93s/it]

Testing Loss:  0.757, Seconds: 111.48
No Improvement.


13it [06:11, 28.56s/it]

Testing Loss:  0.747, Seconds: 98.91
No Improvement.


19it [10:00, 31.63s/it]

Testing Loss:  0.699, Seconds: 99.40
No Improvement.


23it [11:59, 31.26s/it]
0it [00:00, ?it/s]

-------------------- epoch 55 of 100 --------------------


7it [02:49, 24.24s/it]

Testing Loss:  0.894, Seconds: 98.59
No Improvement.


13it [06:27, 29.80s/it]

Testing Loss:  0.629, Seconds: 125.61
No Improvement.


19it [10:23, 32.80s/it]

Testing Loss:  0.611, Seconds: 101.19
No Improvement.


23it [12:22, 32.29s/it]
0it [00:00, ?it/s]

-------------------- epoch 56 of 100 --------------------


7it [02:54, 24.88s/it]

Testing Loss:  0.696, Seconds: 100.72
No Improvement.


13it [06:09, 28.39s/it]

Testing Loss:  0.576, Seconds: 102.31
No Improvement.


19it [10:04, 31.83s/it]

Testing Loss:  0.720, Seconds: 102.23
No Improvement.


23it [12:07, 31.64s/it]
0it [00:00, ?it/s]

-------------------- epoch 57 of 100 --------------------


7it [02:58, 25.54s/it]

Testing Loss:  0.763, Seconds: 104.42
No Improvement.


13it [06:19, 29.16s/it]

Testing Loss:  0.625, Seconds: 104.75
No Improvement.


19it [10:22, 32.74s/it]

Testing Loss:  0.883, Seconds: 106.14
No Improvement.


23it [12:51, 33.53s/it]
0it [00:00, ?it/s]

-------------------- epoch 58 of 100 --------------------


7it [03:06, 26.65s/it]

Testing Loss:  0.538, Seconds: 109.23
No Improvement.


13it [06:31, 30.15s/it]

Testing Loss:  0.526, Seconds: 107.78
No Improvement.


18it [08:23, 27.96s/it]

Testing Loss:  0.482, Seconds: 108.25
New Record!


23it [12:43, 33.17s/it]
0it [00:00, ?it/s]

-------------------- epoch 59 of 100 --------------------


7it [03:07, 26.75s/it]

Testing Loss:  0.620, Seconds: 108.83
No Improvement.


12it [04:26, 22.17s/it]

Testing Loss:  0.478, Seconds: 108.52
New Record!


18it [08:27, 28.18s/it]

Testing Loss:  0.471, Seconds: 109.92
New Record!


23it [13:15, 34.58s/it]
0it [00:00, ?it/s]

-------------------- epoch 60 of 100 --------------------


6it [01:06, 11.03s/it]

Testing Loss:  0.462, Seconds: 109.48
New Record!


12it [04:31, 22.58s/it]

Testing Loss:  0.439, Seconds: 110.46
New Record!


18it [08:35, 28.62s/it]

Testing Loss:  0.433, Seconds: 111.42
New Record!


23it [13:02, 34.02s/it]
0it [00:00, ?it/s]

-------------------- epoch 61 of 100 --------------------


7it [03:15, 27.87s/it]

Testing Loss:  0.444, Seconds: 115.16
No Improvement.


12it [04:36, 23.01s/it]

Testing Loss:  0.421, Seconds: 113.06
New Record!


19it [11:06, 35.10s/it]

Testing Loss:  0.437, Seconds: 114.20
No Improvement.


23it [13:15, 34.58s/it]
0it [00:00, ?it/s]

-------------------- epoch 62 of 100 --------------------


7it [03:17, 28.26s/it]

Testing Loss:  0.441, Seconds: 115.00
No Improvement.


13it [06:56, 32.06s/it]

Testing Loss:  0.429, Seconds: 116.92
No Improvement.


19it [11:18, 35.69s/it]

Testing Loss:  0.431, Seconds: 117.80
No Improvement.


23it [13:28, 35.16s/it]
0it [00:00, ?it/s]

-------------------- epoch 63 of 100 --------------------


7it [03:20, 28.69s/it]

Testing Loss:  0.438, Seconds: 117.34
No Improvement.


12it [04:45, 23.81s/it]

Testing Loss:  0.416, Seconds: 117.98
New Record!


19it [11:30, 36.37s/it]

Testing Loss:  0.437, Seconds: 118.35
No Improvement.


23it [13:43, 35.80s/it]
0it [00:00, ?it/s]

-------------------- epoch 64 of 100 --------------------


7it [03:25, 29.29s/it]

Testing Loss:  0.439, Seconds: 120.67
No Improvement.


13it [07:11, 33.20s/it]

Testing Loss:  0.423, Seconds: 120.65
No Improvement.


19it [11:39, 36.82s/it]

Testing Loss:  0.421, Seconds: 120.82
No Improvement.


23it [13:53, 36.24s/it]
0it [00:00, ?it/s]

-------------------- epoch 65 of 100 --------------------


7it [03:46, 32.39s/it]

Testing Loss:  0.458, Seconds: 140.22
No Improvement.


13it [07:36, 35.13s/it]

Testing Loss:  0.433, Seconds: 122.88
No Improvement.


19it [12:10, 38.46s/it]

Testing Loss:  0.420, Seconds: 124.47
No Improvement.


23it [14:47, 38.60s/it]
0it [00:00, ?it/s]

-------------------- epoch 66 of 100 --------------------


7it [03:32, 30.42s/it]

Testing Loss:  0.486, Seconds: 124.92
No Improvement.


13it [07:26, 34.34s/it]

Testing Loss:  0.432, Seconds: 125.22
No Improvement.


19it [12:23, 39.14s/it]

Testing Loss:  0.416, Seconds: 126.16
No Improvement.


23it [14:40, 38.26s/it]
0it [00:00, ?it/s]

-------------------- epoch 67 of 100 --------------------


7it [03:35, 30.77s/it]

Testing Loss:  0.423, Seconds: 126.11
No Improvement.


13it [07:32, 34.85s/it]

Testing Loss:  0.467, Seconds: 127.33
No Improvement.


19it [12:17, 38.83s/it]

Testing Loss:  0.426, Seconds: 133.41
No Improvement.


23it [14:33, 37.98s/it]
0it [00:00, ?it/s]

-------------------- epoch 68 of 100 --------------------


7it [03:39, 31.36s/it]

Testing Loss:  0.435, Seconds: 129.46
No Improvement.


13it [07:41, 35.48s/it]

Testing Loss:  0.420, Seconds: 130.46
No Improvement.


19it [12:25, 39.22s/it]

Testing Loss:  0.425, Seconds: 131.14
No Improvement.


23it [14:43, 38.41s/it]
0it [00:00, ?it/s]

-------------------- epoch 69 of 100 --------------------


7it [03:43, 31.95s/it]

Testing Loss:  0.508, Seconds: 130.68
No Improvement.


13it [07:51, 36.30s/it]

Testing Loss:  0.547, Seconds: 134.95
No Improvement.


18it [10:00, 33.36s/it]

## Fixing Custom Sentences

In [159]:
def text_to_ints(text):
    '''Prepare the text for the model'''
    #text = clean_text(text)
    return [vocab_to_int[word] for word in text]

In [None]:
# Create your own sentence or use one from the dataset
#text = "Spellin is difficult, whch is wyh you need to study everyday."
text = 'မင်းသမီးထမင်းဆိုင်ကိုဘယ်သူဦးပြီးရောင်းနေသလဲမဖွင်တေါ့ဘူးလါးါး'
text = text_to_ints(text)

#random = np.random.randint(0,len(testing_sorted))
#text = testing_sorted[random]
#text = noise_maker(text, 0.95)

checkpoint = "./kp=0.75,nl=2,th=0.95.ckpt"

model = build_graph(keep_probability, rnn_size, num_layers, batch_size, learning_rate, embedding_size, direction) 

with tf.Session() as sess:
    # Load saved model
    saver = tf.train.Saver()
    saver.restore(sess, checkpoint)
    
    #Multiply by batch_size to match the model's input parameters
    answer_logits = sess.run(model.predictions, {model.inputs: [text]*batch_size, 
                                                 model.inputs_length: [len(text)]*batch_size,
                                                 model.targets_length: [len(text)+1], 
                                                 model.keep_prob: [1.0]})[0]

# Remove the padding from the generated sentence
pad = vocab_to_int["<PAD>"] 

print('\nText')
print('  Word Ids:    {}'.format([i for i in text]))
print('  Input Words: {}'.format("".join([int_to_vocab[i] for i in text])))

print('\nSummary')
print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
response = "".join([int_to_vocab[i] for i in answer_logits if i != pad])
print('  Response Words: {}'.format(response))

if False:
    print('*'*100)
    print('Second correction')
    response_int = text_to_ints(response)
    with tf.Session() as sess:
        # Load saved model
        saver = tf.train.Saver()
        saver.restore(sess, checkpoint)

        #Multiply by batch_size to match the model's input parameters
        answer_logits = sess.run(model.predictions, {model.inputs: [response_int]*batch_size, 
                                                     model.inputs_length: [len(response_int)]*batch_size,
                                                     model.targets_length: [len(response_int)+1], 
                                                     model.keep_prob: [1.0]})[0]

    # Remove the padding from the generated sentence
    pad = vocab_to_int["<PAD>"]

    print('\nSummary')
    print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
    response = "".join([int_to_vocab[i] for i in answer_logits if i != pad])
    print('  Response Words: {}'.format(response))

    print('*'*100)
    print('3rd correction')
    response_int = text_to_ints(response)
    with tf.Session() as sess:
        # Load saved model
        saver = tf.train.Saver()
        saver.restore(sess, checkpoint)

        #Multiply by batch_size to match the model's input parameters
        answer_logits = sess.run(model.predictions, {model.inputs: [response_int]*batch_size, 
                                                     model.inputs_length: [len(response_int)]*batch_size,
                                                     model.targets_length: [len(response_int)+1], 
                                                     model.keep_prob: [1.0]})[0]

    # Remove the padding from the generated sentence
    pad = vocab_to_int["<PAD>"]

    print('\nSummary')
    print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
    print('  Response Words: {}'.format("".join([int_to_vocab[i] for i in answer_logits if i != pad])))

In [161]:
# Create your own sentence or use one from the dataset
#text = "Spellin is difficult, whch is wyh you need to study everyday."
texts = 'မင်းသမီးထမင်းဆိုင်ကို ဘယ်သူဦးပြီး ရောင်းနေသလဲ မဖွင်တေါ့ဘူးလါးါး'.split(' ')
texts = [text_to_ints(t) for t in texts]

#random = np.random.randint(0,len(testing_sorted))
#text = testing_sorted[random]
#text = noise_maker(text, 0.95)

checkpoint = "./kp=0.75,nl=2,th=0.95.ckpt"

model = build_graph(keep_probability, rnn_size, num_layers, batch_size, learning_rate, embedding_size, direction) 

with tf.Session() as sess:
    # Load saved model
    saver = tf.train.Saver()
    saver.restore(sess, checkpoint)
    for text in texts:
        #Multiply by batch_size to match the model's input parameters
        answer_logits = sess.run(model.predictions, {model.inputs: [text]*batch_size, 
                                                     model.inputs_length: [len(text)]*batch_size,
                                                     model.targets_length: [len(text)+1], 
                                                     model.keep_prob: [1.0]})[0]

        # Remove the padding from the generated sentence
        pad = vocab_to_int["<PAD>"] 

        print('\nText')
        print('  Word Ids:    {}'.format([i for i in text]))
        print('  Input Words: {}'.format("".join([int_to_vocab[i] for i in text])))

        print('\nSummary')
        print('  Word Ids:       {}'.format([i for i in answer_logits if i != pad]))
        print('  Response Words: {}'.format("".join([int_to_vocab[i] for i in answer_logits if i != pad])))

INFO:tensorflow:Restoring parameters from ./kp=0.75,nl=2,th=0.95.ckpt

Text
  Word Ids:    [25, 4, 58, 56, 30, 25, 46, 56, 17, 25, 4, 58, 56, 6, 45, 47, 4, 58, 0, 45, 47]
  Input Words: မင်းသမီးထမင်းဆိုင်ကို

Summary
  Word Ids:       [25, 4, 58, 56, 30, 48, 56, 25, 25, 62, 25, 30, 48, 56, 25, 25, 4, 58, 56, 25, 62, 44]
  Response Words: မင်းသူးမမှမသူးမမင်းမှာ

Text
  Word Ids:    [24, 26, 58, 30, 48, 38, 56, 21, 60, 46, 56]
  Input Words: ဘယ်သူဦးပြီး

Summary
  Word Ids:       [24, 48, 24, 48, 24, 48, 56, 24, 48, 56, 30, 48]
  Response Words: ဘူဘူဘူးဘူးသူ

Text
  Word Ids:    [27, 49, 44, 4, 58, 56, 20, 49, 30, 28, 50]
  Input Words: ရောင်းနေသလဲ

Summary
  Word Ids:       [27, 49, 44, 4, 58, 56, 27, 49, 44, 4, 58, 56]
  Response Words: ရောင်းရောင်း

Text
  Word Ids:    [25, 22, 61, 4, 58, 16, 49, 43, 55, 24, 48, 56, 28, 43, 56, 43, 56]
  Input Words: မဖွင်တေါ့ဘူးလါးါး

Summary
  Word Ids:       [25, 61, 49, 22, 49, 43, 55, 24, 48, 56, 24, 48, 56, 25, 49, 56, 81]
  Response Words: မွေဖ

Examples of corrected sentences:
- Spellin is difficult, whch is wyh you need to study everyday.
- Spelling is difficult, which is why you need to study everyday.


- The first days of her existence in th country were vrey hard for Dolly. 
- The first days of her existence in the country were very hard for Dolly.


- Thi is really something impressiv thaat we should look into right away! 
- This is really something impressive that we should look into right away!

## Summary

I hope that you have found this project to be rather interesting and useful. The example sentences that I have presented above were specifically chosen, and the model will not always be able to make corrections of this quality. Given the amount of data that we are working with, this model still struggles. For it to be more useful, it would require far more training data, and additional parameter tuning. This parameter values that I have above worked best for me, but I expect there are even better values that I was not able to find.

Thanks for reading!