# Introduction

We tackle the problem of OCR post processing. In OCR, we map the image form of the document into the text domain. This is done first using an CNN+LSTM+CTC model, in our case based on tesseract. Since this output maps only image to text, we need something on top to validate and correct language semantics.

The idea is to build a language model, that takes the OCRed text and corrects it based on language knowledge. The langauge model could be:
- Char level: the aim is to capture the word morphology. In which case it's like a spelling correction system.
- Word level: the aim is to capture the sentence semnatics. But such systems suffer from the OOV problem.
- Fusion: to capture semantics and morphology language rules. The output has to be at char level, to avoid the OOV. However, the input can be char, word or both.

The fusion model target is to learn:

    p(char | char_context, word_context)

In this workbook we use seq2seq vanilla Keras implementation, adapted from the lstm_seq2seq example on Eng-Fra translation task. The adaptation involves:

- Adapt to spelling correction, on char level
- Pre-train on a noisy, medical sentences
- Fine tune a residual, to correct the mistakes of tesseract 
- Limit the input and output sequence lengths
- Enusre teacher forcing auto regressive model in the decoder
- Limit the padding per batch
- Learning rate schedule
- Bi-directional LSTM Encoder
- Bi-directional GRU Encoder


# Imports

In [1]:
from __future__ import print_function
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import set_session
from keras.models import Model
from keras.layers import Input, LSTM, Dense, Bidirectional, Concatenate, GRU
from keras import optimizers
from keras.callbacks import ModelCheckpoint, TensorBoard, LearningRateScheduler
from keras.models import load_model
import numpy as np
import os
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from autocorrect import spell
import re
%matplotlib inline

Using TensorFlow backend.


# Utility functions

In [2]:
# Limit gpu allocation. allow_growth, or gpu_fraction
def gpu_alloc():
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    set_session(tf.Session(config=config))

In [3]:
gpu_alloc()

In [4]:
def calculate_WER_sent(gt, pred):
    '''
    calculate_WER('calculating wer between two sentences', 'calculate wer between two sentences')
    '''
    gt_words = gt.lower().split(' ')
    pred_words = pred.lower().split(' ')
    d = np.zeros(((len(gt_words) + 1), (len(pred_words) + 1)), dtype=np.uint8)
    # d = d.reshape((len(gt_words)+1, len(pred_words)+1))

    # Initializing error matrix
    for i in range(len(gt_words) + 1):
        for j in range(len(pred_words) + 1):
            if i == 0:
                d[0][j] = j
            elif j == 0:
                d[i][0] = i

    # computation
    for i in range(1, len(gt_words) + 1):
        for j in range(1, len(pred_words) + 1):
            if gt_words[i - 1] == pred_words[j - 1]:
                d[i][j] = d[i - 1][j - 1]
            else:
                substitution = d[i - 1][j - 1] + 1
                insertion = d[i][j - 1] + 1
                deletion = d[i - 1][j] + 1
                d[i][j] = min(substitution, insertion, deletion)
    return d[len(gt_words)][len(pred_words)]

In [5]:
def calculate_WER(gt, pred):
    '''

    :param gt: list of sentences of the ground truth
    :param pred: list of sentences of the predictions
    both lists must have the same length
    :return: accumulated WER
    '''
#    assert len(gt) == len(pred)
    WER = 0
    nb_w = 0
    for i in range(len(gt)):
        #print(gt[i])
        #print(pred[i])
        WER += calculate_WER_sent(gt[i], pred[i])
        nb_w += len(gt[i])

    return WER / nb_w

In [6]:
def load_data_with_gt(file_name, num_samples, max_sent_len, min_sent_len, delimiter='\t', gt_index=1, prediction_index=0):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []
    gt_texts = []
    target_texts = []
    for row in open(file_name, encoding='utf8'):
        if cnt < num_samples :
            #print(row)
            sents = row.split(delimiter)
            input_text = sents[prediction_index]
            
            target_text = '\t' + sents[gt_index] + '\n'
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len and len(target_text) > min_sent_len and len(target_text) < max_sent_len:
                cnt += 1
                
                input_texts.append(input_text)
                target_texts.append(target_text)
                gt_texts.append(sents[gt_index])
    return input_texts, target_texts, gt_texts

In [7]:
def load_data(file_name, num_samples, max_sent_len, min_sent_len):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []   
    
    #for row in open(file_name, encoding='utf8'):
    for row in open(file_name):
        if cnt < num_samples :            
            input_text = row           
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len:
                cnt += 1                
                input_texts.append(input_text)
    return input_texts

In [8]:
def vectorize_data(input_texts, max_encoder_seq_length, num_encoder_tokens, vocab_to_int):
    
    if(len(input_texts) > max_encoder_seq_length):
        input_texts = input_texts[:max_encoder_seq_length]
    
    '''Prepares the input text and targets into the proper seq2seq numpy arrays'''
    encoder_input_data = np.zeros(
    (len(input_texts), max_encoder_seq_length),
    dtype='float32')
    
    for i, input_text in enumerate(input_texts):
        for t, char in enumerate(input_text[:max_encoder_seq_length]):
            # c0..cn
            encoder_input_data[i, t] = vocab_to_int[char]
                
    return encoder_input_data

In [9]:
def decode_sequence(input_seq, encoder_model, decoder_model, num_decoder_tokens, max_decoder_seq_length, vocab_to_int, int_to_vocab):
    
    #print(max_decoder_seq_length)
    # Encode the input as state vectors.
    encoder_outputs, h, c  = encoder_model.predict(input_seq)
    states_value = [h,c]
    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0] = vocab_to_int['\t']

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    #print(input_seq)
    attention_density = []
    i = 0
    special_chars = ['\\', '/', '-', '—' , ':', '[', ']', ',', '.', '"', ';', '%', '~', '(', ')', '{', '}', '$', '#']
    #special_chars = []
    while not stop_condition:
        #print(target_seq)
        output_tokens, attention, h, c  = decoder_model.predict(
            [target_seq, encoder_outputs] + states_value)
        #print(attention.shape)
        attention_density.append(attention[0][0])# attention is max_sent_len x 1 since we have num_time_steps = 1 for the output
        # Sample a token
        #print(output_tokens.shape)
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        
        #print(sampled_token_index)
        sampled_char = int_to_vocab[sampled_token_index]
        
        orig_char = int_to_vocab[int(input_seq[:,i][0])]
        
        # Exit condition: either hit max length
        # or find stop character.
        if (sampled_char == '\n' or
           len(decoded_sentence) > max_decoder_seq_length):
            stop_condition = True
            #print('End', sampled_char, 'Len ', len(decoded_sentence), 'Max len ', max_decoder_seq_length)
            sampled_char = ''
        
        # Copy digits as it, since the spelling corrector is not good at digit corrections
        
        if(orig_char.isdigit() or orig_char in special_chars):
            decoded_sentence += orig_char            
        else:
            if(sampled_char.isdigit() or sampled_char in special_chars):
                decoded_sentence += ''
            else:
                decoded_sentence += sampled_char
        
        #decoded_sentence += sampled_char


        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]
        
        i += 1
        if(i > 48):
            i = 0
    attention_density = np.array(attention_density)
    
    # Word level spell correct
    '''
    corrected_decoded_sentence = ''
    for w in decoded_sentence.split(' '):
        corrected_decoded_sentence += spell(w) + ' '
    decoded_sentence = corrected_decoded_sentence
    '''
    return decoded_sentence, attention_density


In [50]:
def word_spell_correct(decoded_sentence):
    if(decoded_sentence == ''):
        return ''
    corrected_decoded_sentence = ''
    special_chars = ['\\', '/', '-', '—' , ':', '[', ']', ',', '.', '"', ';', '%', '~', '(', ')', '{', '}', '$', '#', '☒', '■', '☐', '□', '☑', '@']
    for w in decoded_sentence.split(' '):
        #print(w)
        if((len(re.findall(r'\d+', w))==0) and not (w in special_chars)):
            corrected_decoded_sentence += spell(w) + ' '
        else:
            corrected_decoded_sentence += w + ' '
    return corrected_decoded_sentence

In [11]:
def clean_up_sentence(sentence, vocab):
    s = ''
    prev_char = ''
    for c in sentence.strip():
        if c not in vocab or (c == ' ' and prev_char == ' '):
            s += ''
        else:
            s += c
        prev_char = c
            
    return s

# Load data

# Load model params

In [12]:
data_path = '../../dat/'

In [13]:
max_sent_lengths = [50, 100]

In [14]:
vocab_file = {}
model_file = {}
encoder_model_file = {}
decoder_model_file = {}
model = {}
encoder_model = {}
decoder_model = {}
vocab = {}
vocab_to_int = {}
int_to_vocab = {}
max_sent_len = {}
min_sent_len = {}
num_decoder_tokens = {}
num_encoder_tokens = {}
max_encoder_seq_length = {}
max_decoder_seq_length = {}

In [15]:

for i in max_sent_lengths:
    vocab_file[i] = 'vocab-{}.npz'.format(i)
    model_file[i] = 'best_model-{}.hdf5'.format(i)
    encoder_model_file[i] = 'encoder_model-{}.hdf5'.format(i)
    decoder_model_file[i] = 'decoder_model-{}.hdf5'.format(i)
    
    vocab = np.load(file=vocab_file[i])
    vocab_to_int[i] = vocab['vocab_to_int'].item()
    int_to_vocab[i] = vocab['int_to_vocab'].item()
    max_sent_len[i] = vocab['max_sent_len']
    min_sent_len[i] = vocab['min_sent_len']
    input_characters = sorted(list(vocab_to_int))
    num_decoder_tokens[i] = num_encoder_tokens[i] = len(input_characters) #int(encoder_model.layers[0].input.shape[2])
    max_encoder_seq_length[i] = max_decoder_seq_length[i] = max_sent_len[i] - 1#max([len(txt) for txt in input_texts])
    
    model[i] = load_model(model_file[i])
    encoder_model[i] = load_model(encoder_model_file[i])
    decoder_model[i] = load_model(decoder_model_file[i])



In [16]:
num_samples = 1000000
#tess_correction_data = os.path.join(data_path, 'test_data.txt')
#input_texts = load_data(tess_correction_data, num_samples, max_sent_len, min_sent_len)

OCR_data = os.path.join(data_path, 'new_trained_data.txt')
#input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len, min_sent_len, delimiter='|',gt_index=0, prediction_index=1)
input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len=10000, min_sent_len=0)

In [17]:
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])

1951
Me dieal Provider Roles: Treating  
 	Medical Provider Roles: Treating


Provider First Name: Christine  
 	Provider First Name: Christine


Provider Last Name: Nolen, MD  
 	Provider Last Name: Nolen, MD


Address Line 1 : 7 25 American Avenue  
 	Address Line 1 : 725 American Avenue


City. W’aukesha  
 	City: Waukesha


StatefProvinee: ‘WI  
 	State/Province: WI


Postal Code: 5 31 88  
 	Postal Code: 53188


Country". US  
 	Country:  US


Business Telephone: (2 62) 92 8- 1000  
 	Business Telephone: (262) 928- 1000


Date ot‘Pirst Visit: 1 2/01f20 17  
 	Date of First Visit: 12/01/2017




In [18]:
# Spell correct before inference
'''
input_texts_ = []
for sent in input_texts:
    sent_ = ''
    for word in sent.split(' '):
        sent_ += spell(word) + ' '
    input_texts_.append(sent_)
input_texts = input_texts_
input_texts_ = []
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])
'''

"\ninput_texts_ = []\nfor sent in input_texts:\n    sent_ = ''\n    for word in sent.split(' '):\n        sent_ += spell(word) + ' '\n    input_texts_.append(sent_)\ninput_texts = input_texts_\ninput_texts_ = []\n# Sample data\nprint(len(input_texts))\nfor i in range(10):\n    print(input_texts[i], '\n', target_texts[i])\n"

In [19]:
decoded_sentences = []
corrected_sentences = []

#for seq_index in range(len(input_texts)):
results = open('RESULTS.md', 'w')
results.write('|OCR sentence|GT sentence|Char decoded sentence|Word decoded sentence|Sentence length (chars)|\n')
results.write('---------------|-----------|----------------|----------------|----------------|\n')
     

for i, input_text in enumerate(input_texts):
    #print(input_text)
    # Find the input length range to choose the proper model to use
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    
    input_text = clean_up_sentence(input_text, vocab_to_int[len_range])
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])
    
    

    target_text = gt_texts[i]
    
    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)
    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(input_text)
    print('-Lenght = ', len_range)
    print('Input sentence:', input_text)
    print('GT sentence:', target_text.strip())
    print('Char Decoded sentence:', decoded_sentence)   
    print('Word Decoded sentence:', corrected_sentence) 
    results.write(' | ' + input_text + ' | ' + target_text.strip() + ' | ' + decoded_sentence + ' | ' + corrected_sentence + ' | ' + str(len_range) + ' | \n')
    decoded_sentences.append(decoded_sentence)
    corrected_sentences.append(corrected_sentence)
results.close()    

    

-Lenght =  50
Input sentence: Me dieal Provider Roles: Treating
GT sentence: Medical Provider Roles: Treating
Char Decoded sentence: Medical Provider Roles:Treating
Word Decoded sentence: Me deal Provider Roles Treating 
-Lenght =  50
Input sentence: Provider First Name: Christine
GT sentence: Provider First Name: Christine
Char Decoded sentence: Provider First Name: Christine
Word Decoded sentence: Provider First Name Christine 
-Lenght =  50
Input sentence: Provider Last Name: Nolen, MD
GT sentence: Provider Last Name: Nolen, MD
Char Decoded sentence: Provider Last Name: Norle, MD
Word Decoded sentence: Provider Last Name Dolens MD 
-Lenght =  50
Input sentence: Address Line 1 : 7 25 American Avenue
GT sentence: Address Line 1 : 725 American Avenue
Char Decoded sentence: Address Line 1:7A25ent Admentine Ave
Word Decoded sentence: Address Line 1 : 7 25 American Avenue 
-Lenght =  50
Input sentence: City. W’aukesha
GT sentence: City: Waukesha
Char Decoded sentence: City.States
Word Dec

KeyboardInterrupt: 

In [25]:
input_texts = ['SUBJECTIVE: This is a S-year-old +@W his left great toe with the handleh lacration.',
               'Thera was no handlebarthe lacration.',
               'Patiet last tet is needing this for school at this',
               'OBJECTIVE : The temp is 99.8, the f tha blood pressure 99/64, O2 sat 94 8/10 at this time.',
               'Left great toe the dorsl surface, extending ta th active hemorrhage at this time.',
               'Th anaathetized with a cotton ball sat Left this in place for 20 minutes.',
               'with Betadine again and injected th he tolerated very well.',
               'The wound sutures. Antibiotic eintment and g',
               'Patient tolerated very well. Pat',
               'IMPRESSION: Lacration te left grs',
               'PLAN: Patent is to do dressing ch advised as far as checking tha waurn it with soap and water.',
               'Sutures oy have any problems prior te that tim ona teaspoon three times a day rer Ibuprofan far pain, discomfort. Cg',
               'hite male who accidently dropped a bike onto ar end hitting the left great toe,'
               'causing a guard to the end of the bike, which caused anus shot is more that three years ago and time.',
               'nlse ef 105 and regular, resprations 286,% on room air.',
               'Patient rates hia pain at — there i15 noted a 3-om laceration across a lateral aspect of tha toe.',
               'There is no e toa ir cleansed with Betadine.',
               'It is then urated with 5 cu of 2% Hylocaine plain.',
               'We then cleansed ae toa with 3 cc of 2% Xylacaina plain',
               'which was then clesed with five 5-0 Prolene ressure dressing was then applied to the tos',
               'paient is given DPT 0.5 ee intramucular (IM).at toe.',
               'Kefylex 250 mg per 5 ml, the next seven days.',
               'He may use Tylenol or 11 if any problems.',
               'Unum Life Insurance Company of America 2211',               
               'Congress Street Portland, Maine 04122',
               'APPLICATION FOR GROUP CRITICAL LLNESS INSURANCE',
               'I Evidence of Insurability',
               '',
               'Application Type: @ New Enrollee Change to',
               'Existing Coverage  Reinstatement  Internal',
               'Replacement  Late Applicant  Rehire SECTION 1:',
               'Employee(Applicant) Information  Always',
               'Complete Employee Name(First, Middle, Last)',
               'Social Security Number Nikolas J Jones',
               '123 - 456 - 7890 Home Address(Street/ PO Box)',
               'Gender 1634 Stewert St  F  M City Date of Birth',
               '(mm / dd / yyyy) Seattle 06 / 15 / 1991 State Zip',
               'Code Home Phone # Washington 98101 854-555-1212',
               'Are you Actively at Work? Employee ID / Payroll #',
               ' Yes  No55624 a.Are you a U.S.Citizen or',
               'Canadian Citizen working in the U.S.? b.Are you',
               'legally authorized to work in  Yes  No(If No',
               'reply to part b) the U.S.?  Yes  No Employer',
               'Name Group Number Date of Hire(mm/ dd / yyyy)',
               'Facebook 11 - 555566 11 / 30 / 2016 Occupation',
               'Eligibility Class Software Engineer 7 Scheduled',
               'Number of Work Hours per Week Work Phone # 35',
               '854-555-6622 SECTION 2: Spouse Information ',
               'Complete Only if applying for Spouse coverage Name',
               '(First, Middle, Last) Social Security Number',
               'Gender Date of Birth(mm / dd / yyyy) Does the',
               '1019 - 07 - AZ 1',
              'if claint is for a child, please state your relationship 10 the child',
              'date of accident 3d _ time of accident ram. 0 p.m.',
              'have you slopped working? (of yes [1 no if yes, what was the last day that you worked? (mm/ddryy)_| —3 | —{% cnslamegs bil =']
               
for input_text in input_texts:
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    pre_corrected_sentence = word_spell_correct(input_text)
    input_text = clean_up_sentence(input_text, vocab_to_int[len_range])
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])



    target_text = gt_texts[i]

    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)

    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(input_text)
    #print('-Lenght = ', len_range)
    print('Input sentence:', input_text)
    #print('Spell Decoded sentence:', pre_corrected_sentence) 
    #print('Char Decoded sentence:', decoded_sentence)   
    print('Word Decoded sentence:', corrected_sentence) 
    print('\n')



Input sentence: SUBJECTIVE: This is a S-year-old +@W his left great toe with the handleh lacration.
Word Decoded sentence: SUBJECTIVES This is a S-year-old New his left great toe with the handled laceration 


Input sentence: Thera was no handlebarthe lacration.
Word Decoded sentence: Thera was no handlebarthe laceration 


Input sentence: Patiet last tet is needing this for school at this
Word Decoded sentence: Patient last tet is needing this for school at this 


Input sentence: OBJECTIVE : The temp is 99.8, the f tha blood pressure 99/64, O2 sat 94 8/10 at this time.
Word Decoded sentence: OBJECTIVE : The temp is 99.8, the f tha blood pressure 99/64, O2 sat 94 8/10 at this time 


Input sentence: Left great toe the dorsl surface, extending ta th active hemorrhage at this time.
Word Decoded sentence: Left great toe the dorsal surface extending ta th active hemorrhage at this time 


Input sentence: Th anaathetized with a cotton ball sat Left this in place for 20 minutes.
Word Decode

In [32]:

input_texts = ['text',
'',
'',
'',
'',
' ',
'',
'',
'',
'Fai',
'10',
'7521509',
'(FISTDEOO)',
'at',
'11/3/2017',
'5:23:19',
'from',
'-9373834004',
'Req',
'IC',
'2017:1030525109:292E.',
'Page',
'4',
'of',
'5',
'(C)',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'11/3/2017',
'FRI',
'8:26',
'FAX',
'2373834004',
'Kjooas00s',
'',
'',
'',
'as3-ursasy3',
'11:30:11',
'11/2/2017',
'vis',
'',
'',
'',
'®',
'®',
'&',
'ACCIDENT',
'CLAIM',
'FORM',
'',
'uu',
'num’',
'Tha',
'Benelits',
'Canter',
'',
'P.O.',
'Bax',
'100158,',
'Calumbin,',
'EC',
'20202-3150',
'',
'Tol-frea:',
'1-800-635-5587',
'Fax:',
'1-800-447-2488',
'',
'Gall',
'toll-free',
'Monday',
'through',
'Friday,',
'8',
'a.m.',
'lo',
'8',
'p.m,',
'Eagtarn',
'Time.',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'[',
'ATTENDING',
'PHYSICIAN',
'STATEMENT',
']',
'',
'',
'IneurexiPolicyt',
'alcar',
'Hama',
'(Lael',
'Name,',
'Flis!',
'Nama,',
'MI,',
'Suffix)',
'Data',
'of',
'Risth',
'{msmidrfyy)',
'-',
'',
'',
'Faupi',
'Nana',
'{Laut',
'Hume,',
'Flial',
'Numa,',
'1',
'Sut)',
'Dats',
'al',
'Bln',
'rAvad)',
'Ul',
'_',
'',
'-[ECIpENT',
'DETAILS',
']',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'a',
'thls',
'Gundilan',
'the',
'result',
'of',
'a',
'acddental',
'inury?',
'ves',
'O',
'No',
'if',
'yas,',
'dale',
'of',
'accident',
'qre/ddlyy)',
'[1',
'0]',
'[z]e',
'[=]',
'',
'',
'',
'Is',
'Mig',
'condition',
'Lhe',
'result',
'of',
'hefer',
'employment',
'£1',
'Yes',
'pNo',
'[1',
'Unknown',
'',
'',
'',
'Plaaze',
'verily',
'treatment',
'for',
'the',
'accident',
'lalad',
'above.',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'Dalaw',
'of',
'Diagnosis',
'Diagncsis',
'Description',
'Prosadure',
'Procedure',
'Dascription',
'',
'Branden',
'(Including',
'|',
'Cudo',
'(GD)',
'ous',
'',
'Confinement)',
'eR',
'ap',
'HAS',
'TTT',
'',
'BEEF',
'eR',
'',
'wiz]',
'.',
'S33,5XxA',
'Hh',
'rioes',
'ey',
'race',
'Word',
'',
'awqd]',
'',
'weak',
'3',
'n',
'[aveny',
'[d',
'',
'wifi',
'Wl',
'',
'oa',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'Has',
'lhe',
'pallet',
'bean',
'trastad',
'for',
'tha',
'same',
'ar',
'&',
'S(tilar',
'candillan',
'by',
'anolher',
'phyalelan',
'In',
'tha',
'past?',
'[1',
'Yen',
'Bho',
'',
'M',
'yor,',
'pioona',
'provid',
'tha',
'fares:',
'',
'',
'',
' ',
'',
'',
'',
'Diageosis:',
'Tramiment',
'Daten:',
'',
'',
'',
' ',
'',
'',
'',
' ',
'',
'',
'',
'id',
'ya.1',
'#dving',
'Lhe',
'patient',
'to',
'clap',
'working?',
'RECEIVED',
'',
'It',
'yes,',
'B8',
'of',
'what',
'cate?',
'(mmidkyy)',
'',
'',
'',
'[23]',
'[117]',
'',
'',
'',
'[Ih',
'cielih',
'fa',
'rotated',
'to',
'normal',
'prepnency,',
'please',
'grovida',
'tha',
'idliawing:',
'NOV',
'',
'Expecigd',
'Delivery',
'Dale',
'(mimicd/yy)',
'Aclual',
'Delivery',
'Dale',
'{mmiddlyy',
'',
'',
'',
' ',
'',
'',
'',
'Phyeiclan',
'informaiton',
'HUMAN',
'REGOURCITE',
'',
'',
'',
'FRAUD',
'NOTICE:',
'Any',
'person',
'wha',
'knowingly',
'files',
'&',
'statement',
'of',
'clalm',
'containing',
'FALSE',
'or',
'misleading',
'information',
'8',
'',
'subject',
'to',
'criminal',
'and',
'elvil',
'penallies.',
'This',
'includes',
'Attending',
'Physician',
'portions',
'of',
'the',
'claim',
'farm.',
'',
'',
'',
'CS',
'yma',
'SEAS',
'Ta',
'hve',
'glan',
'=',
'',
'The',
'above',
'statements',
'ara',
'trun',
'And',
'rompints',
'to',
'tho',
'bot',
'of',
'my',
'knowledge',
'and',
'bolluf.',
'',
'',
'',
'Physician',
'Name',
'(Lea!',
'Name,',
'Firat',
'Name,',
'MI,',
'Suita)',
'Plases',
'Print',
'Co',
'FHman',
'log',
'Mm',
'',
'/',
'‘',
'',
'',
'',
'Medical',
'Speclaty',
'[Tr',
'eactal-',
']',
'|',
'D',
'of',
'r',
'of',
'Ch',
'2',
'',
'2',
'Le',
'',
'',
'==',
'Zoi!',
'M',
'o',
'“Fanart',
'',
'',
'=',
'Balfrone',
'ie',
'2',
'Sle',
'iu',
'',
'il',
'HY',
'BY',
'1942',
'Fax',
'Number',
'yz—',
'43',
'-8',
'7775',
'Fhyalafans',
'Tax',
'ID',
'Number.',
'',
'',
'',
'Aro',
'you',
'refateq',
'to',
'hiv',
'pollen?',
'0',
'Yoe',
'LlMo',
'|',
'yes,',
'wal',
'iv',
'the',
'relelianshipT',
'',
'',
'',
' ',
'',
'',
'',
' ',
' ',
'',
'',
'',
'Physlclan',
'Slgnature',
'Date',
'',
'CL-1023',
'-2717',
'=',
'',
'',
'',
' ',
'',
'',
'',
'—',]
               
for input_text in input_texts:
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    pre_corrected_sentence = word_spell_correct(input_text)
    input_text = clean_up_sentence(input_text, vocab_to_int[len_range])
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])



    target_text = gt_texts[i]

    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)

    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(input_text)
    #print('-Lenght = ', len_range)
    #print('Input sentence:', input_text)
    #print('Spell Decoded sentence:', pre_corrected_sentence) 
    #print('Char Decoded sentence:', decoded_sentence)   
    
    #print('Word Decoded sentence:', corrected_sentence) 
    print(corrected_sentence) 
    #print('\n')



text 








Fai 
10 
7521509 
(FISTDEOO) 
at 
11/3/2017 
5:23:19 
from 
-9373834004 
Req 
IC 
2017:1030525109:292E. 
Page 
4 
of 
5 
Act 











11/3/2017 
FRI 
8:26 
FAX 
2373834004 
Kjooas00s 



as3-ursasy3 
11:30:11 
11/2/2017 
vis 



a 
a 
a 
ACCIDENT 
CLAIM 
FORM 

UU 
numb 
Tha 
Benefits 
Canter 

Poor 
Bax 
100158, 
Calumbin, 
EC 
20202-3150 

Tol-frea: 
1-800-635-5587 
Fax 
1-800-447-2488 

Gall 
toll-free 
Monday 
through 
Friday 
8 
am 
lo 
8 
pm 
Eastern 
Time 







































[ 
ATTENDING 
PHYSICIAN 
STATEMENT 
] 


IneurexiPolicyt 
altar 
Hama 
Lael 
Name 
Flisk 
Naman 
MID 
Suffix 
Data 
of 
Isth 
{msmidrfyy) 
- 


Fault 
Nana 
Claut 
Humet 
Filial 
Numac 
1 
Suth 
Days 
al 
Ban 
raved 
Ul 
a 

-[ECIpENT 
DETAILS 
] 



















a 
this 
Gundilan 
the 
result 
of 
a 
accidental 
inury 
yes 
O 
No 
if 
yas 
dale 
of 
accident 
qre/ddlyy) 
[1 
0] 
[z]e 
[=] 



Is 
Mig 
condition 
Lhe 
result 
of 
refer 
employment 
£1 
Yes 
no 
[1 
Unk

In [51]:
input_texts = ['☑ @ New Enrollee ☐ Change to Existing Coverage ☐ Reinstatement']
for input_text in input_texts:
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    print(input_text)
    pre_corrected_sentence = word_spell_correct(input_text)


    input_text = clean_up_sentence(input_text, vocab_to_int[len_range])
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])



    target_text = gt_texts[i]

    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)

    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(input_text)
    #print('-Lenght = ', len_range)
    print('Input sentence:', input_text)
    #print('Spell Decoded sentence:', pre_corrected_sentence) 
    #print('Char Decoded sentence:', decoded_sentence)   
    
    print('Word Decoded sentence:', corrected_sentence) 
    #print(corrected_sentence) 
    #print('\n')



☑ @ New Enrollee ☐ Change to Existing Coverage ☐ Reinstatement
☑ @ New Enrollee ☐ Change to Existing Coverage ☐ Reinstatement 
☑ @ New Enrollee ☐ Change to Existing Coverage ☐ Reinstatement 


# Handwriting correction

In [54]:
num_samples = 1000000

OCR_data = os.path.join(data_path, 'handwritten_output.txt')
input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len=10000, min_sent_len=0, delimiter='|', gt_index=0, prediction_index=1)

# Sample data
print(len(input_texts))
for i in range(100):
    print(input_texts[i], '\n', target_texts[i])

3749
 is insisting on a policy of change . 
 
 	is insisting on a policy of change . 

 12/29/17 
 
 	12/29/17 

 SIAL)TH 
 
 	SLP(L) THA 

 Arcadia CA 91007 
 
 	Arcadia CA 91007 

 (012) 667 9375 
 
 	(012) 6674375 

 In this 200-fathom trench the herring do not tood the botton . 
 
 	In this 200-fathom trench the herring do not touch the bottom . 

 43638556X1 
 
 	43638556X1 

 Pretoria 
 
 	Pretoria 

 fiddaling about with bils of cost . 
 
 	fiddling about with bills of cost . 

 ( Fig. 3) . Loop threed tound liite finger t 
 
 	( Fig. 3 ) . Loop thread round little finger , 

 200681383 
 
 	200681383 

 for a working week of 34 to 36 houns . 
 
 	for a working week of 34 to 36 hours . 

 Daugher 
 
 	Daugther 

 Electronically Signed 
 
 	Electronically Signed 

 15122 
 
 	15122 

 50 
 
 	50 

 ShE WAS MOVVD A Picvic TablE TO SWEAP leAveS And DROREd iT oN hER BSC 
 
 	ShE wAs Moving A Picnic TAble To swEEp lEAvEs And DRoPEd iT on hER ToE 

 0724603309 
 
 	0724603509 

 lwas 

In [None]:
#for seq_index in range(len(input_texts)):
results = open('RESULTS_HW.md', 'w')
results.write('|HW sentence|Corrected sentence|GT sentence|\n')
results.write('|---------------|-----------|----------------|\n')

for input_text in input_texts:
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    #print(input_text)
    pre_corrected_sentence = word_spell_correct(input_text)
    #print(pre_corrected_sentence)
    
    input_text = clean_up_sentence(input_text, vocab_to_int[len_range])
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])



    target_text = gt_texts[i]

    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)

    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(input_text)
    #print('-Lenght = ', len_range)
    print('Input sentence:', input_text)
    #print('Spell Decoded sentence:', pre_corrected_sentence) 
    #print('Char Decoded sentence:', decoded_sentence)   
    
    print('Word Decoded sentence:', corrected_sentence)
    results.write(' | ' + input_text + ' | ' + corrected_sentence + ' | '+ target_text.strip() + ' | \n')
    #print(corrected_sentence) 
    #print('\n')
results.close()


Input sentence: is insisting on a policy of change .
Word Decoded sentence: is insisting on a policy of change . 
Input sentence: 12/29/17
Word Decoded sentence: 12/29/17 
Input sentence: SIAL)TH
Word Decoded sentence: SIAL)TH 
Input sentence: Arcadia CA 91007
Word Decoded sentence: Arcadia CA 91007 
Input sentence: (012) 667 9375
Word Decoded sentence: (012) 667 9375 
Input sentence: In this 200-fathom trench the herring do not tood the botton .
Word Decoded sentence: In this 200-fathom trench the herring do not good the cotton . 
Input sentence: 43638556X1
Word Decoded sentence: 43638556X1 
Input sentence: Pretoria
Word Decoded sentence: Pretoria 
Input sentence: fiddaling about with bils of cost .
Word Decoded sentence: fiddling about with bill of cost . 
Input sentence: ( Fig. 3) . Loop threed tound liite finger t
Word Decoded sentence: ( Fig 3) . Loop three Tound lite finger t 
Input sentence: 200681383
Word Decoded sentence: 200681383 
Input sentence: for a working week of 34 to 

Input sentence: 336-545-5000
Word Decoded sentence: 336-545-5000 
Input sentence: ()
Word Decoded sentence: of 
Input sentence: as 1830 , when Anglesey believed himself
Word Decoded sentence: as 1830 , when Anglesey believed himself 
Input sentence: gunshot, wound to abdomen
Word Decoded sentence: gunshot wound to abdomen 
Input sentence: Commie of 100 , the anti-nuclear army
Word Decoded sentence: Commie of 100 , the antinuclear army 
Input sentence: NA
Word Decoded sentence: NA 
Input sentence: no
Word Decoded sentence: no 
Input sentence: 9321 North Oak Traffieway
Word Decoded sentence: 9321 North Oak Trafficway 
Input sentence: 3:00 P
Word Decoded sentence: 3:00 P 
Input sentence: M. Macleod thought the two Rhodesian
Word Decoded sentence: My Macleod thought the two Rhodesian 
Input sentence: 3-9-18
Word Decoded sentence: 3-9-18 
Input sentence: 20/07/2018
Word Decoded sentence: 20/07/2018 
Input sentence: 045154387X2
Word Decoded sentence: 045154387X2 
Input sentence: Tylendl 325i

Input sentence: Karnes, Jonathon L.
Word Decoded sentence: Earnest Jonathon Ll 
Input sentence: 2/14/18
Word Decoded sentence: 2/14/18 
Input sentence: Julie Vandee Werff PAC
Word Decoded sentence: Julie Candee Werf PAC 
Input sentence: lost 2days
Word Decoded sentence: lost 2days 
Input sentence: 6305270199083
Word Decoded sentence: 6305270199083 
Input sentence: 4541A
Word Decoded sentence: 4541A 
Input sentence: 12-15-17
Word Decoded sentence: 12-15-17 
Input sentence: services was wholly difterent - findamentally
Word Decoded sentence: services was wholly different - fundamentally 
Input sentence: ben uhu()mweb.co.za.
Word Decoded sentence: ben uhu()mweb.co.za. 
Input sentence: the 6tic douloureux . As early as 1830 , whan
Word Decoded sentence: the 6tic douloureux . As early as 1830 , whan 
Input sentence: in fighting . Pesterday the shirs turned away
Word Decoded sentence: in fighting . Yesterday the shirs turned away 
Input sentence: 10
Word Decoded sentence: 10 
Input sentence:

Input sentence: gossip " that Weaver once had Commist affilia-
Word Decoded sentence: gossip " that Weaver once had Commits affilia- 
Input sentence: Northern Rhodesia is a member of the Federation .
Word Decoded sentence: Northern Rhodesia is a member of the Federation . 
Input sentence: Pasbus 29758, Danhof Bloemfontein 9310
Word Decoded sentence: Passus 29758, Danhof Bloemfontein 9310 
Input sentence: to angry uproar . One dealt with the humitn
Word Decoded sentence: to angry uproar . One dealt with the Humian 
Input sentence: T
Word Decoded sentence: T 
Input sentence: Melkbesstrand
Word Decoded sentence: Melkbesstrand 
Input sentence: 236 west main Ste 202
Word Decoded sentence: 236 west main Ste 202 
Input sentence: bably the taugheet man in Mr. Hkrumahi's team ,
Word Decoded sentence: badly the taught man in Mrp Hkrumahi's team , 
Input sentence: 5 lb weight bearing
Word Decoded sentence: 5 lb weight bearing 
Input sentence: 45109417X1
Word Decoded sentence: 45109417X1 
Input se

Input sentence: inprtrent adlmoson at Melly
Word Decoded sentence: inprtrent adlmoson at Melly 
Input sentence: childrens and sick people than to take onthis vast
Word Decoded sentence: children and sick people than to take Orthis vast 
Input sentence: cafe 27 . " Dec said , they ace pocted
Word Decoded sentence: cafe 27 . " Dec said , they ace posted 
Input sentence: 1/17/18
Word Decoded sentence: 1/17/18 
Input sentence: 6403075030080
Word Decoded sentence: 6403075030080 
Input sentence: BA00000034812X
Word Decoded sentence: BA00000034812X 
Input sentence: (517) 205-1591
Word Decoded sentence: (517) 205-1591 
Input sentence: See abive
Word Decoded sentence: See above 
Input sentence: 98940
Word Decoded sentence: 98940 
Input sentence: mahing progess . His basic defece of the
Word Decoded sentence: making process . His basic defect of the 
Input sentence: complaih abast Weaver's toyaltet
Word Decoded sentence: complain apast weavers toyaltet 
Input sentence: 3/2/18
Word Decoded senten

Input sentence: Christiaan Daniel Jacobs
Word Decoded sentence: Christiaan Daniel Jacobs 
Input sentence: St.Vincents Hospital
Word Decoded sentence: St.Vincents Hospital 
Input sentence: 20 yrars they will have free food , housing , light ,
Word Decoded sentence: 20 years they will have free food , housing , light , 
Input sentence: anneric) Chrutheline.co.za I moedken)
Word Decoded sentence: anneric) Chrutheline.co.za I moedken) 
Input sentence: 719 West Hamitton Avenue Suite C
Word Decoded sentence: 719 West Hamilton Avenue Suite C 
Input sentence: rowing this dinghy ( Fig. 3). They are very simple, Cheep
Word Decoded sentence: rowing this dinghy ( Fig 3). They are very simple Cheep 
Input sentence: Urgent Care
Word Decoded sentence: Urgent Care 
Input sentence: 6-15-17
Word Decoded sentence: 6-15-17 
Input sentence: Dayles town
Word Decoded sentence: Rayles town 
Input sentence: " bame 161 ient ragt as a mmooth ,
Word Decoded sentence: " same 161 ient rat as a smooth , 
Input sente

Input sentence: Pparson 34 per cent) . How far is Mr.
Word Decoded sentence: Pearson 34 per cent . How far is Mrp 
Input sentence: GRAD
Word Decoded sentence: GRAD 
Input sentence: M47.812
Word Decoded sentence: M47.812 
Input sentence: step . If thatis their decision they should also
Word Decoded sentence: step . If thetis their decision they should also 
Input sentence: 3/6/2018
Word Decoded sentence: 3/6/2018 
Input sentence: DATTON
Word Decoded sentence: HATTON 
Input sentence: mother
Word Decoded sentence: mother 
Input sentence: 712-239-2866
Word Decoded sentence: 712-239-2866 
Input sentence: T9RE Press , many docters and public were
Word Decoded sentence: T9RE Press , many doctors and public were 
Input sentence: 6307275112 087
Word Decoded sentence: 6307275112 087 
Input sentence: hodesia , but the Coldhial Secretayy , Mr. Iain
Word Decoded sentence: rhodesia , but the Colonial Secretary , Mrp Iain 
Input sentence: Farrelcinigh (a)claredon schools.co8a
Word Decoded sentence: F

Input sentence: Vanderoiglpark
Word Decoded sentence: Vanderoiglpark 
Input sentence: with pur o more childen 19 per cent less . The only
Word Decoded sentence: with pur o more children 19 per cent less . The only 
Input sentence: Son all foode the nange was fron 3.6 per cent above the natiml anage
Word Decoded sentence: Son all food the range was from 3.6 per cent above the nation manage 
Input sentence: 03-05-18
Word Decoded sentence: 03-05-18 
Input sentence: went to her fianct22's house and
Word Decoded sentence: went to her fianct22's house and 
Input sentence: 1
Word Decoded sentence: 1 
Input sentence: Peeresses have beern created . Most Labour
Word Decoded sentence: Peeresses have been created . Most Labour 
Input sentence: 79701
Word Decoded sentence: 79701 
Input sentence: Petersburg WV 26547
Word Decoded sentence: Petersburg WV 26547 
Input sentence: as a percentage of social service eependiture ,
Word Decoded sentence: as a percentage of social service expenditure , 
Input 

Input sentence: deitth" was underpinning - not undermining -
Word Decoded sentence: deitth" was underpinning - not undermining - 
Input sentence: 49.505
Word Decoded sentence: 49.505 
Input sentence: Phoenix,
Word Decoded sentence: Phoenix 
Input sentence: level of emplayees 7 2.If so , whch occupational
Word Decoded sentence: level of employees 7 2.If so , which occupational 
Input sentence: 363012420
Word Decoded sentence: 363012420 
Input sentence: CeHD
Word Decoded sentence: Ced 
Input sentence: wok in 1360 420000 ( 23 per cent)
Word Decoded sentence: wok in 1360 420000 ( 23 per cent 
Input sentence: Purley , and from 1922 at Suteito I was Lontinnousty
Word Decoded sentence: Purely , and from 1922 at Suteito I was Lontinnousty 
Input sentence: PAREND JACOBUS BUPGER
Word Decoded sentence: PARENT JACOBUS BUNGER 
Input sentence: 071 163 5856
Word Decoded sentence: 071 163 5856 
Input sentence: 3
Word Decoded sentence: 3 
Input sentence: 30/07/2018
Word Decoded sentence: 30/07/2018 
In

Input sentence: K40.90
Word Decoded sentence: K40.90 
Input sentence: 2-26-18
Word Decoded sentence: 2-26-18 
Input sentence: 93940
Word Decoded sentence: 93940 
Input sentence: 7501125013086
Word Decoded sentence: 7501125013086 
Input sentence: POSBUS 755 MUSINA 0900
Word Decoded sentence: POSTBUS 755 MUSING 0900 
Input sentence: Pretoria
Word Decoded sentence: Pretoria 
Input sentence: 350-00
Word Decoded sentence: 350-00 
Input sentence: 1-10-18
Word Decoded sentence: 1-10-18 
Input sentence: 100
Word Decoded sentence: 100 
Input sentence: their whies to defeat a censure notion
Word Decoded sentence: their whites to defeat a censure notion 
Input sentence: 20/27/2018
Word Decoded sentence: 20/27/2018 
Input sentence: Daughter
Word Decoded sentence: Daughter 
Input sentence: 9/13/17
Word Decoded sentence: 9/13/17 
Input sentence: 3-7-18
Word Decoded sentence: 3-7-18 
Input sentence: 4401255 0072085
Word Decoded sentence: 4401255 0072085 
Input sentence: 1/31/18
Word Decoded sentence:

Input sentence: anneria Chrucheline.co.za ( moeden)
Word Decoded sentence: anaemia Chrucheline.co.za ( moeden) 
Input sentence: Cervical Spondylosi
Word Decoded sentence: Cervical Spondylosis 
Input sentence: 7403100109080
Word Decoded sentence: 7403100109080 
Input sentence: Hee attached
Word Decoded sentence: Hee attached 
Input sentence: Family Practice .
Word Decoded sentence: Family Practice . 
Input sentence: Kyauwdm(a)gmaul.com
Word Decoded sentence: Kyauwdm(a)gmaul.com 
Input sentence: 72 0719 0180 087
Word Decoded sentence: 72 0719 0180 087 
Input sentence: film so vividly to life . In Fanny , which
Word Decoded sentence: film so vividly to life . In Fanny , which 
Input sentence: AR
Word Decoded sentence: AR 
Input sentence: 6e mistaken , 1tho' I cannot but fear that
Word Decoded sentence: 6e mistaken , 1tho' I cannot but fear that 
Input sentence: Party But representatives of Sir Roy
Word Decoded sentence: Party But representatives of Sir Roy 
Input sentence: balance-sheet m

Input sentence: OFF ALUARSL File #
Word Decoded sentence: OFF ALTARS File # 
Input sentence: Surgery
Word Decoded sentence: Surgery 
Input sentence: 12/30/17
Word Decoded sentence: 12/30/17 
Input sentence: 2/27/18
Word Decoded sentence: 2/27/18 
Input sentence: RSA
Word Decoded sentence: RSA 
Input sentence: young people aged 15-17 starting
Word Decoded sentence: young people aged 15-17 starting 
Input sentence: timbee as shown in tig. 1 . Although the timber will have aleeady
Word Decoded sentence: timber as shown in tig 1 . Although the timber will have already 
Input sentence: 80 Sese Hill JR pr No
Word Decoded sentence: 80 Sese Hill JR pr No 
Input sentence: M47.812
Word Decoded sentence: M47.812 
Input sentence: of 700 , told Kennedy that he should
Word Decoded sentence: of 700 , told Kennedy that he should 
Input sentence: 045141073X2
Word Decoded sentence: 045141073X2 
Input sentence: 1 . Grasp thread near end between thumb
Word Decoded sentence: 1 . Grasp thread near end betwe

Input sentence: 7447 W.Talcott #316
Word Decoded sentence: 7447 W.Talcott #316 
Input sentence: Kibcials were ahead ( 43 per cent were in farour
Word Decoded sentence: Kibcials were ahead ( 43 per cent were in favour 
Input sentence: MICAEL N. RIRER
Word Decoded sentence: MICHAEL No RIVER 
Input sentence: Hair Salon
Word Decoded sentence: Hair Salon 
Input sentence: (2)
Word Decoded sentence: (2) 
Input sentence: 2
Word Decoded sentence: 2 
Input sentence: being taken directly from the work . Fig. 4
Word Decoded sentence: being taken directly from the work . Fig 4 
Input sentence: 3000-00
Word Decoded sentence: 3000-00 
Input sentence: Rlemfmkin
Word Decoded sentence: Rlemfmkin 
Input sentence: cecile-houy(a)gyima.com
Word Decoded sentence: cecile-houy(a)gyima.com 
Input sentence: 02/19/2018
Word Decoded sentence: 02/19/2018 
Input sentence: (S92.412D)
Word Decoded sentence: (S92.412D) 
Input sentence: inside of the chair arms about 2 1/2 in. from the
Word Decoded sentence: inside of t

Input sentence: has to puss Mr. Weaver's nomination blgire it
Word Decoded sentence: has to puss Mrp weavers nomination Blaire it 
Input sentence: 8 Rilemoods, Church Road, Walmer, Pott Elizabch , 6070
Word Decoded sentence: 8 Rilemoods, Church Road Warmer Pott Elizabch , 6070 
Input sentence: Post Operative visit
Word Decoded sentence: Post Operative visit 
Input sentence: 02/09/1958
Word Decoded sentence: 02/09/1958 
Input sentence: 12/29/17
Word Decoded sentence: 12/29/17 
Input sentence: 2
Word Decoded sentence: 2 
Input sentence: Son
Word Decoded sentence: Son 
Input sentence: 3149912003
Word Decoded sentence: 3149912003 
Input sentence: 46545
Word Decoded sentence: 46545 
Input sentence: 370r ago
Word Decoded sentence: 370r ago 
Input sentence: Washington next week . A big slice of
Word Decoded sentence: Washington next week . A big slice of 
Input sentence: 5-6-18
Word Decoded sentence: 5-6-18 
Input sentence: Loganathan, Amritray
Word Decoded sentence: Loganathan, Amritray 
Inp

Input sentence: Fiance
Word Decoded sentence: Fiance 
Input sentence: RAT DE WEE
Word Decoded sentence: RAT DE WEE 
Input sentence: SS86.011D
Word Decoded sentence: SS86.011D 
Input sentence: Mudi Mina Pnon
Word Decoded sentence: Mud Mina Non 
Input sentence: OP lpost op Ofice visit
Word Decoded sentence: OP lost op Office visit 
Input sentence: Displaled fracture or lateral malleolus , left fibula
Word Decoded sentence: Displaced fracture or lateral malleolus , left fibula 
Input sentence: 20 06, 1951
Word Decoded sentence: 20 06, 1951 
Input sentence: Northern Rhodesia is a memmber of the Fede-
Word Decoded sentence: Northern Rhodesia is a member of the Fedex 
Input sentence: 471227 0067 081
Word Decoded sentence: 471227 0067 081 
Input sentence: brijlal300(a)gmail.com
Word Decoded sentence: brijlal300(a)gmail.com 
Input sentence: N.A.
Word Decoded sentence: Near 
Input sentence: RSA
Word Decoded sentence: RSA 
Input sentence: Felicity Maniatis
Word Decoded sentence: Felicity Manatis

Input sentence: tremtle for my country " I may be mistcken , 1the' 1
Word Decoded sentence: tremble for my country " I may be mistaken , 1the' 1 
Input sentence: CRELA4BRAU ELOAECHO in Hobrew, Salt is the coveneut of thy God .
Word Decoded sentence: CRELA4BRAU ELOAECHO in Hobrew, Salt is the covenant of thy God . 
Input sentence: S21.90XA
Word Decoded sentence: S21.90XA 
Input sentence: 127-04-2019
Word Decoded sentence: 127-04-2019 
Input sentence: (021) 553 3235
Word Decoded sentence: (021) 553 3235 
Input sentence: PO Box 252 Connwall 1/11, 0178
Word Decoded sentence: PO Box 252 Cornwall 1/11, 0178 
Input sentence: IC
Word Decoded sentence: IC 
Input sentence: PO Box 870. Greatbrakriver 6525
Word Decoded sentence: PO Box 870. Greatbrakriver 6525 
Input sentence: nO bendingiho tois ting, no lifting greiter then 4 lbs
Word Decoded sentence: no bendingiho this tinge no lifting greater then 4 lbs 
Input sentence: 0187/2931X7
Word Decoded sentence: 0187/2931X7 
Input sentence: Ne came th

Input sentence: to werk with ; and , altuonye it does
Word Decoded sentence: to were with ; and , altuonye it does 
Input sentence: Voice Authorized
Word Decoded sentence: Voice Authorized 
Input sentence: 4
Word Decoded sentence: 4 
Input sentence: 17/02/2008
Word Decoded sentence: 17/02/2008 
Input sentence: TEbello N Masweneng
Word Decoded sentence: TEbello N Masweneng 
Input sentence: Fay Compton stars in " Ns Hiding (tace " ( T7V , 1.75 p.m. ) .
Word Decoded sentence: Fay Compton stars in " Ns Hiding Stace " ( T7V , 1.75 pm ) . 
Input sentence: Same as life insured
Word Decoded sentence: Same as life insured 
Input sentence: 045149682X2
Word Decoded sentence: 045149682X2 
Input sentence: Premier urgent Core
Word Decoded sentence: Premier urgent Core 
Input sentence: 3commer
Word Decoded sentence: 3commer 
Input sentence: 26-2589825
Word Decoded sentence: 26-2589825 
Input sentence: his days as Brtain's chief UN dele-
Word Decoded sentence: his days as brains chief UN dele 
Input s

Input sentence: Les0 ontshioagae
Word Decoded sentence: Les0 ontshioagae 
Input sentence: MARGARETHA 5.3 VENTER
Word Decoded sentence: MARGARETHA 5.3 VENTER 
Input sentence: Med +
Word Decoded sentence: Med a 
Input sentence: 33.4
Word Decoded sentence: 33.4 
Input sentence: (L)hipoA
Word Decoded sentence: (L)hipoA 
Input sentence: 3/9/18
Word Decoded sentence: 3/9/18 
Input sentence: Eau Claire
Word Decoded sentence: Eau Claire 
Input sentence: 2/14/18
Word Decoded sentence: 2/14/18 
Input sentence: Begundigile
Word Decoded sentence: Begundigile 
Input sentence: Thaseate Bankng Commiter , wiih is headed
Word Decoded sentence: Thaseate Banking Commuter , with is headed 
Input sentence: 3/27/18
Word Decoded sentence: 3/27/18 
Input sentence: 136 Van Jaarrveld Stret
Word Decoded sentence: 136 Van Jaarrveld Stret 
Input sentence: 574-875-9323
Word Decoded sentence: 574-875-9323 
Input sentence: 49201
Word Decoded sentence: 49201 
Input sentence: VREDENAUROT
Word Decoded sentence: VREDENAU

Input sentence: lumbar sprain/strain
Word Decoded sentence: lumbar sprain/strain 
Input sentence: in the Hy hoch , only 30 miles from Glengow , a
Word Decoded sentence: in the Hy hoch , only 30 miles from Glengow , a 
Input sentence: MO
Word Decoded sentence: MO 
Input sentence: at te is to be backed by Mr. Will
Word Decoded sentence: at te is to be backed by Mrp Will 
Input sentence: 2.22.18
Word Decoded sentence: 2.22.18 
Input sentence: negotiatias with Sir Roy's representative ,
Word Decoded sentence: negotiations with Sir royal representative , 
Input sentence: Commonwealtn enay facility passinleda
Word Decoded sentence: Commonwealth nay facility passinleda 
Input sentence: Christiaan Daniel Jacobs
Word Decoded sentence: Christiaan Daniel Jacobs 
Input sentence: Foreign Minister , and Mr. Heath . MR. Seliyn Llogd-
Word Decoded sentence: Foreign Minister , and Mrp Heath . MRP Selion Llogd- 
Input sentence: the riots in Istanbul , which eenlivened the NATO
Word Decoded sentence: the

Input sentence: 01/25/18
Word Decoded sentence: 01/25/18 
Input sentence: KD Swiegelaa
Word Decoded sentence: KD Swiegelaa 
Input sentence: 13-08-1969
Word Decoded sentence: 13-08-1969 
Input sentence: 9:00 A4
Word Decoded sentence: 9:00 A4 
Input sentence: The Govenment's pompous little statement
Word Decoded sentence: The governments pompous little statement 
Input sentence: 0
Word Decoded sentence: 0 
Input sentence: Feb 6, 2018
Word Decoded sentence: Feb 6, 2018 
Input sentence: SLSANNA A PONRAOIE
Word Decoded sentence: SUSANNA A PONRAOIE 
Input sentence: MD
Word Decoded sentence: MD 
Input sentence: becuuse tir Roy had found messages
Word Decoded sentence: because sir Roy had found messages 
Input sentence: N/A
Word Decoded sentence: Na 
Input sentence: 1201 E. Michigan Ave Ste 240
Word Decoded sentence: 1201 E Michigan Ave Ste 240 
Input sentence: 3/6/18
Word Decoded sentence: 3/6/18 
Input sentence: 50
Word Decoded sentence: 50 
Input sentence: a man with trobles enough lack fom

Input sentence: SugamaC. J. Jansen Vom Vunon
Word Decoded sentence: SugamaC. J Jansen Vom Upon 
Input sentence: 12/30/17
Word Decoded sentence: 12/30/17 
Input sentence: 12-15-17
Word Decoded sentence: 12-15-17 
Input sentence: DPM
Word Decoded sentence: DPM 
Input sentence: " The jackals bay when there is nothing belter they can do . "
Word Decoded sentence: " The jackals bay when there is nothing belter they can do . " 
Input sentence: Christiaan Daniel Jacoks
Word Decoded sentence: Christiaan Daniel Jacks 
Input sentence: Pol plcasant Vailly 7d
Word Decoded sentence: Pol pleasant Vainly 7d 
Input sentence: 3-15-18
Word Decoded sentence: 3-15-18 
Input sentence: 11:10
Word Decoded sentence: 11:10 
Input sentence: you ar out of date . Gricet in 1461 is ploye
Word Decoded sentence: you ar out of date . Grivet in 1461 is ploce 
Input sentence: 9563724X5 41350303X8
Word Decoded sentence: 9563724X5 41350303X8 
Input sentence: 1000-00
Word Decoded sentence: 1000-00 
Input sentence: majorit

Input sentence: Occopaton Meliccn
Word Decoded sentence: Occupation Meliccn 
Input sentence: 100
Word Decoded sentence: 100 
Input sentence: the Government adequcte power to maintain all
Word Decoded sentence: the Government adequate power to maintain all 
Input sentence: cilegates fron Mr. Fennth Kaunda's Unitea
Word Decoded sentence: delegates from Mrp Length Kaunda's United 
Input sentence: m25.512 M75.102
Word Decoded sentence: m25.512 M75.102 
Input sentence: a linitaled state of emorpency " was dedornt , vitg
Word Decoded sentence: a linitaled state of emergency " was deodorant , vite 
Input sentence: S89202A
Word Decoded sentence: S89202A 
Input sentence: Dertisy
Word Decoded sentence: Destiny 
Input sentence: 6801180089083
Word Decoded sentence: 6801180089083 
Input sentence: 20
Word Decoded sentence: 20 
Input sentence: Seanswanepoela)gmail.com
Word Decoded sentence: Seanswanepoela)gmail.com 
Input sentence: 1.8.2018
Word Decoded sentence: 1.8.2018 
Input sentence: ERI Wobr Jo

Input sentence: and arrived at Hounslow around t P77 , whee
Word Decoded sentence: and arrived at Hounslow around t P77 , whee 
Input sentence: at (B) , Fig 1 . To mak i leg suchas
Word Decoded sentence: at By , Fig 1 . To mak i leg Suchos 
Input sentence: 11/03/2000
Word Decoded sentence: 11/03/2000 
Input sentence: 8809165098084
Word Decoded sentence: 8809165098084 
Input sentence: 014121349X6
Word Decoded sentence: 014121349X6 
Input sentence: 513-354-3700
Word Decoded sentence: 513-354-3700 
Input sentence: 56-0951114
Word Decoded sentence: 56-0951114 
Input sentence: 3/2/2018
Word Decoded sentence: 3/2/2018 
Input sentence: PasBus 10032, PANABAAT , 6510
Word Decoded sentence: passus 10032, PANABAAT , 6510 
Input sentence: Suy 10 6R
Word Decoded sentence: Say 10 6R 
Input sentence: the is not beig hopocritical . That is what he
Word Decoded sentence: the is not being hypocritical . That is what he 
Input sentence: 0
Word Decoded sentence: 0 
Input sentence: 30
Word Decoded sentence

Input sentence: Pdlyaipstence abuse
Word Decoded sentence: Pdlyaipstence abuse 
Input sentence: N/A
Word Decoded sentence: Na 
Input sentence: Lifting bending, tistig
Word Decoded sentence: Lifting bending testis 
Input sentence: defented the appaintmment of a Neyo as
Word Decoded sentence: defeated the appointment of a Neo as 
Input sentence: put down a vesolution on the subject
Word Decoded sentence: put down a resolution on the subject 
Input sentence: The film version of Miss thelagh Delancy's play
Word Decoded sentence: The film version of Miss Shelagh Delancy's play 
Input sentence: 4.5
Word Decoded sentence: 4.5 
Input sentence: Harry Carroll from Ceicester ( BBC , 8.25 ) . A
Word Decoded sentence: Harry Carroll from Leicester ( BBC , 8.25 ) . A 
Input sentence: lenting friendship begon in 1913 i Purley end
Word Decoded sentence: renting friendship began in 1913 i Purely end 
Input sentence: instalment of almost 8000,000
Word Decoded sentence: instalment of almost 8000,000 
Inpu

Input sentence: by 4in. attoched to the front 2egs with a pair
Word Decoded sentence: by 4in. attached to the front 2egs with a pair 
Input sentence: racial discrimination in Government
Word Decoded sentence: racial discrimination in Government 
Input sentence: 6
Word Decoded sentence: 6 
Input sentence: Next apt 2/20/18
Word Decoded sentence: Next apt 2/20/18 
Input sentence: Willochell, Teri
Word Decoded sentence: Willochell, Teri 
Input sentence: 2/20/18
Word Decoded sentence: 2/20/18 
Input sentence: WIFE
Word Decoded sentence: WIFE 
Input sentence: Fx
Word Decoded sentence: Fx 
Input sentence: 64 inches
Word Decoded sentence: 64 inches 
Input sentence: the Uuted Federel Party and the Domnon Pory.
Word Decoded sentence: the Usted Federal Party and the Common Pory 
Input sentence: PO.90X 50611 WATERPRONI 5 0617
Word Decoded sentence: PO.90X 50611 WATERPROOF 5 0617 
Input sentence: 2 12 18
Word Decoded sentence: 2 12 18 
Input sentence: tremble for my country ! I may be mistaken , 1t

Input sentence: at the week -end fortalks with Mr. Macmillan
Word Decoded sentence: at the week end mortals with Mrp Macmillan 
Input sentence: noon
Word Decoded sentence: noon 
Input sentence: 044766389X9
Word Decoded sentence: 044766389X9 
Input sentence: people .
Word Decoded sentence: people . 
Input sentence: his Hausing Minister . It has aroused strong
Word Decoded sentence: his Hausing Minister . It has aroused strong 
Input sentence: Kassie(odischem.co.za. / Kossemeit31970(a)gmail.cony
Word Decoded sentence: Kassie(odischem.co.za. / Kossemeit31970(a)gmail.cony 
Input sentence: Neurosurgery
Word Decoded sentence: Neurosurgery 
Input sentence: 2-19-18
Word Decoded sentence: 2-19-18 
Input sentence: Charity lowdermilk
Word Decoded sentence: Charity lowdermilk 
Input sentence: day called on Mr. Macmillan to cease his
Word Decoded sentence: day called on Mrp Macmillan to cease his 
Input sentence: RSA
Word Decoded sentence: RSA 
Input sentence: See page 13
Word Decoded sentence: See

Input sentence: Opia douleureux . As early as 1830 , when
Word Decoded sentence: Oria douleureux . As early as 1830 , when 
Input sentence: Mr. Macod was not at the west-end meeteng . But he told
Word Decoded sentence: Mrp Macon was not at the west-end meeting . But he told 
Input sentence: 011907868X1/019098415X3
Word Decoded sentence: 011907868X1/019098415X3 
Input sentence: 9/28/17
Word Decoded sentence: 9/28/17 
Input sentence: 4401135008080
Word Decoded sentence: 4401135008080 
Input sentence: Julan Tix Frea
Word Decoded sentence: Yulan Six Free 
Input sentence: (dguphto)
Word Decoded sentence: (dguphto) 
Input sentence: 9/14/17
Word Decoded sentence: 9/14/17 
Input sentence: Acute reppuetery feailuct
Word Decoded sentence: Acute reppuetery feailuct 
Input sentence: 4169/778X9
Word Decoded sentence: 4169/778X9 
Input sentence: 2/6/18
Word Decoded sentence: 2/6/18 
Input sentence: (L Shoulder Strain
Word Decoded sentence: Ll Shoulder Strain 
Input sentence: 574-875-9323
Word Decode

Input sentence: (L)hipoA
Word Decoded sentence: (L)hipoA 
Input sentence: Christiaan Daniel Jacobs
Word Decoded sentence: Christiaan Daniel Jacobs 
Input sentence: 3/2
Word Decoded sentence: 3/2 
Input sentence: 082 3304 7983
Word Decoded sentence: 082 3304 7983 
Input sentence: har been removed , and if care is taken
Word Decoded sentence: had been removed , and if care is taken 
Input sentence: 3/22/18
Word Decoded sentence: 3/22/18 
Input sentence: 5
Word Decoded sentence: 5 
Input sentence: 0837957927
Word Decoded sentence: 0837957927 
Input sentence: (S90.122A)
Word Decoded sentence: (S90.122A) 
Input sentence: 3
Word Decoded sentence: 3 
Input sentence: 21/02/2011
Word Decoded sentence: 21/02/2011 
Input sentence: 4230 Hamilton Blvd
Word Decoded sentence: 4230 Hamilton Blvd 
Input sentence: 806-7930013
Word Decoded sentence: 806-7930013 
Input sentence: 018712931X7
Word Decoded sentence: 018712931X7 
Input sentence: 773 631-7898
Word Decoded sentence: 773 631-7898 
Input sentence

Input sentence: lage mojirity o Labour M Ps are lkely to
Word Decoded sentence: large majority o Labour M Ps are likely to 
Input sentence: Gade
Word Decoded sentence: Gade 
Input sentence: MD
Word Decoded sentence: MD 
Input sentence: 29/03/1965
Word Decoded sentence: 29/03/1965 
Input sentence: A 90-day war , the West German
Word Decoded sentence: A 90-day war , the West German 
Input sentence: PuL Schcppere
Word Decoded sentence: pul Schcppere 
Input sentence: 2002 OXfard Ave
Word Decoded sentence: 2002 Oxford Ave 
Input sentence: 320-732-2141
Word Decoded sentence: 320-732-2141 
Input sentence: fallen from 28.5 to 23.A per cent. Then
Word Decoded sentence: fallen from 28.5 to 23.A per cent Then 
Input sentence: (Capitec Bank) Felicity Maniatis
Word Decoded sentence: (Capitec Bank Felicity Manatis 
Input sentence: to dican Weaver's appointement . Sorator
Word Decoded sentence: to divan weavers appointment . Orator 
Input sentence: 1306135525087
Word Decoded sentence: 1306135525087 


Input sentence: alont unchanged in 1959 for couples with two or more
Word Decoded sentence: along unchanged in 1959 for couples with two or more 
Input sentence: (PBC , 10.15 ) .
Word Decoded sentence: PBC , 10.15 ) . 
Input sentence: Daughber
Word Decoded sentence: Daughter 
Input sentence: N/A
Word Decoded sentence: Na 
Input sentence: 840245903
Word Decoded sentence: 840245903 
Input sentence: MD
Word Decoded sentence: MD 
Input sentence: thon aown the Joot-crrifaths resoition . M
Word Decoded sentence: thon down the Joot-crrifaths resolution . M 
Input sentence: 68137
Word Decoded sentence: 68137 
Input sentence: egeat ( B) , Fig. 3 . A saw cnt is
Word Decoded sentence: geat ( By , Fig 3 . A saw cut is 
Input sentence: S82.62xA
Word Decoded sentence: S82.62xA 
Input sentence: White Oak
Word Decoded sentence: White Oak 
Input sentence: 12/28/2017
Word Decoded sentence: 12/28/2017 
Input sentence: RC170193
Word Decoded sentence: RC170193 
Input sentence: 07/09/1986
Word Decoded sente

Input sentence: OY a mas wrpped in the impenctralhe cocoon of what he rpads
Word Decoded sentence: OY a mas wrapped in the impenctralhe cocoon of what he roads 
Input sentence: tremble for my country l I may be mistaten , 1 tho'
Word Decoded sentence: tremble for my country l I may be mistaken , 1 thoo 
Input sentence: 31072018
Word Decoded sentence: 31072018 
Input sentence: S/A
Word Decoded sentence: Sea 
Input sentence: Christiaan Daniel Jacobs
Word Decoded sentence: Christiaan Daniel Jacobs 
Input sentence: (800) 793 8869
Word Decoded sentence: (800) 793 8869 
Input sentence: Goverment . Mr. James Callaghan, Labour's Colonial spokesman ,
Word Decoded sentence: Government . Mrp James Callaghan labours Colonial spokesman , 
Input sentence: 11714955X9
Word Decoded sentence: 11714955X9 
Input sentence: today . PRESIDENT KENNEDY today defed
Word Decoded sentence: today . PRESIDENT KENNEDY today deed 
Input sentence: opplications , of wham 100 were considered snitable by
Word Decoded sen

Input sentence: rever were . " Intercusted by engry Tories,
Word Decoded sentence: rever were . " Interested by angry Toriest 
Input sentence: SD
Word Decoded sentence: SD 
Input sentence: All aillfs relations were 4makrodeb now , but tom
Word Decoded sentence: All ailles relations were 4makrodeb now , but tom 
Input sentence: 3-10-18
Word Decoded sentence: 3-10-18 
Input sentence: suture removal
Word Decoded sentence: suture removal 
Input sentence: 247.89
Word Decoded sentence: 247.89 
Input sentence: 1/5/2018
Word Decoded sentence: 1/5/2018 
Input sentence: cross pieces and is made of 1/2 in. plywont
Word Decoded sentence: cross pieces and is made of 1/2 in plywont 
Input sentence: (Fhand contusion
Word Decoded sentence: Hand contusion 
Input sentence: 2/28/18
Word Decoded sentence: 2/28/18 
Input sentence: 897.0
Word Decoded sentence: 897.0 
Input sentence: RSA
Word Decoded sentence: RSA 
Input sentence: manleenrona)gmail.com
Word Decoded sentence: manleenrona)gmail.com 
Input sent

Input sentence: lefore coming to rest at 5.s . " Here a btf
Word Decoded sentence: before coming to rest at 5.s . " Here a tf 
Input sentence: 28
Word Decoded sentence: 28 
Input sentence: offife sisits
Word Decoded sentence: office visits 
Input sentence: torthe welfare food scheme . It was maintained
Word Decoded sentence: tortie welfare food scheme . It was maintained 
Input sentence: Ne R. Menkshiaagac
Word Decoded sentence: Ne Re Menkshiaagac 
Input sentence: 1602 870-6353
Word Decoded sentence: 1602 870-6353 
Input sentence: 8om
Word Decoded sentence: 8om 
Input sentence: aides to come up with the answer in
Word Decoded sentence: aides to come up with the answer in 
Input sentence: to boput out for the sake of 50000 is there
Word Decoded sentence: to bout out for the sake of 50000 is there 
Input sentence: Caillin Rop20 Lncoln Way
Word Decoded sentence: Caitlin Rop20 Lincoln Way 
Input sentence: rity they are seeking . African delrgates ave
Word Decoded sentence: city they are se

In [None]:
WER_spell_correction = calculate_WER(gt_texts, decoded_sentences)
print('WER_spell_correction |TEST= ', WER_spell_correction)

In [None]:
WER_spell_word_correction = calculate_WER(gt_texts, corrected_sentences)
print('WER_spell_word_correction |TEST= ', WER_spell_word_correction)

In [None]:
WER_OCR = calculate_WER(gt_texts, input_texts)
print('WER_OCR |TEST= ', WER_OCR)