# Introduction

We tackle the problem of OCR post processing. In OCR, we map the image form of the document into the text domain. This is done first using an CNN+LSTM+CTC model, in our case based on tesseract. Since this output maps only image to text, we need something on top to validate and correct language semantics.

The idea is to build a language model, that takes the OCRed text and corrects it based on language knowledge. The langauge model could be:
- Char level: the aim is to capture the word morphology. In which case it's like a spelling correction system.
- Word level: the aim is to capture the sentence semnatics. But such systems suffer from the OOV problem.
- Fusion: to capture semantics and morphology language rules. The output has to be at char level, to avoid the OOV. However, the input can be char, word or both.

The fusion model target is to learn:

    p(char | char_context, word_context)

In this workbook we use seq2seq vanilla Keras implementation, adapted from the lstm_seq2seq example on Eng-Fra translation task. The adaptation involves:

- Adapt to spelling correction, on char level
- Pre-train on a noisy, medical sentences
- Fine tune a residual, to correct the mistakes of tesseract 
- Limit the input and output sequence lengths
- Enusre teacher forcing auto regressive model in the decoder
- Limit the padding per batch
- Learning rate schedule
- Bi-directional LSTM Encoder
- Bi-directional GRU Encoder


# Imports

In [1]:
from __future__ import print_function
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import set_session
from keras.models import Model
from keras.layers import Input, LSTM, Dense, Bidirectional, Concatenate, GRU
from keras import optimizers
from keras.callbacks import ModelCheckpoint, TensorBoard, LearningRateScheduler
from keras.models import load_model
import numpy as np
import os
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from autocorrect import spell
import re
%matplotlib inline

Using TensorFlow backend.


# Utility functions

In [2]:
# Limit gpu allocation. allow_growth, or gpu_fraction
def gpu_alloc():
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    set_session(tf.Session(config=config))

In [3]:
gpu_alloc()

In [4]:
def calculate_WER_sent(gt, pred):
    '''
    calculate_WER('calculating wer between two sentences', 'calculate wer between two sentences')
    '''
    gt_words = gt.lower().split(' ')
    pred_words = pred.lower().split(' ')
    d = np.zeros(((len(gt_words) + 1), (len(pred_words) + 1)), dtype=np.uint8)
    # d = d.reshape((len(gt_words)+1, len(pred_words)+1))

    # Initializing error matrix
    for i in range(len(gt_words) + 1):
        for j in range(len(pred_words) + 1):
            if i == 0:
                d[0][j] = j
            elif j == 0:
                d[i][0] = i

    # computation
    for i in range(1, len(gt_words) + 1):
        for j in range(1, len(pred_words) + 1):
            if gt_words[i - 1] == pred_words[j - 1]:
                d[i][j] = d[i - 1][j - 1]
            else:
                substitution = d[i - 1][j - 1] + 1
                insertion = d[i][j - 1] + 1
                deletion = d[i - 1][j] + 1
                d[i][j] = min(substitution, insertion, deletion)
    return d[len(gt_words)][len(pred_words)]

In [5]:
def calculate_WER(gt, pred):
    '''

    :param gt: list of sentences of the ground truth
    :param pred: list of sentences of the predictions
    both lists must have the same length
    :return: accumulated WER
    '''
#    assert len(gt) == len(pred)
    WER = 0
    nb_w = 0
    for i in range(len(gt)):
        #print(gt[i])
        #print(pred[i])
        WER += calculate_WER_sent(gt[i], pred[i])
        nb_w += len(gt[i])

    return WER / nb_w

In [6]:
def load_data_with_gt(file_name, num_samples, max_sent_len, min_sent_len, delimiter='\t', gt_index=1, prediction_index=0):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []
    gt_texts = []
    target_texts = []
    for row in open(file_name, encoding='utf8'):
        if cnt < num_samples :
            #print(row)
            sents = row.split(delimiter)
            input_text = sents[prediction_index]
            
            target_text = '\t' + sents[gt_index] + '\n'
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len and len(target_text) > min_sent_len and len(target_text) < max_sent_len:
                cnt += 1
                
                input_texts.append(input_text)
                target_texts.append(target_text)
                gt_texts.append(sents[gt_index])
    return input_texts, target_texts, gt_texts

In [7]:
def load_data(file_name, num_samples, max_sent_len, min_sent_len):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []   
    
    #for row in open(file_name, encoding='utf8'):
    for row in open(file_name):
        if cnt < num_samples :            
            input_text = row           
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len:
                cnt += 1                
                input_texts.append(input_text)
    return input_texts

In [8]:
def vectorize_data(input_texts, max_encoder_seq_length, num_encoder_tokens, vocab_to_int):
    
    if(len(input_texts) > max_encoder_seq_length):
        input_texts = input_texts[:max_encoder_seq_length]
    
    '''Prepares the input text and targets into the proper seq2seq numpy arrays'''
    encoder_input_data = np.zeros(
    (len(input_texts), max_encoder_seq_length),
    dtype='float32')
    
    for i, input_text in enumerate(input_texts):
        for t, char in enumerate(input_text[:max_encoder_seq_length]):
            # c0..cn
            encoder_input_data[i, t] = vocab_to_int[char]
                
    return encoder_input_data

In [9]:
def decode_sequence(input_seq, encoder_model, decoder_model, num_decoder_tokens, max_decoder_seq_length, vocab_to_int, int_to_vocab):
    
    #print(max_decoder_seq_length)
    # Encode the input as state vectors.
    encoder_outputs, h, c  = encoder_model.predict(input_seq)
    states_value = [h,c]
    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0] = vocab_to_int['\t']

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    #print(input_seq)
    attention_density = []
    i = 0
    special_chars = ['\\', '/', '-', '—' , ':', '[', ']', ',', '.', '"', ';', '%', '~', '(', ')', '{', '}', '$', '#']
    #special_chars = []
    while not stop_condition:
        #print(target_seq)
        output_tokens, attention, h, c  = decoder_model.predict(
            [target_seq, encoder_outputs] + states_value)
        #print(attention.shape)
        attention_density.append(attention[0][0])# attention is max_sent_len x 1 since we have num_time_steps = 1 for the output
        # Sample a token
        #print(output_tokens.shape)
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        
        #print(sampled_token_index)
        sampled_char = int_to_vocab[sampled_token_index]
        
        orig_char = int_to_vocab[int(input_seq[:,i][0])]
        
        # Exit condition: either hit max length
        # or find stop character.
        if (sampled_char == '\n' or
           len(decoded_sentence) > max_decoder_seq_length):
            stop_condition = True
            #print('End', sampled_char, 'Len ', len(decoded_sentence), 'Max len ', max_decoder_seq_length)
            sampled_char = ''
        
        # Copy digits as it, since the spelling corrector is not good at digit corrections
        if(orig_char.isdigit() or orig_char in special_chars):
            decoded_sentence += orig_char            
        else:
            if(sampled_char.isdigit() or sampled_char in special_chars):
                decoded_sentence += ''
            else:
                decoded_sentence += sampled_char
        
        #decoded_sentence += sampled_char


        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]
        
        i += 1
        if(i > 48):
            i = 0
    attention_density = np.array(attention_density)
    
    # Word level spell correct
    '''
    corrected_decoded_sentence = ''
    for w in decoded_sentence.split(' '):
        corrected_decoded_sentence += spell(w) + ' '
    decoded_sentence = corrected_decoded_sentence
    '''
    return decoded_sentence, attention_density


In [10]:
def word_spell_correct(decoded_sentence):
    corrected_decoded_sentence = ''
    special_chars = ['\\', '/', '-', '—' , ':', '[', ']', ',', '.', '"', ';', '%', '~', '(', ')', '{', '}', '$', '#']
    for w in decoded_sentence.split(' '):
        if((len(re.findall(r'\d+', w))==0) and not (w in special_chars)):
            corrected_decoded_sentence += spell(w) + ' '
        else:
            corrected_decoded_sentence += w + ' '
    return corrected_decoded_sentence

# Load data

# Load model params

In [11]:
data_path = '../../dat/'

In [12]:
max_sent_lengths = [50, 100]

In [13]:
vocab_file = {}
model_file = {}
encoder_model_file = {}
decoder_model_file = {}
model = {}
encoder_model = {}
decoder_model = {}
vocab = {}
vocab_to_int = {}
int_to_vocab = {}
max_sent_len = {}
min_sent_len = {}
num_decoder_tokens = {}
num_encoder_tokens = {}
max_encoder_seq_length = {}
max_decoder_seq_length = {}

In [14]:

for i in max_sent_lengths:
    vocab_file[i] = 'vocab-{}.npz'.format(i)
    model_file[i] = 'best_model-{}.hdf5'.format(i)
    encoder_model_file[i] = 'encoder_model-{}.hdf5'.format(i)
    decoder_model_file[i] = 'decoder_model-{}.hdf5'.format(i)
    
    vocab = np.load(file=vocab_file[i])
    vocab_to_int[i] = vocab['vocab_to_int'].item()
    int_to_vocab[i] = vocab['int_to_vocab'].item()
    max_sent_len[i] = vocab['max_sent_len']
    min_sent_len[i] = vocab['min_sent_len']
    input_characters = sorted(list(vocab_to_int))
    num_decoder_tokens[i] = num_encoder_tokens[i] = len(input_characters) #int(encoder_model.layers[0].input.shape[2])
    max_encoder_seq_length[i] = max_decoder_seq_length[i] = max_sent_len[i] - 1#max([len(txt) for txt in input_texts])
    
    model[i] = load_model(model_file[i])
    encoder_model[i] = load_model(encoder_model_file[i])
    decoder_model[i] = load_model(decoder_model_file[i])



In [15]:
num_samples = 1000000
#tess_correction_data = os.path.join(data_path, 'test_data.txt')
#input_texts = load_data(tess_correction_data, num_samples, max_sent_len, min_sent_len)

OCR_data = os.path.join(data_path, 'new_trained_data.txt')
#input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len, min_sent_len, delimiter='|',gt_index=0, prediction_index=1)
input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len=10000, min_sent_len=0)

In [16]:
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])

1951
Me dieal Provider Roles: Treating  
 	Medical Provider Roles: Treating


Provider First Name: Christine  
 	Provider First Name: Christine


Provider Last Name: Nolen, MD  
 	Provider Last Name: Nolen, MD


Address Line 1 : 7 25 American Avenue  
 	Address Line 1 : 725 American Avenue


City. W’aukesha  
 	City: Waukesha


StatefProvinee: ‘WI  
 	State/Province: WI


Postal Code: 5 31 88  
 	Postal Code: 53188


Country". US  
 	Country:  US


Business Telephone: (2 62) 92 8- 1000  
 	Business Telephone: (262) 928- 1000


Date ot‘Pirst Visit: 1 2/01f20 17  
 	Date of First Visit: 12/01/2017




In [17]:
# Spell correct before inference
'''
input_texts_ = []
for sent in input_texts:
    sent_ = ''
    for word in sent.split(' '):
        sent_ += spell(word) + ' '
    input_texts_.append(sent_)
input_texts = input_texts_
input_texts_ = []
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])
'''

"\ninput_texts_ = []\nfor sent in input_texts:\n    sent_ = ''\n    for word in sent.split(' '):\n        sent_ += spell(word) + ' '\n    input_texts_.append(sent_)\ninput_texts = input_texts_\ninput_texts_ = []\n# Sample data\nprint(len(input_texts))\nfor i in range(10):\n    print(input_texts[i], '\n', target_texts[i])\n"

In [18]:
decoded_sentences = []
corrected_sentences = []

#for seq_index in range(len(input_texts)):
results = open('RESULTS.md', 'w')
results.write('|OCR sentence|GT sentence|Char decoded sentence|Word decoded sentence|Sentence length (chars)|\n')
results.write('---------------|-----------|----------------|----------------|----------------|\n')
     

for i, input_text in enumerate(input_texts):
    #print(input_text)
    # Find the input length range to choose the proper model to use
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])
    
    

    target_text = gt_texts[i]
    
    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)
    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    corrected_sentence = word_spell_correct(decoded_sentence)
    print('-Lenght = ', len_range)
    print('Input sentence:', input_text)
    print('GT sentence:', target_text.strip())
    print('Char Decoded sentence:', decoded_sentence)   
    print('Word Decoded sentence:', corrected_sentence) 
    results.write(' | ' + input_text + ' | ' + target_text.strip() + ' | ' + decoded_sentence + ' | ' + corrected_sentence + ' | ' + str(len_range) + ' | \n')
    decoded_sentences.append(decoded_sentence)
    corrected_sentences.append(corrected_sentence)
results.close()    

    

-Lenght =  50
Input sentence: Me dieal Provider Roles: Treating 
GT sentence: Medical Provider Roles: Treating
Char Decoded sentence: Medical Provider Roles:Treating 
Word Decoded sentence: Medical Provider Roles:Treating a 
-Lenght =  50
Input sentence: Provider First Name: Christine 
GT sentence: Provider First Name: Christine
Char Decoded sentence: Provider First Name: Christine 
Word Decoded sentence: Provider First Name Christine a 
-Lenght =  50
Input sentence: Provider Last Name: Nolen, MD 
GT sentence: Provider Last Name: Nolen, MD
Char Decoded sentence: Provider Last Name: Nolen, MD 
Word Decoded sentence: Provider Last Name Dolens MD a 
-Lenght =  50
Input sentence: Address Line 1 : 7 25 American Avenue 
GT sentence: Address Line 1 : 725 American Avenue
Char Decoded sentence: Address Line 1 : 725 Americal Avenuer
Word Decoded sentence: Address Line 1 : 725 America Avenue 
-Lenght =  50
Input sentence: City. W’aukesha 
GT sentence: City: Waukesha
Char Decoded sentence: City.St

-Lenght =  100
Input sentence: TT INSURANCIEINFORMATICN PRIMARY: UMR SECONDARY: '
GT sentence: INSURANCE INFORMATION PRIMARY: UMR SECONDARY:
Char Decoded sentence: INSURANCEN ORN ORMATION PRIMARY:UMMARY UMCOND:RY
Word Decoded sentence: INSURANCE ON FORMATION PRIMARY:UMMARY UMCOND:RY 
-Lenght =  50
Input sentence: STATEMENT DATE 1121/2018
GT sentence: STATEMENT DATE 1/21/2018
Char Decoded sentence: STATEMENT DATE 1121/2
Word Decoded sentence: STATEMENT DATE 1121/2 
-Lenght =  50
Input sentence: PATIENT NAME 
GT sentence: PATIENT NAME
Char Decoded sentence: PATIENT NAME
Word Decoded sentence: PATIENT NAME 
-Lenght =  100
Input sentence: pAvMENTsm'E—I: STATEMENT DATE WILL NOT APPEAR ON THIS STATEMENT 
GT sentence: PAYMENTS RECEIVED AFTER STATE DATE WILL NOT APPEAR ON THIS STATEMENT
Char Decoded sentence: PAYMENTE IN— :TATEMENT DATE WILL NOT APPEAR ON THIS STATEMEN— :ENT 
Word Decoded sentence: PAYMENT IN STATEMENT DATE WILL NOT APPEAR ON THIS STATEMENT WENT a 
-Lenght =  50
Input sentence

-Lenght =  100
Input sentence: To pay your bill on line with a credit card, log on to www.ebixinc.comlpayonline.html. 
GT sentence: To pay your bill on line with a credit card, log on to www.ebixinc.com/payonline.html.
Char Decoded sentence: To pay your bill on line with a credity car, log on to wwwentixincehinccompatoly 
Word Decoded sentence: To pay your bill on line with a credit care log on to wwwentixincehinccompatoly a 
-Lenght =  50
Input sentence: ACCOUNT# EMA297232
GT sentence: ACCOUNT# EMA297232
Char Decoded sentence: ACCOUNT# EMA297232
Word Decoded sentence: ACCOUNT EMA297232 
-Lenght =  100
Input sentence: PLACE OF SERVICE 11 Office 21 Inpatient 22 Outpatient Hospital 23 Emergency Room-Hospital
GT sentence: PLACE OF SERVICE 11 Office 21 Inpatient 22 Outpatient Hospital 23 Emergency Room-Hospital
Char Decoded sentence: PLACE OF ORVERVIC11 Offic21 Oulpatie22 Hospital  Emergency R11m Hospit21  Emergen22
Word Decoded sentence: PLACE OF ORVERVIC11 Offic21 Oulpatie22 Hospital a E

-Lenght =  100
Input sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the Immune system, use of drugs or alcohol, mental or phy5ica| histor , condition, advice or treatment (except this authorization does not authorize release of psychotherapy notesi, prescription drug history, earnings, financial or credit history, professional licenses, employment history, insurance claims and benefits, and all other claims and benefits, including Social Security claims and benefits (“My Information"); 
GT sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the immune system, use of drugs or alcohol, mental or physical history, condition, advice or treatment (except this authorization does not authorize release of psychotherapy notes, prescription drug history, earnings, financia

-Lenght =  50
Input sentence: Dependent Information 
GT sentence: Dependent Information
Char Decoded sentence: Dependent Information
Word Decoded sentence: Dependent Information 
-Lenght =  50
Input sentence: First Name: 
GT sentence: First Name:
Char Decoded sentence: First Name: 
Word Decoded sentence: First Name a 
-Lenght =  50
Input sentence: Middle Nameﬂnitial: 
GT sentence: Middle Name/Initial:
Char Decoded sentence: Middle NameInitia: 
Word Decoded sentence: Middle NameInitia: a 
-Lenght =  50
Input sentence: Last Name: 
GT sentence: Last Name:
Char Decoded sentence: Last Name: 
Word Decoded sentence: Last Name a 
-Lenght =  50
Input sentence: Social Security Number: 
GT sentence: Social Security Number:
Char Decoded sentence: Social Security Number: 
Word Decoded sentence: Social Security Number a 
-Lenght =  50
Input sentence: Birth Date: 
GT sentence: Birth Date:
Char Decoded sentence: Birth Date: 
Word Decoded sentence: Birth Date a 
-Lenght =  50
Input sentence: Gender: 
G

-Lenght =  100
Input sentence: Your actual coverage and amounts are subject to all the terms, conditions, limitations and exclusions stated in your Certificate of Coverage and Policy. Please refer to your Certificate of Coverage for a detailed description of your coverage. If the information in this letter differs from the Policy held by your Employer or Plan Administrator, the terms of the Policy will govern. Please contact your Plan Administrator with any questions or call our Contact Center at 1-800-635-5597 and we can assist you. 
GT sentence: Your actual coverage and amounts are subject to all the terms, conditions, limitations and exclusions stated in your Certificate of Coverage and Policy. Please refer to your Certificate of Coverage for a detailed description of your coverage. If the information in this letter differs from the Policy held by your Employer or Plan Administrator, the terms of the Policy will govern. Please contact your Plan Administrator with any questions or ca

-Lenght =  100
Input sentence: 1. Hydrocodone-Acetaminophen 5-325 MG Oral Tablet: Take 1-2 tablets every 6 hours as needed for pain; 
GT sentence: 1. Hydrocodone-Acetaminophen 5-325 MG Oral Tablet; Take 1-2 tablets every 6 hours as needed for pain;
Char Decoded sentence: 1. Hydrocodoci-o Accident  5-325 Oralet Selvi1.s Chable Se-able to hours 5-325eded for paidle1.
Word Decoded sentence: 1. Hydrocodoci-o Accident a 5-325 Oralee Selvi1.s Cable secable to hours 5-325eded for paidle1. 
-Lenght =  100
Input sentence: Therapy: 09Feb2018 to (Last Rx:09Feb2018) Ordered 
GT sentence: Therapy: 09Feb2018 to (Last Rx:09Feb2018) Ordered
Char Decoded sentence: Therapy: 09Feb2018 to (Last Rx:09Feb2018) Ordered
Word Decoded sentence: Therapy 09Feb2018 to Last Rx:09Feb2018) Ordered 
-Lenght =  100
Input sentence: 2. HydrOXYzine Pamoate 25 MG Oral Capsule; Take 1—2 capsules every 6 hours as needed; 
GT sentence: 2. HydrOXYzine Pamoate 25 MG Oral Capsule; Take 1-2 capsules every 6 hours as needed;
Char 

-Lenght =  100
Input sentence: 011241201 8 10:24:20 AM:Ordered; For:New tear of anterior cruciate ligament of right knee; Ordered ByzHolm, Jason; 
GT sentence: 10:24:20 AM;Ordered; For:New tear of anterior cruciate ligament of right knee; Ordered By:Holm, Jason;
Char Decoded sentence: 0112412018S10:24:20r :or Ne;r tr:at of anteri011241201t8 10:24:20 of:right k;ee :rdered Order0112
Word Decoded sentence: 0112412018S10:24:20r for near treat of anteri011241201t8 10:24:20 fright knee ordered Order0112 
-Lenght =  50
Input sentence: DiscussionlSumrnary 
GT sentence: Discussion/Sumrnary
Char Decoded sentence: DiscussionSumrnary 
Word Decoded sentence: DiscussionSumrnary a 
-Lenght =  100
Input sentence: I had a long discussion with today going over the MRI and discussing her ACL rupture and high grade MCL tear. I used a model to discuss the role of the ACL and MCL and we reviewed operative versus nonoperative management. At this point. based on her current symptoms and activity goals I feel 

-Lenght =  50
Input sentence: ENT: no ears, nose or throat symptoms. 
GT sentence: ENT: no ears, nose or throat symptoms.
Char Decoded sentence: ENT: no eare, nose throat symptoms
Word Decoded sentence: ENTR no eared nose throat symptoms 
-Lenght =  50
Input sentence: Endocrine: no endocrine symptoms. 
GT sentence: Endocrine: no endocrine symptoms.
Char Decoded sentence: Endocrine: no endocrine symptoms.
Word Decoded sentence: Endocrines no endocrine symptoms 
-Lenght =  50
Input sentence: Eyes: glasseslcontact. 
GT sentence: Eyes: glasses/contact.
Char Decoded sentence: Eyes: glassescontact.
Word Decoded sentence: Eyes glassescontact. 
-Lenght =  50
Input sentence: Gastrointestinal: no gastrointestinal symptoms. 
GT sentence: Gastrointestinal: no gastrointestinal symptoms.
Char Decoded sentence: Gast rintestinal: no gastivestional symptoms
Word Decoded sentence: Gast intestinal no gastivestional symptoms 
-Lenght =  50
Input sentence: Genitourinary: no genitourinary symptoms. 
GT sent

-Lenght =  50
Input sentence: Eyes: glassesicontact. 
GT sentence: Eyes: glasses/contact.
Char Decoded sentence: Eyes: glassessiconta.t
Word Decoded sentence: Eyes glassessiconta.t 
-Lenght =  50
Input sentence: Gastrointestinal: no gastrointestinal symptoms. 
GT sentence: Gastrointestinal: no gastrointestinal symptoms.
Char Decoded sentence: Gast rintestinal: no gastivestional symptoms
Word Decoded sentence: Gast intestinal no gastivestional symptoms 
-Lenght =  50
Input sentence: Genitourinary: no genitourinary symptoms. 
GT sentence: Genitourinary: no genitourinary symptoms.
Char Decoded sentence: Genitourinary: no genitourinary symptoms.
Word Decoded sentence: Genitourinary no genitourinary symptoms 
-Lenght =  50
Input sentence: HematologiciLymphatic no hematologic symptoms. 
GT sentence: Hematologic/Lymphatic no hematologic symptoms.
Char Decoded sentence: HematologicLymphatic no hematologic symptoms.
Word Decoded sentence: HematologicLymphatic no hematologic symptoms 
-Lenght = 

-Lenght =  100
Input sentence: PROCEDURE: The patient was identiﬁed in the preoperative holding area and the operative site was marked. The consent was again reviewed and all questions were answered. She was taken to the operating room, placed under general anesthesia, and positioned supine on the operating table with the right lower extremity prepped and draped in the usual sterile fashion. Preoperative antibiotics were administered. A time-out was called to ensure the correct operative site and procedure and the tourniquet was inﬂated to 300 mmHg. The arthroscope was introduced along with a working portal. A diagnostic arthroscopy showed no chondral wear to the patella or trochlea. The medial femoral condyle and medial tibial plateau were without chondral Wear. There was a large positive drive-through sign of the medial joint. The mediai meniscus showed a superior peripheral tear that did not extend to the inferior articular surface. This was probed and felt to be overall stable. Lat

-Lenght =  50
Input sentence: Height: 5 ft 8 in 
GT sentence: Height: 5 ft 8 in
Char Decoded sentence: Height: 5 ft 8 in 
Word Decoded sentence: Height 5 ft 8 in a 
-Lenght =  50
Input sentence: Weight: 270 lb 
GT sentence: Weight: 270 lb
Char Decoded sentence: Weight: 270 lb
Word Decoded sentence: Weight 270 lb 
-Lenght =  50
Input sentence: BMl Calculated: 41.05 
GT sentence: BMI Calculated: 41.05
Char Decoded sentence: BMICl Clated :41.05
Word Decoded sentence: Bill Elated :41.05 
-Lenght =  50
Input sentence: BSA Calculated: 2.33 
GT sentence: BSA Calculated: 2.33
Char Decoded sentence: BSA Calculated: 2.33 
Word Decoded sentence: BSA Calculated 2.33 a 
-Lenght =  50
Input sentence: Physical Exam 
GT sentence: Physical Exam
Char Decoded sentence: Physical Exam
Word Decoded sentence: Physical Exam 
-Lenght =  100
Input sentence: Constitutional- Patient is healthy, well nourished and appears stated age. 
GT sentence: Constitutional- Patient is healthy, well nourished and appears stat

-Lenght =  100
Input sentence: Continue with physicai therapy. Follow up in 4 weeks. 
GT sentence: Continue with physical therapy. Follow up in 4 weeks.
Char Decoded sentence: Continue with physical therapy. Follow up in 4 weeks
Word Decoded sentence: Continue with physical therapy Follow up in 4 weeks 
-Lenght =  50
Input sentence: DiscussionlSummary 
GT sentence: Discussion/Summary
Char Decoded sentence: DiscussionSummary 
Word Decoded sentence: DiscussionSummary a 
-Lenght =  100
Input sentence: I believe that: a is doing quite Well at this time following a right knee ACL repair with right knee MCL repair. We did review precautions and once again reviewed her protocol which would include an additional week of toe touch weight bearing with the brace locked at zero. At three weeks post op, she may progress to weight bearing as tolerated with the brace locked at zero. I would like to see her back in 4 weeks for reassessment. We did review some home exercises to work on ROM. Based on qu

-Lenght =  50
Input sentence: Signatures 
GT sentence: Signatures
Char Decoded sentence: Signatures
Word Decoded sentence: Signatures 
-Lenght =  50
Input sentence: Electronically signed by : David Felvor, PA~C:
GT sentence: Electronically signed by : David Feivor, PA~C:
Char Decoded sentence: Electronically signed by:Davide Felvic,PAC~
Word Decoded sentence: Electronically signed by:Davide Felvic,PAC~ 
-Lenght =  100
Input sentence: Provider Statement: I personally performed thls service and the scribe documentation accurately reﬂects this service. 
GT sentence: Provider Statement: I personally performed this service and the scribe documentation accurately reflects this service.
Char Decoded sentence: Provider Statement: I press service and the scrive and the scrive a:d the documentation accurately r
Word Decoded sentence: Provider Statement I press service and the scrive and the scrive and the documentation accurately r 
-Lenght =  50
Input sentence: oTé'iLNog'aiiiés 
GT sentence: TW

-Lenght =  50
Input sentence: a Family history of other condition (284.89) 
GT sentence: • Family history of other condition (Z84.89)
Char Decoded sentence: • Family history of other condition (284.89)
Word Decoded sentence: a Family history of other condition (284.89) 
-Lenght =  50
Input sentence: a Family history of other condition (284.89) 
GT sentence: • Family history of other condition (Z84.89)
Char Decoded sentence: • Family history of other condition (284.89)
Word Decoded sentence: a Family history of other condition (284.89) 
-Lenght =  50
Input sentence: Social History 
GT sentence: Social History
Char Decoded sentence: Social History
Word Decoded sentence: Social History 
-Lenght =  50
Input sentence: 0 Age reporting 
GT sentence: • Age reporting
Char Decoded sentence: 0 Age reporting Corting Sporite
Word Decoded sentence: 0 Age reporting Carting Sprite 
-Lenght =  50
Input sentence: . Consumes alcohol (278.9) 
GT sentence: • Consumes alcohol (278.9)
Char Decoded sentence: 

-Lenght =  50
Input sentence: nonTTP along the lateral joint line 
GT sentence: nonTTP along the lateral joint line
Char Decoded sentence: nonTTP along the lateral joint line 
Word Decoded sentence: nonTTP along the lateral joint line a 
-Lenght =  50
Input sentence: no parapatellar tenderness 
GT sentence: no parapatellar tenderness
Char Decoded sentence: no parapatellar tenderness 
Word Decoded sentence: no prepatellar tenderness a 
-Lenght =  100
Input sentence: 1 quadrants medial, 1 quadrants lateral patellar translation without apprehension. 
GT sentence: 1 quadrants medial, 1 quadrants lateral patellar translation without apprehension.
Char Decoded sentence: 1 quadrants medial, 1 quadrants lateral patellar 1ranslation withou, 1pprehnesion
Word Decoded sentence: 1 quadrants medial 1 quadrants lateral patellar 1ranslation without 1pprehnesion 
-Lenght =  50
Input sentence: Full extension 
GT sentence: Full extension
Char Decoded sentence: Full extension
Word Decoded sentence: Full 

-Lenght =  50
Input sentence: Physician authorization - mail 
GT sentence: Physician authorization - mail
Char Decoded sentence: Physician authorization - mail
Word Decoded sentence: Physician authorization - mail 
-Lenght =  50
Input sentence: Home Email — 
GT sentence: Home Email -
Char Decoded sentence: Home Email —
Word Decoded sentence: Home Email — 
-Lenght =  50
Input sentence: Register for Claim Self Service — no 
GT sentence: Register for Claim Self Service - no
Char Decoded sentence: Register for Claim Self Service — no
Word Decoded sentence: Register for Claim Self Service — no 
-Lenght =  50
Input sentence: Health insurance through employer - yes 
GT sentence: Health insurance through employer - yes
Char Decoded sentence: Health insurance thart the employe-  yes
Word Decoded sentence: Health insurance that the employed a yes 
-Lenght =  50
Input sentence: Health insurance provider — bcbs 
GT sentence: Health insurance provider - bcbs
Char Decoded sentence: Health insurance 

-Lenght =  100
Input sentence: Note: Final cost may vary slightly due to rounding differences. 
GT sentence: Note: Final cost may vary slightly due to rounding differences.
Char Decoded sentence: Note: Final cost may vary slightly due to rounding di:ferences
Word Decoded sentence: Note Final cost may vary slightly due to rounding differences 
-Lenght =  100
Input sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your On 3L Off-Job Accident coverage. 
GT sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your On & Off-Job Accident coverage.
Char Decoded sentence: This information to gived form only in provi.ed to giver to gived to gist manderal understand.ng off
Word Decoded sentence: This information to gived form only in provided to giver to gived to gist mineral understanding off 
-Lenght =  100
Input sentence: Your actual coverage and amounts are subject to all the 

-Lenght =  50
Input sentence: Work and live same state — yes 
GT sentence: Work and live same state - yes
Char Decoded sentence: Work and live same state — yes
Word Decoded sentence: Work and live same state — yes 
-Lenght =  50
Input sentence: Work from home — no 
GT sentence: Work from home - no
Char Decoded sentence: Work from home — no
Word Decoded sentence: Work from home — no 
-Lenght =  50
Input sentence: Does schedule vary? - yes 
GT sentence: Does schedule vary? - yes
Char Decoded sentence: Does schedule vary? - yes
Word Decoded sentence: Does schedule vary - yes 
-Lenght =  50
Input sentence: How does it vary — hours and days vary 
GT sentence: How does it vary - hours and days vary
Char Decoded sentence: How does it vary — hours and days vary 
Word Decoded sentence: How does it vary — hours and days vary a 
-Lenght =  100
Input sentence: Scheduled work days — ["mon","tue","wed“rll thull I"fri"] 
GT sentence: Scheduled work days — ["mon","tue","wed“,"thu","fri"]
Char Decoded 

-Lenght =  50
Input sentence: Created Date: 
GT sentence: Created Date:
Char Decoded sentence: Created Date: 
Word Decoded sentence: Created Date a 
-Lenght =  50
Input sentence: Create Site: Chattanooga 
GT sentence: Create Site: Chattanooga
Char Decoded sentence: Create Site: Chattanooga
Word Decoded sentence: Create Site Chattanooga 
-Lenght =  50
Input sentence: Completed By: Hughes, Brittany 
GT sentence: Completed By: Hughes, Brittany
Char Decoded sentence: Completed By: Hughes, Brittany
Word Decoded sentence: Completed By Hughes Brittany 
-Lenght =  50
Input sentence: Completed Date:  
GT sentence: Completed Date:
Char Decoded sentence: Completed Date: 
Word Decoded sentence: Completed Date a 
-Lenght =  50
Input sentence: Complete Site: Chattanooga 
GT sentence: Complete Site: Chattanooga
Char Decoded sentence: Complete Site: Chattanooga
Word Decoded sentence: Complete Site Chattanooga 
-Lenght =  50
Input sentence: unum°
GT sentence: unum
Char Decoded sentence: unum
Word Decod

-Lenght =  50
Input sentence: Time of Accident
GT sentence: Time of Accident a.m. p.m
Char Decoded sentence: Time of Accident
Word Decoded sentence: Time of Accident 
-Lenght =  100
Input sentence: 0' YOU need more space. please attach a separate sheet of paper).
GT sentence: Please explain how your accident happened. (If you need more space, please attach a separate sheet of paper).
Char Decoded sentence: 0U YOU need more space. please atach a semparate 0heet of paper
Word Decoded sentence: 0U YOU need more space please attach a separate 0heet of paper 
-Lenght =  100
Input sentence: I got up quickly from a seated position. planted my left foot on the ground and turned. My leg twisted and i heard and felt a very large snap in my
GT sentence: I got up quickly from a seated position, planted my left foot on the ground and turned. My leg twisted and I heard and felt a very large snap in my left knee and fell back onto the sofa.
Char Decoded sentence: Is to p torm troo anticily my left fo

-Lenght =  50
Input sentence: lfyee. as of what date”? (mrna'ddiyy)
GT sentence: If yes, as of what date”? (mm/dd/yy)
Char Decoded sentence: fffye. as of what date”? (mmddyy
Word Decoded sentence: fffye. as of what date (mmddyy 
-Lenght =  100
Input sentence: If this claim is related to normal pregnancy, please provide the following:
GT sentence: If this claim is related to normal pregnancy, please provide the following:
Char Decoded sentence: If this claim is related to normal pregnancy, please provide the following
Word Decoded sentence: If this claim is related to normal pregnancy please provide the following 
-Lenght =  50
Input sentence: ExPeoled Dei‘ ery _ate (mmlddiyy)
GT sentence: Expected Delivery Date (mm/dd/yy)
Char Decoded sentence: ExPloled Dedite fry Cat(lly
Word Decoded sentence: explored Dedie fry fatally 
-Lenght =  50
Input sentence: Actual Delivgry Date (
GT sentence: Actual Delivery Date (mm/dd/yy)
Char Decoded sentence: Actual Delivery Date (mmddyy
Word Decoded sen

-Lenght =  50
Input sentence: enonsu0W00001500: INSURANCE 3 
GT sentence: CURRENT INSURANCE
Char Decoded sentence: Clensu0e00001500:RANCE RANCE3
Word Decoded sentence: Clensu0e00001500:RANCE RANCE3 
-Lenght =  50
Input sentence: AUTHl: 
GT sentence: AUTH#
Char Decoded sentence: AUTH:of
Word Decoded sentence: Author 
-Lenght =  50
Input sentence: Piedmont Healihcn r1.- 
GT sentence: Piedmont Healthcare
Char Decoded sentence: Piedmont Healthcare1.
Word Decoded sentence: Piedmont Healthcare1. 
-Lenght =  50
Input sentence: Piscamway, NJ  
GT sentence: Piscataway, NJ
Char Decoded sentence: Piscamary, No
Word Decoded sentence: Piscamary, No 
-Lenght =  50
Input sentence: Electronic. Survicr. Reqester 
GT sentence: Electronic Service Requested
Char Decoded sentence: Electronic. Survice.Requester
Word Decoded sentence: Electronic Survice.Requester 
-Lenght =  50
Input sentence: MyHealthBéO" 5-080“ 
GT sentence: MyHealth360° PHC & Me
Char Decoded sentence: MyHealth" 5-08
Word Decoded sentence:

-Lenght =  50
Input sentence: ORTHDATLA A. Lu: 
GT sentence: ORTHOATLANTA, L.L.C.
Char Decoded sentence: ORTHOATLANT. L:LC
Word Decoded sentence: ORTHOATLANT. LLC 
-Lenght =  50
Input sentence: billm -hone: 
GT sentence: billing phone:
Char Decoded sentence: bille - Pho:e 
Word Decoded sentence: Bille - phone a 
-Lenght =  50
Input sentence: do artment of service: 
GT sentence: department of service:
Char Decoded sentence: deartment of service:
Word Decoded sentence: department of service 
-Lenght =  50
Input sentence: FA ETI'EVILLE 
GT sentence: FAYETTEVILLE printed
Char Decoded sentence: FALETT VILLE Date
Word Decoded sentence: FALETTI VILLE Date 
-Lenght =  50
Input sentence: dept phone: 
GT sentence: deparment phone:
Char Decoded sentence: dept phone: 
Word Decoded sentence: dept phone a 
-Lenght =  50
Input sentence: GUARANT'UH NAME AND. ADDRESS 
GT sentence: GUARANTOR NAME AND ADDRESS
Char Decoded sentence: GUARANTOR NAME AND .DDRESS 
Word Decoded sentence: GUARANTOR NAME AND ADD

-Lenght =  50
Input sentence: (Name / Relationship) (Telephone Number) 
GT sentence: (Name / Relationship) (Telephone Number)
Char Decoded sentence: (Name / Relationship) (Telephone Number)
Word Decoded sentence: Name / Relationship Telephone Number 
-Lenght =  100
Input sentence: I au rizEI Unum to leave messages about my claim on my voicemail I answering machine. 
GT sentence: I authorize Unum to leave messages about my claim on my voicemail / answering machine.
Char Decoded sentence: I auth Unum to leave messages about my claim on my voicemail  answering machine machine
Word Decoded sentence: I auth Unum to leave messages about my claim on my voicemail a answering machine machine 
-Lenght =  50
Input sentence: as No 
GT sentence: Yes No
Char Decoded sentence: Ys No 
Word Decoded sentence: Ys No a 
-Lenght =  100
Input sentence: i understand that information about my claim may include information about my health and that such information about my health me be related to any disorder 

-Lenght =  100
Input sentence: Note: Final cost may vary slightly due to rounding differences. 
GT sentence: Note: Final cost may vary slightly due to rounding differences.
Char Decoded sentence: Note: Final cost may vary slightly due to rounding di:ferences
Word Decoded sentence: Note Final cost may vary slightly due to rounding differences 
-Lenght =  100
Input sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your Off-Job Accident coverage.
GT sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your Off-Job Accident coverage.
Char Decoded sentence: This information to gived form only in provi.ed to giver to gived to gist manderal understand.ng off
Word Decoded sentence: This information to gived form only in provided to giver to gived to gist mineral understanding off 
-Lenght =  100
Input sentence: Your actual coverage and amounts are subject to all the terms, condi

-Lenght =  100
Input sentence: ' The medial collateral ligament is intact. The distal insertions of the biceps femoris and the iliotibiai hand are unremarkable. The ﬁbular collateral ligament is‘intact. The popliteus muscle and tendon are unremarkable. 
GT sentence: The medial collateral ligament is intact. The distal insertions of the biceps femoris and the iliotibial band are unremarkable. The fibular collateral ligament is intact. The popliteus muscle and tendon are unremarkable.
Char Decoded sentence: The ablent is intament is intact insertio.s of the bisatent is the bisates fem ristal and .he ligame
Word Decoded sentence: The absent is ointment is intact insertions of the bisatent is the bites fem distal and the ligate 
-Lenght =  50
Input sentence: 3.1 cm x 0.9 cm Baker's Cyst. 
GT sentence: 3.1 cm x 0.9 cm Baker's cyst.
Char Decoded sentence: 3.1 cm x 0.9 cm Baker's cyst.
Word Decoded sentence: 3.1 cm x 0.9 cm bakers cyst 
-Lenght =  100
Input sentence: The distal insertions of 

-Lenght =  100
Input sentence: TIER 1 Individual Deductible  0110112018 - 1213112018 
GT sentence: TIER 1 Individual Deductible 01/01/2018 - 12/31/2018
Char Decoded sentence: IndEv1dulal Deductible Deducti0110112018-121311^L1
Word Decoded sentence: IndEv1dulal Deductible Deducti0110112018-121311^L1 
-Lenght =  100
Input sentence: TIER 1 Individual MODF‘ Max  0110112018 - 1213112018 
GT sentence: TIER 1 Individual MOOP Max 01/01/2018 - 12/31/2018
Char Decoded sentence: IndEv1r  dividual MOD Fax M0110112018l-1213112  1  m
Word Decoded sentence: IndEv1r a dividual MOD Fax M0110112018l-1213112 a 1 a m 
-Lenght =  100
Input sentence: TIER 2 Family Deduclibie  0110112010 — 1213112010 
GT sentence: TIER 2 Family Deductible 01/01/2018 - 12/31/2018
Char Decoded sentence: TIER 2 Family Deductible D0110112010 —1213112010
Word Decoded sentence: TIER 2 Family Deductible D0110112010 —1213112010 
-Lenght =  50
Input sentence: TIER 2 Family MOOP Max  0110112018 - 1213112013 
GT sentence: TIER 2 Family

-Lenght =  100
Input sentence: So that Unum ma evaluate and administer my claims, includin providin assistance with return to work. For such eva uation and administration of claims, this authorize ion is valid or two years, or the duration of my claim for beneﬁts (to include any subsi/cltuent ﬁnanCial management and/or beneﬁt recovery review), whic ever is shorter. i understand the once y Information is disc csed to Unum, any privacy rotections established by/IHIPAA may not apply to the information. but other privacy laws continue to apply. num may y then disclose Informaticn only as permitted by law, including. state fraud reporting laws or as authorized by me. 
GT sentence: So that Unum may evaluate and administer my claims, including providing assistance with return to work. For such evaluation and administration of claims, this authorization is valid or two years, or the duration of my claim for benefits (to include any subsequent financial management and/or benefit recovery review

-Lenght =  50
Input sentence: F. Additional Medical Information Required 
GT sentence: F. Additional Medical Information Required
Char Decoded sentence: F. Additional Medical Information Required 
Word Decoded sentence: Ff Additional Medical Information Required a 
-Lenght =  100
Input sentence: Plom attach llol'l'llznd canto! 91‘ any hill! related to this accident including doctor. emergency room. hospital. and motor vehicle incidentrsccldent report. 
GT sentence: Please attach itemized copies any bills related to this accident including doctor, emergency room, hospital, and motor vehicle incident/accident report.
Char Decoded sentence: Plomatically llatom and cantoma91d to this and chill or this and chictung doctor91emergency and octi
Word Decoded sentence: Plomatically atom and cantoma91d to this and chill or this and chictung doctor91emergency and OCI 
-Lenght =  100
Input sentence: Bills should Include cleanest: Information (from your medical provider). Additional medical Informat

-Lenght =  50
Input sentence: Proxider First Name: Jason 
GT sentence: Provider First Name: Jason
Char Decoded sentence: Provider First Name: Jason
Word Decoded sentence: Provider First Name Jason 
-Lenght =  50
Input sentence: Prowider Last Name: Holm 
GT sentence: Provider Last Name: Holm
Char Decoded sentence: Provider Last Name: Holm 
Word Decoded sentence: Provider Last Name Holm a 
-Lenght =  50
Input sentence: Address Line 1: 1000 “7 140th St #301 
GT sentence: Address Line 1: 1000 W 140th St #201
Char Decoded sentence: Address Line 1: 10007 140 ?
Word Decoded sentence: Address Line 1: 10007 140 a 
-Lenght =  50
Input sentence: City. BmlsViJle 
GT sentence: City: Burnsville
Char Decoded sentence: City. Bals Vile 
Word Decoded sentence: City Bals Vile a 
-Lenght =  50
Input sentence: State/Province: MN 
GT sentence: State/Province: MN
Char Decoded sentence: State/Province: MN 
Word Decoded sentence: State/Province: MN a 
-Lenght =  50
Input sentence: Postal Code: 5533 
GT sentenc

-Lenght =  50
Input sentence: Total Employee Bi—Weekly Payroll Deduction: 
GT sentence: Total Employee Bi-Weekly Payroll Deduction:
Char Decoded sentence: Total Employee Bi—Weekly Payroll Deduction:
Word Decoded sentence: Total Employee Biweekly Payroll Deduction 
-Lenght =  100
Input sentence: Note: ﬁnal cost may vary slightly due to rounding differenca. 
GT sentence: Note: Final cost may vary slightly due to rounding differences.
Char Decoded sentence: Note: Final cost may vary slightly due to rounding di:ferencea
Word Decoded sentence: Note Final cost may vary slightly due to rounding difference 
-Lenght =  100
Input sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your Off—Job Accident coverage. 
GT sentence: This information is in abbreviated form only. It is provided to give you a general understanding of your Off-Job Accident coverage.
Char Decoded sentence: This information to gived form only in provi.ed to giver to g

-Lenght =  100
Input sentence: If relatedW to a fracture or dislocation, please Indicate: 
GT sentence: If related to a fracture or dislocation, please Indicate:
Char Decoded sentence: If related to a fracture or dislocation,please Indicate
Word Decoded sentence: If related to a fracture or dislocation,please Indicate 
-Lenght =  100
Input sentence: III Closed El Open El Unknown Name of bone fractured or dislocated: 
GT sentence: Closed Open Unknown Name of bone fractured or dislocated:
Char Decoded sentence: Information Elect Unknown Name of bone fractured or dislocated 
Word Decoded sentence: Information Elect Unknown Name of bone fractured or dislocated a 
-Lenght =  100
Input sentence: If related to a laceration, please indicate the length: 
GT sentence: If related to a laceration, please indicate the length:
Char Decoded sentence: If related to a laceration, please indicate the length
Word Decoded sentence: If related to a laceration please indicate the length 
-Lenght =  100
Inpu

-Lenght =  100
Input sentence: Is the patient permanently disabled? III Yes El No If yes. what Is the recommended frequency of treatment? 
GT sentence: Is the patient permanently disabled? Yes No If yes, what is the recommended frequency of treatment?
Char Decoded sentence: Is the patient permaned? Ins beled? Ind the recomment Yes No If yes what is the requency of trenct
Word Decoded sentence: Is the patient permanent Ins bleed Ind the recommend Yes No If yes what is the frequency of trench 
-Lenght =  100
Input sentence: Does the patient have permanent restrictions and limitations? El Yes El No If yes, please list the permanent restrictions and limitations. 
GT sentence: Does the patient have permanent restrictions and limitations? Yes No If yes, please list the permanent restrictions and limitations.
Char Decoded sentence: Dees the patient have pregnance and limitations? and limitations? Yes please lase lightal the the th
Word Decoded sentence: Dees the patient have pregnance and lim

-Lenght =  100
Input sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the immune system, use of drugs or alcohol, mental or physical histor , condition, advice or treatment (except this authorization does not authorize release of psychotherapy notesil, prescription drug history, earnings, financial or credit history, professional licenses, employment history, insurance claims and benefits, and all other claims and benefits, including Social Security claims and benefits (“My Information"); 
GT sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the immune system, use of drugs or alcohol, mental or physical history, condition, advice or treatment (except this authorization does not authorize release of psychotherapy notes), prescription drug history, earnings, financ

-Lenght =  50
Input sentence: Middle Name/Initial: 
GT sentence: Middle Name/Initial:
Char Decoded sentence: Middle Name/Initial: 
Word Decoded sentence: Middle Name/Initial: a 
-Lenght =  50
Input sentence: Last Name: 
GT sentence: Last Name:
Char Decoded sentence: Last Name: 
Word Decoded sentence: Last Name a 
-Lenght =  50
Input sentence: S ocial Security Number: 
GT sentence: Social Security Number:
Char Decoded sentence: Social Security Number:
Word Decoded sentence: Social Security Number 
-Lenght =  50
Input sentence: Binh Date: 
GT sentence: Birth Date:
Char Decoded sentence: Birth Dat: 
Word Decoded sentence: Birth Date a 
-Lenght =  50
Input sentence: Gender: 
GT sentence: Gender:
Char Decoded sentence: Gender: 
Word Decoded sentence: Gender a 
-Lenght =  50
Input sentence: Claim Event Information 
GT sentence: Claim Event Information
Char Decoded sentence: Claim Event Information
Word Decoded sentence: Claim Event Information 
-Lenght =  100
Input sentence: Accident Descn'p

-Lenght =  100
Input sentence: FRAUD N TICE: Any person who knowingly files a statement of claim containing faise or misleading information issub'ect to criminal and civil penalties. This includes Attending Physician portions of the claim form. 
GT sentence: FRAUD NOTICE: Any person who knowingly files a statement of claim containing false or misleading information is subject to criminal and civil penalties. This includes Attending Physician portions of the claim form.
Char Decoded sentence: FUAR Date A:y person the resont of claim contain cont no the:resoning farse or mis leading ing ingli
Word Decoded sentence: FAR Date Any person the resort of claim contain cont no the:resoning farse or mis leading ing inglu 
-Lenght =  50
Input sentence: C. Signature of Attending Physician 
GT sentence: C. Signature of Attending Physician
Char Decoded sentence: C. Signature of Attending Physician
Word Decoded sentence: C Signature of Attending Physician 
-Lenght =  100
Input sentence: The above sta

-Lenght =  50
Input sentence: State ”Produce: DIN
GT sentence: State/Province: MN
Char Decoded sentence: StateProducte: DIN
Word Decoded sentence: StateProducte: DIN 
-Lenght =  50
Input sentence: Postal Code: 5 53 3 7
GT sentence: Postal Code: 55337
Char Decoded sentence: Postal Code: 5 53
Word Decoded sentence: Postal Code 5 53 
-Lenght =  50
Input sentence: Ciotmtry. US
GT sentence: Country: US
Char Decoded sentence: Country.US
Word Decoded sentence: Country.US 
-Lenght =  50
Input sentence: Date ot'Visit’AdmissiorL 02102520 1 8
GT sentence: Date of Visit/Admission: 02/02/2018
Char Decoded sentence: Date of VisitAdmission 02102520
Word Decoded sentence: Date of VisitAdmission 02102520 
-Lenght =  50
Input sentence: Date ofDisehaJ'ge: 02/022018
GT sentence: Date of Discharge: 02/02/2018
Char Decoded sentence: Date of Discharge: 02/022
Word Decoded sentence: Date of Discharge 02/022 
-Lenght =  100
Input sentence: Procedure: ACL, MCL, meniscus repair in light knee 
GT sentence: Proced

-Lenght =  50
Input sentence: T‘mfﬂnynn TD:
GT sentence: Employee ID:
Char Decoded sentence: Complenn T
Word Decoded sentence: Complete T 
-Lenght =  50
Input sentence: Employer Nana;
GT sentence: Employer Name:
Char Decoded sentence: Employer Name;
Word Decoded sentence: Employer Name 
-Lenght =  50
Input sentence: Gender! Male
GT sentence: Gender: Male
Char Decoded sentence: Gender Male
Word Decoded sentence: Gender Male 
-Lenght =  50
Input sentence: Marital Status; Single
GT sentence: Marital Status: Single
Char Decoded sentence: Marital Status; Single
Word Decoded sentence: Marital Status Single 
-Lenght =  50
Input sentence: 0c: Title: Rasanﬂxxe: 
GT sentence: Occ Title: ResinMixer
Char Decoded sentence: 0c: Title: ResinMixe:
Word Decoded sentence: 0c: Title ResinMixe: 
-Lenght =  50
Input sentence: Original Hire Date:
GT sentence: Original Hire Date:
Char Decoded sentence: Original Hire Date:
Word Decoded sentence: Original Hire Date 
-Lenght =  50
Input sentence: Kira ”ate: ‘ ,

-Lenght =  50
Input sentence: Plan Larnlngs:
GT sentence: Plan Earnings:
Char Decoded sentence: Plan Lareding:
Word Decoded sentence: Plan Lareding: 
-Lenght =  50
Input sentence: r , = _ E-Tl ELIZABETH 
GT sentence: ST. ELIZABETH
Char Decoded sentence: ST, ELIZA-ETH STH ABTH 
Word Decoded sentence: ST ELIZABETH STH BATH a 
-Lenght =  50
Input sentence: EUGEWOOD 
GT sentence: EDGEWOOD
Char Decoded sentence: EDGEWOOD 
Word Decoded sentence: EDGEWOOD a 
-Lenght =  50
Input sentence: OP Notes 
GT sentence: OP Notes
Char Decoded sentence: OP Notes 
Word Decoded sentence: OP Notes a 
-Lenght =  50
Input sentence: MFIN: DOB: 
GT sentence: MRN: DOB:
Char Decoded sentence: MRIN: DOB:
Word Decoded sentence: MIND DOBB 
-Lenght =  50
Input sentence: Acct #: 
GT sentence: Acct #:
Char Decoded sentence: Acct#:
Word Decoded sentence: Acct#: 
-Lenght =  50
Input sentence: Adm: 3."16I2CI‘18! DIG: 3a‘1 £62013 
GT sentence: Adm: 3/16/18, D/C: 3/16/18
Char Decoded sentence: Adm:3."16218 DI:B3 162
Word De

-Lenght =  50
Input sentence: Adm: 3115:2013, arc: 3.1 some 
GT sentence: Adm: 3/16/18, D/C: 3/16/18
Char Decoded sentence: Adm: 3115:2013,c F:r3.1 some 
Word Decoded sentence: Adm 3115:2013,c F:r3.1 some a 
-Lenght =  50
Input sentence: Operative 8‘ Procedure Notes (continued) 
GT sentence: Operative & Procedure Notes (continued)
Char Decoded sentence: Operative 8 Procedure Notes (ontinued)
Word Decoded sentence: Operative 8 Procedure Notes continued 
-Lenght =  100
Input sentence: E} Note si need in Larkin. John J, MD at 32’2032813 8:36 AM continued 
GT sentence: Op Note by Larkin, John J, MD at 3/20/18 8:36 AM (continued)
Char Decoded sentence: O} Note singed Ond Jarkin. John J, MD at 3220328} AM  AM MA clined
Word Decoded sentence: Of Note singed Ond Marking John J MD at 3220328} AM a AM MA lined 
-Lenght =  50
Input sentence: butt)? Larkin,John J, MD 
GT sentence: Author: Larkin, John J MD
Char Decoded sentence: Auth)r Larki, John,J MD 
Word Decoded sentence: author Larkin Johnny 

-Lenght =  50
Input sentence: SSN xxxexxvmocx 
GT sentence: SSN xxx-xx-xxxx
Char Decoded sentence: SSN xxxxxxx
Word Decoded sentence: SSN xxxxxxx 
-Lenght =  50
Input sentence: Sex
GT sentence: Sex
Char Decoded sentence: Sex
Word Decoded sentence: Sex 
-Lenght =  50
Input sentence: EEFU'I Date
GT sentence: Birth Date
Char Decoded sentence: Birth Date
Word Decoded sentence: Birth Date 
-Lenght =  50
Input sentence: Address 
GT sentence: Address
Char Decoded sentence: Address
Word Decoded sentence: Address 
-Lenght =  50
Input sentence: Fho ne 
GT sentence: Phone
Char Decoded sentence: Phone 
Word Decoded sentence: Phone a 
-Lenght =  50
Input sentence: Email 
GT sentence: Email
Char Decoded sentence: Email 
Word Decoded sentence: Email a 
-Lenght =  50
Input sentence: Empioyer 
GT sentence: Employer
Char Decoded sentence: Employer 
Word Decoded sentence: Employer a 
-Lenght =  50
Input sentence: Reg. Status Verified 
GT sentence: Reg Status Verified
Char Decoded sentence: Reg.st Status 

-Lenght =  50
Input sentence: www.unum.mrn
GT sentence: www.unum.com
Char Decoded sentence: www.unum.com
Word Decoded sentence: www.unum.com 
-Lenght =  100
Input sentence: Call ion-free Monday through Friday. 8 am. It! 8 pm. (Eastern Time)
GT sentence: Call toll-free Monday through Friday, 8 a.m. to 8 p.m. (Eastern Time)
Char Decoded sentence: Call tol-free Monday through Frida.8 a.m to8 pm Ea-tern Time
Word Decoded sentence: Call tol-free Monday through Frida.8 am to8 pm eastern Time 
-Lenght =  50
Input sentence: ATTENDING PHYSICIAN STATEMENT (PLEASE PRINT) 
GT sentence: ATTENDING PHYSICIAN STATEMENT (PLEASE PRINT)
Char Decoded sentence: ATTENDING PHYSICIAN STATEMENT (PLEASE PRINT)
Word Decoded sentence: ATTENDING PHYSICIAN STATEMENT PLEASE PRINT 
-Lenght =  100
Input sentence: TO BE COMPLETED BY PHYSICIAN OR TREATING PROVIDER 
GT sentence: TO BE COMPLETED BY PHYSICIAN OR TREATING PROVIDER
Char Decoded sentence: TO BE COMPLETED BY PHYSICIAN OR TREATING PROVIDER RE
Word Decoded sente

-Lenght =  50
Input sentence: Haa the patient baan hospitalized? D Yes No 
GT sentence: Has the patient been hospitalized? Yes No
Char Decoded sentence: Has the patient baan hospitalized? Yes No
Word Decoded sentence: Has the patient barn hospitalized Yes No 
-Lenght =  100
Input sentence: if yes, data nospimllzad immfdwyy): through immiddiyy); 
GT sentence: If yes, date hospitalized (mm/dd/yy): through (mm/dd/yy):
Char Decoded sentence: If yes, date nospimalea dimimatio):l demase dimiday
Word Decoded sentence: If yes date nospimalea dimimatio):l demise timidly 
-Lenght =  50
Input sentence: Was surge performed? 
GT sentence: Was surgery performed? Yes No
Char Decoded sentence: Was surger performed? 
Word Decoded sentence: Was surger performed a 
-Lenght =  50
Input sentence: If yes, what pro 
GT sentence: If yes, what procedure was performed?
Char Decoded sentence: If yes, what pare
Word Decoded sentence: If yes what pare 
-Lenght =  50
Input sentence: cad rl a 4-. cp'er: 
GT sentence

-Lenght =  100
Input sentence: The Gan-u: Il'ﬂdlmnbon Nondhodmmtlon not or 20118 (GlNA pmnlbllt! amplmm and otnor nnllllu mm! by GINA Tltlo II from moons-LIN or I'Iqulrlng nanotlc lnfonnatlnn of an 1ndlvldunl or Family morn r at th- indFvidunl, «mum an wooiﬁoilly allow-d by thin law. To oomrny wllh W8 llwl W0 Mi “Hm that Wu not pro-Ade any alﬂﬂlﬁ lnbmﬂmwhnn responding to lhls mount for medical information. 'Ganollo Informntlon' I: deﬁned GINA. Includes an Indlvmm‘l family mIoIo-l nlatory, the row": of an Murmurs Orﬂll'l'llly member's ganetlo teats. tho last that an lndlvldua of I latun nun-Inn by on Mdhrldlml arm lndlvtdunt‘o tan-ﬂy mnrilbir sought or moment aanallo oarvloon. and ganntlo Inronnnum or an tndlvldunl's tnl'rlly mambot’ or on ornbryo ltml'ulfy held by an Indlvtdual or family mamtm manna “mum reprodumhlo sorvlcaa.
GT sentence: The Genetic Information Nondiscrimination Act of 2008 (GINA) prohibits employers and other entities covered by GINA Title II from requesting or requi

-Lenght =  100
Input sentence: 10. We: modlnetlon prescribed (excluding cventhmmter mdicetion)? Was No 
GT sentence: 10. Was medication prescribed (excluding over the counter medication)? Yes No
Char Decoded sentence: 10. Wo:kness prescribed sectu(ing coventaming co10.t o: acciden? Was No
Word Decoded sentence: 10. weakness prescribed securing coventaming co10.t of accident Was No 
-Lenght =  100
Input sentence: 1 I. We 3.” patient referred to other health care prewar-(s) for eveluetten or treatment (ea .. phﬂlﬂl thlmplat)? as No 
GT sentence: 11. Was the patient referred to other health care provider(s) for evaluation or treatment (e.g., physical therapist)? Yes No
Char Decoded sentence: 1 I.  t3.s patient referred to the veluped to th1 v.luer3.o the relatment or treatment Chargent 1Ch.
Word Decoded sentence: 1 In a t3.s patient referred to the velured to th1 v.luer3.o the relament or treatment Chargen 1Ch. 
-Lenght =  50
Input sentence: Nature and ntlmatod duration of treatments: 
GT

-Lenght =  100
Input sentence: Call toll-frat: Monday through Friday. 8 am. to 8 p.m. Eastern Tima.
GT sentence: Call toll-free Monday through Friday, 8 a.m. to 8 p.m. Eastern Time.
Char Decoded sentence: Call toll-frat: Monday through Friday. 8 a. to8 pm Ea-tern:Tima
Word Decoded sentence: Call toll-frat: Monday through Friday 8 a to8 pm Ea-tern:Tima 
-Lenght =  50
Input sentence: ATTENDING PHYSICIAN $TATEMENT 
GT sentence: ATTENDING PHYSICIAN STATEMENT
Char Decoded sentence: ATTENDING PHYSICIAN $TATEMENT
Word Decoded sentence: ATTENDING PHYSICIAN STATEMENT 
-Lenght =  100
Input sentence: InsuredfPolicyi'i-oidor Name [Last Nan-no. First Name, MI, aufﬁx  
GT sentence: Insured Name (Last Name, First Name, MI, Suffix)
Char Decoded sentence: InsuredPolicyNa-e Last Nam[ First -am. MI Suffix Fimalix 
Word Decoded sentence: InsuredPolicyNa-e Last Name First same MI Suffix Fimalix a 
-Lenght =  50
Input sentence: Data or Blrlh (mmnoryy) 
GT sentence: Date of Birth (mm/dd/yy)
Char Decoded sent

-Lenght =  100
Input sentence: Call tolH‘ree Monday through Friday. 8 am. to 8 pm (Eastern Time) 
GT sentence: Call toll-free Monday through Friday, 8 a.m. to 8 p.m. Eastern Time.
Char Decoded sentence: Call tollfree Monday through Frida.8 a.m t8  pm Eastern Time
Word Decoded sentence: Call tollfree Monday through Frida.8 am t8 a pm Eastern Time 
-Lenght =  50
Input sentence: ananomo PHYSICIAN STATEMENT (continued) 
GT sentence: ATTENDING PHYSICIAN STATEMENT (Continued)
Char Decoded sentence: ATTENDING PHYSICIAN STATEMEN( Continu)d
Word Decoded sentence: ATTENDING PHYSICIAN STATEMENT continued 
-Lenght =  50
Input sentence: Patient Name (Last Name, Final Name, MI, Sufﬁx) 
GT sentence: Patient’s Name (Last Name, First Name, MI. Suffix)
Char Decoded sentence: Patient Name (ast Name ,ingl Name ,ame, Sufti)
Word Decoded sentence: Patient Name last Name single Name same Mufti 
-Lenght =  50
Input sentence: Date of Birth [mmrddiyﬁ 
GT sentence: Date of Birth (mm/dd/yy)
Char Decoded sentence:

-Lenght =  50
Input sentence: - at;
GT sentence: Date
Char Decoded sentence: -ate;
Word Decoded sentence: later 
-Lenght =  50
Input sentence: unumu'
GT sentence: unum
Char Decoded sentence: unum
Word Decoded sentence: unum 
-Lenght =  50
Input sentence: O O O The Benefits Center
GT sentence: The Benefits Center
Char Decoded sentence: The Benefits Center
Word Decoded sentence: The Benefits Center 
-Lenght =  100
Input sentence: Call toll-free Monday through Friday, 8 am. to 8 pm. Eastern Time.
GT sentence: Call toll-free Monday through Friday, 8 a.m. to 8 p.m. Eastern Time.
Char Decoded sentence: Call toll-free Monday through Friday, 8 a. to8 pm Eas-ern Time
Word Decoded sentence: Call toll-free Monday through Friday 8 a to8 pm eastern Time 
-Lenght =  100
Input sentence: Please si n and return this authorization‘to The Benefits Center at the address above. You are entitled to‘receive a copy 0 this authorization. This authorization is desrgned to comply With the Health Insurance Portab

-Lenght =  100
Input sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease.
GT sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease.
Char Decoded sentence: Information authorized for disclose may incoude for dislon to hishoral which may indicate the presen
Word Decoded sentence: Information authorized for disclose may include for diplon to hishoral which may indicate the present 
-Lenght =  100
Input sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above, Unum may not be able to evaluate or administer my claim(s), which may lead to my claim(s) being denied. I may revoke this authorization at any time by sending written notice to the address above. I understand that revocation will not apply to any information that Unum requests or disclos

-Lenght =  100
Input sentence: I authorize the followin persons: health care professionals, hospitals, clinics, laboratories, pharmacies and all other medical or me ically related providers, facilities or services, rehabilitation professionals, vocational evaluators, health plans, insurance companies, third party administrators, insurance producers, insurance service providers, consumer reporting agencies including credit bureaus, GENEX Services, Inc., The Advocator Group and other Social Security advocacy vendors, professional licensing bodies, employers, attorneys, financial institutions and/or banks, and governmental entities; 
GT sentence: I authorize the following persons: health care professionals, hospitals, clinics, laboratories, pharmacies and all other medical or medically related providers, facilities or services, rehabilitation professionals, vocational evaluators, health plans, insurance companies, third party administrators, insurance producers, insurance service provider

-Lenght =  50
Input sentence: 03/15/2018 Date Signed 
GT sentence: 03/15/2018 Date Signed
Char Decoded sentence: 03/15/2018 Date Signed Insured Paing
Word Decoded sentence: 03/15/2018 Date Signed Insured Pain 
-Lenght =  50
Input sentence: Printed Name 
GT sentence: Printed Name
Char Decoded sentence: Printed Name 
Word Decoded sentence: Printed Name a 
-Lenght =  50
Input sentence: Social Security Number 
GT sentence: Social Security Number
Char Decoded sentence: Social Security Number 
Word Decoded sentence: Social Security Number a 
-Lenght =  100
Input sentence: | signed on behalf of the Insured as MOTher (Relationship). If Power of Attorney Designee, Guardian, or Conservator, please attach a copy of the document granting authority. 
GT sentence: I signed on behalf of the Insured as Mother (Relationship). If Power of Attorney Designee, Guardian, or Conservator, please attach a copy of the document granting authority.
Char Decoded sentence: Allerged on beal fon the Insured as MOT on

-Lenght =  100
Input sentence: Genitourinary: Negative for dysuria, flank pain, frequency and urgency. 
GT sentence: Genitourinary: Negative for dysuria, flank pain, frequency and urgency.
Char Decoded sentence: Genitourinary: Negative for dysuria, flank pain, frequency and:urgency
Word Decoded sentence: Genitourinary Negative for dysuria flank pain frequency and:urgency 
-Lenght =  50
Input sentence: Skin: Negative for rash. 
GT sentence: Skin: Negative for rash.
Char Decoded sentence: Skin: Negative for rash.
Word Decoded sentence: Skin Negative for rash 
-Lenght =  50
Input sentence: Neurological: Negative for headaches. 
GT sentence: Neurological: Negative for headaches.
Char Decoded sentence: Neurological: Negative for headaches.
Word Decoded sentence: Neurological Negative for headaches 
-Lenght =  50
Input sentence: Objective: . . 
GT sentence: Objective:
Char Decoded sentence: Objective:
Word Decoded sentence: Objective 
-Lenght =  100
Input sentence: Blood pressure (l) 98MB. p

-Lenght =  100
Input sentence: If your blood pressure on today‘s visit was greater than 120/80 you may be at risk of developing pro-hypertension or hypertension. We recommend that you follow-up with your Primary Care Provider for further evaluation. 
GT sentence: If your blood pressure on today's visit was greater than 120/80 you may be at risk of developing pre-hypertension or hypertension. We recommend that you follow-up with your Primary Care Provider for further evaluation.
Char Decoded sentence: If your by on today’s visit was no may be a as breater than gres visit was deelopeding of deeloped p
Word Decoded sentence: If your by on todays visit was no may be a as greater than gres visit was deelopeding of developed p 
-Lenght =  100
Input sentence: If your BMI on today's visit was less than 18.5 or greater than or equal to 25, we recommend that you follow-up with your Primary Care Provider for nutritional counseling. 
GT sentence: If your BMI on today's visit was less than 18.5 or 

-Lenght =  50
Input sentence: Stater‘Pronnce: 
GT sentence: State/Province:
Char Decoded sentence: StateProvince: 
Word Decoded sentence: StateProvince: a 
-Lenght =  50
Input sentence: Postal Code: 
GT sentence: Postal Code:
Char Decoded sentence: Postal Code: 
Word Decoded sentence: Postal Code a 
-Lenght =  50
Input sentence: C mmtry', 
GT sentence: Country:
Char Decoded sentence: Country,
Word Decoded sentence: Country 
-Lenght =  50
Input sentence: “Fork State-Comm: 
GT sentence: Work State/Country:
Char Decoded sentence: Fork State-orn:
Word Decoded sentence: Fork State-orn: 
-Lenght =  50
Input sentence: Best Phone Number to be Reached During the Day: 
GT sentence: Best Phone Number to be Reached During the Day:
Char Decoded sentence: Best Phone Number to be Reached During the Day:
Word Decoded sentence: Best Phone Number to be Reached During the Day 
-Lenght =  50
Input sentence: Email Address: 
GT sentence: Email Address:
Char Decoded sentence: Email Address: 
Word Decoded sen

-Lenght =  50
Input sentence: Location: MedExprcss Jackson, N West Ave 
GT sentence: Location: MedExpress Jackson, N West Ave
Char Decoded sentence: Location: MedExproxprachsst ,ame Avest
Word Decoded sentence: Location MedExproxprachsst same Avert 
-Lenght =  50
Input sentence: 1325 North West Avenue 
GT sentence: 1325 North West Avenue
Char Decoded sentence: 1325Not West Avenue 
Word Decoded sentence: 1325Not West Avenue a 
-Lenght =  50
Input sentence: Jackson: MI 49202—2050 
GT sentence: Jackson, MI 49202-2050
Char Decoded sentence: Jackson: MIN49202—2050
Word Decoded sentence: Jackson MIN49202—2050 
-Lenght =  50
Input sentence:  
GT sentence: 517-768-0384
Char Decoded sentence: Required
Word Decoded sentence: Required 
-Lenght =  50
Input sentence: Policy Holder: 
GT sentence: Policy Holder:
Char Decoded sentence: Policy Holder: 
Word Decoded sentence: Policy Holder a 
-Lenght =  50
Input sentence:  
GT sentence: Relation:
Char Decoded sentence: Required
Word Decoded sentence: Re

-Lenght =  100
Input sentence: VIPdira] Pl'mirlel' Information — Hospitalirm'inn 
GT sentence: Medical Provider Information - Hospitalization
Char Decoded sentence: Middita] Provider Information  —ospitalirment 
Word Decoded sentence: Middita] Provider Information a —ospitalirment a 
-Lenght =  50
Input sentence: Hospital N anle: Medlixpress 
GT sentence: Hospital Name: MedExpress
Char Decoded sentence: Hospital Name :edExpress 
Word Decoded sentence: Hospital Name :edExpress a 
-Lenght =  50
Input sentence: Address Line 1: 1 325 North West Ave. 
GT sentence: Address Line 1: 1325 North West Ave.
Char Decoded sentence: Address Line 1: 1325 West Aves
Word Decoded sentence: Address Line 1: 1325 West Aves 
-Lenght =  50
Input sentence: City .Tackson 
GT sentence: City: Jackson
Char Decoded sentence: City .tachson 
Word Decoded sentence: City .tachson a 
-Lenght =  50
Input sentence: State/Province: MI 
GT sentence: State/Province: MI
Char Decoded sentence: State/Province: MI 
Word Decoded 

-Lenght =  50
Input sentence: Amount Billed: 
GT sentence: Amount Billed: $ 264.00
Char Decoded sentence: Amount Billed: 
Word Decoded sentence: Amount Billed a 
-Lenght =  50
Input sentence: Allowed Amnum:
GT sentence: Allowed Amount: $ 136.22
Char Decoded sentence: Allowed Amoun:
Word Decoded sentence: Allowed Amount 
-Lenght =  50
Input sentence: Service Date: 03:0912013
GT sentence: Service Date: 03/09/2018
Char Decoded sentence: Service Date: 03:09120
Word Decoded sentence: Service Date 03:09120 
-Lenght =  50
Input sentence: Paramount Paid: $ 101.22
GT sentence: Paramount Paid: $ 101.22
Char Decoded sentence: Paramount Paid: $101.22
Word Decoded sentence: Paramount Paid $101.22 
-Lenght =  50
Input sentence: Type: Medical 
GT sentence: Type: Medical
Char Decoded sentence: Type: Medical @ Secal Sedical @ Medical @ Sed 
Word Decoded sentence: Type Medical a Seal Medical a Medical a Sed a 
-Lenght =  50
Input sentence: Provider/Facility: Ryan C Klsh. DPM
GT sentence: Provider/Facili

-Lenght =  50
Input sentence: Middle Name/initial: 
GT sentence: Middle Name/Initial:
Char Decoded sentence: Middle Name/Initial: 
Word Decoded sentence: Middle Name/Initial: a 
-Lenght =  50
Input sentence: Last Name: 
GT sentence: Last Name:
Char Decoded sentence: Last Name: 
Word Decoded sentence: Last Name a 
-Lenght =  50
Input sentence: Social Security Number: 
GT sentence: Social Security Number:
Char Decoded sentence: Social Security Number: 
Word Decoded sentence: Social Security Number a 
-Lenght =  50
Input sentence: Birth Date: 
GT sentence: Birth Date:
Char Decoded sentence: Birth Date: 
Word Decoded sentence: Birth Date a 
-Lenght =  50
Input sentence: Gender: 
GT sentence: Gender:
Char Decoded sentence: Gender: 
Word Decoded sentence: Gender a 
-Lenght =  50
Input sentence: Claim Event Information 
GT sentence: Claim Event Information
Char Decoded sentence: Claim Event Information
Word Decoded sentence: Claim Event Information 
-Lenght =  100
Input sentence: Accident Des

-Lenght =  50
Input sentence: Country". US 
GT sentence: Country: US
Char Decoded sentence: Country".US 
Word Decoded sentence: Country".US a 
-Lenght =  50
Input sentence: Business Telephone: (419) 474- 1210 
GT sentence: Business Telephone: (419) 474-1210
Char Decoded sentence: Business Telephone: (419)474-1210
Word Decoded sentence: Business Telephone (419)474-1210 
-Lenght =  50
Input sentence: Business Fax (419) 474-3076 
GT sentence: Business Fax: (419) 474-3076
Char Decoded sentence: Business Fax (419)474-3076 
Word Decoded sentence: Business Fax (419)474-3076 a 
-Lenght =  50
Input sentence: Date ofFirst Visit: 03/09/2018 
GT sentence: Date of First Visit: 03/09/2018
Char Decoded sentence: Date of First Visi:03/09/2018
Word Decoded sentence: Date of First Visi:03/09/2018 
-Lenght =  50
Input sentence: Date ofNen Visit: 03/16/2018
GT sentence: Date of Next Visit: 03/16/2018
Char Decoded sentence: Date of Next Vis:t03/16/2018
Word Decoded sentence: Date of Next Vis:t03/16/2018 
-

-Lenght =  50
Input sentence: City. 
GT sentence: City:
Char Decoded sentence: City. 
Word Decoded sentence: City a 
-Lenght =  50
Input sentence: State/PrOVirice: 
GT sentence: State/Province:
Char Decoded sentence: State/Province:
Word Decoded sentence: State/Province: 
-Lenght =  50
Input sentence: Postal Code 
GT sentence: Postal Code:
Char Decoded sentence: Postal Code 
Word Decoded sentence: Postal Code a 
-Lenght =  50
Input sentence: Country. 
GT sentence: Country:
Char Decoded sentence: Country. 
Word Decoded sentence: Country a 
-Lenght =  50
Input sentence: Best Phone Number to be Reached During the Day: 
GT sentence: Best Phone Number to be Reached During the Day:
Char Decoded sentence: Best Phone Number to be Reached During the Day:
Word Decoded sentence: Best Phone Number to be Reached During the Day 
-Lenght =  50
Input sentence: Email Address: 
GT sentence: Email Address:
Char Decoded sentence: Email Address: 
Word Decoded sentence: Email Address a 
-Lenght =  50
Input 

-Lenght =  100
Input sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease. 
GT sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease.
Char Decoded sentence: Information authorized for disclose may incoude for dislon to hishoral which may indicate the presen
Word Decoded sentence: Information authorized for disclose may include for diplon to hishoral which may indicate the present 
-Lenght =  100
Input sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above, Unum may not be able to evaluate or administer my claim(s), which may lead to my claim(s) being denied. I may revoke this authorization at any time by sending written notice to the address above. I understand that revocation will not apply to any information that Unum requests or disclo

-Lenght =  50
Input sentence: Employee Off—Job Acc June 18, 2012 
GT sentence: Employee Off-Job Acc June 18, 2012
Char Decoded sentence: Employee Off—Job Acc June 18, 2012
Word Decoded sentence: Employee Off—Job Acc June 18, 2012 
-Lenght =  50
Input sentence: Spouse Off—Job Acc June 18, 2012 
GT sentence: Spouse Off-Job Acc June 18, 2012
Char Decoded sentence: Spouse Off—Job Acc June 18, 2012
Word Decoded sentence: Spouse Off—Job Acc June 18, 2012 
-Lenght =  50
Input sentence: Child Off—Job Acc June 18, 2012 
GT sentence: Child Off-Job Acc June 18, 2012
Char Decoded sentence: Child Off—Job Acc June 18, 2012
Word Decoded sentence: Child Off—Job Acc June 18, 2012 
-Lenght =  50
Input sentence: Total Monthly Premium: $31.81 
GT sentence: Total Monthly Premium: $31.81
Char Decoded sentence: Total Monthly Premium: $31.81
Word Decoded sentence: Total Monthly Premium $31.81 
-Lenght =  100
Input sentence: Total Employee Semi-Monthly Payroll Deduction: $15.92 
GT sentence: Total Employee Sem

In [19]:
WER_spell_correction = calculate_WER(gt_texts, decoded_sentences)
print('WER_spell_correction |TEST= ', WER_spell_correction)

WER_spell_correction |TEST=  0.113000765838


In [20]:
WER_spell_word_correction = calculate_WER(gt_texts, corrected_sentences)
print('WER_spell_word_correction |TEST= ', WER_spell_word_correction)

WER_spell_word_correction |TEST=  0.129743576201


In [21]:
WER_OCR = calculate_WER(gt_texts, input_texts)
print('WER_OCR |TEST= ', WER_OCR)

WER_OCR |TEST=  0.0563287295006
