# Introduction

We tackle the problem of OCR post processing. In OCR, we map the image form of the document into the text domain. This is done first using an CNN+LSTM+CTC model, in our case based on tesseract. Since this output maps only image to text, we need something on top to validate and correct language semantics.

The idea is to build a language model, that takes the OCRed text and corrects it based on language knowledge. The langauge model could be:
- Char level: the aim is to capture the word morphology. In which case it's like a spelling correction system.
- Word level: the aim is to capture the sentence semnatics. But such systems suffer from the OOV problem.
- Fusion: to capture semantics and morphology language rules. The output has to be at char level, to avoid the OOV. However, the input can be char, word or both.

The fusion model target is to learn:

    p(char | char_context, word_context)

In this workbook we use seq2seq vanilla Keras implementation, adapted from the lstm_seq2seq example on Eng-Fra translation task. The adaptation involves:

- Adapt to spelling correction, on char level
- Pre-train on a noisy, medical sentences
- Fine tune a residual, to correct the mistakes of tesseract 
- Limit the input and output sequence lengths
- Enusre teacher forcing auto regressive model in the decoder
- Limit the padding per batch
- Learning rate schedule
- Bi-directional LSTM Encoder
- Bi-directional GRU Encoder


# Imports

In [1]:
from __future__ import print_function
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import set_session
from keras.models import Model
from keras.layers import Input, LSTM, Dense, Bidirectional, Concatenate, GRU
from keras import optimizers
from keras.callbacks import ModelCheckpoint, TensorBoard, LearningRateScheduler
from keras.models import load_model
import numpy as np
import os
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from autocorrect import spell
%matplotlib inline

Using TensorFlow backend.


# Utility functions

In [2]:
# Limit gpu allocation. allow_growth, or gpu_fraction
def gpu_alloc():
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    set_session(tf.Session(config=config))

In [3]:
gpu_alloc()

In [4]:
def calculate_WER_sent(gt, pred):
    '''
    calculate_WER('calculating wer between two sentences', 'calculate wer between two sentences')
    '''
    gt_words = gt.lower().split(' ')
    pred_words = pred.lower().split(' ')
    d = np.zeros(((len(gt_words) + 1), (len(pred_words) + 1)), dtype=np.uint8)
    # d = d.reshape((len(gt_words)+1, len(pred_words)+1))

    # Initializing error matrix
    for i in range(len(gt_words) + 1):
        for j in range(len(pred_words) + 1):
            if i == 0:
                d[0][j] = j
            elif j == 0:
                d[i][0] = i

    # computation
    for i in range(1, len(gt_words) + 1):
        for j in range(1, len(pred_words) + 1):
            if gt_words[i - 1] == pred_words[j - 1]:
                d[i][j] = d[i - 1][j - 1]
            else:
                substitution = d[i - 1][j - 1] + 1
                insertion = d[i][j - 1] + 1
                deletion = d[i - 1][j] + 1
                d[i][j] = min(substitution, insertion, deletion)
    return d[len(gt_words)][len(pred_words)]

In [5]:
def calculate_WER(gt, pred):
    '''

    :param gt: list of sentences of the ground truth
    :param pred: list of sentences of the predictions
    both lists must have the same length
    :return: accumulated WER
    '''
#    assert len(gt) == len(pred)
    WER = 0
    nb_w = 0
    for i in range(len(gt)):
        #print(gt[i])
        #print(pred[i])
        WER += calculate_WER_sent(gt[i], pred[i])
        nb_w += len(gt[i])

    return WER / nb_w

In [6]:
def load_data_with_gt(file_name, num_samples, max_sent_len, min_sent_len, delimiter='\t', gt_index=1, prediction_index=0):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []
    gt_texts = []
    target_texts = []
    for row in open(file_name, encoding='utf8'):
        if cnt < num_samples :
            #print(row)
            sents = row.split(delimiter)
            input_text = sents[prediction_index]
            
            target_text = '\t' + sents[gt_index] + '\n'
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len and len(target_text) > min_sent_len and len(target_text) < max_sent_len:
                cnt += 1
                
                input_texts.append(input_text)
                target_texts.append(target_text)
                gt_texts.append(sents[gt_index])
    return input_texts, target_texts, gt_texts

In [7]:
def load_data(file_name, num_samples, max_sent_len, min_sent_len):
    '''Load data from txt file, with each line has: <TXT><TAB><GT>. The  target to the decoder muxt have \t as the start trigger and \n as the stop trigger.'''
    cnt = 0  
    input_texts = []   
    
    #for row in open(file_name, encoding='utf8'):
    for row in open(file_name):
        if cnt < num_samples :            
            input_text = row           
            if len(input_text) > min_sent_len and len(input_text) < max_sent_len:
                cnt += 1                
                input_texts.append(input_text)
    return input_texts

In [8]:
def vectorize_data(input_texts, max_encoder_seq_length, num_encoder_tokens, vocab_to_int):
    
    if(len(input_texts) > max_encoder_seq_length):
        input_texts = input_texts[:max_encoder_seq_length]
    
    '''Prepares the input text and targets into the proper seq2seq numpy arrays'''
    encoder_input_data = np.zeros(
    (len(input_texts), max_encoder_seq_length),
    dtype='float32')
    
    for i, input_text in enumerate(input_texts):
        for t, char in enumerate(input_text[:max_encoder_seq_length]):
            # c0..cn
            encoder_input_data[i, t] = vocab_to_int[char]
                
    return encoder_input_data

In [9]:
def decode_sequence(input_seq, encoder_model, decoder_model, num_decoder_tokens, max_decoder_seq_length, vocab_to_int, int_to_vocab):
    
    #print(max_decoder_seq_length)
    # Encode the input as state vectors.
    encoder_outputs, h, c  = encoder_model.predict(input_seq)
    states_value = [h,c]
    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0] = vocab_to_int['\t']

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    #print(input_seq)
    attention_density = []
    i = 0
    special_chars = ['\\', '/', '-', '—' , ':', '[', ']', ',', '.', '"', ';', '%', '~', '(', ')', '{', '}', '$']
    #special_chars = []
    while not stop_condition:
        #print(target_seq)
        output_tokens, attention, h, c  = decoder_model.predict(
            [target_seq, encoder_outputs] + states_value)
        #print(attention.shape)
        attention_density.append(attention[0][0])# attention is max_sent_len x 1 since we have num_time_steps = 1 for the output
        # Sample a token
        #print(output_tokens.shape)
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        
        #print(sampled_token_index)
        sampled_char = int_to_vocab[sampled_token_index]
        
        orig_char = int_to_vocab[int(input_seq[:,i][0])]
        
        # Exit condition: either hit max length
        # or find stop character.
        if (sampled_char == '\n' or
           len(decoded_sentence) > max_decoder_seq_length):
            stop_condition = True
            #print('End', sampled_char, 'Len ', len(decoded_sentence), 'Max len ', max_decoder_seq_length)
            sampled_char = ''
        
        # Copy digits as it, since the spelling corrector is not good at digit corrections
        if(orig_char.isdigit() or orig_char in special_chars):
            decoded_sentence += orig_char            
        else:
            if(sampled_char.isdigit() or sampled_char in special_chars):
                decoded_sentence += ''
            else:
                decoded_sentence += sampled_char
        
        #decoded_sentence += sampled_char


        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]
        
        i += 1
        if(i > 48):
            i = 0
    attention_density = np.array(attention_density)
    
    # Word level spell correct
    
    corrected_decoded_sentence = ''
    for w in decoded_sentence.split(' '):
        corrected_decoded_sentence += spell(w) + ' '
    decoded_sentence = corrected_decoded_sentence
    
    return decoded_sentence, attention_density


# Load data

# Load model params

In [10]:
data_path = '../../dat/'

In [11]:
max_sent_lengths = [50, 100]

In [12]:
vocab_file = {}
model_file = {}
encoder_model_file = {}
decoder_model_file = {}
model = {}
encoder_model = {}
decoder_model = {}
vocab = {}
vocab_to_int = {}
int_to_vocab = {}
max_sent_len = {}
min_sent_len = {}
num_decoder_tokens = {}
num_encoder_tokens = {}
max_encoder_seq_length = {}
max_decoder_seq_length = {}

In [13]:

for i in max_sent_lengths:
    vocab_file[i] = 'vocab-{}.npz'.format(i)
    model_file[i] = 'best_model-{}.hdf5'.format(i)
    encoder_model_file[i] = 'encoder_model-{}.hdf5'.format(i)
    decoder_model_file[i] = 'decoder_model-{}.hdf5'.format(i)
    
    vocab = np.load(file=vocab_file[i])
    vocab_to_int[i] = vocab['vocab_to_int'].item()
    int_to_vocab[i] = vocab['int_to_vocab'].item()
    max_sent_len[i] = vocab['max_sent_len']
    min_sent_len[i] = vocab['min_sent_len']
    input_characters = sorted(list(vocab_to_int))
    num_decoder_tokens[i] = num_encoder_tokens[i] = len(input_characters) #int(encoder_model.layers[0].input.shape[2])
    max_encoder_seq_length[i] = max_decoder_seq_length[i] = max_sent_len[i] - 1#max([len(txt) for txt in input_texts])
    
    model[i] = load_model(model_file[i])
    encoder_model[i] = load_model(encoder_model_file[i])
    decoder_model[i] = load_model(decoder_model_file[i])



In [14]:
num_samples = 1000000
#tess_correction_data = os.path.join(data_path, 'test_data.txt')
#input_texts = load_data(tess_correction_data, num_samples, max_sent_len, min_sent_len)

OCR_data = os.path.join(data_path, 'new_trained_data.txt')
#input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len, min_sent_len, delimiter='|',gt_index=0, prediction_index=1)
input_texts, target_texts, gt_texts = load_data_with_gt(OCR_data, num_samples, max_sent_len=10000, min_sent_len=0)

In [15]:
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])

1951
Me dieal Provider Roles: Treating  
 	Medical Provider Roles: Treating


Provider First Name: Christine  
 	Provider First Name: Christine


Provider Last Name: Nolen, MD  
 	Provider Last Name: Nolen, MD


Address Line 1 : 7 25 American Avenue  
 	Address Line 1 : 725 American Avenue


City. W’aukesha  
 	City: Waukesha


StatefProvinee: ‘WI  
 	State/Province: WI


Postal Code: 5 31 88  
 	Postal Code: 53188


Country". US  
 	Country:  US


Business Telephone: (2 62) 92 8- 1000  
 	Business Telephone: (262) 928- 1000


Date ot‘Pirst Visit: 1 2/01f20 17  
 	Date of First Visit: 12/01/2017




In [16]:
# Spell correct before inference
'''
input_texts_ = []
for sent in input_texts:
    sent_ = ''
    for word in sent.split(' '):
        sent_ += spell(word) + ' '
    input_texts_.append(sent_)
input_texts = input_texts_
input_texts_ = []
# Sample data
print(len(input_texts))
for i in range(10):
    print(input_texts[i], '\n', target_texts[i])
'''

"\ninput_texts_ = []\nfor sent in input_texts:\n    sent_ = ''\n    for word in sent.split(' '):\n        sent_ += spell(word) + ' '\n    input_texts_.append(sent_)\ninput_texts = input_texts_\ninput_texts_ = []\n# Sample data\nprint(len(input_texts))\nfor i in range(10):\n    print(input_texts[i], '\n', target_texts[i])\n"

In [17]:
decoded_sentences = []

#for seq_index in range(len(input_texts)):


for i, input_text in enumerate(input_texts):
    #print(input_text)
    # Find the input length range to choose the proper model to use
    len_range = max_sent_lengths[-1] # Take the longest range
    for length in max_sent_lengths:
        if(len(input_text) < length):
            len_range = length
            break
    #print(len_range)
    encoder_input_data = vectorize_data(input_texts=[input_text], max_encoder_seq_length=max_encoder_seq_length[len_range], num_encoder_tokens=num_encoder_tokens[len_range], vocab_to_int=vocab_to_int[len_range])
    
    

    target_text = gt_texts[i]
    
    input_seq = encoder_input_data
    #print(input_seq.shape)
    #print(max_decoder_seq_length[len_range])
    #print(max_decoder_seq_length)
    decoded_sentence,_  = decode_sequence(input_seq, encoder_model[len_range], decoder_model[len_range], num_decoder_tokens[len_range],  max_decoder_seq_length[len_range], vocab_to_int[len_range], int_to_vocab[len_range])
    
    print('-')
    print('Input sentence:', input_text)
    print('GT sentence:', target_text)
    print('Decoded sentence:', decoded_sentence)   
    decoded_sentences.append(decoded_sentence)
    

    

-
Input sentence: Me dieal Provider Roles: Treating 
GT sentence: Medical Provider Roles: Treating

Decoded sentence: Medical Provider Roles:Treating a 
-
Input sentence: Provider First Name: Christine 
GT sentence: Provider First Name: Christine

Decoded sentence: Provider First Name Christine a 
-
Input sentence: Provider Last Name: Nolen, MD 
GT sentence: Provider Last Name: Nolen, MD

Decoded sentence: Provider Last Name Dolens MD a 
-
Input sentence: Address Line 1 : 7 25 American Avenue 
GT sentence: Address Line 1 : 725 American Avenue

Decoded sentence: Address Line a a 725 American Avenue 
-
Input sentence: City. W’aukesha 
GT sentence: City: Waukesha

Decoded sentence: City.Stakes Week 
-
Input sentence: StatefProvinee: ‘WI 
GT sentence: State/Province: WI

Decoded sentence: StateProvinee: WI I N S 
-
Input sentence: Postal Code: 5 31 88 
GT sentence: Postal Code: 53188

Decoded sentence: Postal Code a 3188 a 
-
Input sentence: Country". US 
GT sentence: Country:  US

Decoded

-
Input sentence: [:I Piease Check box if address is inoorreot or insurawce information has changed andmdicatechange(s)0n revalse side. 
GT sentence: Please check if address is incorrect or insurance information has changed and indicate charge(s) on reverse side

Decoded sentence: these Check box indurated in ore insurant and in[:race information has changed and dis cantechang[: 
-
Input sentence: - VIASTERCARD
GT sentence: MASTERCARD

Decoded sentence: MASTERCARD 
-
Input sentence: DISCOVER
GT sentence: DISCOVER

Decoded sentence: DISCOVER 
-
Input sentence: VISA
GT sentence: VISA

Decoded sentence: VISA 
-
Input sentence: RATIENTIIAME
GT sentence: PATIENT NAME

Decoded sentence: PATIENTS SAME 
-
Input sentence: DUEDATE 
GT sentence: DUE DATE

Decoded sentence: DUE DATE 
-
Input sentence: GUARANTORID
GT sentence: GUARANTOR ID

Decoded sentence: GUARANTOR ID 
-
Input sentence: BALANCE DUE
GT sentence: BALANCE DUE

Decoded sentence: BALANCE DUE 
-
Input sentence: Amount Enclosed 
GT sen

-
Input sentence: I authorize the followin persons: health care professionals, hospitals, clinics, laboratories, pharmacies and all other medical or me ically related providers, facilities or services, rehabilitation professionals, vocational evaluators, health plans, insurance companies, third party administrators, insurance producers, insurance service providers, consumer reporting agencies including credit bureaus, GENEX Services, Inc., The Advocator Group and other Social Security advocacy vendors, professional licensing bodies, employers, attorneys, financial institutions and/or banks, and governmental entities; 
GT sentence: I authorize the following persons: health care professionals, hospitals, clinics, laboratories, pharmacies and all other medical or medically related providers, facilities or services, rehabilitation professionals, vocational evaluators, health plans, insurance companies, third party administrators, insurance producers, insurance service providers, consumer r

-
Input sentence: I signed on behalf of the Insured as (Relationship). If Power of Attorney Designee, Guardian, or Conservator, please attach a copy of the document granting authority.
GT sentence: I signed on behalf of the Insured as (Relationship). If Power of Attorney Designee, Guardian, or Conservator, please attach a copy of the document granting authority.

Decoded sentence: Is inkled on bleed as Relationship of the Insured as melinae Designed Describe guardian Cor Contain a 
-
Input sentence: Unum is a registered trademark and marketing brand of Unum Group and its insuring subsidiaries. 
GT sentence: Unum is a registered trademark and marketing brand of Unum Group and its insuring subsidiaries.

Decoded sentence: Unum is a registered trademark and marketing brand of Unum Group and its insuring subsidiaries 
-
Input sentence: CL-1116 ( 
GT sentence: CL-1116

Decoded sentence: CL-1116 d 
-
Input sentence: Daytime Phone: 
GT sentence: Daytime Phone:

Decoded sentence: Daytime Phone

-
Input sentence: Date ofVisithdmission 
GT sentence: Date of Visit/Admission: 12/01/2017

Decoded sentence: Date of VisitAdmission a 
-
Input sentence: Date ot’DischaIge: 1210112017 
GT sentence: Date of Discharge: 12/01/2017

Decoded sentence: Date of Discharge 1210112017 
-
Input sentence: Proc edure : Cleaning, may: bandage
GT sentence: Procedure: Cleaning, xray, bandage

Decoded sentence: Procedure leaning and bandage 
-
Input sentence: E11113] arm en t In formation 
GT sentence: Employment Information

Decoded sentence: E11113] ardent Information 
-
Input sentence: Employer Name: 
GT sentence: Employer Name:

Decoded sentence: Employer Name a 
-
Input sentence: Electronic Submission 
GT sentence: Electronic Submission

Decoded sentence: Electronic Submission 
-
Input sentence: Claim Event Identiﬁer: 
GT sentence: Claim Event Identifier:

Decoded sentence: Claim Event Identified a 
-
Input sentence: Submission Date: 03411419018
GT sentence: Submission Date: 03/14/2018

Decoded sen

-
Input sentence: 4. Lateral compartment osseous contusions involving the posterior rim of the lateral tibial plateau and terminal sulcus of lateral femoral condyle. 
GT sentence: 4. Lateral compartment osseous contusions involving the posterior rim of the lateral tibial plateau and terminal sulcus of lateral femoral condyle.

Decoded sentence: of Lateral compartment incoupang the postering th4.postering the patient to distent is after lateral 
-
Input sentence: 5. Patellar apical grade 1-2 chondromalacia. 
GT sentence: 5. Patellar apical grade 1-2 chondromalacia.

Decoded sentence: of Patellar apical grade 1-2 chondromata 
-
Input sentence: Diagnosis 
GT sentence: Diagnosis

Decoded sentence: Diagnosis 
-
Input sentence: Right knee ACL rupture and high grade MCL tear 
GT sentence: Right knee ACL rupture and high grade MCL tear

Decoded sentence: a Right knee ACL rupture and hight rider MAL tear a 
-
Input sentence: Plan 
GT sentence: Plan

Decoded sentence: Plan Cand 
-
Input sentence

-
Input sentence: ENT: no ears, nose or throat symptoms. 
GT sentence: ENT: no ears, nose or throat symptoms.

Decoded sentence: ENTR no eared nose throat symptoms 
-
Input sentence: Endocrine: no endocrine symptoms. 
GT sentence: Endocrine: no endocrine symptoms.

Decoded sentence: Endocrines no endocrine symptoms 
-
Input sentence: Eyes: glasseslcontact. 
GT sentence: Eyes: glasses/contact.

Decoded sentence: Eyes glassescontact. 
-
Input sentence: Gastrointestinal: no gastrointestinal symptoms. 
GT sentence: Gastrointestinal: no gastrointestinal symptoms.

Decoded sentence: Gast intestinal no gastivestional symptoms 
-
Input sentence: Genitourinary: no genitourinary symptoms. 
GT sentence: Genitourinary: no genitourinary symptoms.

Decoded sentence: Genitourinary no genitourinary symptoms 
-
Input sentence: HematologicILymphatlc no hematologic symptoms. 
GT sentence: Hematologic/Lymphatlc no hematologic symptoms.

Decoded sentence: HematologicLymphatlc no hematologic symptoms 
-
Inp

-
Input sentence: OPERATIVE REPORT 
GT sentence: OPERATIVE REPORT

Decoded sentence: OPERATIVE REPORT a 
-
Input sentence: MR #: 
GT sentence: MR #:

Decoded sentence: MR of 
-
Input sentence: SURGEON: JASON HOLM, M.D. 
GT sentence: SURGEON: JASON HOLM, M.D. 

Decoded sentence: SURGEON JASON HOLM Made a 
-
Input sentence: DATE: 02/02/2018 
GT sentence: DATE: 02/02/2018

Decoded sentence: DATE 02/02/2018 a 
-
Input sentence: 05/09/1980 
GT sentence: 05/09/1980

Decoded sentence: 05/09/1980 a 
-
Input sentence: PREOPERATIVE DIAGNOSES: 
GT sentence: PREOPERATIVE DIAGNOSES:

Decoded sentence: PREOPERATIVE DIAGNOSES a 
-
Input sentence: 1. Right knee anterior cruciate ligament tear. 
GT sentence: 1. Right knee anterior cruciate ligament tear.

Decoded sentence: of Right knee anterior cruciate ligament tears 
-
Input sentence: 2. Me dial collateral ligament tear. 
GT sentence: 2. Medial collateral ligament tear.

Decoded sentence: of Medial collateral ligament tears 
-
Input sentence: POSTOP

-
Input sentence: 9 Family history of Cancer (C801) 
GT sentence: • Family history of Cancer (C80.1)

Decoded sentence: a Family history of Cancer (C801) 
-
Input sentence: 0 Family history of other condition (284.89) 
GT sentence: • Family history of other condition (Z84.89)

Decoded sentence: a Family history of other condition (284.89) 
-
Input sentence: 0 Family history of other condition (284.89) 
GT sentence: • Family history of other condition (Z84.89)

Decoded sentence: a Family history of other condition (284.89) 
-
Input sentence: Social History 
GT sentence: Social History

Decoded sentence: Social History 
-
Input sentence: 0 Age reporting 
GT sentence: • Age reporting

Decoded sentence: a Age reporting Carting Sorite 
-
Input sentence: a Consumes alcohol (278.9) 
GT sentence: • Consumes alcohol (Z78.9)

Decoded sentence: a Consumes alcohol (278.9) a 
-
Input sentence: - Exercises regularly 
GT sentence: • Exercises regularly

Decoded sentence: a Exercises regularly a 
-
In

-
Input sentence: Continue with physicai therapy. Follow up in 4 weeks. 
GT sentence: Continue with physical therapy. Follow up in 4 weeks.

Decoded sentence: Continue with physical therapy Follow up in a weeks 
-
Input sentence: DiscussionlSummary 
GT sentence: Discussion/Summary

Decoded sentence: DiscussionSummary a 
-
Input sentence: I believe that: a is doing quite Well at this time following a right knee ACL repair with right knee MCL repair. We did review precautions and once again reviewed her protocol which would include an additional week of toe touch weight bearing with the brace locked at zero. At three weeks post op, she may progress to weight bearing as tolerated with the brace locked at zero. I would like to see her back in 4 weeks for reassessment. We did review some home exercises to work on ROM. Based on quad set. which is actually quite good today with a mild effusion, i did not feet an aspiration was indicated. I did however recommend use of a Kneehab unit given her

-
Input sentence: Postoperatively, the patient will be touchdown weightbearing only for the ﬁrst three weeks to allow some early healing of the MCL'. She was then allowed to progressively weight bear as tolerated with the knee locked in extension. She will then progress back onto the standard ACL rehabilitation protocol. 
GT sentence: Postoperatively, the patient will be touchdown weightbearing only for the first three weeks to allow some early healing of the MCL. She was then allowed to progressively weight bear as tolerated with the knee locked in extension. She will then progress back onto the standard ACL rehabilitation protocol.

Decoded sentence: Providerativer,the patient will abn weighting only for the res ,ecthent week week freaks to allowed a 
-
Input sentence: Jamie Birkelo, PA-C, was present and scrubbed throughout the case and his assistance was critical for patient positioning, assistance with prepping and draping, soft tissue retraction, leg manipulation, operating power

-
Input sentence: Grade 2b Lachman's 
GT sentence: Grade 2b Lachman's

Decoded sentence: Grade b Lachman's 
-
Input sentence: -- posterior drawer 
GT sentence: -- posterior drawer

Decoded sentence: of posterior drawer 
-
Input sentence: +anterior drawer 
GT sentence: +anterior drawer

Decoded sentence: anterior drawer a 
-
Input sentence: opening to valgus stress at 0 degrees. 
GT sentence: opening to valgus stress at 0 degrees.

Decoded sentence: opening to valgus stress at a degrees 
-
Input sentence: grade 3 opening to valgus at 30 degrees 
GT sentence: grade 3 opening to valgus at 30 degrees

Decoded sentence: grade a opening to valgus a degrees a 
-
Input sentence: stable to varus stress at 0 and 30 degrees. 
GT sentence: stable to varus stress at 0 and 30 degrees.

Decoded sentence: stable to varus stress at a and of degrees 
-
Input sentence: o‘é’iiiéi'gﬁés 
GT sentence: TWIN CITIES ORTHOPEDICS

Decoded sentence: Right Tillers a 
-
Input sentence: "shortTermDisability"] 
GT sen

-
Input sentence: December 27, 2016
GT sentence: December 27, 2016

Decoded sentence: December 27, 2016 
-
Input sentence: Confirmation of Coverage
GT sentence: Confirmation of Coverage

Decoded sentence: Confirmation of Coverage 
-
Input sentence: Employer: ‘
GT sentence: Employer:

Decoded sentence: Employers 
-
Input sentence: Group Policy #: 
GT sentence: Group Policy #:

Decoded sentence: Group Policy of 
-
Input sentence: Customer Policy #: 
GT sentence: Customer Policy #:

Decoded sentence: Customer Policy of 
-
Input sentence: EE Name: 
GT sentence: EE Name:

Decoded sentence: EE Name a 
-
Input sentence: The information below is provided to give you a general summary of your coverage and premium consistent with the benefits outlined in your certificate. 
GT sentence: The information below is provided to give you a general summary of your coverage and premium consistent with the benefits outlined in your certificate.

Decoded sentence: The above aition beloved to giver to giver

-
Input sentence: Served military last 12 mths — no 
GT sentence: Served military last 12 mths - no

Decoded sentence: Served lizary last 12mthe no 
-
Input sentence: Hired as temp - no 
GT sentence: Hired as temp - no

Decoded sentence: Hired as temp a no a 
-
Input sentence: Work and live same state — yes 
GT sentence: Work and live same state - yes

Decoded sentence: Work and live same state a yes 
-
Input sentence: Work from home — no 
GT sentence: Work from home - no

Decoded sentence: Work from home a no 
-
Input sentence: Does schedule vary? - yes 
GT sentence: Does schedule vary? - yes

Decoded sentence: Does schedule vary a yes 
-
Input sentence: How does it vary — hours and days vary 
GT sentence: How does it vary - hours and days vary

Decoded sentence: How does it vary a hours and days vary a 
-
Input sentence: Scheduled work days — ["mon","tue","wed“rll thull I"fri"] 
GT sentence: Scheduled work days — ["mon","tue","wed“,"thu","fri"]

Decoded sentence: Scheduled work days 

-
Input sentence: l:l Short Term Disability iﬁ Voluntary Benefits ( A cci cl ten-i» d.— i-ioep [+41 lﬁdé’mni ly)
GT sentence: Short Term Disability Voluntary Benefits

Decoded sentence: Short Term Disability Voluntary Benefits incl c(l:untary Benefits IDDD a left S lighty 
-
Input sentence: El Long Term Disability El Voluntary Benefits ConcorlGrilical Illness insurance
GT sentence: Long Term Disability Voluntary Benefits Cancer/Critical Illness Insurance

Decoded sentence: Longe Term Disability Voluntary Bencordital Concervical Illness insurances insurance 
-
Input sentence: El Lite Insurance :1 Voluntary Benefits ModSupporl insurance
GT sentence: Life Insurance Voluntary Benefits MedSupport Insurance

Decoded sentence: Listed Insurance voluntary Benefits ModSupporl insurance 
-
Input sentence: While there to no legal requirementtor you to provide intermatlen regarding other policies you may have with Unum. thle information will help us Identify any other coverage you have with up for 

-
Input sentence: Detee orService (including
GT sentence: Dates  of Service (including Confinement) 

Decoded sentence: Deter WorkServic( including 
-
Input sentence: Diagnosis Code (ICD)
GT sentence: Diagnosis Code (ICD)

Decoded sentence: Diagnosis Code (ICD) 
-
Input sentence: UIBQnOSIS Description
GT sentence: Diagnosis Description

Decoded sentence: RIB LOS St Description 
-
Input sentence: Prooodure Code
GT sentence: Procedure Code

Decoded sentence: Procedure Code 
-
Input sentence: Procedure Description
GT sentence: Procedure Description

Decoded sentence: Procedure Description 
-
Input sentence: Has the patient been treated for the same or a similar condition by another physician in the past? [:1 Yes X31“:
GT sentence: Has the patient been treated for the same or a similar condition by another physician in the past? Yes No

Decoded sentence: Has the patient been treated for the same or the same same condition by another physician in the pas 
-
Input sentence: It yes, please pr

-
Input sentence: Provlder: FAYETI'E COMMUNITY HOSPI Employee: 0911001: 010110 1105” Provldor Nu: Member No: Pal. Acct No: l 
GT sentence: Provider: FAYETTE COMMUNITY HOSPI Employee Member No: Patient Pat Acct No: Claim No:

Decoded sentence: Provider FAYETENURI Employee No Ins repl:y0911: Employee No Member No Medical:A091 
-
Input sentence: lREFnarks Description 
GT sentence: Services Description

Decoded sentence: Remarks Description a 
-
Input sentence: lREFnarks Description 
GT sentence: Remarks Description

Decoded sentence: Remarks Description a 
-
Input sentence: 1 Processed at [he Tier 1 Contracted Rate 
GT sentence: 1 Processed at the Tier 1 Contracted Rate

Decoded sentence: a Processed at the Tier a Contrate Rate a 
-
Input sentence: I" "" Ann—1.101 Hagan Amount Satisﬁed 130mm Period "Wm—I 
GT sentence: Annual Amount Amount Satisfied Benefits Period

Decoded sentence: A"n""l Am—1.101mount Satisfied mm P130ided Werk"P""d 
-
Input sentence: TERWamny nmctmle __""'"‘"‘”"“ " 
GT

-
Input sentence: To assist in the evaluation or administration of my claim(s), I authorize Unum Group, its subsidiaries and duly authorized representatives (“Unum") to share personal health and ﬁnancial information relating to my ciaim with the family members, friends. andfor other third parties listed below: 
GT sentence: To assist in the evaluation or administration of my claim(s), I authorize Unum Group, its subsidiaries and duly authorized representatives ("Unum") to share personal health and financial information relating to my claim with the family members, friends, and/or other third parties listed below:

Decoded sentence: To as sist in the vanity in my claims Indinistration of my claim s Group Group and subsidiary 
-
Input sentence: My Spouse: f"— 
GT sentence: My Spouse: 

Decoded sentence: My Spouse to 
-
Input sentence: (Name) ._ (Telephone Number) 
GT sentence: (Name)  (Telephone Number)

Decoded sentence: Name Telephone Number a 
-
Input sentence: Other Family Member: 
G

-
Input sentence: impression 
GT sentence: Impression

Decoded sentence: Impression 
-
Input sentence: 1. Horizontal flap tear through the posterior medial meniscai horn. Separate focal radial tear within the posterior horn. Another separate complex tear at the junction of the posterior medial meniscai horn and root. No displaced meniscal fragments identiﬁed. 
GT sentence: 1. Horizontal flap tear through the posterior medial meniscal horn. Separate focal radial tear within the posterior horn. Another separate complex tear at the junction of the posterior medial meniscal horn and root. No displaced meniscal fragments identified.

Decoded sentence: of Horrizal for the posterior medial meniscal men1.cal meniscal meniscal are for a the rear withers 
-
Input sentence: 2. 0.3 cm extrusion of the medial meniscal body. 
GT sentence: 2. 0.3 cm extrusion of the medial meniscal body.

Decoded sentence: of 0.3 cmextression of the medical bodisial by.d2 
-
Input sentence: 3. Mild medial and patello

-
Input sentence: Provld-or: GEORGE STONE M0 
GT sentence: Provider: GEORGE STONE MD

Decoded sentence: Provid-r:GEORGE STONE MD 
-
Input sentence: En'lployee: 
GT sentence: Employee:

Decoded sentence: Employee a 
-
Input sentence: PaiTiEMWWTM“
GT sentence: Patient: 

Decoded sentence: STATEMENT 
-
Input sentence: Claim No: ' _" 
GT sentence: Claim No:

Decoded sentence: Claim Not 
-
Input sentence: Provider Me:  
GT sentence: Provider No:

Decoded sentence: Provider Me 
-
Input sentence: Member Ne: 
GT sentence: Member No:

Decoded sentence: Member name a 
-
Input sentence: Pat Acct No: 
GT sentence: Pat Acct No:

Decoded sentence: Pat Acct Not 
-
Input sentence: Service Date of Charged I MRI 
GT sentence: Services Description MRI

Decoded sentence: Service Date of Charged 
-
Input sentence: Description Service Ameung , Porgy} . 02:15:18 
GT sentence: Date of Service 02/15/18

Decoded sentence: Description Service Amount a Port a .02:15:1 
-
Input sentence: Claim Totals _ 
GT sentenc

-
Input sentence: So that Unum ma evaluate and administer my claims, includin providin assistance with return to work. For such eva uation and administration of claims, this authorize ion is valid or two years, or the duration of my claim for beneﬁts (to include any subsi/cltuent ﬁnanCial management and/or beneﬁt recovery review), whic ever is shorter. i understand the once y Information is disc csed to Unum, any privacy rotections established by/IHIPAA may not apply to the information. but other privacy laws continue to apply. num may y then disclose Informaticn only as permitted by law, including. state fraud reporting laws or as authorized by me. 
GT sentence: So that Unum may evaluate and administer my claims, including providing assistance with return to work. For such evaluation and administration of claims, this authorization is valid or two years, or the duration of my claim for benefits (to include any subsequent financial management and/or benefit recovery review), whichever 

-

Decoded sentence: Frequent From cont contre to appar to paper on the following:to appear on this claim for and fire 
-
Input sentence: Any person who knowingly and with the intent to injure. defraud or deceive an insurance company presents a false or fraudulent claim for payment of a loss or beneﬁt or knowingly presents false information in an application for insurance is guilty of a crime and may be subject to ﬁnes and confinement in prison. 
GT sentence: Any person who knowingly and with the intent to injure, defraud or deceive an insurance company presents a false or fraudulent claim for payment of a loss or benefit or knowingly presents false information in an application for insurance is guilty of a crime and may be subject to fines and confinement in prison.

Decoded sentence: Any person ho knownscinglan dang withen to injured and withen to insurance an normande company park 
-

Decoded sentence: Are the warking For your procine,to appeal to paper on the request on the request

-
Input sentence: December 6, 2017 
GT sentence: December 6, 2017

Decoded sentence: December of 2017 
-
Input sentence: Conﬁrmation of Coverage 
GT sentence: Confirmation of Coverage

Decoded sentence: Confirmation of Coverage 
-
Input sentence: Employer: 
GT sentence: Employer:

Decoded sentence: Employers 
-
Input sentence: Group Policy #: 
GT sentence: Group Policy #:

Decoded sentence: Group Policy of 
-
Input sentence: Customer Policy #: 
GT sentence: Customer Policy #:

Decoded sentence: Customer Policy of 
-
Input sentence: EE Name: 
GT sentence: EE Name:

Decoded sentence: EE Name a 
-
Input sentence: The information below is provided to give you a general summary of your coverage and premium consistent with the benefits outlined in your certiﬁcate. 
GT sentence: The information below is provided to give you a general summary of your coverage and premium consistent with the benefits outlined in your certificate.

Decoded sentence: The above aition beloved to giver to giver to 

-
Input sentence: If related to a burn. please Indicate the degree: I3 First degree III Second degree-percent of body burned % or square Inches of body surface burned El Third degree-percent of body burned % or square inches of body surface burned 
GT sentence: If related to a burn, please indicate the degree: First degree Second degree-percent of body burned % or square Inches of body surface burned Third degree-percent of body burned % or square inches of body surface burned

Decoded sentence: If related to a frequent dedicate the degree In grease indicate the.degreent degrees of bode of bod 
-
Input sentence: MRI Yes El No Date: (mmlddlyy) I 11?, 
GT sentence: MRI Yes No Date: (mm/dd/yy)

Decoded sentence: MRI Yes No Date Duty mmdd)yy11 
-
Input sentence: Is the patient's condition related to his/her employment? El Yes “El No El Unknown 
GT sentence: Is the patient's condition related to his/her employment? Yes No Unknown

Decoded sentence: Is the patients condition related to highe

-
Input sentence: TWIN CITIES ORTHOPEDICS PA  
GT sentence: TWIN CITIES ORTHOPEDICS PA

Decoded sentence: TWIN CITIES ORTHOPEDICS PA 
-
Input sentence: St. Croix Orthopaedics and Twin Cities Orthopedics integrated on Oi/O‘l/ZO‘I 5. All billing for any date of service on or after 07/011201 5 wiil be biiled under the Twin Cities Or‘thmpedics practice name. Pieahe visit our website at www.1‘c0mn.com for onetime bin payments. 
GT sentence: St. Croix Orthopaedics and Twin Cities Orthopedics integrated on 07/01/2015. All billing for any date of service on or after 07/01/2015 will be billed under the Twin Cities Orthopedics practice name. Please visit our website at www.TCOmn.com for on-line bill payments.

Decoded sentence: St Croix Orthopedics and Tind Linet cand To the radius instegnance on Or ll ed for and filling for 
-
Input sentence: SEE REVERSE SIDE FOR IMPORTANT BILLING I FORMATION 
GT sentence: SEE REVERSE SIDE FOR IMPORTANT BILLING INFORMATION

Decoded sentence: SEE REVERSE SIDE FO

-
Input sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease. 
GT sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease.

Decoded sentence: Information authorized for disclose may include for diplon to hishoral which may indicate the present 
-
Input sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above, Unum may not be able to evaluate or administer my claim(s), which may lead to my claim(s) being denied. I may revoke this authorization at any time by sending written notice to the address above. I understand that revocation will not apply to any information that Unum requests or discloses prior to Unum receiving my revocation request. 
GT sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above

-
Input sentence: Date of Surgery (mmiddlyy) . El Inpatient X? Outpatient (choose one) 
GT sentence: Date of Surgery (mm/dd/yy) Inpatient Outpatient (choose one)

Decoded sentence: Date of Surgery (mmddyy) rd out Inpatient Chosen choose on 
-
Input sentence: Surgical rocedure CPT Code: 
GT sentence: Surgical Procedure CPT Code:

Decoded sentence: Surgical Procedure CPT Code a 
-
Input sentence: FRAUD N TICE: Any person who knowingly files a statement of claim containing faise or misleading information issub'ect to criminal and civil penalties. This includes Attending Physician portions of the claim form. 
GT sentence: FRAUD NOTICE: Any person who knowingly files a statement of claim containing false or misleading information is subject to criminal and civil penalties. This includes Attending Physician portions of the claim form.

Decoded sentence: FAR Date Any person the resort of claim contain cont no the:resoning farse or mis leading ing inglu 
-
Input sentence: C. Signature of Atten

-
Input sentence: Electronically Signed Indicator: Yes
GT sentence: Electronically Signed Indicator: Yes

Decoded sentence: Electronically Signed Indicator Yes 
-
Input sentence: Electronically S igncd Date: Monday, 03’12/2018 01:26 PM
GT sentence: Electronically Signed Date: Monday, 03/12/2018 01:26 PM

Decoded sentence: Electronically Signed Date:Monday,D03e12/20180 P 
-
Input sentence: (1 ° ‘ 
GT sentence: unum

Decoded sentence: sum 
-
Input sentence: November 30, 2016 
GT sentence: November 30, 2016

Decoded sentence: November 30,2016 
-
Input sentence: Confirmation of Coverage 
GT sentence: Confirmation of Coverage

Decoded sentence: Confirmation of Coverage 
-
Input sentence: Employer: 
GT sentence: Employer:

Decoded sentence: Employers 
-
Input sentence: Group Policy #: 
GT sentence: Group Policy #:

Decoded sentence: Group Policy of 
-
Input sentence: Customer Policy #: 
GT sentence: Customer Policy #:

Decoded sentence: Customer Policy of 
-
Input sentence: EE Name: 
GT sent

-
Input sentence: EALHJ' 11.93 Type , [‘UlAI Ry
GT sentence: Earnings Type: Hourly

Decoded sentence: Earning11.93e medico PAIN R 
-
Input sentence: Earn1ng€ Mnda: Monthly
GT sentence: Earnings Mode: Monthly

Decoded sentence: earning Midan:ationshly 
-
Input sentence: ﬁftsr Tax: 0.069
GT sentence: After Tax: 0.000

Decoded sentence: After Tax 0.069 
-
Input sentence: Repart Group' ?6
GT sentence: Report Group: 26

Decoded sentence: Repart Group a 
-
Input sentence: Product: Short Term Blsab111ty
GT sentence: Product: Short Term Disability

Decoded sentence: Products Short Term Bisab111y 
-
Input sentence: Product Type: £53
GT sentence: Product Type: ASO

Decoded sentence: Product Type And 
-
Input sentence: Funding: Self insured
GT sentence: Funding: Self Insured

Decoded sentence: Funding Self Insured 
-
Input sentence: Staff: Flhh. ND
GT sentence: State Plan: No

Decoded sentence: State:Plain.ND 
-
Input sentence: Employee Coverage: res
GT sentence: Employee Coverage: Yes

Decoded s

-
Input sentence: tr Ego: Physioan 
GT sentence: Author Type: Physician

Decoded sentence: Author:Type Physician 
-
Input sentence: Flea; 352052518 8:36 AM Daletlucs 3.."16-"2018 5:08 PM 
GT sentence: Filed: 3/20/2018 8:36 AM Date of Service: 3/16/2018 5:08 PM

Decoded sentence: a Ma;d352052518 8:36 AMD Dast Cale3.."16-"20185: ;M35 
-
Input sentence: Q'. .2 Signed 
GT sentence: Status: Signed

Decoded sentence: St.t.2Signed Insured Pain 
-
Input sentence: Edsur Lmtm_JomiJ.MDcPhysdanJ 
GT sentence: Editor: Larkin, John J. MD (Physician)

Decoded sentence: Editor Part om Did J Physician 
-
Input sentence: DATE CIF OPERATION: ﬂ3f16f2018 
GT sentence: DATE OF OPERATION: 03/16/2018

Decoded sentence: DATE CIF OPERATION 316 2018 
-
Input sentence: PREOPERATIVE DIAGNOSIS: Medial meniscus tear, left knea. 
GT sentence: PREOPERATIVE DIAGNOSIS: Medial meniscus tear, left knee.

Decoded sentence: PREOPERATIVE DIAGNOSIS Medial meniscus tears left knee 
-
Input sentence: POSTDPERATIVE DIAGNOSIS: Co

-
Input sentence: Service Area ST. ELIEABETH SERVJCE AREA 
GT sentence: Service Area ST. ELIZABETH SERVICE AREA

Decoded sentence: Service Area ST ELE BREATHY CREATH AREA 
-
Input sentence: EDG SC CRESTWEW HELL-S 
GT sentence: EDG SC CRESTVIEW HILLS

Decoded sentence: EDGE SC CRESTVIEW HILLS 
-
Input sentence: Roomi‘Eed EDGSCCEEDGSCC 
GT sentence: Room/Bed EDGSCC/EDGSCC

Decoded sentence: roomed EDGSCCEDGSCC a 
-
Input sentence: Admission Status Discharged {Confirmedi 
GT sentence: Admission Status Discharged (Confirmed)

Decoded sentence: Admission Status Discharged Confirmed 
-
Input sentence: Hospital Account 
GT sentence: HOSE. ital Account

Decoded sentence: Hospital Account a 
-
Input sentence: Primary 
GT sentence: Primary

Decoded sentence: Primary a 
-
Input sentence: Name 
GT sentence: Name

Decoded sentence: Name a 
-
Input sentence: Acct ID 
GT sentence: Acct ID

Decoded sentence: Acct ID a 
-
Input sentence: Class Same Day Surgery 
GT sentence: Class Same Day Surgery

Deco

-
Input sentence: If yes. please pmvida iraﬁlmanl dates (mnﬁddfyy): From Through 
GT sentence: If yes, please provide treatment dates (mm/dd/yy): From Through

Decoded sentence: If yes please previde and dates mmd(yy From through Type Coverage Amid Coming a 
-
Input sentence: Is the pelican-Li's condition work reiaiad‘? El Yes No C] Unknown 
GT sentence: Is the patient’s condition work related?  Yes No Unknown

Decoded sentence: Is the patient-List condition work relative Yes No Unknown 
-
Input sentence: Patient’s Height: 
GT sentence: Patient’s Height:

Decoded sentence: patients Height a 
-
Input sentence: Paliant’s Weighi 
GT sentence: Patient's Weight:

Decoded sentence: patients Weight a 
-
Input sentence: Primary Dlﬁgnosis: 
GT sentence: Primary Diagnosis:

Decoded sentence: Primary Diagnosis a 
-
Input sentence: Primary [CD Code: 
GT sentence: Primary ICD Code:

Decoded sentence: Primary BCD Code a 
-
Input sentence: Secondary Diagnosis: 
GT sentence: Secondary Diagnosis:

Deco

-
Input sentence: l. Doscrlbo any relevant modlonl fools relntnd to tho common tor which the pollen! sank: law. (any include symptoms. ”metﬂmmwm zoomwww tmwmmg WA mEﬂﬂLﬂeﬂlﬁM, 5 #51252 as -/# var {JEFF gym Hamlet» (3 Eﬂﬂé ”7.544;, m our of Pail-M7 WW! M 51:: #:5111910 
GT sentence: 1. Describe any relevant medical facts related to the condition for which the patient seeks leave (may include symptoms, diagnosis, or any regimen of continuing treatment such as the use of specialized equipment). In CT, do not disclose diagnosis without patient’s consent:

Decoded sentence: In Descrive to the completed to tho common dot the common OND tho complete on the mala any muddy 
-
Input sentence: 2. la the medlaal downton prognnnnyﬂ __Yos ling. tf yon. ammo omenrdm: 
GT sentence: 2. Is the medical condition pregnancy? Yes No. If yes, expected delivery date:

Decoded sentence: of Is the medial downtonal a dress ling to yog.a2. ling to roundrime 
-
Input sentence: a. Appnoxlmale um lymptomllmadlcal ma

-
Input sentence: 13. Answer the tottowlng questions for en lurerml'ﬂanl leave or at reduced work schedufe. 
GT sentence: 13. Answer the following questions for an intermittent leave or a reduced work schedule.

Decoded sentence: 13. Answer the total howling questinal leave and 13. a reduced work deduced work schedule 
-
Input sentence: a, It! It madloelly nawnary for the patient to he oFl' work due to tnoclc ﬂora-ups on an Interrntttent boots or to work teal than the petlent‘u normal work echeduia? You a 
GT sentence: a. Is it medically necessary for the patient to be off work due to episodic flare-ups on an intermittent basis or to work less than the patient's normal work schedule? Yes No

Decoded sentence: In the Indoors Care for the patient to the patient to the patient to the res on an Internttronttatt 
-
Input sentence: If "Year", pruvt'de an outline!“ frequency and duration betow: 
GT sentence: If "Yes", provide an estimated frequency and duration below:

Decoded sentence: If "r

-
Input sentence: g g . SHORT TERM DISABILITY CLAIM FORM 
GT sentence: SHORT TERM DISABILITY CLAIM FORM

Decoded sentence: ART SHORE TER DISABILITY CATIE FORM a 
-
Input sentence: w The Beneﬁts Cantor 
GT sentence: The Benefits Center

Decoded sentence: The Benefits Center 
-
Input sentence: Call tolH‘ree Monday through Friday. 8 am. to 8 pm (Eastern Time) 
GT sentence: Call toll-free Monday through Friday, 8 a.m. to 8 p.m. Eastern Time.

Decoded sentence: Call tollfree Monday through Friday am to a pm Eastern Time 
-
Input sentence: ananomo PHYSICIAN STATEMENT (continued) 
GT sentence: ATTENDING PHYSICIAN STATEMENT (Continued)

Decoded sentence: ATTENDING PHYSICIAN STATEMENT continued 
-
Input sentence: Patient Name (Last Name, Final Name, MI, Sufﬁx) 
GT sentence: Patient’s Name (Last Name, First Name, MI. Suffix)

Decoded sentence: Patient Name last Name single Name same Mufti 
-
Input sentence: Date of Birth [mmrddiyﬁ 
GT sentence: Date of Birth (mm/dd/yy)

Decoded sentence: Date of

-
Input sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the immune system, use of drugs or alcohol, mental or physical histor , condition, advice or treatment (except this authorization does not authorize release of psychotherapy notesi, prescription drug history, earnings, financial or credit history, professional licenses, employment history, insurance claims and benefits, and all other claims and benefits, including Social Security claims and benefits (“My Information”); 
GT sentence: To disclose information, whether from before, during or after the date of this authorization, about my health, including HIV, AIDS or other disorders of the immune system, use of drugs or alcohol, mental or physical history, condition, advice or treatment (except this authorization does not authorize release of psychotherapy notes prescription drug history, earnings, financial or credit hi

-
Input sentence: Accident Description r jump oﬂ‘the bottom stair and hit his head on the corner ot‘the ceﬂjng. 
GT sentence: Accident Description: jump off the bottom stair and hit his head on the corner of the ceiling.

Decoded sentence: Accident Description hemp or mus of the bottom stair and hit his head on the celling the ceiling 
-
Input sentence: Accident Work Related: No 
GT sentence: Accident Work Related: No

Decoded sentence: Accident Work Related No a 
-
Input sentence: Time ofAccident: 1:30 pm 
GT sentence: Time of Accident: 1:30 pm

Decoded sentence: Time of Acciden:1:30 
-
Input sentence: Accident Date: 02/] 1 52018 
GT sentence: Accident Date: 02/11/2018

Decoded sentence: Accident Date 02/]1 52018 
-
Input sentence: Diagnosis Code: Concussion 
GT sentence: Diagnosis Code: Concussion

Decoded sentence: Diagnosis Code Concussion 
-
Input sentence: SII l‘g Ei')’ Information 
GT sentence: Surgery Information

Decoded sentence: Surgery information a 
-
Input sentence: 15 Su

-
Input sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease. 
GT sentence: Information authorized for use or disclosure may include information which may indicate the presence of a communicable or non-communicable disease.

Decoded sentence: Information authorized for disclose may include for diplon to hishoral which may indicate the present 
-
Input sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above, Unum may not be able to evaluate or administer my claim(s), which may lead to my claim(s) being denied. I may revoke this authorization at any time by sending written notice to the address above. I understand that revocation will not apply to any information that Unum requests or discloses prior to Unum receiving my revocation request. 
GT sentence: If I do not sign this authorization or if I alter or revoke it, except as specified above

-
Input sentence: Blood pressure (l) 98MB. pulse (l) 100, temperature 97.5 °F, temperature source Tympanic, height 5' 4.5", weight 120 lb, Sp02100 %. 
GT sentence: Blood pressure (1) 98/48, pulse (1) 100, temperature 97.6 °F, temperature source Tympanic, height 5' 4.5", weight 120 lb, SpO2 100 %.

Decoded sentence: Best Service my )98 a a Ore(e)n100, or my the pervert tre tr(e)T98pa.ic hel(e)a100,height a 
-
Input sentence: Physical Exam 
GT sentence: Physical Exam

Decoded sentence: Physical Exam 
-
Input sentence: Constitutional: He appears well-developed. He is active. 
GT sentence: Constitutional: He appears well-developed. He is active.

Decoded sentence: Constitutional He appears well-developed. He is active 
-
Input sentence: HENT: 
GT sentence: HENT:

Decoded sentence: HENTY a 
-
Input sentence: Right Ear: Tympanic membrane normal. 
GT sentence: Right Ear: Tympanic membrane normal.

Decoded sentence: Right Ears Tympanic membrane normal 
-
Input sentence: Nose: Nase normal. 
GT 

-
Input sentence: We provide free services to help you communicate with us. Such as, letters in other languages or large print. Or, you can ask for an interpreter. To ask for help, please call 888-249-6365. 
GT sentence: We provide free services to help you communicate with us. Such as, letters in other languages or large print. Or, you can ask for an interpreter. To ask for help, please call 888-249-6365.

Decoded sentence: Wervores tree sect you communicate with USS not communis other las letter languages ANG wages or lan 
-
Input sentence: ATENCION: 5i habla espaﬁol (Spanish), hay servicios de asistencia de idiomas, sin cargo, a su disposicion. Llarne al 888-249-6355. 
GT sentence: ATENCIÓN: Si habla español (Spanish), hay servicios de asistencia de idiomas, sin cargo, a su disposición. Llame al 888-249-6365.

Decoded sentence: ATENCCIO:5ilbal espanist( dedivi),s and divan services medial see did Menis argo),as surgery gastro 
-
Input sentence: aﬁi‘i‘ﬁn ﬁuz‘lvliéiéﬁﬁhﬁt‘ (Chinese), 

-
Input sentence: I: I— 
GT sentence: ME

Decoded sentence: Date 
-
Input sentence: Med Express 
GT sentence: MedExpress

Decoded sentence: MedExpreass a 
-
Input sentence: Patient Name: 
GT sentence: Patient Name:

Decoded sentence: Patient Name a 
-
Input sentence: Patient DOB: 
GT sentence: Patient DOB:

Decoded sentence: Patient DOBE 
-
Input sentence: Date ofVisit: February II 2018 
GT sentence: Date of Visit: February 11 2018

Decoded sentence: Date of Visit Ferry IZ 2018 
-
Input sentence: Seen By: Vijay Patel, MD 
GT sentence: Seen By: Vijay Patel, MD

Decoded sentence: Seen By Vijao Patel MD a 
-
Input sentence: Location: MedExprcss Jackson, N West Ave 
GT sentence: Location: MedExpress Jackson, N West Ave

Decoded sentence: Location MedExproxprachsst same Avert 
-
Input sentence: 1325 North West Avenue 
GT sentence: 1325 North West Avenue

Decoded sentence: 1325Not West Avenue a 
-
Input sentence: Jackson: MI 49202—2050 
GT sentence: Jackson, MI 49202-2050

Decoded sentence: 

-
Input sentence: Medical Provider Specially. Podiﬂll'ibl 
GT sentence: Medical Provider Specially: Podiatrist

Decoded sentence: Medical Provider Specialty Podicible 
-
Input sentence: Medical Provider Roles: [rearing 
GT sentence: Medical Provider Roles: Treating

Decoded sentence: Medical Provider Roles rearing a 
-
Input sentence: Provider First Name: Ryan 
GT sentence: Provider First Name: Ryan

Decoded sentence: Provider First Name Ryan 
-
Input sentence: Pl'mider la st Name: Kish 
GT sentence: Provider Last Name: Kish

Decoded sentence: Provider Last Name:Kish a 
-
Input sentence: Address Line 1: 3905. W’est Sylvania Ave. 
GT sentence: Address Line 1: 3905 West Sylvania Ave.

Decoded sentence: Address Line of 3905.eldania Sellate a 
-
Input sentence: City: Toledo 
GT sentence: City: Toledo

Decoded sentence: City Toled a 
-
Input sentence: Stater'Proxince: OH 
GT sentence: State/Province: OH

Decoded sentence: StateProvince:HOH 
-
Input sentence: Claim Tji'pe: VB Accident - Acci

-
Input sentence: Electronically Signed Date: Monday, 03:"12/2018 08:04 AM 
GT sentence: Electronically Signed Date: Monday, 03/12/2018 08:04 AM

Decoded sentence: Electronically Signed Date Monday 03:"12/20180 a 
-
Input sentence: Spou s 0 Information 
GT sentence: Spouse Information

Decoded sentence: Spouse information 
-
Input sentence: First Name: 
GT sentence: First Name:

Decoded sentence: First Name a 
-
Input sentence: Middle Name/initial: 
GT sentence: Middle Name/Initial:

Decoded sentence: Middle Name/Initial: a 
-
Input sentence: Last Name: 
GT sentence: Last Name:

Decoded sentence: Last Name a 
-
Input sentence: Social Security Number: 
GT sentence: Social Security Number:

Decoded sentence: Social Security Number a 
-
Input sentence: Birth Date: 
GT sentence: Birth Date:

Decoded sentence: Birth Date a 
-
Input sentence: Gender: 
GT sentence: Gender:

Decoded sentence: Gender a 
-
Input sentence: Claim Event Information 
GT sentence: Claim Event Information

Decoded sen

-
Input sentence: Address Line 1 2142 N Cove Blvd  
GT sentence: Address Line 1 : 2142 N Cove Blvd

Decoded sentence: Address Line a 2142 Cover Give a 
-
Input sentence: C ity. OH 
GT sentence: City: OH

Decoded sentence: City.OH NAME 
-
Input sentence: S lute riProx-inee: 
GT sentence: State/Province: OH

Decoded sentence: Saluter province a 
-
Input sentence: Postal Code: 43606 
GT sentence: Postal Code: 43606

Decoded sentence: Postal Code 43606 a 
-
Input sentence: (10th US 
GT sentence: Country: US

Decoded sentence: (10 US S 
-
Input sentence: Date ofVisit/A d‘miss ion 03/09/2018 
GT sentence: Date of Visit/Admission: 03/09/2018

Decoded sentence: Date of Visi/Admiss on 03/09/201 
-
Input sentence: Date ofDiseharge: 03/09/2018 
GT sentence: Date of Discharge: 03/09/2018

Decoded sentence: Date of Discharg:03/09/2018 
-
Input sentence: Procedure: ER visit, me 
GT sentence: Procedure: ER visit, Xray

Decoded sentence: Procedure ER visit,Number a 
-
Input sentence: Emplomi ent In fo

-
Input sentence: To Unum Group and its subsidiaries, Unum Life Insurance Company ofAmerica, Provident Life and Accident Insurance Compan , The Paul Revere Life Insurance Company, and persons who evaluate claims for any of those companies (“ num"); 
GT sentence: To Unum Group and its subsidiaries, Unum Life Insurance Company of America, Provident Life and Accident Insurance Company, The Paul Revere Life Insurance Company, and persons who evaluate claims for any of those companies (“Unum”);

Decoded sentence: To Unum Group and its subsidiaries Unum Life Insurance Complice Insurance Complete,and Accident Con 
-
Input sentence: So that Unum may evaluate and administer my claims, including providing assistance with return to work. For such evaluation and administration of claims, this authorization is valid for two years, or the duration of my claim for benefits, whichever is shorter. I understand that once My Information is disclosed to Unum, any privacy rotections established by HIPAA ma

-
Input sentence: 15 Surgery Required: No 
GT sentence: Is Surgery Required: No

Decoded sentence: of Surgery Required No a 
-
Input sentence: Medical Proxitler Information , Physician 
GT sentence: Medical Provider Information - Physician

Decoded sentence: Medical Provider Information Physician 
-
Input sentence: Medical Provider Specially. Orthopedic Surgeon 
GT sentence: Medical Provider Specially: Orthopedic Surgeon

Decoded sentence: Medical Provider Specialty Orthopedic Surgeon a 
-
Input sentence: unum‘ 
GT sentence: unum

Decoded sentence: unum 
-
Input sentence: June 19, 2012 
GT sentence: June 19, 2012

Decoded sentence: June 19, 2012 
-
Input sentence: Conﬁrmation of Coveraoe 
GT sentence: Confirmation of Coverage

Decoded sentence: Confirmation of Coverage 
-
Input sentence: The information below is provided to give you a general summary of your coverage and premium consistent with the benefits outlined in your certiﬁcate. 
GT sentence: The information below is provided to

In [18]:
WER_spell_correction = calculate_WER(gt_texts, decoded_sentences)
print('WER_spell_correction |TEST= ', WER_spell_correction)

WER_spell_correction |TEST=  0.130755891233


In [19]:
WER_OCR = calculate_WER(gt_texts, input_texts)
print('WER_OCR |TEST= ', WER_OCR)

WER_OCR |TEST=  0.0563287295006
