# Processing PDF's into csv files of OCR results

In [1]:
import numpy as np
import pandas as pd
import sys
import os
import pytesseract                            # API for letting python interface with Google's tesseract OCR software

import cv2                                    # Open Computer Vision library
import PyPDF2                                 # All things PDF format related
import io                                     # Something about messing with memory
from wand.image import Image                  # For messing with images
from PIL import Image as Im                   # Likewise images
import codecs                                 # Unknown

## 1. Convert all PDF's to single image files in "processed_images" directory

In [2]:
# Alternative process that uses command-line tool pdftocairo directly
def pdf_reader_cairo(filename):
    os.system("pdftocairo -r 300 -png ./CH_records/" + filename + ".pdf ./working/converted/" + filename)
    return("Converted " + filename + " to png")

In [3]:
# Get a list of all of the pdf files in the directory "CH_records"
files = [filename.split(".")[0] for filename in os.listdir("./CH_records") if ".pdf" in filename]

In [4]:
# Optional, do just two for testing purposes
#files = files[0:2]

### FOR REPAIRING DAMAGED PDF's, command-line tool from poppler-utils

There's been a lot of fiddling to get this process working.  First, some of the pdf's are damaged, possibly during zip file compression - they lack metadata in an xref table, they may never have had it if they're just scanned pages.  To deal with this the commandline tool "pdftocairo" from poppler-utils is used to create a repaired version that the python library PyPDF2 will process without complaint.

Second problem was that the resources the ImageMagick program that the image libraries call were assigned to be too low, this was solved by manually editing the file at /etc/ImageMagick-6/policy.xml to give it more memory for nearly everything (alternatively, one can process the pages individually with pdf_reader2 above and reduce the resolution in the Image command, but I didn't want to sacrifice image quality).

In [5]:
# The loop that actually triggers the process.
# Final images are dumped into the "working" directory
# with the original filename

for filename in files:
    print(pdf_reader_cairo(filename))

Converted 01804186 to png
Converted 06034603 to png
Converted 01002610 to png
Converted 3459907 to png
Converted 00542515 to png
Converted 02959325 to png
Converted 01539777 to png
Converted 1983517 to png
Converted 02266230 to png
Converted 983951 to png
Converted 09457025 to png
Converted 868273 to png
Converted 01369166 to png
Converted 00030177 to png
Converted 5508774 to png
Converted 00002404 to png
Converted 01370175 to png
Converted 00782931 to png
Converted 3824626 to png
Converted 00468115 to png
Converted 01337451 to png
Converted 04802747 to png
Converted 04558828 to png
Converted 02430955 to png
Converted 00983951 to png
Converted 00053475 to png
Converted 2303730 to png
Converted 00477955 to png
Converted 06005142 to png
Converted 02714555 to png
Converted 02245999 to png
Converted 00553535 to png
Converted 02582534 to png
Converted 3387163 to png
Converted 04860660 to png
Converted 03293902 to png
Converted 00178090 to png
Converted 2765595 to png


## 2. Apply pre-processing to every image

In [6]:
# Read in images, greyscale, apply filters, save
def pre_process(filename):
    
    #find all of the converted pages
    png_files = [pngname for pngname in os.listdir("./working/converted") if filename in pngname]
    
    for pngname in png_files:
        
        # Read in as greyscale
        concatenated = cv2.imread('./working/converted/'+pngname, 0)
    
        # Threshold image to black/white (threshold = 127 I presume)
        num, grey_composite = cv2.threshold(concatenated, 127, 255, cv2.THRESH_BINARY)
    
        # inverting the image for morphological operations
        inverted_composite = 255-grey_composite
    
        # Perform closing, dilation followed by erosion
        kernel = np.ones((2,2), np.uint8) 
        closed_composite = cv2.morphologyEx(inverted_composite, cv2.MORPH_CLOSE, kernel)
        
        # Undo inversion
        closed_composite = 255-closed_composite
        
        # Write to file ready for OCR
        cv2.imwrite("./working/preprocessed/"+pngname, closed_composite)
        
    print("Image pre-processing complete for " + filename)

    return(1)
    
#    except Exception as e:
#        print("Failed to preprocess " + filename + "_concatenated.png")
#        print(e)
#       
#        return(0)

In [7]:
# Pre-process all of the concatenated files
for each in files:
    pre_process(each)

Image pre-processing complete for 01804186
Image pre-processing complete for 06034603
Image pre-processing complete for 01002610
Image pre-processing complete for 3459907
Image pre-processing complete for 00542515
Image pre-processing complete for 02959325
Image pre-processing complete for 01539777
Image pre-processing complete for 1983517
Image pre-processing complete for 02266230
Image pre-processing complete for 983951
Image pre-processing complete for 09457025
Image pre-processing complete for 868273
Image pre-processing complete for 01369166
Image pre-processing complete for 00030177
Image pre-processing complete for 5508774
Image pre-processing complete for 00002404
Image pre-processing complete for 01370175
Image pre-processing complete for 00782931
Image pre-processing complete for 3824626
Image pre-processing complete for 00468115
Image pre-processing complete for 01337451
Image pre-processing complete for 04802747
Image pre-processing complete for 04558828
Image pre-processin

## 3.  Apply OCR to every pre-processed image

In [8]:
# THE KEY ONE, Get all of the tesseract data and location metadata
# into one convenient csv file.  I should add, I don't know how to read
# the data generated directly into a dataframe :S
for filename in files:
    
    #find all of the pre-processed pages
    png_files = [pngname for pngname in os.listdir("./working/preprocessed") if filename in pngname]
    
    for pngname in png_files:
        # Iterate through all of the pages
        f = open("./working/ocr_output/"+pngname+"._ocr_data.csv", "w")
        f.write(pytesseract.image_to_data(Im.open("./working/preprocessed/"+pngname)))
        f.close()
        print("OCR'ed " + pngname)

OCR'ed 01804186-11.png
OCR'ed 01804186-19.png
OCR'ed 01804186-22.png
OCR'ed 01804186-04.png
OCR'ed 01804186-14.png
OCR'ed 01804186-13.png
OCR'ed 01804186-01.png
OCR'ed 01804186-07.png
OCR'ed 01804186-06.png
OCR'ed 01804186-21.png
OCR'ed 01804186-23.png
OCR'ed 01804186-08.png
OCR'ed 01804186-25.png
OCR'ed 01804186-02.png
OCR'ed 01804186-17.png
OCR'ed 01804186-16.png
OCR'ed 01804186-15.png
OCR'ed 01804186-12.png
OCR'ed 01804186-20.png
OCR'ed 01804186-10.png
OCR'ed 01804186-18.png
OCR'ed 01804186-24.png
OCR'ed 01804186-03.png
OCR'ed 01804186-05.png
OCR'ed 01804186-09.png
OCR'ed 06034603-24.png
OCR'ed 06034603-16.png
OCR'ed 06034603-03.png
OCR'ed 06034603-09.png
OCR'ed 06034603-11.png
OCR'ed 06034603-07.png
OCR'ed 06034603-23.png
OCR'ed 06034603-18.png
OCR'ed 06034603-01.png
OCR'ed 06034603-08.png
OCR'ed 06034603-10.png
OCR'ed 06034603-15.png
OCR'ed 06034603-22.png
OCR'ed 06034603-05.png
OCR'ed 06034603-06.png
OCR'ed 06034603-17.png
OCR'ed 06034603-14.png
OCR'ed 06034603-21.png
OCR'ed 0603

OCR'ed 09457025-07.png
OCR'ed 09457025-11.png
OCR'ed 09457025-26.png
OCR'ed 09457025-25.png
OCR'ed 09457025-18.png
OCR'ed 09457025-14.png
OCR'ed 09457025-30.png
OCR'ed 09457025-19.png
OCR'ed 09457025-15.png
OCR'ed 09457025-20.png
OCR'ed 09457025-27.png
OCR'ed 09457025-10.png
OCR'ed 09457025-17.png
OCR'ed 09457025-28.png
OCR'ed 09457025-02.png
OCR'ed 09457025-29.png
OCR'ed 09457025-24.png
OCR'ed 868273-54.png
OCR'ed 868273-35.png
OCR'ed 868273-24.png
OCR'ed 868273-22.png
OCR'ed 868273-07.png
OCR'ed 868273-34.png
OCR'ed 868273-51.png
OCR'ed 868273-04.png
OCR'ed 868273-40.png
OCR'ed 868273-32.png
OCR'ed 868273-06.png
OCR'ed 868273-20.png
OCR'ed 868273-58.png
OCR'ed 868273-56.png
OCR'ed 868273-48.png
OCR'ed 868273-45.png
OCR'ed 868273-15.png
OCR'ed 868273-47.png
OCR'ed 868273-33.png
OCR'ed 868273-43.png
OCR'ed 868273-02.png
OCR'ed 868273-19.png
OCR'ed 868273-01.png
OCR'ed 868273-31.png
OCR'ed 868273-41.png
OCR'ed 868273-23.png
OCR'ed 868273-16.png
OCR'ed 868273-44.png
OCR'ed 868273-03.png


OCR'ed 01337451-29.png
OCR'ed 01337451-10.png
OCR'ed 01337451-07.png
OCR'ed 01337451-33.png
OCR'ed 01337451-01.png
OCR'ed 01337451-12.png
OCR'ed 01337451-14.png
OCR'ed 01337451-28.png
OCR'ed 01337451-06.png
OCR'ed 01337451-24.png
OCR'ed 01337451-23.png
OCR'ed 01337451-02.png
OCR'ed 01337451-08.png
OCR'ed 01337451-13.png
OCR'ed 01337451-26.png
OCR'ed 01337451-05.png
OCR'ed 04802747-10.png
OCR'ed 04802747-03.png
OCR'ed 04802747-07.png
OCR'ed 04802747-06.png
OCR'ed 04802747-04.png
OCR'ed 04802747-02.png
OCR'ed 04802747-08.png
OCR'ed 04802747-01.png
OCR'ed 04802747-05.png
OCR'ed 04802747-11.png
OCR'ed 04802747-09.png
OCR'ed 04558828-20.png
OCR'ed 04558828-09.png
OCR'ed 04558828-06.png
OCR'ed 04558828-19.png
OCR'ed 04558828-11.png
OCR'ed 04558828-13.png
OCR'ed 04558828-07.png
OCR'ed 04558828-10.png
OCR'ed 04558828-02.png
OCR'ed 04558828-01.png
OCR'ed 04558828-18.png
OCR'ed 04558828-14.png
OCR'ed 04558828-16.png
OCR'ed 04558828-15.png
OCR'ed 04558828-05.png
OCR'ed 04558828-12.png
OCR'ed 0455

OCR'ed 04860660-15.png
OCR'ed 04860660-17.png
OCR'ed 04860660-16.png
OCR'ed 04860660-30.png
OCR'ed 04860660-12.png
OCR'ed 04860660-24.png
OCR'ed 04860660-36.png
OCR'ed 04860660-09.png
OCR'ed 04860660-19.png
OCR'ed 04860660-29.png
OCR'ed 04860660-18.png
OCR'ed 04860660-20.png
OCR'ed 04860660-04.png
OCR'ed 04860660-23.png
OCR'ed 04860660-08.png
OCR'ed 04860660-01.png
OCR'ed 04860660-06.png
OCR'ed 04860660-32.png
OCR'ed 04860660-03.png
OCR'ed 04860660-35.png
OCR'ed 04860660-26.png
OCR'ed 04860660-27.png
OCR'ed 04860660-37.png
OCR'ed 04860660-11.png
OCR'ed 04860660-34.png
OCR'ed 04860660-31.png
OCR'ed 04860660-10.png
OCR'ed 04860660-05.png
OCR'ed 04860660-13.png
OCR'ed 04860660-14.png
OCR'ed 04860660-28.png
OCR'ed 03293902-13.png
OCR'ed 03293902-06.png
OCR'ed 03293902-16.png
OCR'ed 03293902-14.png
OCR'ed 03293902-18.png
OCR'ed 03293902-08.png
OCR'ed 03293902-17.png
OCR'ed 03293902-02.png
OCR'ed 03293902-03.png
OCR'ed 03293902-22.png
OCR'ed 03293902-10.png
OCR'ed 03293902-15.png
OCR'ed 0329

## 4. Compile OCR results for pages into per-document csv files

In [20]:
# THE KEY ONE, Get all of the tesseract data and location metadata
# into one convenient csv file.  I should add, I don't know how to read
# the data generated directly into a dataframe :S

# FAIL - YOU'VE NOT ORDERED THEM CORRECTLY
# RERUN

for filename in files:
    
    # Blank DF for data
    df_doc = pd.DataFrame()
    
    #find all of the pre-processed pages
    csv_files = sorted([csvname for csvname in os.listdir("./working/ocr_output") if filename in csvname])
    
    csv_num = 1
    
    for each in csv_files:
        
        try:
            # Reading csv is tricky, weird save format separated by spaces + tabs
            df_page = pd.read_csv("./working/ocr_output/" + each,
                                  sep=' |\t',
                                  error_bad_lines=False,
                                  engine='python')
        
            # Append csv (page) number
            df_page['csv_num'] = csv_num
        
            df_doc = df_doc.append(df_page)
            
            print("Processed "+filename+" page "+str(csv_num))
            csv_num = csv_num + 1
        
        except:
            print("Failed on "+filename+" page"+str(csv_num))
            csv_num = csv_num + 1
    
    
    df_doc.to_csv("./working/ocr_output_compiled/"+filename+".csv")
    
    

Processed 01804186 page 1
Processed 01804186 page 2
Processed 01804186 page 3
Processed 01804186 page 4
Processed 01804186 page 5
Processed 01804186 page 6
Processed 01804186 page 7
Processed 01804186 page 8
Processed 01804186 page 9
Processed 01804186 page 10
Processed 01804186 page 11
Processed 01804186 page 12
Processed 01804186 page 13
Processed 01804186 page 14
Processed 01804186 page 15
Processed 01804186 page 16
Processed 01804186 page 17
Processed 01804186 page 18
Processed 01804186 page 19
Processed 01804186 page 20
Processed 01804186 page 21
Processed 01804186 page 22
Processed 01804186 page 23
Processed 01804186 page 24
Processed 01804186 page 25


Skipping line 148: Expected 12 fields in line 148, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.
Skipping line 45: Expected 12 fields in line 45, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 06034603 page 1
Processed 06034603 page 2
Processed 06034603 page 3
Processed 06034603 page 4
Processed 06034603 page 5
Processed 06034603 page 6
Processed 06034603 page 7
Processed 06034603 page 8
Processed 06034603 page 9
Processed 06034603 page 10
Processed 06034603 page 11
Processed 06034603 page 12
Processed 06034603 page 13
Processed 06034603 page 14
Processed 06034603 page 15
Processed 06034603 page 16
Processed 06034603 page 17
Processed 06034603 page 18
Processed 06034603 page 19
Processed 06034603 page 20
Processed 06034603 page 21
Processed 06034603 page 22
Processed 06034603 page 23
Processed 06034603 page 24
Processed 01002610 page 1
Processed 01002610 page 2
Processed 01002610 page 3
Processed 01002610 page 4
Processed 01002610 page 5
Processed 01002610 page 6
Processed 01002610 page 7
Processed 01002610 page 8
Processed 01002610 page 9
Processed 01002610 page 10
Processed 01002610 page 11
Processed 01002610 page 12
Processed 01002610 page 13
Processed 01002610 

Skipping line 193: Expected 12 fields in line 193, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 3459907 page 6
Processed 3459907 page 7
Processed 3459907 page 8
Processed 3459907 page 9
Processed 3459907 page 10
Processed 3459907 page 11
Processed 3459907 page 12
Processed 3459907 page 13
Processed 3459907 page 14
Processed 3459907 page 15
Processed 3459907 page 16
Processed 3459907 page 17
Processed 3459907 page 18
Processed 3459907 page 19
Processed 3459907 page 20
Processed 3459907 page 21
Processed 3459907 page 22
Processed 3459907 page 23
Processed 3459907 page 24
Processed 3459907 page 25
Processed 3459907 page 26
Processed 3459907 page 27
Processed 3459907 page 28
Processed 3459907 page 29
Processed 3459907 page 30
Processed 3459907 page 31
Processed 3459907 page 32
Processed 3459907 page 33
Processed 3459907 page 34
Processed 3459907 page 35
Processed 3459907 page 36
Processed 3459907 page 37
Processed 3459907 page 38
Processed 3459907 page 39
Processed 3459907 page 40
Processed 00542515 page 1
Processed 00542515 page 2
Processed 00542515 page 3
Processed 005425

Skipping line 377: Expected 12 fields in line 377, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 02959325 page 1
Processed 02959325 page 2
Processed 02959325 page 3
Processed 02959325 page 4
Processed 02959325 page 5
Processed 02959325 page 6
Processed 02959325 page 7
Processed 02959325 page 8
Processed 02959325 page 9
Processed 02959325 page 10
Processed 02959325 page 11
Processed 02959325 page 12
Processed 02959325 page 13
Processed 02959325 page 14
Processed 02959325 page 15
Processed 02959325 page 16
Processed 02959325 page 17
Processed 02959325 page 18
Processed 02959325 page 19
Processed 02959325 page 20
Processed 02959325 page 21
Processed 02959325 page 22
Processed 02959325 page 23
Processed 02959325 page 24
Processed 02959325 page 25
Processed 02959325 page 26
Processed 02959325 page 27
Processed 02959325 page 28
Processed 02959325 page 29


Skipping line 52: Expected 12 fields in line 52, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 01539777 page 1
Processed 01539777 page 2
Processed 01539777 page 3
Processed 01539777 page 4
Processed 01539777 page 5
Processed 01539777 page 6
Processed 01539777 page 7
Processed 01539777 page 8
Processed 01539777 page 9
Processed 01539777 page 10
Processed 01539777 page 11
Processed 01539777 page 12
Processed 01539777 page 13
Processed 01539777 page 14
Processed 01539777 page 15
Processed 01539777 page 16
Processed 01539777 page 17
Processed 01539777 page 18
Processed 01539777 page 19
Processed 01539777 page 20
Processed 01539777 page 21
Processed 01539777 page 22
Processed 01539777 page 23
Processed 01539777 page 24
Processed 1983517 page 1
Processed 1983517 page 2
Processed 1983517 page 3
Processed 1983517 page 4
Processed 1983517 page 5
Processed 1983517 page 6
Processed 1983517 page 7
Processed 1983517 page 8
Processed 1983517 page 9
Processed 1983517 page 10
Processed 1983517 page 11
Processed 1983517 page 12
Processed 1983517 page 13
Processed 1983517 page 14
Proces

Skipping line 60: Expected 12 fields in line 60, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.
Skipping line 78: Expected 12 fields in line 78, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 01369166 page 1
Processed 01369166 page 2
Processed 01369166 page 3
Processed 01369166 page 4
Processed 01369166 page 5
Processed 01369166 page 6
Processed 01369166 page 7
Processed 01369166 page 8
Processed 01369166 page 9
Processed 01369166 page 10
Processed 01369166 page 11
Processed 01369166 page 12
Processed 01369166 page 13
Processed 01369166 page 14
Processed 01369166 page 15
Processed 01369166 page 16
Processed 01369166 page 17
Processed 01369166 page 18
Processed 01369166 page 19
Processed 01369166 page 20
Processed 01369166 page 21
Processed 01369166 page 22
Processed 01369166 page 23
Processed 01369166 page 24
Processed 01369166 page 25
Processed 01369166 page 26
Processed 00030177 page 1
Processed 00030177 page 2
Processed 00030177 page 3
Processed 00030177 page 4
Processed 00030177 page 5
Processed 00030177 page 6
Processed 00030177 page 7
Processed 00030177 page 8
Processed 00030177 page 9
Processed 00030177 page 10
Processed 00030177 page 11
Processed 00030177 

Skipping line 469: Expected 12 fields in line 469, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 5508774 page 1
Processed 5508774 page 2
Processed 5508774 page 3
Processed 5508774 page 4
Processed 5508774 page 5
Processed 5508774 page 6
Processed 5508774 page 7
Processed 5508774 page 8
Processed 5508774 page 9
Processed 5508774 page 10
Processed 5508774 page 11
Processed 5508774 page 12
Processed 5508774 page 13
Processed 5508774 page 14
Processed 5508774 page 15
Processed 5508774 page 16
Processed 5508774 page 17
Processed 5508774 page 18
Processed 5508774 page 19
Processed 5508774 page 20
Processed 5508774 page 21
Processed 5508774 page 22
Processed 5508774 page 23
Processed 5508774 page 24
Processed 5508774 page 25
Processed 5508774 page 26
Processed 5508774 page 27
Processed 5508774 page 28
Processed 5508774 page 29
Processed 5508774 page 30
Processed 5508774 page 31
Processed 5508774 page 32
Processed 5508774 page 33
Processed 5508774 page 34
Processed 5508774 page 35
Processed 5508774 page 36
Processed 5508774 page 37
Processed 5508774 page 38
Processed 5508774 pag

Skipping line 184: Expected 12 fields in line 184, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 00002404 page 25
Processed 00002404 page 26
Processed 00002404 page 27
Processed 00002404 page 28
Processed 00002404 page 29
Processed 00002404 page 30
Processed 00002404 page 31
Processed 00002404 page 32
Processed 00002404 page 33
Processed 00002404 page 34
Processed 00002404 page 35
Processed 00002404 page 36
Processed 00002404 page 37
Processed 00002404 page 38
Processed 00002404 page 39
Processed 00002404 page 40
Processed 00002404 page 41
Processed 01370175 page 1
Processed 01370175 page 2
Processed 01370175 page 3
Processed 01370175 page 4
Processed 01370175 page 5
Processed 01370175 page 6
Processed 01370175 page 7
Processed 01370175 page 8
Processed 01370175 page 9
Processed 01370175 page 10
Processed 01370175 page 11
Processed 01370175 page 12
Processed 01370175 page 13
Processed 01370175 page 14
Processed 01370175 page 15
Processed 01370175 page 16
Processed 01370175 page 17
Processed 01370175 page 18
Processed 01370175 page 19
Processed 01370175 page 20
Processed 

Skipping line 372: Expected 12 fields in line 372, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 01337451 page 24
Processed 01337451 page 25
Processed 01337451 page 26
Processed 01337451 page 27
Processed 01337451 page 28
Processed 01337451 page 29
Processed 01337451 page 30
Processed 01337451 page 31
Processed 01337451 page 32
Processed 01337451 page 33
Processed 01337451 page 34
Processed 01337451 page 35
Processed 04802747 page 1
Processed 04802747 page 2
Processed 04802747 page 3
Processed 04802747 page 4
Processed 04802747 page 5
Processed 04802747 page 6
Processed 04802747 page 7
Processed 04802747 page 8
Processed 04802747 page 9
Processed 04802747 page 10
Processed 04802747 page 11
Processed 04558828 page 1
Processed 04558828 page 2
Processed 04558828 page 3
Processed 04558828 page 4
Processed 04558828 page 5
Processed 04558828 page 6
Processed 04558828 page 7
Processed 04558828 page 8
Processed 04558828 page 9
Processed 04558828 page 10
Processed 04558828 page 11
Processed 04558828 page 12
Processed 04558828 page 13
Processed 04558828 page 14
Processed 04558828 

Skipping line 12: Expected 12 fields in line 12, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.
Skipping line 95: Expected 12 fields in line 95, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 3387163 page 1
Processed 3387163 page 2
Processed 3387163 page 3
Processed 3387163 page 4
Processed 3387163 page 5
Processed 3387163 page 6
Processed 3387163 page 7
Processed 3387163 page 8
Processed 3387163 page 9
Processed 3387163 page 10
Processed 3387163 page 11
Processed 3387163 page 12
Processed 3387163 page 13
Processed 3387163 page 14
Processed 3387163 page 15
Processed 3387163 page 16
Processed 3387163 page 17
Processed 3387163 page 18
Processed 3387163 page 19
Processed 3387163 page 20
Processed 3387163 page 21
Processed 3387163 page 22
Processed 3387163 page 23
Processed 3387163 page 24
Processed 3387163 page 25
Processed 3387163 page 26
Processed 3387163 page 27
Processed 3387163 page 28
Processed 3387163 page 29
Processed 04860660 page 1
Processed 04860660 page 2
Processed 04860660 page 3
Processed 04860660 page 4
Processed 04860660 page 5
Processed 04860660 page 6
Processed 04860660 page 7
Processed 04860660 page 8
Processed 04860660 page 9
Processed 04860660 pa

Skipping line 113: Expected 12 fields in line 113, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 00178090 page 1
Processed 00178090 page 2
Processed 00178090 page 3
Processed 00178090 page 4
Processed 00178090 page 5
Processed 00178090 page 6
Processed 00178090 page 7
Processed 00178090 page 8
Processed 00178090 page 9
Processed 00178090 page 10
Processed 00178090 page 11
Processed 00178090 page 12
Processed 00178090 page 13
Processed 00178090 page 14
Processed 00178090 page 15
Processed 00178090 page 16
Processed 00178090 page 17
Processed 00178090 page 18
Processed 00178090 page 19
Processed 00178090 page 20
Processed 00178090 page 21


Skipping line 494: Expected 12 fields in line 494, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 00178090 page 22
Processed 00178090 page 23
Processed 00178090 page 24
Processed 00178090 page 25
Processed 00178090 page 26
Processed 00178090 page 27
Processed 00178090 page 28
Processed 00178090 page 29
Processed 00178090 page 30
Processed 00178090 page 31
Processed 00178090 page 32
Processed 00178090 page 33
Processed 00178090 page 34
Processed 00178090 page 35


Skipping line 89: Expected 12 fields in line 89, saw 13. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.


Processed 2765595 page 1
Processed 2765595 page 2
Processed 2765595 page 3
Processed 2765595 page 4
Processed 2765595 page 5
Processed 2765595 page 6
Processed 2765595 page 7
Processed 2765595 page 8
Processed 2765595 page 9
Processed 2765595 page 10
Processed 2765595 page 11
Processed 2765595 page 12
Processed 2765595 page 13
Processed 2765595 page 14
Processed 2765595 page 15
Processed 2765595 page 16
Processed 2765595 page 17
Processed 2765595 page 18
Processed 2765595 page 19
Processed 2765595 page 20
Processed 2765595 page 21
Processed 2765595 page 22
Processed 2765595 page 23
Processed 2765595 page 24
Processed 2765595 page 25
Processed 2765595 page 26
Processed 2765595 page 27
