# Eyeglass Prescription Parsing using Google Cloud Document AI

This notebook shows how to use Google Cloud Document AI to parse a prescription form for Eyeglasses



## Enable Document AI

1. First enable Document AI in your project by visiting
https://console.developers.google.com/apis/api/documentai.googleapis.com/overview

2. Find out who you are running as:

In [None]:
!gcloud auth list

3. Create a service account authorization by visiting
https://console.cloud.google.com/iam-admin/serviceaccounts/create
Give this service account Document AI Core Service Account authorization

4. Give the above ACTIVE ACCOUNT the ability to use the service account you just created.

## Call Document AI
#### Put your PDF source document path in the PDF var

In [None]:
PDF="gs://glottman-project-0/RX_samples/Example_1.gif" 

In [None]:
%%bash -s "$PDF"

PDF=$1

REGION="us"  # change to EU if the bucket is in the EU

cat <<EOM > request.json
{
   "inputConfig":{
      "gcsSource":{
         "uri":"${PDF}"
      },
      "mimeType":"image/gif"
   },
   "documentType":"general",
   "formExtractionParams":{
      "enabled":true
   }
}
EOM

# Send request to Document AI.
PROJECT=$(gcloud config get-value project)
echo "Sending the following request to Document AI in ${PROJECT} ($REGION region), saving to response.json"
cat request.json

curl -X POST \
  -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
  -H "Content-Type: application/json; charset=utf-8" \
  -d @request.json \
  https://${REGION}-documentai.googleapis.com/v1beta2/projects/${PROJECT}/locations/us/documents:process \
  > response.json

In [None]:
!tail response.json

**Note**: If you get a 403 PERMISSION DENIED error, please re-run all the cells from the top.

## Show the document sent

In [None]:
%%bash -s "$PDF"

PDF=$1

FILENAME="$(basename $PDF)"
test=$FILENAME
if [ ! -f $FILENAME ]; then
   gsutil cp $PDF .
fi

In [None]:
import os
filename = os.path.basename(PDF)

from IPython.display import IFrame
IFrame(filename, width=1200, height=500)

## Parse the response

Let's use Python to parse the response and pull out specific fields.
Start with printing all the extracted text fields.

In [None]:
import json
ifp = open('response.json')

In [None]:
response = json.load(ifp)

In [None]:
# Print all extracted text

allText = response['text']
print(allText[:1000])

## Let's look at the document extracted dictionary layout a bit

In [None]:
# substring print
print(allText.index("Sph."))

In [None]:
# the response starts at page 0, contains 1 page and has 25 extracted blocks
response['pages'][0]

In [None]:
blockLen=len(response['pages'][0]['blocks'])
startIndex = int(response['pages'][0]['blocks'][1]['layout']['textAnchor']['textSegments'][0]['startIndex'])
endIndex = int(response['pages'][0]['blocks'][blockLen-1]['layout']['textAnchor']['textSegments'][0]['endIndex'])
allText[startIndex:endIndex]

### Option 1: Now we understand the layout block, we can get some extractions using the first method of parsing text elements

In [None]:
def extractText(allText, elem):
    startIndex = int(elem['textAnchor']['textSegments'][0]['startIndex'])
    endIndex = int(elem['textAnchor']['textSegments'][0]['endIndex'])
    return allText[startIndex:endIndex].strip()

amount = extractText(allText, response['pages'][0]['blocks'][1]['layout']) + ":" +  extractText(allText, response['pages'][0]['blocks'][7]['layout'])
print(amount)
amount = extractText(allText, response['pages'][0]['blocks'][2]['layout']) + ":" +  extractText(allText, response['pages'][0]['blocks'][8]['layout'])
print(amount)
amount = extractText(allText, response['pages'][0]['blocks'][4]['layout']) + ":" +  extractText(allText, response['pages'][0]['blocks'][9]['layout'])
print(amount)

### Option 2: Parsing form fields

What we did with blocks of text was quite low-level. Document AI understands that forms tend to have key-value pairs, and part of the JSON response includes these extracted key-value pairs as well.

Besides FormField Document AI also supports getting Paragraph and Table from the document.

In [None]:
# Review dictuinary keys extracted
response['pages'][0].keys()

In [None]:
# Review what a form field includes - Field and Value with text segments starting with an anchor (Important for later)
response['pages'][0]['formFields'][3]

In [None]:
fieldName = extractText(allText, response['pages'][0]['formFields'][0]['fieldName'])
fieldValue = extractText(allText, response['pages'][0]['formFields'][0]['fieldValue'])
print('key={}\nvalue={}'.format(fieldName, fieldValue))

listLength = len(response['pages'][0]['formFields'])

In [None]:
# Using loop
for i in range(listLength):
    fieldName = extractText(allText, response['pages'][0]['formFields'][i]['fieldName'])
    if len(response['pages'][0]['formFields'][i]['fieldValue']['textAnchor']) == 0 :
        fieldValue=0
    else : 
        fieldValue = extractText(allText, response['pages'][0]['formFields'][i]['fieldValue'])
    print('key={}\nvalue={}'.format(fieldName, fieldValue))

Enjoy!

Copyright 2021 Google Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License