# Amazon Textract
### Redacting a Form of Sensitive Information

**Based on AWS Github Account:**  https://github.com/aws-samples  
**File:** aws-samples/amazon-textractcode-samples/**09-forms-redaction.py**
  
James Reed, Centennial Data Science

![Amazon](https://media.gettyimages.com/photos/closeup-of-sign-with-logo-on-facade-of-the-regional-headquarters-of-picture-id1065011338?s=2048x2048)

## Original Form Requiring Redaction

 * Home Address
 * Mailing Address
!["Employment Form](../data/employmentapp.png)

In [1]:
import boto3
from trp import Document
from PIL import Image, ImageDraw

In [6]:
# Document
s3BucketName = "jdreed-hadley"
documentName = "employmentapp.png"

# Amazon Textract client
textract = boto3.client('textract')

# Call Amazon Textract
response = textract.analyze_document(
    Document={
        'S3Object': {
            'Bucket': s3BucketName,
            'Name': documentName
        }
    },
    FeatureTypes=["FORMS"])

#print(response)

In [7]:
doc = Document(response)

# Redact document
img = Image.open('../data/' + documentName)

width, height = img.size

if(doc.pages):
    page = doc.pages[0]
    for field in page.form.fields:
        if(field.key and field.value and "address" in field.key.text.lower()):
        #if(field.key and field.value):
            print("Redacting => Key: {}, Value: {}".format(field.key.text, field.value.text))
            
            x1 = field.value.geometry.boundingBox.left*width
            y1 = field.value.geometry.boundingBox.top*height-2
            x2 = x1 + (field.value.geometry.boundingBox.width*width)+5
            y2 = y1 + (field.value.geometry.boundingBox.height*height)+2

            draw = ImageDraw.Draw(img)
            draw.rectangle([x1, y1, x2, y2], fill="Black")

img.save("redacted-{}".format(documentName))

Redacting => Key: Home Address:, Value: 123 Any Street, Any Town, USA
Redacting => Key: Mailing Address:, Value: same as home address


### Redacted Document

![Redacted Document](redacted-employmentapp.png)

In [1]:
import datetime

complete = datetime.datetime.now()
print(f'Notebook end: {complete}')

Notebook end: 2021-01-20 14:00:54.048637
