# Text detection using Amazon Rekognition

***
This notebook provides a walkthrough of the [text detection API](https://docs.aws.amazon.com/rekognition/latest/dg/text-detection.html) in Amazon Rekognition. You can quickly identify text in your video and image libraries to catalog footage and photos for marketing, advertising, and media industry use cases.
***

# Initialize stuff

In [None]:
# Update boto3 to current version
!conda upgrade -y boto3

In [None]:
#Check to ensure that current version of boto3 is installed
import boto3
print(boto3.__version__)

import botocore
print(botocore.__version__)

In [None]:
# Initialise Notebook
import boto3
from IPython.display import HTML, display, Image as IImage
from PIL import Image, ImageDraw, ImageFont
import time
import os

In [None]:
# Curent AWS Region. Use this to choose corresponding S3 bucket with sample content

mySession = boto3.session.Session()
awsRegion = "us-east-1"

In [None]:
# Init clients
rekognition = boto3.client('rekognition')
s3 = boto3.client('s3')

# Detect text in image
***

In [None]:
imagePath = "./static/aws-indonesia.png"

In [None]:
img=Image.open(imagePath)
display(img)

#### Call Rekognition to detect text in the image

In [None]:
# Call Amazon Rekognition to detect text in the image
# https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectText.html
with open(imagePath, 'rb') as image:
    # Todo-1 assign rekognition.detect_text with Image from image.read() and Word Filter minimum confidence = 90
    
    detectTextResponse

#### Review the raw JSON reponse from Rekognition

In [None]:
# Show JSON response returned by Rekognition Text API (Text Detection)
# In the JSON response below, you will see detected text, confidence score, and additional information.

display(detectTextResponse)

#### Display list of line detected

In [None]:
for textDetection in detectTextResponse["TextDetections"]:
    if textDetection["Type"]=="LINE":
        print(textDetection["DetectedText"])

# Detect text in image using Filters and Regions of Interest
***

In [None]:
imagePath = "./static/aws-indonesia.png"

In [None]:
img=Image.open(imagePath)
display(img)

In [None]:
# Call Amazon Rekognition to detect text in the image
# https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectText.html
with open(imagePath, 'rb') as image:
    detectTextResponse = rekognition.detect_text(
        Image={
            'Bytes': image.read()
          },
        Filters={
            'WordFilter': {
                'MinConfidence': 90,
                'MinBoundingBoxHeight': 0.05,
                'MinBoundingBoxWidth': 0.02
            },
            'RegionsOfInterest': [
                {
                    'BoundingBox': {
                        'Width': 0.9461569786071777,
                        'Height': 0.08966819196939468,
                        'Left': 0.021472634747624397,
                        'Top': 0.1912720501422882},
                },
            ]
        }
    )

In [None]:
# Show JSON response returned by Rekognition Text API (Text Detection)
# In the JSON response below, you will see detected text, confidence score, and additional information.

display(detectTextResponse)

In [None]:
for textDetection in detectTextResponse["TextDetections"]:
    text = textDetection["DetectedText"]
    if(textDetection["Type"] == "WORD"):
        print("Word: {}".format(textDetection["DetectedText"]))

# Detect text in video
 Text detection in video is an async operation. 
https://docs.aws.amazon.com/rekognition/latest/dg/text-detecting-video-procedure.html.

- First we start a text detection job which returns a Job Id.
- We can then call `get_text_detection` to get the job status and after job is complete, we can get object metadata.
- In production use cases, you would usually use StepFunction or SNS topic to get notified when job is complete.
***

In [None]:
# Todo-2 download text-detection.mp4 file and upload to your s3, assign bucketName with your own bucket and videoName with the object key
videoName = ""
bucketName = ""

#### Call Rekognition to start a job for text detection

In [None]:
# Start video text job
startTextDetection = rekognition.start_text_detection(
    Video={
        'S3Object': {
            'Bucket': bucketName,
            'Name': videoName,
        }
    },
)

textJobId = startTextDetection['JobId']
display("Job Id: {0}".format(textJobId))

#### Wait for text detection job to complete

In [None]:
# Wait for text detection job to complete
# In production use cases, you would usually use StepFunction or SNS topic to get notified when job is complete.
getTextDetection = rekognition.get_text_detection(
    JobId=textJobId
)

while(getTextDetection['JobStatus'] == 'IN_PROGRESS'):
    time.sleep(5)
    print('.', end='')
 
    getTextDetection = rekognition.get_text_detection(
    JobId=textJobId
    )
    
display(getTextDetection['JobStatus'])

#### Review raw JSON reponse from Rekognition

In [None]:
# Show JSON response returned by Rekognition Text Detection API
# In the JSON response below, you will see list of detected text.
# For each detected object, you will see information like Timestamp

display(getTextDetection)

#### Display recognized text in the video

In [None]:
flaggedTextInVideo = ["AWS"]

theLines = {}

# Display timestamps and objects detected at that time
strDetail = "Text detected in video<br>=======================================<br>"
strOverall = "Text in the overall video:<br>=======================================<br>"

# Objects detected in each frame
for obj in getTextDetection['TextDetections']:
    if(obj['TextDetection']['Type'] == 'WORD'):
        ts = obj ["Timestamp"]
        cconfidence = obj['TextDetection']["Confidence"]
        oname = obj['TextDetection']["DetectedText"]

        if(oname in flaggedTextInVideo):
            print("Found flagged text at {} ms: {} (Confidence: {})".format(ts, oname, round(cconfidence,2)))

        strDetail = strDetail + "At {} ms: {} (Confidence: {})<br>".format(ts, oname, round(cconfidence,2))
        if oname in theLines:
            cojb = theLines[oname]
            theLines[oname] = {"Text" : oname, "Count": 1+cojb["Count"]}
        else:
            theLines[oname] = {"Text" : oname, "Count": 1}

# Unique objects detected in video
for theLine in theLines:
    strOverall = strOverall + "Name: {}, Count: {}<br>".format(theLine, theLines[theLine]["Count"])

# Display results
display(HTML(strOverall))

#### Show video in the player

In [None]:
# Show video in a player

s3VideoUrl = s3.generate_presigned_url('get_object', Params={'Bucket': bucketName, 'Key': videoName})

videoTag = "<video controls='controls' autoplay width='640' height='360' name='Video' src='{0}'></video>".format(s3VideoUrl)

videoui = "<table><tr><td style='vertical-align: top'>{}</td></tr></table>".format(videoTag)

display(HTML(videoui))

In [None]:
listui = "<table><tr><td style='vertical-align: top'>{}</td></tr></table>".format(strDetail)
display(HTML(listui))

***
### References
- https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectText.html
- https://docs.aws.amazon.com/rekognition/latest/dg/API_StartTextDetection.html
- https://docs.aws.amazon.com/rekognition/latest/dg/API_GetTextDetection.html

***

You have successfully used Amazon Rekognition to identify text in images an videos.