## Using AI Services for Analyzing Images and Text
by Manav Sehgal | on APR 30 2019 | 
by Tom Liu | on Dec 2020 | modified edition for AI Labs for Amazon Recognition and Amazon Comprehend

So far we have been working with structured data in flat files as our data source. What if the source is images and unstructured text. AWS AI services provide vision, transcription, translation, personalization, and forecasting capabilities without the need for training and deploying machine learning models. AWS manages the machine learning complexity, you just focus on the problem at hand and send required inputs for analysis and receive output from these services within your applications.

Extending our open data analytics use case to New York Traffic let us use the AWS AI services to turn open data available in social media, Wikipedia, and other sources into structured datasets and insights.

We will start by importing dependencies for AWS SDK, Python Data Frames, file operations, handeling JSON data, and display formatting. We will initialize the Rekognition client for use in the rest of this notebook.

In [None]:
import boto3
import pandas as pd
import io
import json
from IPython.display import display, Markdown, Image
import sagemaker

boto_session = boto3.Session()
region = boto_session.region_name

rekognition = boto3.client('rekognition', region)
bucket_name = sagemaker.Session().default_bucket()
prefix = "images"

In [None]:
# download image set for the lab
!wget https://df4l9poikws9t.cloudfront.net/images.zip

In [None]:
!unzip -d test_images images.zip

In [None]:
!aws s3 cp ./test_images s3://$bucket_name/$prefix/  --recursive --include "*.png" --include "*.jpg"

### Show Image
We will work with a number of images so we need a way to show these images within this notebook. Our function creates a public image URL based on S3 bucket and key as input.

In [None]:
def show_image(filename, img_width = 300):
    return Image(filename = filename, width = img_width)

In [None]:
file_name = 'sydney-street-02-unsplash.jpg'

In [None]:
show_image(f'./test_images/{file_name}')

### Image Labels
One of use cases for traffic analytics is processing traffic CCTV imagery or social media uploads. Let's consider a traffic location where depending on number of cars, trucks, and pedestrians we can identify if there is a traffic jam. This insight can be used to better manage flow of traffic around the location and plan ahead for future use of this route.

First step in this kind of analytics is to recognize that we are actually looking at an image which may represent a traffic jam. We create ``image_labels`` function which uses ``detect_lables`` Rekognition API to detect objects within an image. The function prints labels detected with confidence score.

In the given example notice somewhere in the middle of the labels listing at 73% confidence the Rekognition computer vision model has actually determined a traffic jam.

In [None]:
def image_labels(bucket, key):
    image_object = {'S3Object':{'Bucket': bucket,'Name': key}}

    response = rekognition.detect_labels(Image=image_object)
    for label in response['Labels']:
        print('{} ({:.0f}%)'.format(label['Name'], label['Confidence']))

In [None]:
image_labels(bucket_name, f'images/{file_name}')

#### Questions:
* How well image label detection works for images, such as 'olive_*.png', 'gear*.png' & 'coffee*.png'.

### Image Label Count
Now that we have a label detecting a traffic jam and some of the ingredients of a busy traffic location like pedestrians, trucks, cars, let us determine quantitative data for benchmarking different traffic locations. If we can count the number of cars, trucks, and persons in the image we can compare these numbers with other images. Our function does just that, it counts the number of instances of a matching label.

In [None]:
def image_label_count(bucket, key, match):    
    image_object = {'S3Object':{'Bucket': bucket,'Name': key}}

    response = rekognition.detect_labels(Image=image_object)
    count = 0
    for label in response['Labels']:
        if match in label['Name']:
            for instance in label['Instances']:
                count += 1
    print(f'Found {match} {count} times.')

In [None]:
image_label_count(bucket_name, f'images/{file_name}', 'Car')

In [None]:
image_label_count(bucket_name, f'images/{file_name}', 'Truck')

In [None]:
image_label_count(bucket_name, f'images/{file_name}', 'Person')

### Image Text
Another use case of traffic location analytics using social media content is to understand more about a traffic location and instance if there is an incident reported, like an accident, jam, or VIP movement. For a computer program to understand a random traffic location, it may help to capture any text within the image. The ``image_text`` function uses Amazon Rekognition service to detect text in an image.

You will notice that the text recognition is capable to read blurry text like "The Lion King", text which is at a perspective like the bus route, text which may be ignored by the human eye like the address below the shoes banner, and even the text representing the taxi number. Suddenly the image starts telling a story programmatically, about what time it may represent, what are the landmarks, which bus route, which taxi number was on streets, and so on.

In [None]:
def image_text(bucket, key, sort_column='', parents=True):
    response = rekognition.detect_text(Image={'S3Object':{'Bucket':bucket,'Name': key}})
    df = pd.read_json(io.StringIO(json.dumps(response['TextDetections'])))
    df['Width'] = df['Geometry'].apply(lambda x: x['BoundingBox']['Width'])
    df['Height'] = df['Geometry'].apply(lambda x: x['BoundingBox']['Height'])
    df['Left'] = df['Geometry'].apply(lambda x: x['BoundingBox']['Left'])
    df['Top'] = df['Geometry'].apply(lambda x: x['BoundingBox']['Top'])
    df = df.drop(columns=['Geometry'])
    if sort_column:
        df = df.sort_values([sort_column])
    if not parents:
        df = df[df['ParentId'] > 0]
    return df

In [None]:
text_image_file = 'street-01-unsplash.jpg'

In [None]:
show_image(f'./test_images/{text_image_file}')

Sorting on ``Top`` column will keep the horizontal text together.

In [None]:
image_text(image_bucket, f'images/{text_image_file}', sort_column='Top', parents=False)

#### Questions:
* How well the image text detection function works for images, such as 'olive_coffee_shop_*.png'?

### Detect Celebs
Traffic analytics may also involve detecting VIP movement to divert traffic or monitor security events. Detecting VIP in a scene starts with facial recognition. Our function ``detect_celebs`` works as well with political figures as it will with movie celebrities.

In [None]:
def detect_celebs(bucket, key, sort_column=''):
    image_object = {'S3Object':{'Bucket': bucket,'Name': key}}

    response = rekognition.recognize_celebrities(Image=image_object)
    df = pd.DataFrame(response['CelebrityFaces'])
    df['Width'] = df['Face'].apply(lambda x: x['BoundingBox']['Width'])
    df['Height'] = df['Face'].apply(lambda x: x['BoundingBox']['Height'])
    df['Left'] = df['Face'].apply(lambda x: x['BoundingBox']['Left'])
    df['Top'] = df['Face'].apply(lambda x: x['BoundingBox']['Top'])
    df = df.drop(columns=['Face'])
    if sort_column:
        df = df.sort_values([sort_column])
    return(df)

In [None]:
show_image('./test_images/celeb-02-unsplash.jpg')

In [None]:
detect_celebs(image_bucket, 'images/celeb-02-unsplash.jpg', sort_column='Left')

### Comprehend Syntax
It is possible that many data sources represent natural language and free text. Understand structure and semantics from this unstructured text can help further our open data analytics use cases.

Let us assume we are processing traffic updates for structured data so we can take appropriate actions. First step in understanding natural language is to break it up into grammaticaly syntax. Nouns like "today" can tell about a particular event like when is the event occuring. Adjectives like "snowing" and "windy" tell what is happening at that moment in time. 

In [None]:
comprehend = boto3.client('comprehend', region)

traffic_update = """
It is snowing and windy today in New York. The temperature is 50 degrees Fahrenheit. 
The traffic is slow 10 mph with several jams along the I-86.
"""

In [None]:
def comprehend_syntax(text): 
    response = comprehend.detect_syntax(Text=text, LanguageCode='en')
    df = pd.read_json(io.StringIO(json.dumps(response['SyntaxTokens'])))
    df['Tag'] = df['PartOfSpeech'].apply(lambda x: x['Tag'])
    df['Score'] = df['PartOfSpeech'].apply(lambda x: x['Score'])
    df = df.drop(columns=['PartOfSpeech'])
    return df

In [None]:
comprehend_syntax(traffic_update)

### Comprehend Entities
More insights can be derived by doing entity extraction from the natural langauage. These entities can be date, location, quantity, among others. Just few of the entities can tell a structured story to a program.

In [None]:
def comprehend_entities(text):
    response = comprehend.detect_entities(Text=text, LanguageCode='en')
    df = pd.read_json(io.StringIO(json.dumps(response['Entities'])))
    return df

In [None]:
comprehend_entities(traffic_update)

### Comprehend Phrases
Analysis of phrases within narutal language text complements the other two methods for a program to better route the actions based on derived structure of the event.

In [None]:
def comprehend_phrases(text):
    response = comprehend.detect_key_phrases(Text=text, LanguageCode='en')
    df = pd.read_json(io.StringIO(json.dumps(response['KeyPhrases'])))
    return df

In [None]:
comprehend_phrases(traffic_update)

### Comprehend Sentiment
Sentiment analysis is common for social media user generated content. Sentiment can give us signals on the users' mood when publishing such social data.

In [None]:
def comprehend_sentiment(text):
    response = comprehend.detect_sentiment(Text=text, LanguageCode='en')
    return response['SentimentScore']

In [None]:
comprehend_sentiment(traffic_update)

Type your thoughts and check the related sentiment?

In [None]:
comprehend_sentiment("")

### Original notebook

[Original notebook](https://github.com/aws-samples/aws-open-data-analytics-notebooks/blob/master/ai-services/using-ai-services-for-analyzing-public-data.ipynb) created by Manva Sehgal on APR 30 2019
