# Custom Celebrity Recognition Using Amazon Rekognition

***
This notebook provides a walkthrough of recognizing custom celebrities using Amazon Rekognition. You will first index faces of custom celebrities and then use SearchFaces API (https://docs.aws.amazon.com/rekognition/latest/dg/API_SearchFacesByImage.html and https://docs.aws.amazon.com/rekognition/latest/dg/API_StartFaceSearch.html) with sample image and video to detect custom celebrities.

***

**Prerequisites:**

The user or role that executes the commands must have permissions in AWS Identity and Access Management (IAM) to perform those actions. AWS provides a set of managed policies that help you get started quickly. For our example, you need to apply the following minimum managed policies to your user or role:

* AmazonRekognitionFullAccess
* AmazonDynamoDBFullAccess
* AmazonS3FullAccess

Be aware that we recommend you follow AWS IAM best practices for production implementations, which is out of scope fof this workshop.
***

In [None]:
# boto3 update might be required if SageMaker has older version of boto3
#!conda upgrade -y boto3

In [None]:
#Check to ensure that current version of boto3 is installed
#import boto3
#print(boto3.__version__)

## Initialize Stuff
***

In [None]:
# initialise Notebook
import boto3
from IPython.display import HTML, display
from PIL import Image, ImageDraw, ImageFont
import time

In [None]:
# Initialize clients
rekognition = boto3.client('rekognition')
dynamodb = boto3.client('dynamodb')
s3 = boto3.client('s3')

In [None]:
# S3 bucket that contains sample images and videos
bucketName = "ki-reinvent-content"

# DynamoDB Table and Rekognition Collection names. We will be creating these in this module.
ddbTableName = "my-celebrities"
collectionId = "my-celebrities"

In [None]:
# Create temporary directory
# This directory is not needed to call Rekognition APIs.
# We will only use this directory to download images from S3 bucket and drwaw bounding boxes
# around recognized celebrities to show them here in the notebook.

!mkdir m2tmp
tempFolder = 'm2tmp/'

## DynamoDB table to store custom celebrity metadata
***
In this step we will create a DynamoDB table to store custom celebrity metadata including id, name and url. You can store additional attributes for each celebrity if needed.

In [None]:
# List existing DynamoDB Tables
# Before creating DynamoDB table, let us firsr look at the list of existing DynamoDB tables in our account.

listTablesResponse = dynamodb.list_tables()
display(listTablesResponse["TableNames"])

In [None]:
# Create new DynamoDB Table
        
createTableResponse = dynamodb.create_table(
    TableName=ddbTableName,
    KeySchema=[
        {
            'AttributeName': 'id',
            'KeyType': 'HASH'  #Partition key
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'id',
            'AttributeType': 'S'
        },

    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 10,
        'WriteCapacityUnits': 10
    }
)

display(createTableResponse)

In [None]:
# List DynamoDB Tables
# Let us look at list of our DynamoDB tables again to make sure that table we just created appears in the list.

listTablesResponse = dynamodb.list_tables()
display(listTablesResponse["TableNames"])

## Rekogniton Collections
***
Amazon Rekognition can store information about detected faces in server-side containers known as [collections](https://docs.aws.amazon.com/rekognition/latest/dg/collections.html). You can use the facial information that's stored in a collection to search for known faces in images, stored videos, and streaming videos. In this section you will learn how you can create and manage Rekognition Collections.

In [None]:
#List Rekognition Collections
# Let us first see if we have already created any Rekognition collections in our account.
# If there is not an existing Rekognition in your account, you will see empty list
# otherwise you will a list with names of rekognition collections and face model version.

listCollectionsResponse = rekognition.list_collections()

display(listCollectionsResponse["CollectionIds"])
display(listCollectionsResponse["FaceModelVersions"])

In [None]:
# Create Rekognition Collection
# Let us now create a new Rekognition collection that we will use to store faces of custom celebrities.

createCollectionResponse = rekognition.create_collection(
    CollectionId=collectionId
)
display(createCollectionResponse)


In [None]:
# List Rekognition Collections
# Let us make sure that Recognition we just created now appears in the list of collections in our AWS account.
listCollectionsResponse = rekognition.list_collections()

display(listCollectionsResponse["CollectionIds"])
display(listCollectionsResponse["FaceModelVersions"])

In [None]:
#Describe Rekognition Collection
# You can use DescribeCollection to get information, 
# such as the number of faces indexed into a collection 
# and the version of the model used by the collection for face detection etc.
# https://docs.aws.amazon.com/rekognition/latest/dg/API_DescribeCollection.html

# Since we have not indexed any faces yet, you should see FaceCount: 0

describeCollectionResponse = rekognition.describe_collection(
    CollectionId=collectionId
)
display(describeCollectionResponse)

## Index Custom Celebrity Faces
***
We will now index multiple images for each celebrity. By indexing multiple faces we increase the likelyhood of detecting celebrities when their face is in different angles etc. We will use [IndexFaces](https://docs.aws.amazon.com/rekognition/latest/dg/API_IndexFaces.html) to detects faces in the input image and add them to the specified collection.

You can read more about some of the best practices around [indexing faces here in the blog](https://aws.amazon.com/blogs/machine-learning/save-time-and-money-by-filtering-faces-during-indexing-with-amazon-rekognition/).

In [None]:
# We will define a method to index a face along with the celebrity id
# https://docs.aws.amazon.com/rekognition/latest/dg/API_IndexFaces.html

def indexFace (bucketName, imageName, celebrityId):

    indexFaceResponse = rekognition.index_faces(
        CollectionId=collectionId,
        Image={
            'S3Object': {
                'Bucket': bucketName,
                'Name': imageName,
            }
        },
        ExternalImageId=celebrityId,
        DetectionAttributes=[
            'DEFAULT' #'DEFAULT'|'ALL',
        ],
        MaxFaces=1,
        QualityFilter='AUTO' #'NONE'|'AUTO'
    )
    
    display(indexFaceResponse)

# We will define a method to write metadata (id, name, url) of celebrity to DynamoDB
def addCelebrityToDynamoDB(celebrityId, celebrityName, celebrityUrl):
    ddbPutItemResponse = dynamodb.put_item(
        Item={
            'id': {'S': celebrityId},
            'name': {'S': celebrityName},
            'url': { 'S': celebrityUrl},
        },
        TableName=ddbTableName,
    )

### Index first celebrity

In [None]:
#Index Celebrity 1
celebrityId = "1"
celebrityName = "Chris Munns"
celebrityUrl = "http://www.amazon.com"

In [None]:
addCelebrityToDynamoDB(celebrityId, celebrityName, celebrityUrl)

In [None]:
# After you run this cell, biggest face from the image will be indexed.
# You will get JSON response with a variety of information but notice FaceId, ImageId and ExternalImageId
# Later when we will search celebrities, we will use this ExteralImageId to extract metadata from DynamoDB.

#Indexing face: https://s3.amazonaws.com/ki-reinvent-content/ch-0.png

indexFace(bucketName, "ch-0.png", celebrityId)

In [None]:
#Indexing face: https://s3.amazonaws.com/ki-reinvent-content/ch-1.png

indexFace(bucketName, "ch-1.png", celebrityId)

In [None]:
# Indexing face: https://s3.amazonaws.com/ki-reinvent-content/ch-2.png

indexFace(bucketName, "ch-2.png", celebrityId)

In [None]:
# Describe Rekognition Collection
# With three faces indexed for celebrity 1, you shoud now see FaceCount: 3

describeCollectionResponse = rekognition.describe_collection(
    CollectionId=collectionId
)
display("FaceCount: {0}".format(describeCollectionResponse["FaceCount"]))

### Index second celebrity

In [None]:
#Index Celebrity 2
celebrityId = "2"
celebrityName = "Kashif Imran"
celebrityUrl = "http://aws.amazon.com"

In [None]:
addCelebrityToDynamoDB(celebrityId, celebrityName, celebrityUrl)

In [None]:
# Indexing face: https://s3.amazonaws.com/ki-reinvent-content/k-0.png

indexFace(bucketName, "k-0.png", celebrityId)

In [None]:
# Indexing face: https://s3.amazonaws.com/ki-reinvent-content/k-1.png

indexFace(bucketName, "k-1.png", celebrityId)

In [None]:
# Indexing face: https://s3.amazonaws.com/ki-reinvent-content/k-1.png

indexFace(bucketName, "k-2.png", celebrityId)

In [None]:
# Describe Rekognition Collection
# You should now have FaceCount: 6 since we have indexed 3 faces for each of the 2 celebrities we indexed.
describeCollectionResponse = rekognition.describe_collection(
    CollectionId=collectionId
)
display("FaceCount: {0}".format(describeCollectionResponse["FaceCount"]))

## Recognize custom celebrities in image
***

In [None]:
imageName = "serverless-bytes.png"

In [None]:
searchFacesResponse = rekognition.search_faces_by_image(
    CollectionId=collectionId,
    Image={
        'S3Object': {
            'Bucket': bucketName,
            'Name': imageName,
        }
    },
    MaxFaces=2,
    FaceMatchThreshold=95
)

In [None]:
# You will see Rekognition response with SearchedFaceBoundingBox (which contains information about the bigges face
# in the image). Rekognition also returns FaceMatches, a list of matched faces. Each matched face has additional
# information including FaceId, ImageId and ExternalImageId. We will use ExternalImageId to extract information
# from DynamoDB about this celebrity.

display(searchFacesResponse)

In [None]:
# Define functions to show image and bounded boxes around recognized celebrities
  
def displayWithBoundingBoxes (sourceImage, boxes):
    # blue, green, red, grey
    colors = ((220,220,220),(242,168,73),(76,182,252),(52,194,123))
    
    # Download image locally
    imageLocation = tempFolder+sourceImage
    s3.download_file(bucketName, sourceImage, imageLocation)

    # Draws BB on Image
    bbImage = Image.open(imageLocation)
    draw = ImageDraw.Draw(bbImage)
    width, height = bbImage.size
    col = 0
    maxcol = len(colors)
    line= 3
    for box in boxes:
        x1 = int(box[1]['Left'] * width)
        y1 = int(box[1]['Top'] * height)
        x2 = int(box[1]['Left'] * width + box[1]['Width'] * width)
        y2 = int(box[1]['Top'] * height + box[1]['Height']  * height)
        
        draw.text((x1,y1),box[0],colors[col])
        for l in range(line):
            draw.rectangle((x1-l,y1-l,x2+l,y2+l),outline=colors[col])
        col = (col+1)%maxcol
    
    imageFormat = "PNG"
    ext = sourceImage.lower()
    if(ext.endswith('jpg') or ext.endswith('jpeg')):
       imageFormat = 'JPEG'

    bbImage.save(imageLocation,format=imageFormat)

    display(bbImage)
    
def getDynamoDBItem(itemId):
    ddbGetItemResponse = dynamodb.get_item(
        Key={'id': {'S': itemId} },
        TableName=ddbTableName
    )
    
    itemToReturn = ('', '', '')
    
    if('Item' in ddbGetItemResponse):
        itemToReturn = (ddbGetItemResponse['Item']['id']['S'], 
                ddbGetItemResponse['Item']['name']['S'],
                ddbGetItemResponse['Item']['url']['S'])
    
    return itemToReturn



In [None]:
# After your run this cell you should see one of the faces recognized using Amazon Rekognition.
# You only see one face recognized in this example because
# SearchFacesByImage, for a given input image, first detects the largest face in the image,
# and then searches the specified collection for matching faces.

# In next example we will use DetectFaces API call to first detect faces in the image and then
# use SearchFacesByImage for each detected face to get it recognized.

def displaySearchedFace(sfr):  

    boxes = []
    
    if(len(sfr['FaceMatches']) > 0):
        bb = sfbb = sfr['SearchedFaceBoundingBox']
        eid = sfr['FaceMatches'][0]['Face']['ExternalImageId']
        conf = sfr['FaceMatches'][0]['Similarity']

        celeb = getDynamoDBItem(eid)

        boxes.append(("{0}-{1}-{2}%".format(celeb[0], celeb[1], round(conf,2)), bb))

        displayWithBoundingBoxes(imageName, boxes)

displaySearchedFace(searchFacesResponse)

## Recognize custom celebrities in video
***

In [None]:
videoName = "serverless-bytes.mov"

In [None]:
startFaceSearchResponse = rekognition.start_face_search(
    Video={
        'S3Object': {
            'Bucket': bucketName,
            'Name': videoName
        }
    },
    FaceMatchThreshold=99,
    CollectionId=collectionId,
)


faceSearchJobId = startFaceSearchResponse['JobId']
display("Job ID: {0}".format(faceSearchJobId))

In [None]:
getFaceSearch = rekognition.get_face_search(
    JobId=faceSearchJobId,
    SortBy='TIMESTAMP'
)

while(getFaceSearch['JobStatus'] == 'IN_PROGRESS'):
    time.sleep(5)
    print('.', end='')
 
    getFaceSearch = rekognition.get_face_search(
    JobId=faceSearchJobId,
    SortBy='TIMESTAMP'
)
    
display(getFaceSearch['JobStatus'])

In [None]:
display(getFaceSearch)

In [None]:
theCelebs = {}

# Display timestamps and celebrites detected at that time
strDetail = "Celebrites detected in each frame<br>=======================================<br>"
strOverall = "Celebrities in the overall video:<br>=======================================<br>"

# Faces detected in each frame
for person in getFaceSearch['Persons']:
    if('FaceMatches' in person and len(person["FaceMatches"])> 0):
        ts = person["Timestamp"]
        theFaceMatches = {}
        for fm in person["FaceMatches"]:
            conf = fm["Similarity"]
            eid =  fm["Face"]["ExternalImageId"]
            if(eid not in theFaceMatches):
                theFaceMatches[eid] = (eid, ts, round(conf,2))
            if(eid not in theCelebs):
                theCelebs[eid] = (getDynamoDBItem(eid))
        for theFaceMatch in theFaceMatches:
            celeb = theCelebs[theFaceMatch]
            fminfo = theFaceMatches[theFaceMatch]
            strDetail = strDetail + "Timestamp: {0} ms, EID:{1}, Name: {2}, Url: {3}, Conf: {4}%<br>".format(fminfo[1],
                       celeb[0], celeb[1], celeb[2], fminfo[2])

# Unique faces detected in video
for theCeleb in theCelebs:
    tc = theCelebs[theCeleb]
    strOverall = strOverall + "id: {0}, name: {1}, url: {2}<br>".format(tc[0], tc[1], tc[2])

# Display results
display(HTML(strOverall))
display(HTML(strDetail))
    

In [None]:
# Display video in player

s3VideoUrl = "https://s3.amazonaws.com/{0}/{1}".format(bucketName, videoName)

videoTag = "<video controls='controls' autoplay width='800' height='600' name='Video' src='{0}'></video>".format(s3VideoUrl)

display(HTML(videoTag))

In [None]:
#Delete Collection
#createCollectionResponse = rekognition.delete_collection(
#    CollectionId=collectionId
#)
#display(createCollectionResponse)